Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reducing memory allocation using lazy evaluation #29

Open
Shuenhoy opened this issue Aug 13, 2024 · 5 comments
Open

Reducing memory allocation using lazy evaluation #29

Shuenhoy opened this issue Aug 13, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@Shuenhoy
Copy link

Thank you for this project! .NET world really need good numerics and linear algebra libraries nowadays.

The README indicates the impact of memory allocation on performance is already noticed, and an inplace operation is suggested.
In principle, such inplace operations do avoid memory allocations. However, it would requrie the coder to manually arrange the orders for a complex chained expression, and make the code harder to read.

The popular libraries in C++ world (e.g. Eigen3 and Blaze) use a lazy evaluation technique to handle such difficulties. For example, var X = A.Add(B) produces a MatAdd<Mat, Mat> instead of a Mat, var X = A.Add(B.MultipleBy(C)) will produce a MatAdd<Mat, MatMul<Mat, Mat>>, where MatAdd and MatMul only track the operations to be performed (but not actually perform them) and are something like a ref struct that does not need heap allocation.

And finally, X.Eval() will allocate a new matrix and X.EvalTo(M) will use the pre-allocated M. Only at this point the lazy evaluation is performed, and all the intermediate temp allocation can be avoided.

(This may be simplified using the upcoming Extention Everthing, e.g var R = X.Eval() may be just replaced by Mat R = X, but I am not sure)

@sinshu
Copy link
Owner

sinshu commented Aug 13, 2024

Thanks for the kind comment 😊

Regarding the lazy evaluation technique, while it's an interesting topic, it's difficult for me to implement and maintain due to my current workload 😖

Anyway, I really hope that interest in numerical computation and ML expands within the .NET community.

@sinshu sinshu added the enhancement New feature or request label Aug 13, 2024
@MarsRover75
Copy link

MarsRover75 commented Aug 28, 2024

The popular libraries in C++ world (e.g. Eigen3 and Blaze) use a lazy evaluation technique to For example, var X = A.Add(B) produces a MatAdd<Mat, Mat> instead of a Mat, var X = A.Add(B.MultipleBy(C)) will produce a MatAdd<Mat, MatMul<Mat, Mat>>, where MatAdd and MatMul only track the operations to be performed (but not actually perform them) and are something like a ref struct that does not need heap allocation.

INMO in C# any lazy evaluations will imply ether boxing of value-types, or allocations of delegates, which in both cases implies heap allocations (and impossibility to use ref structs).
Maybe I'm wrong, but that's exactly the reason why there're so many constraints with Span usage.

With the current approach all intermediate objects are structs, so they are stack-allocated by compiler which is cheap.

Something like this can be achieved with code-generators, but that's another level.

UPD: There's some hope https://github.com/dotnet/runtime/issues/104936 with the upcoming NET10, though.
But, by that time I hope we'll also have more usable System.Numerics.Tensors.Tensor

@Shuenhoy
Copy link
Author

Shuenhoy commented Aug 28, 2024

The popular libraries in C++ world (e.g. Eigen3 and Blaze) use a lazy evaluation technique to For example, var X = A.Add(B) produces a MatAdd<Mat, Mat> instead of a Mat, var X = A.Add(B.MultipleBy(C)) will produce a MatAdd<Mat, MatMul<Mat, Mat>>, where MatAdd and MatMul only track the operations to be performed (but not actually perform them) and are something like a ref struct that does not need heap allocation.

INMO in C# any lazy evaluations will imply ether boxing of value-types, or allocations of delegates, which in both cases implies heap allocations (and impossibility to use ref structs). Maybe I'm wrong, but that's exactly the reason why there're so many constraints with Span usage.

With the current approach all intermediate objects are structs, so they are stack-allocated by compiler which is cheap.

Something like this can be achieved with code-generators, but that's another level.

UPD: There's some hope https://github.com/dotnet/runtime/issues/104936 with the upcoming NET10, though. But, by that time I hope we'll also have more usable System.Numerics.Tensors.Tensor

Here by "lazy evaluation" I do not refer to any specific current implementations like System.Lazy<T>, which indeed introduces heap allocations as you described. Instead, I mean a general evaluation strategy. Extra work may be needed to avoid the heap allocation. I have made some preliminary experiments: https://github.com/Shuenhoy/DotnetETExp . Generally, I let the struct implements some interfaces, and when passing struct to generic methods with interface constraints, boxing can be avoided. Though the approach in that repo is not good enough. As struct may still be boxed accidently. C# 13 will allow ref struct to implement interfaces, which may make the situation better.

As for System.Numerics.Tensors, of course I also hope it to be good enough to use. However, AFAIK Microsoft is not particularly experienced with numerics. I am not so confident that they will make it right. In fact, I have already seen some severe criticism (indicating the whole design is problematic) on System.Numerics.Tensors and the whole System.Numerics actually. You could use a translator to read https://zhuanlan.zhihu.com/p/713157309 if you are interested.

@MarsRover75
Copy link

My point was that when constructing any deferred execution expression chain, you'll have to return something from it's building blocks to get nice syntax.

You can not return ref struct as an Interface that it implements (A ref struct can't be converted to an instance of an interface it implements). Those interfaces serve different goals.
So, what will be returned instead? Some Interface for a class (regular structs are boxed when referenced as an Interface they implement) that represents the expression being built, or a stateful delegate?
All this leads us to heap allocation at some point.

@Shuenhoy
Copy link
Author

Shuenhoy commented Aug 28, 2024

My point was that when constructing any deferred execution expression chain, you'll have to return something from it's building blocks to get nice syntax.

You can not return ref struct as an Interface that it implements (A ref struct can't be converted to an instance of an interface it implements). Those interfaces serve different goals. So, what will be returned instead? Some Interface for a class (regular structs are boxed when referenced as an Interface they implement) that represents the expression being built, or a stateful delegate? All this leads us to heap allocation at some point.

Why just returning the raw ref struct is a problem? I can not see why we need to convert it to an interface or delegate.

For example, suppose the operation is defined as:

struct AddGeneric<V1, V2>: IVecExp where V1: IVecExp, V2: IVecExp {
V1 left;
V2 right;
}

The expression v1.add(v2).add(v3) will have a value of type AddGeneric<AddGeneric<Vec, Vec>, Vec> where the inner AddGeneric<Vec, Vec> does not need to be boxed, because the type of left in AddGeneric is defined as a generic parameter with constraints instead of an interface type like

struct AddInterface: IVecExp {
   IVec left;
   IVec right;
}

The AddInterface will lead to heap allocation as you concerned which we want to avoid, but AddGeneric will not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants