Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Matrix on diagonal and triangular matrices for types without zero #36256

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

blegat
Copy link
Contributor

@blegat blegat commented Jun 12, 2020

Similar to #35743

@blegat
Copy link
Contributor Author

blegat commented Jun 15, 2020

The failing tests are due to the types T such as Matrix{Float64} for which zero(::T) is defined but zero(::Type{T}) is not defined.
For these types, Matrix{T}(::LowerTriangular{T}) works as it only requires zero(::T) to be defined but the current PR does not work as it calls typeof(zero(T)).
I see two solutions to this:

  1. Define a function zero_type(T::Type) = typeof(zero(T)) that should be implemented for Array next to
    zero(x::AbstractArray{T}) where {T} = fill!(similar(x), zero(T))

    and for AbstractSparseArray next to
    zero(a::AbstractSparseArray) = spzeros(eltype(a), size(a)...)

    The issue with this is that it can be considered a breaking change.
  2. Use Base.promote_op(zero, T) instead of typeof(zero(T)).

@codecov-commenter
Copy link

codecov-commenter commented Jun 15, 2020

Codecov Report

Merging #36256 into master will decrease coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36256      +/-   ##
==========================================
- Coverage   86.10%   86.09%   -0.02%     
==========================================
  Files         349      349              
  Lines       65163    65163              
==========================================
- Hits        56109    56102       -7     
- Misses       9054     9061       +7     
Impacted Files Coverage Δ
stdlib/LinearAlgebra/src/givens.jl 56.73% <0.00%> (-4.81%) ⬇️
base/bitset.jl 92.23% <0.00%> (-0.98%) ⬇️
base/missing.jl 87.97% <0.00%> (-0.55%) ⬇️
stdlib/SparseArrays/src/linalg.jl 94.94% <0.00%> (-0.30%) ⬇️
stdlib/LinearAlgebra/src/triangular.jl 96.31% <0.00%> (-0.26%) ⬇️
stdlib/REPL/src/REPL.jl 80.69% <0.00%> (ø)
stdlib/REPL/src/LineEdit.jl 79.25% <0.00%> (ø)
stdlib/LinearAlgebra/src/bidiag.jl 95.87% <0.00%> (+0.18%) ⬆️
stdlib/SparseArrays/src/sparsevector.jl 95.26% <0.00%> (+0.18%) ⬆️
stdlib/SparseArrays/src/sparsematrix.jl 90.00% <0.00%> (+0.31%) ⬆️
... and 1 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9ed10de...3fa1f75. Read the comment docs.

@andreasnoack
Copy link
Member

I'm a little reluctant about these changes. My gut feeling is that the zero(T) isa T is generally assumed in the linear algebra code here and elsewhere. I'm not saying this is change is necessarily wrong but I'm not sure if we understand the implications.

@blegat
Copy link
Contributor Author

blegat commented Jun 16, 2020

I understand that many parts in LinearAlgebra were written with Numbers in mind and in fact, most of it continues to work with other types such as JuMP functions and polynomial functions.
There are a few places where too much is assumed and this triggers error when, e.g. users of JuMP (resp. MultivariatePolynomials) are trying to do linear algebra with JuMP functions (resp. polynomials).
We define a type AbstractMutable in MutableArithmetics. Subtypes of AbstractMutable are mutable types. Some methods in stdlib (e.g. mul!, dot) would be more efficient if they exploit their mutability and we redefine these methods for AbstractMutable.
It happens that the mutable types are also often types with non-Number behaviors (e.g. zero(T) != T) so we also try to redefine methods for AbstractMutable where these assumptions are made.
This is however a bit hacky and was no the role of AbstractMutable (it is meant to rewrite methods for performance by exploiting mutability).
We made a list here of the different issues with assumptions made in LinearAlgebra in:
jump-dev/MutableArithmetics.jl#47
You can see that many places where these kind of assumptions were made in LinearAlgebra were already fixed, this is currently the last one remaining.
In fact, this PR is similar in spirit with #18218 where we also needed promote_op to handle more general cases and this change was very useful for JuMP and polynomial types.

@tkf
Copy link
Member

tkf commented Jun 18, 2020

Some methods in stdlib (e.g. mul!, dot) would be more efficient if they exploit their mutability and we redefine these methods for AbstractMutable.

I don't think relying on computations on the type domain is a good practice in Julia. It doesn't compose well because it's rather unavoidable to rely on promote_op (or very complicated trait system) if you go with such design principle. Using return_type (via promote_op) like this means that your API is undefined. (Yes, it applies to a lot of Base/stdlib APIs, too.)

I do agree exploiting in-place operations on nested mutable data structure would be highly beneficial. But I think the logic has to occur in the value domain rather than in type domain. The only route I can imagine for this to happen is to embrace mutate-or-widen strategy everywhere recursively in LinearAlgebra/Base functions. I'm exploring this in BangBang.jl. It is primary for "containers" but I added a few compute-oriented APIs like materialize!!, mul!!, etc. and I find it powerful. Of course, using mutate-or-widen strategy doesn't mean that we can't use return_type. We just have to make sure that it is used solely for optimization.

I understand that this is not a short-term solution but I think it's something worth keep in mind before using promote_op everywhere and designing APIs relying on that.

@blegat
Copy link
Contributor Author

blegat commented Jun 19, 2020

Thanks for the link to BangBang, it looks interesting. The difference with MutableArithmetics (MA) is that the latter focus more on arithmetics/algebra and operation that originally do not mutate and return the result such as *, +, -, ...

it's rather unavoidable to rely on promote_op (or very complicated trait system) if you go with such design principle.

In order to write algorithms that work for any type and still exploit mutability when possible, you must be able to apply an operation on some objects and store the result in some other object by mutating it if possible and, in any case, return the result (for instance, with Int, it would not mutate it and with BigInt it will but as your code is written for any type, you don't know which case it is). This is provided by MA.operate! and MA.operate_to!.
It turns out that this is quite boilerplate and error-prone to implement as it's doing two things (checking whether we should mutate and mutate if we can).
Moreover, sometimes, you have a code that is not type-independent and you know that you want to mutate the object and if it's not possible then it's a bug and you want an error, not silently inefficient code. This is provided by MA.mutable_operate! and MA.mutable_operate_to!.
The implementation of MA.operate! can just redirect to MA.mutable_operate! but as mentioned it is quite error-prone to implement and in fact there is one way to implement it that relies on a fact that seems to work for all types that implements MA:
Either output is not mutable (e.g. Int, Float64) and MA.operate_to!(op, output, ...) should just redirect to op(...) or it is and then, if op(...) has the same type as output then we should redirect to MA.mutable_operate_to!(op, output, ...), otherwise, we should just redirect to op(...).
To do this, you need to be able to get the return type so types should implement MA.promote_operation (as it does not rely on promote_op).
While it seems like a lot of work to require the types to implement MA.promote_operation, it is still less work than implementing MA.operate! and MA.operate_to! and you get a reliable MA.promote_operation that user can use for free (and it's ok to use it, not like Base.promote_op which should be avoided). So it's a case when by implementing less, you get more features!
I'm not saying that in Julia, all types should implement MA.promote_operation for all functions in addition to implementing the functions.
I'm just saying that for certain functions (typically the one encountered in arithmetics). The approach seems very fruitful.

Another example where we could get more features with less effort is matrix multiplication.
Currently in LinearAlgebra and SparseArrays (spmatmul! is not doing exactly that but it could be rewritten like this), there are many implementations of *(A, B) that basically do

  1. Guess the type of the resulting array
  2. Create a new array to store the result
  3. Redirect to LinearAlgebra.mul!

So in fact, we could have only one method for * and have many methods for giving the type of the resulting array and many methods for creating an new array from the type of an array (which already exist, e.g. similar).
This is the approach we used in MA and it seems to work quite well: https://github.com/jump-dev/MutableArithmetics.jl/blob/7057356d3f154b1a76ca385e03a650d834c80665/src/linear_algebra.jl#L215-L226.
By doing this, we get more features, as now there is a function that can be used to get the return type of an array multiplication, and it requires less effort as implementing this function is less work than implementing * since it is one of the task that is currently done in * anyway.

@codecov-io
Copy link

Codecov Report

Merging #36256 (8d91b11) into master (4d94a63) will decrease coverage by 0.03%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #36256      +/-   ##
==========================================
- Coverage   86.12%   86.09%   -0.04%     
==========================================
  Files         349      349              
  Lines       65258    65163      -95     
==========================================
- Hits        56205    56102     -103     
- Misses       9053     9061       +8     
Impacted Files Coverage Δ
stdlib/LinearAlgebra/src/givens.jl 56.73% <0.00%> (-4.81%) ⬇️
stdlib/Test/src/logging.jl 65.75% <0.00%> (-1.86%) ⬇️
base/compiler/ssair/ir.jl 81.66% <0.00%> (-1.63%) ⬇️
base/cartesian.jl 81.81% <0.00%> (-0.86%) ⬇️
base/int.jl 53.00% <0.00%> (-0.71%) ⬇️
base/gmp.jl 87.04% <0.00%> (-0.29%) ⬇️
stdlib/LinearAlgebra/src/triangular.jl 96.31% <0.00%> (-0.26%) ⬇️
base/compiler/ssair/driver.jl 90.69% <0.00%> (-0.22%) ⬇️
base/compiler/abstractinterpretation.jl 89.18% <0.00%> (-0.20%) ⬇️
stdlib/LinearAlgebra/src/bidiag.jl 95.87% <0.00%> (-0.19%) ⬇️
... and 22 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4d94a63...e5c3f3a. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants