Factorization clarification question #75

marc-vdm · 2024-06-20T10:43:22Z

I'm writing some custom BW code (can share over email, but is still to remain private for now) where I observe a ~60% speed increase (~~2600~~ 200 solve/sec vs ~~1600~~ 123 solve/sec) when I factorize beforehand.

Are you certain the below is correct?

PyPardiso/pypardiso/scipy_aliases.py

Lines 73 to 74 in 0f7afd0

  !!! Use spsolve directly whenever possible !!! Contrary to the scipy implementation there is no performance 

  gain in PyPardiso by using factorized instead of spsolve.

You may be correct that the computational/memory cost is not worth it if you only perform a few -whatever number few is- Ax=b solves, on the same A, but when I do many solves (>20k), this does seem to matter quite a lot.

edit: Also note that def factorize() itself has conflicting information with the above.

PyPardiso/pypardiso/pardiso_wrapper.py

Lines 145 to 152 in 0f7afd0

 def factorize(self, A): 

 """ 

  Factorize the matrix A, the factorization will automatically be used if the same matrix A is passed to the 

  solve method. This will drastically increase the speed of solve, if solve is called more than once for the 

  same matrix A 

  --- Parameters --- 

  A: sparse square CSR matrix (scipy.sparse.csr.csr_matrix), CSC matrix also possible

The text was updated successfully, but these errors were encountered:

cmutel · 2024-06-20T12:12:15Z

I am 99% sure that the difference your are seeing is due to

PyPardiso/pypardiso/scipy_aliases.py

Line 44 in 0f7afd0

solver._check_A(A)

. If the matrix wasn't factorized the difference would be at least an order of magnitude slower. What this line does is make sure the A matrix, which is factorized, is the same now as when it was factorized in the first place. factorize doesn't make this check and is therefore somewhat faster.

cmutel · 2024-06-20T12:15:43Z

The ideal option, in my opinion, would be to be able to call spsolve without the A matrix check, probably through an additional input argument. This would still preserve compatibility with scipy input arguments, and avoid paying the memory cost of keeping another copy of the A matrix around:

PyPardiso/pypardiso/scipy_aliases.py

Lines 71 to 77 in 0f7afd0

  Don't use the factorized method for very large matrices, because it needs to keep an additional copy of 

  A in memory. 

  !!! Use spsolve directly whenever possible !!! Contrary to the scipy implementation there is no performance 

  gain in PyPardiso by using factorized instead of spsolve. 

  """ 

 solve_b = functools.partial(spsolve, A.tocsr().copy(), squeeze=False, solver=solver)

haasad · 2024-06-20T12:25:54Z

You should be able to do the following for maximum speed:

import pypardiso

solver = pypardiso.ps
solver.factorize(A)
solver.set_phase(33)
x = solver._call_pardiso(A, b)

This skips all the checks etc.

marc-vdm · 2024-06-20T12:36:40Z

Thanks for your replies both!
I'll experiment with Adrians solution by rewriting def solve_linear_system() and def decompose_technosphere() and see how that (doesn't?) change results

I'll report back with some findings

haasad · 2024-06-20T12:42:23Z

Please be aware that factorized is in a way just a convenience function, so pypardiso can be used as drop-in replacement for scipy.sparse.linalg.factorized. It still uses pypardiso.spsolve under the hood and pypardiso.spsolve always does a factorization for re-use by subsequent calls to spsolve.

haasad · 2024-06-20T13:16:56Z

I think I found the reason for the the performance difference that you see. I assume the brightway technosphere matrix is in csc format?

factorized converts A to csr format and keeps a copy of it:

PyPardiso/pypardiso/scipy_aliases.py

Line 77 in 0f7afd0

 solve_b = functools.partial(spsolve, A.tocsr().copy(), squeeze=False, solver=solver) 

spsolve converts to csr on every call:

PyPardiso/pypardiso/scipy_aliases.py

Lines 41 to 42 in 0f7afd0

 if sp.issparse(A) and A.format == "csc": 

 A = A.tocsr() # fixes issue with brightway2 technosphere matrix

Unfortunately my commit message from 2016 is pretty useless: 7a60c84

pardiso can deal with both csc and csr formats:

PyPardiso/pypardiso/pardiso_wrapper.py

Lines 219 to 227 in 0f7afd0

 if sp.issparse(A) and A.format == "csr": 

 self._solve_transposed = False 

 self.set_iparm(12, 0) 

 elif sp.issparse(A) and A.format == "csc": 

 self._solve_transposed = True 

 self.set_iparm(12, 1) 

 else: 

 msg = 'PyPardiso requires matrix A to be in CSR or CSC format, but matrix A is: {}'.format(type(A)) 

 raise TypeError(msg)

haasad · 2024-06-20T13:20:35Z

Looks like I actually wrote down the reasoning for this in #7, even with a jupyter notebook and everything 😊

marc-vdm · 2024-06-24T09:47:40Z

Alright, here's some findings:

@cmutel
I am 99% sure that the difference your are seeing is due to

PyPardiso/pypardiso/scipy_aliases.py

Line 44 in 0f7afd0

solver._check_A(A)

I'm not entirely sure this is correct. I think it may be the conversion to csr that is costing so much time.

PyPardiso/pypardiso/scipy_aliases.py

Lines 41 to 42 in 0f7afd0

 if sp.issparse(A) and A.format == "csc": 

 A = A.tocsr() # fixes issue with brightway2 technosphere matrix

When I convert my BW matrices to csr beforehand, I get the same speed as calling lca.decompose_technosphere() (which does this for us). solver._check_A does, however, re-do the checks for csc/csr, which doesn't seem needed when calling spsolve as we already convert to csr anyway.

I think if Brightway would set it's default format to csr instead of csc, we can speed up these calculations by default. The current BW25 implementation (which blocks the factorization in lca.decompose_technosphere()) seems to be incurring a speed penalty for no reason unless user thinks to convert lca.technosphere_matrix to csr themselves. This confirms what Adrian is saying in this comment. I understand the original csc format for UMFPACK compatibility, but most users are on x86 and thus use pypardiso, so perhaps this default is not the best.

@haasad
You should be able to do the following for maximum speed:
import pypardiso

solver = pypardiso.ps
solver.factorize(A)
solver.set_phase(33)
x = solver._call_pardiso(A, b)

This adds about ~10% speed for me, while nice if you know what you're doing, perhaps it's safe that this is indeed not done for BW.

Now to look ahead:
What makes sense for BW?
I would like to see lca.decompose_technosphere() not be blocked when using pypardiso -especially with a wrong warning type- but perhaps I'm misunderstanding something here?
Furthermore, does it make sense to set the sparse format based on the solver we're checking the solver anyway, so setting the 'correct' sparse format could help speed things up.

Let me know if you'd like to see a PR for BW25 Chris, I'll try and so it soon-ish then.

haasad · 2024-06-24T15:18:01Z

I would like to see lca.decompose_technosphere() not be blocked when using pypardiso -especially with a wrong warning type- but perhaps I'm misunderstanding something here?

pypardiso always does a factorization + solve, the warning in bw2calc is correct
the benefits that you see don't come from factorization, but are a side-effect of not having to do the repeated sparse format conversion, i.e. A = A.tocsr()
pypardiso needs the brightway technosphere matrix to be in csr, that's why the conversion to csr is enforced in pypardiso.spsolve
- the reasoning for this is explained in Wrong results for ill-conditioned CSC-matrices #7

Furthermore, does it make sense to set the sparse format based on the solver we're checking the solver anyway, so setting the 'correct' sparse format could help speed things up.

I also think that the best way would be if brightway uses csr format as default in combination with pypardiso.

This adds about ~10% speed for me, while nice if you know what you're doing, perhaps it's safe that this is indeed not done for BW.

I don't think the trade-off of 10% speed versus skipping all safety checks is worth it. If you do an LCA calculation, modify some values in the technosphere matrix and then do another LCA calculation, you might get completely wrong results without the checks in pypardiso.

marc-vdm · 2024-06-25T11:39:07Z

pypardiso always does a factorization + solve, the warning in bw2calc is correct

Well, partially IMO. By giving this warning and putting the factorization in an else:, BW users with pypardiso (which is still most of them) will keep using a csc format technosphere, as previously lca.decompose_technosphere() would convert this to csr. In BW25, a user would now need to manually convert their technosphere, or face a ~60% speed loss as the technosphere needs to be converted to csr for every new FU.

We can either automatically keep technosphere (and for consistency also biosphere?) in the correct format for the solver (csc for scipy, csr for pypardiso) -which is the best solution IMO-, or, we allow lca.decompose_technosphere() to call factorized(), which would convert the technosphere to the correct csr format.

I'll close this discussion now, I think we can continue in brightway-lca/brightway2-calc#98 as this is a BW issue, not a pypardiso issue.

marc-vdm mentioned this issue Jun 20, 2024

Factorizing matrices with pypardiso is not a no-op brightway-lca/brightway2-calc#98

Closed

marc-vdm closed this as completed Jun 25, 2024

marc-vdm mentioned this issue Jul 17, 2024

Force correct self.technosphere_matrix format for solver brightway-lca/brightway2-calc#101

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factorization clarification question #75

Factorization clarification question #75

marc-vdm commented Jun 20, 2024 •

edited

Loading

cmutel commented Jun 20, 2024

cmutel commented Jun 20, 2024

haasad commented Jun 20, 2024

marc-vdm commented Jun 20, 2024

haasad commented Jun 20, 2024

haasad commented Jun 20, 2024

haasad commented Jun 20, 2024

marc-vdm commented Jun 24, 2024

haasad commented Jun 24, 2024

marc-vdm commented Jun 25, 2024

Factorization clarification question #75

Factorization clarification question #75

Comments

marc-vdm commented Jun 20, 2024 • edited Loading

cmutel commented Jun 20, 2024

cmutel commented Jun 20, 2024

haasad commented Jun 20, 2024

marc-vdm commented Jun 20, 2024

haasad commented Jun 20, 2024

haasad commented Jun 20, 2024

haasad commented Jun 20, 2024

marc-vdm commented Jun 24, 2024

haasad commented Jun 24, 2024

marc-vdm commented Jun 25, 2024

marc-vdm commented Jun 20, 2024 •

edited

Loading