[ADD] EKFAC #127

runame · 2024-09-17T04:35:59Z

Implements EKFAC (and its inverse) support (resolves #116).

I think we should at some point refactor KFACLinearOperator and KFACInverseLinearOperator to inherit from KroneckerProductLinearOperator and EigendecomposedKroneckerProductLinearOperator (or similar) classes since torch_matmat and other methods can be shared. Also, currently KFACInverseLinearOperator doesn't support trace, det, etc. properties which can also be shared. I created #126 for this.

coveralls · 2024-09-17T04:55:01Z

Pull Request Test Coverage Report for Build 10975103378

Details

207 of 210 (98.57%) changed or added relevant lines in 2 files are covered.
2 unchanged lines in 1 file lost coverage.
Overall coverage increased (+0.5%) to 89.5%

Changes Missing Coverage	Covered Lines	Changed/Added Lines	%
curvlinops/inverse.py	36	37	97.3%
curvlinops/kfac.py	171	173	98.84%

Files with Coverage Reduction	New Missed Lines	%
curvlinops/kfac.py	2	93.71%

Totals
Change from base Build 10408891176:	0.5%
Covered Lines:	1449
Relevant Lines:	1619

💛 - Coveralls

runame · 2024-09-20T01:32:34Z

@f-dangel One thing that is not tested and that could be wrong is the per-example gradient computation when there is weight sharing.

f-dangel

Gave some refactoring comments.

Overall, while reading through the diff, I was wondering if there is a better way to separate the eigenvalue correction of EKFAC. Ideally, I was imagining we can keep KFAC as is and implement EKFAC separately, e.g. by inheriting EKFAC from KFAC.

Do you have a good idea how to do this? Otherwise I believe this PR will make the code a lot more complex, and long-term complicate extending KFAC, especially for developers that are less familiar with EKFAC.

curvlinops/inverse.py

curvlinops/kfac.py

f-dangel · 2024-09-20T22:27:02Z

curvlinops/kfac.py

+ # Delete the cached activations
+ self._cached_activations.clear()


Are these cached activations concatenated over batches? Why don't they have to be cleared inside the data loop?

No they will just be overwritten, this avoids redundant clearing of the cache before it is filled up again anyway. Do you think it is cleaner to clear the cache explicitly every iteration?

f-dangel · 2024-09-20T22:30:38Z

curvlinops/kfac.py

+ "d_out1 d_out2, ... d_out1 d_in1, d_in1 d_in2 -> ... d_out2 d_in2",
+ )
+ .square_()
+ .sum(dim=0)


Is this sum correct, or do you want to sum out the ... of the einsum result?

Based on the above variable, I would change .sum(dim=0) into .sum(list(range(shared_axes)))

Also check the else branch below for the same suggestions.

f-dangel · 2024-09-20T22:32:05Z

curvlinops/kfac.py

+ per_example_gradient = einsum(
+ g,
+ self._cached_activations[module_name],
+ "shared d_out, shared d_in -> shared d_out d_in",


shared should be replaced by ...

Then, add a line shared_axes = g.ndim - 2.

curvlinops/kfac.py

runame added 12 commits August 7, 2024 10:48

Add compute_eigendecomposition

fd812f7

Merge branch 'main' into ekfac

08bdf7c

Add EKFAC test coverage

290fd1b

Implement EKFAC

7c72d97

Add inverse EKFAC test coverage

18062ef

Add inverse EKFAC support

b096726

Fix flake8

a2dec74

Fix docstring and lower test numerical threshold

d60c876

Fix MetaEnum docstring

b309300

Fix tests for FOOF+eigenvalue correction

725c413

Fix black

1e4769d

Ignore flake8 too complex error

31cab8a

runame added the enhancement New feature or request label Sep 17, 2024

runame requested a review from f-dangel September 17, 2024 04:35

runame self-assigned this Sep 17, 2024

f-dangel reviewed Sep 20, 2024

View reviewed changes

runame added 4 commits September 21, 2024 14:34

Refactor inverse

89c814f

Address KFAC refactor suggestions

cf45ef5

Fix docstring and error catching in test

7422b00

Fix test

3583070

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ADD] EKFAC #127

[ADD] EKFAC #127

runame commented Sep 17, 2024

coveralls commented Sep 17, 2024 •

edited

Loading

runame commented Sep 20, 2024

f-dangel left a comment

f-dangel Sep 20, 2024

runame Sep 21, 2024 •

edited

Loading

f-dangel Sep 20, 2024

f-dangel Sep 20, 2024

f-dangel Sep 20, 2024

f-dangel Sep 20, 2024

f-dangel Sep 20, 2024

		# Delete the cached activations
		self._cached_activations.clear()

[ADD] EKFAC #127

Are you sure you want to change the base?

[ADD] EKFAC #127

Conversation

runame commented Sep 17, 2024

coveralls commented Sep 17, 2024 • edited Loading

Pull Request Test Coverage Report for Build 10975103378

Details

💛 - Coveralls

runame commented Sep 20, 2024

f-dangel left a comment

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

runame Sep 21, 2024 • edited Loading

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

f-dangel Sep 20, 2024

Choose a reason for hiding this comment

coveralls commented Sep 17, 2024 •

edited

Loading

runame Sep 21, 2024 •

edited

Loading