Skip to content

Commit

Permalink
Merge branch 'dilated-convolutions' into string-padding
Browse files Browse the repository at this point in the history
  • Loading branch information
f-dangel committed Oct 31, 2023
2 parents a46471f + 2d4e5b0 commit 51c329f
Show file tree
Hide file tree
Showing 40 changed files with 1,149 additions and 705 deletions.
23 changes: 18 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,9 @@ The main feature is a `torch.optim.Optimizer` which works like most PyTorch opti
data-parallel](https://pytorch.org/tutorials/intermediate/ddp_tutorial.html)
(DDP) training[^1]

The pre-conditioner matrices support different structures that allow to reduce cost ([overview](TODO Insert link to example)).
The pre-conditioner matrices support different structures that allow to reduce
cost
([overview](https://singd.readthedocs.io/en/latest/generated/gallery/example_05_structures/)).

## Installation

Expand All @@ -42,10 +44,21 @@ The pre-conditioner matrices support different structures that allow to reduce c

## Usage

- [Basic example](TODO Insert link to example)
- Examples for [supported features](TODO Insert link to gallery)
- [Advanced example](TODO Insert link to example)
- [Supported structures](TODO Insert link to example)
- [Basic
example](https://singd.readthedocs.io/en/latest/generated/gallery/example_01_basic/)
- Examples for [supported
features](https://singd.readthedocs.io/en/latest/generated/gallery/)
- [Advanced
example](https://singd.readthedocs.io/en/latest/generated/gallery/example_04_advanced/)
- [Supported
structures](https://singd.readthedocs.io/en/latest/generated/gallery/example_05_structures/)

## Limitations

- `SINGD` does not support graph neural networks (GNN)

- The code has stabilized only recently. Expect things to break and help us
improve by filing issues.

## Citation

Expand Down
23 changes: 23 additions & 0 deletions changelog
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Changelog

All notable changes to this project will be documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

### Added

### Changed

### Deprecated

### Fixed

## [0.0.1] - 2023-10-31

Initial release

[unreleased]: https://github.com/f-dangel/singd/compare/v0.0.1...HEAD
[0.0.1]: https://github.com/f-dangel/singd/releases/tag/v0.0.1
2 changes: 1 addition & 1 deletion docs/examples/example_03_param_groups.py
Original file line number Diff line number Diff line change
Expand Up @@ -60,7 +60,7 @@
"momentum": 0.9,
"weight_decay": 1e-2,
"lr_cov": 1e-2,
"batch_averaged": True,
"loss_average": "batch",
"T": 1,
"alpha1": 0.5,
}
Expand Down
19 changes: 14 additions & 5 deletions docs/examples/example_04_advanced.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,18 +35,21 @@
MAX_STEPS = 100 # quit training after this many steps
DEV = device("cuda" if cuda.is_available() else "cpu")

BATCH_SIZE = 32
MICRO_BATCH_SIZE = 6 # [ACC]
ITERS_TO_ACCUMULATE = 4 # [ACC]
NUM_PROCS = 2 # [ACC]

MICRO_BATCH_SIZE = 8 # [ACC]
assert BATCH_SIZE % MICRO_BATCH_SIZE == 0 # [ACC]
BATCH_SIZE = MICRO_BATCH_SIZE * ITERS_TO_ACCUMULATE * NUM_PROCS

train_dataset = MNIST(
"./data",
train=True,
download=True,
transform=Compose([ToTensor(), Normalize(mean=(0.1307,), std=(0.3081,))]),
)
train_loader = DataLoader(dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True)
train_loader = DataLoader(
dataset=train_dataset, batch_size=BATCH_SIZE, shuffle=True, drop_last=True
)

model = Sequential(
Conv2d(1, 3, kernel_size=5, stride=2),
Expand Down Expand Up @@ -90,7 +93,7 @@
"momentum": 0.9,
"weight_decay": 1e-2,
"lr_cov": 1e-2,
"batch_averaged": True,
"loss_average": "batch",
"T": 1,
"alpha1": 0.5,
"structures": ("dense", "dense"),
Expand Down Expand Up @@ -155,6 +158,12 @@
with autocast(device_type=amp_device_type, dtype=amp_dtype): # [AMP]
loss = loss_func(model(inputs_micro), target_micro)

# [ACC] Each per-datum loss must be scaled relative to the total
# number of data points accumulated in a gradient, see
# https://pytorch.org/docs/stable/notes/amp_examples.html#working-with-scaled-gradients
if loss_func.reduction == "mean":
loss *= MICRO_BATCH_SIZE / BATCH_SIZE

# [AMP] Backward passes under ``autocast`` are not recommended, see
# (https://pytorch.org/docs/stable/amp.html#torch.autocast).
# Therefore, this part happens outside the ``autocast`` context
Expand Down
5 changes: 3 additions & 2 deletions docs/examples/example_05_structures.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,8 +28,9 @@
# $\mathbf{m}_\mathbf{K}$, while the second entry specifies the structure of
# $\mathbf{C}$ and its momentum $\mathbf{m}_\mathbf{C}$ (see the [paper](TODO
# Insert link to arXiv submission) for details). It is even possible to specify
# structures on a per-layer basis (see [this](TODO Insert link to param groups
# example) example).
# structures on a per-layer basis (see
# [this](https://singd.readthedocs.io/en/latest/generated/gallery/example_03_param_groups/)
# example).
#
# The following structures are available:

Expand Down
7 changes: 7 additions & 0 deletions docs/interface.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
This section lists the interface for structured matrices, that is the operations
they need to implement to work in SINGD. It serves **for internal purposes
only**. This is useful for developers that wish to add a new structured matrix
class to the code that cannot be constructed with one of the available
templates.

::: singd.structures.base.StructuredMatrix
59 changes: 59 additions & 0 deletions docs/structures.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
Here we provide a list of structured matrices. This list is meant **for internal
purposes only**. It exists because it is more convenient to read the rendered
LaTeX code rather than the docstring source.

::: singd.structures.dense.DenseMatrix
options:
members:
- __init__

::: singd.structures.hierarchical.Hierarchical15_15Matrix
options:
members:
- __init__

# DIAGONAL

::: singd.structures.diagonal.DiagonalMatrix
options:
members:
- __init__

::: singd.structures.blockdiagonal.Block30DiagonalMatrix
options:
members:
- __init__

# LOWER-TRIANGULAR

::: singd.structures.triltoeplitz.TrilToeplitzMatrix
options:
members:
- __init__

::: singd.structures.trilbottomrightdiag.TrilBottomRightDiagonalMatrix
options:
members:
- __init__

::: singd.structures.triltopleftdiag.TrilTopLeftDiagonalMatrix
options:
members:
- __init__

# UPPER-TRIANGULAR

::: singd.structures.triutoeplitz.TriuToeplitzMatrix
options:
members:
- __init__

::: singd.structures.triubottomrightdiag.TriuBottomRightDiagonalMatrix
options:
members:
- __init__

::: singd.structures.triutopleftdiag.TriuTopLeftDiagonalMatrix
options:
members:
- __init__
24 changes: 24 additions & 0 deletions docs/templates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
Here we provide a list of templates that can be used to create new structured
matrices. This list is meant **for internal purposes only**. It exists because
it is more convenient to read the rendered LaTeX code rather than the docstring
source.

::: singd.structures.blockdiagonal.BlockDiagonalMatrixTemplate
options:
members:
-

::: singd.structures.hierarchical.HierarchicalMatrixTemplate
options:
members:
-

::: singd.structures.recursive.RecursiveBottomLeftMatrixTemplate
options:
members:
-

::: singd.structures.recursive.RecursiveTopRightMatrixTemplate
options:
members:
-
5 changes: 2 additions & 3 deletions makefile
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,9 @@ install-test:
.PHONY: test test-light

test:
@pytest -vx --run-optional-tests=expensive --cov=singd test

@pytest -vx --run-optional-tests=expensive --cov=singd --doctest-modules test singd
test-light:
@pytest -vx --cov=singd test
@pytest -vx --cov=singd --doctest-modules test singd

.PHONY: install-lint

Expand Down
6 changes: 5 additions & 1 deletion mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,10 @@ nav:
- Code Examples: generated/gallery
- API Documentation: api.md
- Developer Notes: develop.md
- Internal:
- Structures: structures.md
- Templates: templates.md
- Interface: interface.md
theme:
name: material
features:
Expand All @@ -34,7 +38,7 @@ plugins:
options:
show_root_heading: true
show_source: true
show_bases: false
show_bases: true
show_signature_annotations: true
separate_signature: true
docstring_section_style: list
Expand Down
1 change: 1 addition & 0 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ doc =
mkdocstrings[python]==0.22.0
mkdocs-gallery==0.7.8
matplotlib # structure visualizations
torchvision # MNIST

# Dependencies needed to run fine-tuning experiments
fine_tuning =
Expand Down
42 changes: 0 additions & 42 deletions singd/optim/accumulator.py

This file was deleted.

Loading

0 comments on commit 51c329f

Please sign in to comment.