Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pretrained MAE weights, option to load checkpoints in ViT builder #479

Closed
wants to merge 3 commits into from

Conversation

ebsmothers
Copy link
Contributor

@ebsmothers ebsmothers commented Oct 4, 2023

Summary:
For MAE fine-tuning, fine-tuning occurs just on the encoder (ViT). This change allows easy loading of MAE pretrained weights directly into our ViT class.

Test plan:

python -m pytest -v tests/models/*
...
========== 207 passed, 25 warnings in 424.67s (0:07:04) ===========================

python -m pytest -v tests/modules/*
...
======================== 192 passed, 2 skipped, 22 warnings in 10.75s ==========================

Test instantiating ViT using MAE pretrained weights for each of the 3 checkpoints:

Screenshot 2023-10-05 at 6 39 02 PM

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 4, 2023
@codecov-commenter
Copy link

codecov-commenter commented Oct 4, 2023

Codecov Report

Attention: 7 lines in your changes are missing coverage. Please review.

Comparison is base (0de91e1) 72.21% compared to head (f488945) 72.18%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #479      +/-   ##
==========================================
- Coverage   72.21%   72.18%   -0.03%     
==========================================
  Files         187      187              
  Lines       13160    13174      +14     
==========================================
+ Hits         9503     9510       +7     
- Misses       3657     3664       +7     
Files Coverage Δ
...hmultimodal/modules/encoders/vision_transformer.py 52.72% <66.66%> (+0.80%) ⬆️
...orchmultimodal/models/masked_auto_encoder/model.py 92.98% <45.45%> (-5.08%) ⬇️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -20,6 +20,13 @@
)


MAE_MODEL_MAPPING = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt it nicer to expose as vit_mae* wrapper with pretrained=True like clip in here itself vs making the user pass it around

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's fine too. I thought it was a bit weird cause we are then mixing builders across files (i.e. we either have a builder for MAE inside vision_transformer.py or a builder returning ViT inside MAE model.py). But maybe the second option (I think that's what you're suggesting?) isn't so bad

@facebook-github-bot
Copy link
Contributor

@ebsmothers has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@ebsmothers merged this pull request in 6f32ca1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants