Support returning hidden state in CLIPTextEncoder #442

22quinn · 2023-08-04T23:49:31Z

Summary:
Added a return_hidden_state arg in CLIPTextEncoder. If set to True, forward will return a tuple of final embedding and the last hidden state.

Test plan:
pytest tests/models/clip/test_text_encoder.py

cc @abhinavarora Moved code to here for easier review and testing.

codecov-commenter · 2023-08-07T16:33:28Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.05% 🎉

Comparison is base (81e281c) 68.72% compared to head (977e72a) 68.77%.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #442      +/-   ##
==========================================
+ Coverage   68.72%   68.77%   +0.05%     
==========================================
  Files         169      169              
  Lines       11374    11393      +19     
==========================================
+ Hits         7817     7836      +19     
  Misses       3557     3557

Files Changed	Coverage Δ
tests/models/clip/test_text_encoder.py	`100.00% <100.00%> (ø)`
torchmultimodal/models/clip/text_encoder.py	`100.00% <100.00%> (ø)`

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

ankitade · 2023-08-07T16:38:49Z

torchmultimodal/models/clip/text_encoder.py

 )
 # embeddings now has size [bs, embedding_dim]

+ if self.return_hidden_state:
+ return embeddings, hidden_state


nit: use named tuple

ankitade · 2023-08-07T16:39:21Z

tests/models/clip/test_text_encoder.py

+ assert isinstance(text_encoder, torch.nn.Module)
+
+ actual_clip_init, actual_hidden_state = text_encoder(text)
+ print(actual_hidden_state)


ankitade · 2023-08-07T16:41:33Z

tests/models/clip/test_text_encoder.py

+ expected_hidden_state = torch.Tensor(
+ [
+ [
+ [6.348165e-01, -4.137459e-02, -1.604239e00, 1.010798e00],


nit: can we limit to 4 decimal places given atol is being set

ankitade · 2023-08-07T16:42:36Z

torchmultimodal/models/clip/text_encoder.py

@@ -77,6 +82,8 @@ def __init__(
 if use_clip_init:
 self.initialize_parameters()

+ self.return_hidden_state = return_hidden_state


you can make this an arg to fwd

+1 to what Ankita said!. We don't need this to be a module level property.

abhinavarora · 2023-08-07T18:12:09Z

torchmultimodal/models/clip/text_encoder.py

@@ -120,11 +127,13 @@ def forward(self, text: Tensor) -> Tensor:

 # [n_ctx, bs, transformer.width] -> [bs, n_ctx, transformer.width]
 embeddings = torch.permute(embeddings, (1, 0, 2))
- embeddings = self.ln_final(embeddings)
+ hidden_state = self.ln_final(embeddings)
 # take features from the eot embedding (the highest number in each sequence)
 embeddings = self.projection(


I would suggest renaming this to something like projected_embeddings to avoid re-using the embeddings name as it was used before.

…tput

22quinn · 2023-08-08T04:39:48Z

Thanks both for the review! I've addressed all comments. buck test looks good.

buck test @mode/dev-nosan //torchmultimodal/tests
Tests finished: Pass 317. Fail 0. Fatal 0. Skip 6. Build failure 0

ankitade · 2023-08-08T16:33:49Z

can you import the diff and land it there

facebook-github-bot · 2023-08-09T17:01:42Z

@abhinavarora has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2023-08-10T23:26:17Z

@abhinavarora merged this pull request in a1cc8f3.

Support returning hidden state from CLIPTextEncoder

b70dc30

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 4, 2023

ankitade reviewed Aug 7, 2023

View reviewed changes

abhinavarora reviewed Aug 7, 2023

View reviewed changes

Make return_hidden_state an arg for forward and add CLIPTextEncoderOu…

0c4307d

…tput

torch.Tensor -> Tensor

977e72a

facebook-github-bot closed this in a1cc8f3 Aug 10, 2023

facebook-github-bot added the Merged label Aug 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support returning hidden state in CLIPTextEncoder #442

Support returning hidden state in CLIPTextEncoder #442

22quinn commented Aug 4, 2023

codecov-commenter commented Aug 7, 2023 •

edited

Loading

ankitade Aug 7, 2023

ankitade Aug 7, 2023

ankitade Aug 7, 2023

ankitade Aug 7, 2023

abhinavarora Aug 7, 2023 •

edited

Loading

abhinavarora Aug 7, 2023

22quinn commented Aug 8, 2023

ankitade commented Aug 8, 2023

facebook-github-bot commented Aug 9, 2023

facebook-github-bot commented Aug 10, 2023

Support returning hidden state in CLIPTextEncoder #442

Support returning hidden state in CLIPTextEncoder #442

Conversation

22quinn commented Aug 4, 2023

codecov-commenter commented Aug 7, 2023 • edited Loading

Codecov Report

ankitade Aug 7, 2023

Choose a reason for hiding this comment

ankitade Aug 7, 2023

Choose a reason for hiding this comment

ankitade Aug 7, 2023

Choose a reason for hiding this comment

ankitade Aug 7, 2023

Choose a reason for hiding this comment

abhinavarora Aug 7, 2023 • edited Loading

Choose a reason for hiding this comment

abhinavarora Aug 7, 2023

Choose a reason for hiding this comment

22quinn commented Aug 8, 2023

ankitade commented Aug 8, 2023

facebook-github-bot commented Aug 9, 2023

facebook-github-bot commented Aug 10, 2023

codecov-commenter commented Aug 7, 2023 •

edited

Loading

abhinavarora Aug 7, 2023 •

edited

Loading