Add support for Whisper #693

TimoImhof · 2024-04-30T13:51:20Z

This PR adds adapter support for the Whisper model from openai and builds upon work done previously in #572.

Key Additions:

Adapter Support for Whisper Model:
- Incorporates adapter functionality to enhance the flexibility and adaptability of the Whisper model.
AudioClassificationHead Class:
- Introduces a new AudioClassificationHead class.
- Note: This class is currently not utilized by any model. The WhisperForAudioClassification static model only uses the Whisper Encoder, making it incompatible with encoder-decoder AdapterModels. I am open to feedback on whether to retain or remove this class.
Enhanced Head Functions:
- Expanded the argument options for some heads by adding a layer argument with a default value.
- This change allows for greater customization, but I'm unsure if the previous limitations were intentional. Feedback on this modification is welcome.
Preprocessing Scripts for Audio Datasets:
- Added preprocessing scripts tailored for audio datasets.
- These scripts are now utilized in the Whisper tests within the test suite, replacing the use of arbitrary samples.
- I am open to suggestions on whether to retain these scripts.

- setup import structure and files - implement mixins - implement WithAdapter classes for modeling

- create WhisperSdpaAttentionAdapters module - create WhisperAdapterModel - make style - add whisper to CONFIG_CLASS_KEYS_MAPPING - add whisper to model importstructure

- make style - add proof of concept head to head_utils - add whisper to parallel composition white list

…t and add documentation

- remove redundant files - add minor modifications

- reorganize dev dir - experiment more with training - Problem: dimensions of input samples

calpt

This looks good overall, thanks so much for working on it!

Did a first pass & left a couple of comments, mainly question to understand changes you made.

docs/model_overview.md

src/adapters/head_utils.py

src/adapters/models/whisper/modeling_whisper.py

tests/test_adapter_heads.py

tests/test_adapter.py

tests/methods/base.py

src/adapters/heads/language_modeling.py

src/adapters/model_mixin.py

tests/test_adapter_embeddings.py

Co-authored-by: calpt <calpt@mail.de>

lenglaender

Looks good! Have you already trained an adapter on a task to see that our implementation yields the expected results?

src/adapters/head_utils.py

src/adapters/model_mixin.py

…n, fix seq2seq trainer bug

calpt

lgtm

# Conflicts: # tests/test_adapter_heads.py

calpt · 2024-08-04T08:44:54Z

src/adapters/methods/reft.py

+        # if cached indexing matrices are computed for different hidden_states size -> recompute
+        cache_invalidated = False
+        if hasattr(context, "pref_idx") and hasattr(context, "suff_idx"):
+            cache_invalidated = context.suff_idx.size(1) != seq_len


great catch! should we check the full shape here since bsz and ddim might also change?

Yes, then we have all potential cases covered 👍

However I now realized that my checking logic was not correct; the indixing matrices and hidden_states do never have the same value at dim1:

the hidden_states[1] represent the sequence length

the suff_idx[1] represents the number of positions

We need to check for the actual values of suff_idx to see if the indexing values are out of bounds. I adapted the logic and added checks for the residual dimensions as well.
When the tests passed locally I will push the changes for review

WIP - Adding a tutorial notebook for whisper support being worked on #693. Feedback is always appreciated! --------- Co-authored-by: TimoImhof <62378375+TimoImhof@users.noreply.github.com>

TimoImhof and others added 21 commits April 2, 2024 14:37

save current progress:

ad9fe2b

- setup import structure and files - implement mixins - implement WithAdapter classes for modeling

Merge branch 'adapter-hub:main' into dev/whisper

7ebb5e8

Implement WhisperAdapterModel:

1e10398

- create WhisperSdpaAttentionAdapters module - create WhisperAdapterModel - make style - add whisper to CONFIG_CLASS_KEYS_MAPPING - add whisper to model importstructure

add logger for Attention module

c775d43

Add Whisper model to documentation

23f1c68

Add WhisperDecoderWrapperAdaptersMixin:

f4b3df8

- make style - add proof of concept head to head_utils - add whisper to parallel composition white list

Add tests

dc40973

save progress

a0a89ed

save progress

f36133c

overwrite get_input_samples method to fix tests requiring simple inpu…

f011bb3

…t and add documentation

add support for speech samples with "input_features" as tensor name

08464c2

fix wrong input argument

32a6434

upload dev files for experiments

d13b6d3

upload dev files for experiments

f5e4269

update SpeechTestBase

70f7651

Add copy info and add flash attention

e38cd9e

Changes:

8587f0f

- remove redundant files - add minor modifications

Changes:

909fecb

- reorganize dev dir - experiment more with training - Problem: dimensions of input samples

Delete dev dir

fcaa21e

add TODOS

9bd7065

make method more general

25171bb

TimoImhof mentioned this pull request Apr 30, 2024

[WIP] Added Adapters to Whisper Model from openai #572

Closed

TimoImhof changed the title ~~[WIP] Add Support for Whisper~~ [WIP] Add support for Whisper Apr 30, 2024

calpt linked an issue Apr 30, 2024 that may be closed by this pull request

Support for openai Whisper #466

Closed

3 tasks

TimoImhof added 6 commits May 2, 2024 11:28

add methods necessary for head usage

6405be4

Add TODO

c52f0c3

remove redundant code

cc00ed6

add comment & enable all tests

24f72d6

Add special check for vision models

182f5a5

make style

e44a482

TimoImhof requested a review from lenglaender June 19, 2024 20:58

calpt reviewed Jun 29, 2024

View reviewed changes

julian-fong mentioned this pull request Jul 9, 2024

Adding a notebook for adapters whisper support #717

Merged

TimoImhof and others added 8 commits July 9, 2024 23:27

Update src/adapters/model_mixin.py

af2ddc2

Co-authored-by: calpt <calpt@mail.de>

Apply suggestions

57b411a

Merge remote-tracking branch 'origin/dev/whisper' into dev/whisper

61f3742

Merge branch 'main' into dev/whisper

f01b51e

Fix failing test and refactor speech model case handling

1f8573f

Fix failing test

8d04de1

Fix overwriting arguments

5b41382

make style

327381e

lenglaender reviewed Jul 17, 2024

View reviewed changes

src/adapters/head_utils.py Outdated Show resolved Hide resolved

src/adapters/model_mixin.py Outdated Show resolved Hide resolved

TimoImhof and others added 4 commits July 24, 2024 20:52

Address remaining comments, fix conversion test, correct documentatio…

a30bb6c

…n, fix seq2seq trainer bug

Revert forward function signature modification

ad47696

Merge branch 'adapter-hub:main' into dev/whisper

6111b07

make style

53b9cd9

TimoImhof requested review from calpt and lenglaender July 30, 2024 19:07

TimoImhof added 2 commits July 30, 2024 21:20

Remove redundant head - not supported by any model

7526514

Add Future TODO for seq2seqtrainer

0588db6

calpt approved these changes Aug 1, 2024

View reviewed changes

TimoImhof added 3 commits August 3, 2024 13:05

Merge branch 'refs/heads/main' into dev/whisper

1751a25

# Conflicts: # tests/test_adapter_heads.py

Incorporate pyreft tests

08911d8

Add check for changing hidden_states size

88bd867

calpt reviewed Aug 4, 2024

View reviewed changes

TimoImhof added 3 commits August 4, 2024 17:23

Adapt checking logic

3de2581

Merge branch 'refs/heads/main' into dev/whisper

89377f8

Fix attention classes and generation

5f4f20c

TimoImhof merged commit a99e47c into adapter-hub:main Aug 8, 2024
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Whisper #693

Add support for Whisper #693

TimoImhof commented Apr 30, 2024 •

edited

Loading

calpt left a comment

lenglaender left a comment

calpt left a comment

calpt Aug 4, 2024

TimoImhof Aug 4, 2024

TimoImhof Aug 4, 2024 •

edited

Loading

Add support for Whisper #693

Add support for Whisper #693

Conversation

TimoImhof commented Apr 30, 2024 • edited Loading

calpt left a comment

Choose a reason for hiding this comment

lenglaender left a comment

Choose a reason for hiding this comment

calpt left a comment

Choose a reason for hiding this comment

calpt Aug 4, 2024

Choose a reason for hiding this comment

TimoImhof Aug 4, 2024

Choose a reason for hiding this comment

TimoImhof Aug 4, 2024 • edited Loading

Choose a reason for hiding this comment

TimoImhof commented Apr 30, 2024 •

edited

Loading

TimoImhof Aug 4, 2024 •

edited

Loading