Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Whisper #693

Merged
merged 72 commits into from
Aug 8, 2024
Merged

Add support for Whisper #693

merged 72 commits into from
Aug 8, 2024

Conversation

TimoImhof
Copy link
Contributor

@TimoImhof TimoImhof commented Apr 30, 2024

This PR adds adapter support for the Whisper model from openai and builds upon work done previously in #572.

Key Additions:

  1. Adapter Support for Whisper Model:

    • Incorporates adapter functionality to enhance the flexibility and adaptability of the Whisper model.
  2. AudioClassificationHead Class:

    • Introduces a new AudioClassificationHead class.
    • Note: This class is currently not utilized by any model. The WhisperForAudioClassification static model only uses the Whisper Encoder, making it incompatible with encoder-decoder AdapterModels. I am open to feedback on whether to retain or remove this class.
  3. Enhanced Head Functions:

    • Expanded the argument options for some heads by adding a layer argument with a default value.
    • This change allows for greater customization, but I'm unsure if the previous limitations were intentional. Feedback on this modification is welcome.
  4. Preprocessing Scripts for Audio Datasets:

    • Added preprocessing scripts tailored for audio datasets.
    • These scripts are now utilized in the Whisper tests within the test suite, replacing the use of arbitrary samples.
    • I am open to suggestions on whether to retain these scripts.

TimoImhof and others added 21 commits April 2, 2024 14:37
- setup import structure and files
- implement mixins
- implement WithAdapter classes for modeling
- create WhisperSdpaAttentionAdapters module
- create WhisperAdapterModel
- make style
- add whisper to CONFIG_CLASS_KEYS_MAPPING
- add whisper to model importstructure
- make style
- add proof of concept head to head_utils
- add whisper to parallel composition white list
- remove redundant files
- add minor modifications
- reorganize dev dir
- experiment more with training
   - Problem: dimensions of input samples
@TimoImhof TimoImhof changed the title [WIP] Add Support for Whisper [WIP] Add support for Whisper Apr 30, 2024
@calpt calpt linked an issue Apr 30, 2024 that may be closed by this pull request
3 tasks
@TimoImhof TimoImhof requested a review from lenglaender June 19, 2024 20:58
Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good overall, thanks so much for working on it!

Did a first pass & left a couple of comments, mainly question to understand changes you made.

docs/model_overview.md Outdated Show resolved Hide resolved
src/adapters/head_utils.py Outdated Show resolved Hide resolved
tests/test_adapter_heads.py Outdated Show resolved Hide resolved
tests/test_adapter_heads.py Outdated Show resolved Hide resolved
tests/test_adapter.py Outdated Show resolved Hide resolved
tests/methods/base.py Outdated Show resolved Hide resolved
src/adapters/heads/language_modeling.py Show resolved Hide resolved
src/adapters/model_mixin.py Outdated Show resolved Hide resolved
tests/test_adapter_embeddings.py Outdated Show resolved Hide resolved
Copy link
Member

@lenglaender lenglaender left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Have you already trained an adapter on a task to see that our implementation yields the expected results?

src/adapters/head_utils.py Outdated Show resolved Hide resolved
src/adapters/model_mixin.py Outdated Show resolved Hide resolved
@TimoImhof TimoImhof requested review from calpt and lenglaender July 30, 2024 19:07
Copy link
Member

@calpt calpt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

# if cached indexing matrices are computed for different hidden_states size -> recompute
cache_invalidated = False
if hasattr(context, "pref_idx") and hasattr(context, "suff_idx"):
cache_invalidated = context.suff_idx.size(1) != seq_len
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great catch! should we check the full shape here since bsz and ddim might also change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, then we have all potential cases covered 👍

Copy link
Contributor Author

@TimoImhof TimoImhof Aug 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However I now realized that my checking logic was not correct; the indixing matrices and hidden_states do never have the same value at dim1:

  • the hidden_states[1] represent the sequence length
  • the suff_idx[1] represents the number of positions

We need to check for the actual values of suff_idx to see if the indexing values are out of bounds. I adapted the logic and added checks for the residual dimensions as well.
When the tests passed locally I will push the changes for review

@TimoImhof TimoImhof merged commit a99e47c into adapter-hub:main Aug 8, 2024
4 checks passed
TimoImhof added a commit that referenced this pull request Aug 9, 2024
WIP - Adding a tutorial notebook for whisper support being worked on
#693. Feedback is always appreciated!

---------

Co-authored-by: TimoImhof <62378375+TimoImhof@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support for openai Whisper
3 participants