Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX: More VeRA tests, fix tests, more checks #1900

Conversation

BenjaminBossan
Copy link
Member

  • Fixes incorrect config for VeRA in a test
  • Add VeRA to multi-adapter tests
  • Add more checks on the VeRA A/B shapes

The latter becomes necessary when we add more than one VeRA adapter. The shapes for VeRA A and B are only determined once, when the first VeRA adapter is created. After that, they are fixed. However, users may add a second VeRA adapter. As long as that adapter targets the same layers and has the same rank, we're good. But if it targets other, bigger layers, or if it has increased rank, the shapes of VeRA A and/or VeRA B will be too small, resulting in an error during the forward pass. To prevent this, we already check the shapes during initialization of the new adapter and raise an error right away.

- Fixes incorrect config for VeRA in a test
- Add VeRA to multi-adapter tests
- Add more checks on the VeRA A/B shapes

The latter becomes necessary when we add more than one VeRA adapter. The
shapes for VeRA A and B are only determined once, when the first VeRA
adapter is created. After that, they are fixed. However, users may add a
second VeRA adapter. As long as that adapter targets the same layers and
has the same rank, we're good. But if it targets other, bigger layers,
or if it has increased rank, the shapes of VeRA A and/or VeRA B will be
too small, resulting in an error during the forward pass. To prevent
this, we already check the shapes during initialization of the new
adapter and raise an error right away.
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@BenjaminBossan
Copy link
Member Author

Could you please review @dkopi?

Copy link
Contributor

@dkopi dkopi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, I've only added few comments regarding the raised message;

to easily fix this problem in vera I think the simplest solution would be to add optional config arguments that allow to specify the shared A/B shapes manually;
the other way would be to create separate "shared A/B" matrices per each adapter

@@ -100,6 +100,23 @@ def update_layer(
# we can take any of the existing adapter's parameters, as they should all be identical
vera_A_param = list(self.vera_A.values())[0]
vera_B_param = list(self.vera_B.values())[0]

error_tmpl = (
"{} has a size of {} but {} is or greater required; this probably happened because an additional VeRA "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a typo - should be "{} has a size of {} but {} or greater is required [...]", right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could also have separate messages for incompatible r dimension or input/output dimensions;
can also give a hint that adding adapters in different order may resolve the problem


model = get_peft_model(self.get_model(), config0)
# not full message but enough to identify the error
msg = "vera_A has a size of 10 but 20 is or greater required"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here


model = get_peft_model(self.get_model(), config0)
# not full message but enough to identify the error
msg = "vera_A has a size of 123 but 456 is or greater required"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@BenjaminBossan
Copy link
Member Author

BenjaminBossan commented Jul 4, 2024

@dkopi Thanks for the helpful comments, please check if I have addressed them sufficiently. CI is failing for unrelated reasons.

Also a thought that I had:

With your recent fix to the shape issue, we should be able to allow some models that formerly didn't work, right?

# "btlm": ["c_proj", "c_attn"], # tested, does not work because of different shapes
"codegen": ["qkv_proj"],
# "mistral": ["q_proj", "v_proj"], # tested, does not work because of different shapes
# "mixtral": ["q_proj", "v_proj"], # tested, does not work because of different shapes
"stablelm": ["q_proj", "v_proj"],
# "phi": ["q_proj", "v_proj", "fc1", "fc2"], # tested, does not work because of different shapes
"phi": ["q_proj", "v_proj"],
# "gemma": ["q_proj", "v_proj"], # tested, does not work because of different shapes

I cannot test at the moment, but I think we can uncomment them.

@dkopi
Copy link
Contributor

dkopi commented Jul 4, 2024

yeah, some/all of them should work now;

I've already tested adapting all linear layers of gemma and phi and it worked

@BenjaminBossan
Copy link
Member Author

@matthewdouglas LMK if you feel ready for reviewing this PR.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Very minor comments.

Comment on lines +471 to +475
"VeRA Same",
"vera",
VeraConfig,
{"target_modules": ["lin0"], "init_weights": False},
{"target_modules": ["lin0"], "init_weights": False},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So essentially, these two adapters ("VeRA Same" and "vera") are the same. Right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"VeRA Same" is just the name of the test, "vera" is the name of the method being used. I added a comment at the top that describes the items of this tuple for clarification.

The two adapters here have the same config, but they are two different adapters.


model = get_peft_model(self.get_model(), config0)
# not full message but enough to identify the error
msg = "vera_A has a size of 123 but 456 or greater is required"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we programmatically determine 123 and 456?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the code to make it obvious where the numbers come from.


model = get_peft_model(self.get_model(), config0)
# not full message but enough to identify the error
msg = "vera_A has a size of 10 but 20 or greater is required"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same. Can we programmatically determine the magic numbers here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the code to make it obvious where the numbers come from.

@BenjaminBossan
Copy link
Member Author

@sayakpaul Your points should be addressed, LMK if you want to do another review or if the PR can be merged.

@sayakpaul sayakpaul merged commit ba75bb1 into huggingface:main Jul 22, 2024
14 checks passed
@sayakpaul
Copy link
Member

Thanks for clarifying the comments.

@BenjaminBossan BenjaminBossan deleted the fix-vera-tests-and-additional-shape-checks branch July 22, 2024 14:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants