Integrate X-LoRA #1491

EricLBuehler · 2024-02-20T12:51:53Z

Paper link: https://arxiv.org/abs/2402.07148

This PR integrates X-LoRA by creating a new tuner model type on the level of LoraModel. Please see #1472.

Changes

Although the new model type is a subclass of LoraModel, this is only an implementation detail to remove the need for nested PeftModels. In a similar vein of thought, I have updated the signatures of the tuner __init__ functions to allow the method swapping and ensure the xLoRAModel is a tuner and not on the "level" of PeftModel.

Update signatures of tuner __init__ functions to take a back reference (not used for all but xLoRAModel).
Implement and export xLoRAModel and xLoRAConfig.
The special API for X-LoRA is located in xLoRAModel.

Status

Instantiate a LoraModel somewhere before layer swapping in xLoRAModel.__init__
Fully integrate.
Test implementation.
Add documentation for methods.

BenjaminBossan · 2024-02-22T08:46:04Z

Let me know once this is ready for review.

EricLBuehler · 2024-02-22T12:55:00Z

Hi @BenjaminBossan, I think this is ready for review.

BenjaminBossan

Thanks so much for working on this PR. I did not do an in-depth review yet, but from what I saw it already looks very solid.

Before working on the details, I think it's best to get the overall design polished. For this, I still have a few questions, please check out my comments. Together, I'm sure we can figure out a way to simplify some of the code.

Also, it would really help to have examples and/or tests to see X-LoRA in action. Do you have something that you could add? Later, we should also work on documentation, but this can wait until for now.

Edit: For other readers, here is the paper link: https://arxiv.org/abs/2402.07148

src/peft/tuners/xlora/insertion.py

src/peft/tuners/xlora/model.py

BenjaminBossan · 2024-02-22T15:46:27Z

src/peft/tuners/xlora/classifier.py

+ npy = result.numpy()
+ numpy.save(file, npy)
+
+ def flush_log_scalings(self, path: str):


Could you explain why we need this? Is this just for debugging/monitoring purposes?

This API enables the user to get a log of the scalings. It is useful for generating visualizations such as this.

I see. In that case, there should be some kind of example provided to illustrate how to use this, I think otherwise users will not discover this feature.

Also, I'd suggest for this method to only return the indices_map and allow the caller to decide themselves if they want to save it on disk as json or do something else with it.

Wouldn't returning the indices_map (or maybe the seqlens_map would be better, as it contains the tensors) make flush_log_scalings redundant to get_scalings_log? Of course, a dedicated method to calculate the indices_map may be helpful, and given the addition of such a method I think removing flush_log_scalings would be OK.

Additionally, there is actually a wrapper method on XLoraModel with the rest of the API which contains a docstring. This way, it should be easy to find. This method in the classifier is actually internal, I have prefixed it with _ now.

BenjaminBossan · 2024-02-22T15:52:36Z

src/peft/peft_model.py

@@ -149,7 +155,7 @@ def active_adapters(self) -> list[str]:
 adapters = self.active_adapter
 if isinstance(adapters, str):
 adapters = [adapters]
- return adapters
+ return list(filter(lambda x: len(x) > 0, adapters))


Not sure why this is needed. Is this because of self.active_adapter = "". Could you please explain the reason behind that?

During some of the testing I did, self.active_adapters would return a function instead of executing the @property. Through debugging, I discovered that it was somehow connected to self.active_adapter not being set, as no adapter is initially loaded.

However, suppose that I could allow the default adapter to be loaded, and then delete it later. That should make those obsolete. I have pushed a commit to hopefully resolve this.

BenjaminBossan · 2024-02-22T15:53:30Z

src/peft/peft_model.py

@@ -123,7 +129,7 @@ def __init__(self, model: PreTrainedModel, peft_config: PeftConfig, adapter_name
 else:
 self._peft_config = None
 cls = PEFT_TYPE_TO_MODEL_MAPPING[peft_config.peft_type]
- self.base_model = cls(model, {adapter_name: peft_config}, adapter_name)
+ self.base_model = cls(model, {adapter_name: peft_config}, adapter_name, self)


I really want to avoid this, as this looks like a mixing of concerns. Surely, we can figure out a better way. Could you explain why this was needed?

This was needed because in XLoraModel.__init__ we load adapters into the PeftModel via load_adapter and for use by the XLoraClassifier. This way, automatic loading is achieved. However, all that matters is that we are able to get some reference to PeftModel into the constructor.

Perhaps we could add some code after we call cls(model...), and check if self.base_model is a XLoraModel. Then, we could call a __post_init__ which would take the PeftModel self, and do the adapter loading. Would this be more elegant?

I just pushed a commit (c5cdfc3) which is the latest one currently for easy reversion. Does this resolve the concern?

src/peft/peft_model.py

BenjaminBossan · 2024-02-23T16:04:02Z

src/peft/tuners/xlora/classifier.py

+ npy = result.numpy()
+ numpy.save(file, npy)
+
+ def flush_log_scalings(self, path: str):


I see. In that case, there should be some kind of example provided to illustrate how to use this, I think otherwise users will not discover this feature.

Also, I'd suggest for this method to only return the indices_map and allow the caller to decide themselves if they want to save it on disk as json or do something else with it.

src/peft/tuners/xlora/classifier.py

src/peft/tuners/xlora/model.py

BenjaminBossan · 2024-02-23T16:19:01Z

src/peft/tuners/xlora/model.py

+ model_peft.get_nb_trainable_parameters,
+ model_peft.generate,
+ )
+ model_peft.save_pretrained = peft_model_wrapper.save_pretrained # type: ignore


I don't think I fully understand this yet. Above, we call:

self.base_model.__xlora_post_init__(model, peft_config, adapter_name)

but here it looks like this method requires 4 arguments. What is the model_peft, is this the PeftModel?

So what I suspect is happening here is that you want to modify the behavior of generate and save_pretrained of the PeftModel without actually modifying PeftModel, is this the goal?

When it comes to generate, the PeftModel calls generate on the base_model anyway, could we not add the modification at that point?

When it comes to save_pretrained, I think we could check if we can somehow make the two approaches work together, I'd need to understand what the custom save_pretrained method does differently.

One easy way that we could allow custom save_pretrained methods without a lot of changes in PeftModel would be something like this:

class PeftModel: def save_pretrained(...): if hasattr(self.base_model, "_custom_save_pretrained"): return self.base_model._custom_save_pretrained(...) # same code as right now

This way, we only added 2 extra lines in PeftModel but would allow the underlying model to implement their own save_pretrained method.

BenjaminBossan · 2024-02-23T16:19:52Z

src/peft/tuners/xlora/insertion.py

+ # TODO(EricLBuehler): Evaluate effectiveness and performance degradation
+ self.peft_model.base_model.eval()
+ if not self.config.use_trainable_adapters:
+ for name, param in self.peft_model.base_model.named_parameters():


This looks strange to me, generate should normally not have anything to do with parameter updates. Could you explain why this is required?

Certainly! We discovered during training that the adapters were being set to trainable after each generate call. If you could provide any insight to why that may be the case, it would be great!

src/peft/tuners/xlora/model.py

BenjaminBossan · 2024-02-27T14:17:29Z

We discovered during training that the adapters were being set to trainable after each generate call.

This should certainly not happen. Could you please open an issue and provide an example so that we can fix it?

Regarding the state of the PR, it's actually a bit hard for me to tell which comments have been addressed and which haven't (as GH removes them when the line has been changed, even if it may not have been addressed). Regardless of some designs which I think could be improved, I think the most efficient way forward would be if you could provide an example/tests so that I can see X-LoRA in action. This may help answer some questions I still have.

EricLBuehler · 2024-02-27T19:40:22Z

Thank you, I work on some example code to reproduce the behavior and will raise an issue.

Regarding examples of X-LoRA, we have put together some examples using the xlora package API here. Although this is slightly different (ex., add_xlora_to_model and from_pretrained which are parts of get_peft_model and PeftModel respectively in this PR) from the XLoraModel proposed here, it documents the same methods.

I hope this will work as a demonstration of X-LoRA in action, would it be better for me to provide some examples using this PR?

BenjaminBossan · 2024-02-28T10:53:58Z

Regarding examples of X-LoRA, we have put together some examples using the xlora package API here. Although this is slightly different (ex., add_xlora_to_model and from_pretrained which are parts of get_peft_model and PeftModel respectively in this PR) from the XLoraModel proposed here, it documents the same methods.

I hope this will work as a demonstration of X-LoRA in action, would it be better for me to provide some examples using this PR?

I took a look at these examples, thanks. But let's get something to work based on this PR, even if very basic. This helps me understand what code paths are taken and how the different parts interact. Otherwise, the review is much harder for me. Also, we will need something like this eventually because we want to add tests for X-LoRA. As I said, it can be very basic for a start.

EricLBuehler · 2024-02-28T16:42:55Z

I was able to put together a small example which shows how one would use the API as documented here, which I have attached in a plain text file as GH does not allow .py to be attached.
example.txt

This is very basic and just shows creation, generation and simple API usage. I hope this helps!

BenjaminBossan · 2024-02-29T14:57:49Z

I tried to run your example but encountered some problems:

There is still a merge conflict in the forward method of lora.Linear. This is because we merged DoRA recently. My suggestion would be to apply X-LoRA only when not using DoRA. A quick and dirty solution should be enough for now, I think we'll rework that part in the future anyway.
As I don't have the adapter checkpoints referenced in the script, I tried to create some checkpoints with randomly initialized LoRA weights. However, this led to a bizarre error when trying to load them. It turns out that when I tried to create a normal LoRA adapter, it was not applied to any layer, despite setting target_modules correctly. Could it be possible that something has been messed up?

EricLBuehler · 2024-02-29T15:35:06Z

Thank you for trying it out. I ran the following code on a local machine with the latest installation from this branch to test the loading of a normal LoRA adapter, and it seemed to work, as after printing the model I can see lora.Linear in the specified layers.

from transformers import AutoModelForCausalLM, OPTForCausalLM, AutoTokenizer
from peft import LoraConfig

model_id = "HuggingFaceH4/zephyr-7b-beta"
model = AutoModelForCausalLM.from_pretrained(model_id)

lora_config = LoraConfig(
    target_modules=[
        "q_proj",
        "gate_proj",
        "o_proj",
        "v_proj",
        "k_proj"
    ],
    init_lora_weights=False
)

model.add_adapter(lora_config, adapter_name="adapter_1")
print(model)

Strangely, when I printed out the model in the X-LoRA test script, it showed proper injection of adapters as well as the classifier. To begin fixing this, could you please provide a minimum-reproducible example for the LoRA adapter loading so that I can find the error?

BenjaminBossan · 2024-02-29T15:55:39Z

could you please provide a minimum-reproducible example for the LoRA adapter loading so that I can find the error?

Using your exact branch with the merge conflict removed, when I run your script with this slight modification, I get the issue of no LoRA weights being applied:

- model.add_adapter(lora_config, adapter_name="adapter_1")
+ from peft import get_peft_model
+ model = get_peft_model(model, lora_config)

EricLBuehler · 2024-02-29T16:23:14Z

I found the bug: it was because of a mistake in a hasattr check. I added this because an XLoraModel should not have a default adapter injected. Perhaps you could try it again?

BenjaminBossan · 2024-03-01T10:45:06Z

Thanks for fixing this, it should now be working. There is still an issue with a merge conflict being unresolved as mentioned earlier:

There is still a merge conflict in the forward method of lora.Linear. This is because we merged DoRA recently. My suggestion would be to apply X-LoRA only when not using DoRA. A quick and dirty solution should be enough for now, I think we'll rework that part in the future anyway.

EricLBuehler · 2024-03-01T16:42:12Z

I have now fixed the merge conflict.

EricLBuehler · 2024-03-05T13:37:43Z

Hi @BenjaminBossan, I'm not sure if you have had a chance to look at the updated PR, as merge conflict has been resolved, and I think it is ready for another review. Would you prefer a new PR to be opened as a cleaner slate for further reviews?

BenjaminBossan

Thanks a lot for working on my previous comments. I was busy with some other stuff and only today could do another in detailed review of your PR. As I have to go now, it is unfortunately not 100% complete, but I still wanted to give you the feedback so you don't have to wait.

In addition to the individual comments I made, I have a few general questions:

How is the X-LoRA adapter trained? Could you please provide an example? Eventually, we'll want to move this to unit tests.
Could you please add the copyright notice to all new modules?
X-LoRA only really works with transformers language models, right? Can we document this more clearly? Also, do you think it would be possible to make this work with other types of models?
I'm not a fan of the type annotations of the style self.inner: nn.ModuleList = nn.ModuleList([]) or model: nn.Module = self.model, especially when followed by a # type: ignore. Same with the use of typing.cast. Is that because your IDE flags the code otherwise? Maybe you could deactivate the type checker for this project, as PEFT isn't really well annotated.

src/peft/tuners/tuners_utils.py

BenjaminBossan · 2024-03-05T16:20:48Z

src/peft/peft_model.py

+ if not isinstance(config, XLoraConfig):
+ raise TypeError(f"Expected 'XLoraConfig', got '{type(config)}' instead.")
+
+ device = infer_device() # As in PeftModel.load_adapter, torch_device = infer_device(


comment is cut off?

src/peft/tuners/xlora/classifier.py

src/peft/tuners/xlora/config.py

src/peft/tuners/xlora/model.py

BenjaminBossan · 2024-03-05T16:51:11Z

src/peft/tuners/xlora/model.py

+ ) -> None:
+ super().__init__(model, config, adapter_name)
+
+ def _xlora_post_init(


I wonder if it wouldn't be better to convert this to a standalone function, something like def post_init_lora(peft_model). Not sure if we need all the other arguments, can they not be derived from the PeftModel instance?

I'm not sure. Most of those are deeply nested by the time post_init_lora is called, so I thought it would increase readability to pass it this way. Would you prefer it if they are accessed through the PeftModel?

src/peft/tuners/xlora/util.py

BenjaminBossan · 2024-03-05T16:56:04Z

src/peft/peft_model.py

+ config.device = torch.device(device)
+
+ # If we are passed adapters in the kwargs, it is already in the config.
+ # If no adapters are passed, config.adapters is None


Do I understand correctly that this is for the case where we call save_pretrained on the X-LoRA model and then load this pretrained model again with from_pretrained? The only "new" thing added in that case would be the X-LoRA classifier, right?

Yes, that is correct. I simply load the weights for the classifier here.

BenjaminBossan · 2024-03-05T17:14:05Z

src/peft/tuners/lora/layer.py

@@ -434,7 +446,14 @@ def forward(self, x: torch.Tensor, *args: Any, **kwargs: Any) -> torch.Tensor:
 x = x.to(lora_A.weight.dtype)

 if not self.use_dora[active_adapter]:
- result = result + lora_B(lora_A(dropout(x))) * scaling
+ if _xlora_layer is not None:


I'm not so happy with this addition to the LoraLayers. It makes reading and understanding them more complex and requires all LoRA layers to be updated (e.g. what about the bnb lora layers?).

I couldn't come up with an alternative yet, but I wonder if we could achieve something with wrapping and/or forward hooks. I'll continue to think about this tomorrow but wanted to let you know already in case you have some ideas.

I agree, it is not very easy to read. Perhaps we could implement some sort of hook in the LoraLayer so that techniques such as DoRA and X-LoRA could use that instead of modifying the layer source?

EricLBuehler · 2024-03-06T13:29:37Z

Thank you for your review! I have updated the code with your suggestions.

How is the X-LoRA adapter trained? Could you please provide an example? Eventually, we'll want to move this to unit tests.
Each LoRA adapter for an X-LoRA model is trained by training a LoRA adapter separately. The weights are then loaded, and the X-LoRA model is trained by only training the classifier. Would you recommend that I add a few sentences to the docstring detailing what is above, or some other code example? If a code example would be better, what part of the X-LoRA training process should I show?
Could you please add the copyright notice to all new modules?
I have added it to each new module.
X-LoRA only really works with transformers language models, right? Can we document this more clearly? Also, do you think it would be possible to make this work with other types of models?
With the current implementation of the classifier, it will not work with other types of models as it requires a sequence length. However, it would be possible to make it work with other types of models by changing the way the resulting scalings are reshape/expanded. I have added a note in the XLoraModel docstring.
I'm not a fan of the type annotations of the style self.inner: nn.ModuleList = nn.ModuleList([]) or model: nn.Module = self.model, especially when followed by a # type: ignore. Same with the use of typing.cast. Is that because your IDE flags the code otherwise? Maybe you could deactivate the type checker for this project, as PEFT isn't really well annotated.
Yes, my IDE would flag the code otherwise. I have removed these.

BenjaminBossan

Thanks for all the updates. I left a few comments, but let's focus more on the bigger picture and not each detail.

When I tried to run your code, I encountered an error:

AttributeError: 'XLoraConfig' object has no attribute 'target_modules'

I also think some other parts of the code don't quite work. To avoid this in the future, let's start adding unit tests. Let's start simple and add a new file tests/test_xlora.py with a functional test based on the example you posted earlier. It should also contain a test of training the XLoRA classifier.

Regarding the issue of only working with transformers language models, I think it's fine for the start. We can think of generalizing this in a follow up PR.

Again, thanks a lot for this contribution and your patience.

src/peft/tuners/xlora/classifier.py

BenjaminBossan · 2024-03-07T16:26:46Z

src/peft/tuners/xlora/classifier.py

+
+ logits = self.last.forward(hidden_state)
+
+ ### Repeat to make layerwise scalings if the classifier layer does not


Could you please add this explanation as a comment? Thanks.

src/peft/tuners/xlora/util.py

src/peft/tuners/xlora/model.py

src/peft/tuners/lora/layer.py

BenjaminBossan · 2024-03-26T13:04:39Z

How is the state of the PR? Let me know if you need any help with it.

EricLBuehler

Thank you for your comments! I have fixed the embedding layer, and all tests pass when I run:

pytest tests/test_xlora.py

Here is the coverage:

src/peft/tuners/xlora/__init__.py                         3      0   100%
src/peft/tuners/xlora/classifier.py                      88      9    90%   74-77, 80, 118-120, 142-143
src/peft/tuners/xlora/config.py                          35      7    80%   82-85, 87-90, 93, 96, 101
src/peft/tuners/xlora/layer.py                          110     30    73%   112, 114, 159, 161, 170-171, 186, 194-223
src/peft/tuners/xlora/model.py                          171     14    92%   72-82, 142, 150, 154, 159-160, 275, 307, 312, 315

I also updated ruff and ran make style which produced no formatting changes but gave several errors such as the following, which I ignored as I have not modified that part of the codebase.

src/peft/tuners/tuners_utils.py:568:9: F811 Redefinition of unused `active_adapter` from line 503
    |
567 |     @property
568 |     def active_adapter(self) -> str | list[str]:
    |         ^^^^^^^^^^^^^^ F811
569 |         # use a property to ensure that active_adapter is not set directly, instead use the set_adapter method
570 |         return self._active_adapter
    |
    = help: Remove definition: `active_adapter`

EricLBuehler · 2024-06-29T23:36:24Z

src/peft/tuners/xlora/model.py

+
+ device = None
+ for module in base.modules():
+ # Check the exact type because classes like OPTLearnedPositionalEmbedding inherit from nn.Embedding


Should we check for the exact type or use isinstance? My thought was that isinstance is not strict enough, but I cannot think of a case where a class is a subtype of a LoRA layer (at the moment) and we need to handle that.

I think isinstance should also work, though the only existing layer that would be affected is AdaLoraLayer, so it won't make a big difference either way.

EricLBuehler · 2024-07-01T09:26:46Z

It looks like test_save_load_functional_pt failed for one test but passed in another? That seems super strange, could it be the order in which test_save_load_functional_pt and test_save_load_functional are executed?

The only state which should be shared are the saved model files, I think.

BenjaminBossan · 2024-07-01T11:02:06Z

It looks like test_save_load_functional_pt failed for one test but passed in another? That seems super strange, could it be the order in which test_save_load_functional_pt and test_save_load_functional are executed?

Yeah, really strange, I can't replicate the issue locally and also some CI passes, some fails, so it looks flaky.

The only state which should be shared are the saved model files, I think.

How are the model files shared? The tmp_dir should be a separate one for the two tests. The only shared state I can spot at the moment is the tokenizer. Could you please try making its fixture "function" scoped instead of "class"? That shouldn't slow down the tests by much.

EricLBuehler · 2024-07-01T11:20:38Z

How are the model files shared? The tmp_dir should be a separate one for the two tests. The only shared state I can spot at the moment is the tokenizer. Could you please try making its fixture "function" scoped instead of "class"? That shouldn't slow down the tests by much.

I made the tmp_dir and the tokenizer both function scoped, perhaps an interaction in the tmp directory is the problem.

EricLBuehler · 2024-07-01T11:52:07Z

I added the scope information to the LoRA adapter saving too, and the tests pass locally now.

BenjaminBossan · 2024-07-01T12:16:41Z

I added the scope information to the LoRA adapter saving too, and the tests pass locally now.

Does that mean you could reproduce the error locally? How did you run the tests, did you use CPU or GPU, what OS did you use?

Ideally, we should fix the issue without changing the scope of the saved lora adapters to "function", as this means we need to create them again for each test, which is pretty wasteful. Do you have any suspicion what kind of side effect could be responsible for the test failure?

EricLBuehler · 2024-07-01T12:30:37Z

Does that mean you could reproduce the error locally? How did you run the tests, did you use CPU or GPU, what OS did you use?

I ran pytest tests/test_xlora.py and all tests passed on my CUDA GPU with WSL2.

Ideally, we should fix the issue without changing the scope of the saved lora adapters to "function", as this means we need to create them again for each test, which is pretty wasteful. Do you have any suspicion what kind of side effect could be responsible for the test failure?

Yeah, I'm curious if recreating them will be the solution though - perhaps something is being overwritten when we save with Pytorch? I noticed that when I flip the order of test_save_load_functional_pt and test_save_load_functional so that test_save_load_functional comes before test_save_load_functional_pt I can reproduce the issue locally sometimes, which implies that it is a sporadic thing?

BenjaminBossan · 2024-07-02T10:12:02Z

Yeah, I'm curious if recreating them will be the solution though - perhaps something is being overwritten when we save with Pytorch? I noticed that when I flip the order of test_save_load_functional_pt and test_save_load_functional so that test_save_load_functional comes before test_save_load_functional_pt I can reproduce the issue locally sometimes, which implies that it is a sporadic thing?

Oh I see now, when I change the order I also get the error. The tests are dumping everything into the same temporary directory. There is an easy fix for this: Let's use a separate temporary directory for each test using the tmp_path fixture provided by pytest. Here are the steps to take:

Let's rename tmp_dir to lora_dir, just to avoid confusion with the name. Adjust saved_lora_adapters accordingly.
Let's create a lora_embedding_dir fixture which is the same as lora_dir but we will use it for saved_lora_embedding_adapters.
- Now saved_lora_adapters and saved_lora_embedding_adapters are created once ("class" scope) and they have separate directories, so there should be no side effect from the tests.
Finally, replace all tmp_dir that are being used by the tests by tmp_path, i.e. the built-in pytest fixture. This now ensures that each test gets their own temporary directory.

EricLBuehler · 2024-07-02T13:20:58Z

@BenjaminBossan I updated the tests to use the tmp_path fixture and did the renames of tmp_dir -> lora_dir and created lora_embedding_dir. The tests pass now on my machine when I invert the order (I left it inverted in the committed test).

EricLBuehler · 2024-07-03T12:13:52Z

Looks like the tests are failing with this error:

FAILED tests/test_decoder_models.py::PeftDecoderModelTester::test_inference_safetensors_14_test_hf_internal_testing_tiny_random_GPTNeoXForCausalLM_boft - requests.exceptions.ReadTimeout: (ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10)"), '(Request ID: 73f83a27-fedd-45f8-9c59-4b8a51388973)')

But the X-LoRA tests pass.

BenjaminBossan

Tests are now passing, nice! As you correctly observed, the failing CI is unrelated, so no need to worry about that.

While thinking about this implementation a bit more, I think I came up with a further simplification, which would allow us to revert all changes to lora/model.py. This would be quite nice, because I don't want to make LoraModel more complex for X-LoRA, since the former should theoretically not have to know about the latter. Please check if my suggestion makes change.

BenjaminBossan · 2024-07-03T13:25:29Z

src/peft/tuners/lora/model.py

@@ -164,7 +164,8 @@ def _prepare_model(self, peft_config: LoraConfig, model: nn.Module):
 model (`nn.Module`):
 The model that is going to be adapted.
 """
- if peft_config.layer_replication:
+ # Handle X-LoRA case


I think I found a way to allow us to remove all these changes to lora/model.py. The main issue is the following: During XLoraModel.__init__, we want to create the LoraModel instance based on the XLoraConfig. This LoraModel is not supposed to contain any actual LoRA adapter, as those will come later through the config.adapters. So what we need is to implement a way to create this "empty" LoraModel without needing to change LoraModel itself. Here is my idea:

First, let's make a change to XLoraModel.__init__ by creating a copy of the XLoraConfig that "imitates" a normal LoraConfig:

modified src/peft/tuners/xlora/model.py @@ -140,7 +140,15 @@ class XLoraModel(BaseTuner): conf = config[adapter_name] else: conf = config - lora_model = LoraModel(model, config.copy(), adapter_name) + + # create an empty LoraModel + base_lora_config = copy.copy(conf) + base_lora_config.target_modules = DUMMY_TARGET_MODULES + # imitate a LoraConfig, fields might need to be updated if LoraConfig is updated + base_lora_config.layer_replication = None + base_lora_config.bias = "none" + lora_model = LoraModel(model, base_lora_config, adapter_name) +

So we set the required attributes so that we no longer need the extra checks in LoraModel that this PR adds. Also, we have added DUMMY_TARGET_MODULES. What is this? It's a constant defined in constants.py as DUMMY_TARGET_MODULES = "dummy-target-modules". This is a special value that allows us to create an "empty" LoraModel. For this to work, we also need a small change in BaseTuner:

modified src/peft/tuners/tuners_utils.py @@ -395,6 +395,10 @@ class BaseTuner(nn.Module, ABC): self._prepare_model(peft_config, model) is_target_modules_in_base_model = False key_list = [key for key, _ in model.named_modules()] + if getattr(peft_config, "target_modules", None) == DUMMY_TARGET_MODULES: + # dummy adapter, we allow not matching any module + key_list = [] + is_target_modules_in_base_model = True

I tested this locally and the tests pass after reverting all changes to lora/model.py.

I think this is the more elegant solution, because it should not be necessary for LoraModel to know about XLoRA.

EricLBuehler · 2024-07-03T14:42:18Z

All tests pass locally after the separation of concerns change.

src/peft/tuners/xlora/model.py

EricLBuehler · 2024-07-04T16:07:29Z

Are test failures perhaps caused by the fact that we are downloading models during the testing?

BenjaminBossan · 2024-07-04T16:27:44Z

Are test failures perhaps caused by the fact that we are downloading models during the testing?

Yes, you can ignore them, we have some strange issues with timeouts lately.

EricLBuehler · 2024-07-04T17:04:58Z

Ah, ok. Are there any other changes you would like me to make before merge?

BenjaminBossan

I think I don't see any issues left that would require fixing, so finally this PR can be approved :-) I know it's been a long time in the making, so thanks a lot for your patience and your work on X-LoRA.

I'll give you the opportunity to also do a last check to see if we missed anything, since there were a lot of smaller changes recently. If you give the thumbs up, I can merge the PR.

Note that I still think it is very important to also add documentation and at least one example. Otherwise, users will have a very hard time discovering this method, which would be a pity, given the amount of work that went into it. So I hope you'll add those in a future PR. That one should be a lot less work ;-)

BenjaminBossan · 2024-07-05T10:04:12Z

src/peft/tuners/xlora/model.py

+
+ device = None
+ for module in base.modules():
+ # Check the exact type because classes like OPTLearnedPositionalEmbedding inherit from nn.Embedding


I think isinstance should also work, though the only existing layer that would be affected is AdaLoraLayer, so it won't make a big difference either way.

EricLBuehler · 2024-07-05T10:23:45Z

I think this is ready to merge, all the X-LoRA functionality is implemented! Perhaps I can do a follow-up PR to add docs. Thanks for all your help.

BenjaminBossan · 2024-07-05T10:38:35Z

Perhaps I can do a follow-up PR to add docs.

That would be really great.

BenjaminBossan reviewed Feb 22, 2024

View reviewed changes

BenjaminBossan reviewed Feb 23, 2024

View reviewed changes

BenjaminBossan reviewed Mar 5, 2024

View reviewed changes

BenjaminBossan reviewed Mar 7, 2024

View reviewed changes

EricLBuehler added 11 commits March 29, 2024 13:48

Initial commit of integration

9aeb69c

Pass a reference back to the peftmodel when creating

b9d3878

Check base model in from_pretrained

bbd3ad4

Fix assert, attr

ea53917

Fix inheritance from peft model

79c4df9

Update inheritance again

2ee88af

Update inheritance again and properly instantiate

19063dc

Update comment

d364e5b

Export config, model

c9ab310

Remove use of default attr

9ed665e

Remove use of default attr

ae32006

Fix target modules

abdcb50

EricLBuehler commented Jun 29, 2024

View reviewed changes

Somehow it didn't get formatted

aecd492

Make tmp dir and tokenizer function scoped

68e5f2f

Scope the lora adapters too

1628227

Use unique temp dirs for lora adapters, all tests

5a889be

BenjaminBossan requested changes Jul 3, 2024

View reviewed changes

EricLBuehler added 2 commits July 3, 2024 10:38

Seperation of concerns for xlora and lora

072987e

Merge remote-tracking branch 'upstream/main' into add_xlora

13edc61

BenjaminBossan reviewed Jul 3, 2024

View reviewed changes

src/peft/tuners/xlora/model.py Show resolved Hide resolved

Prevent inf recursion as per 1892

77fb6b0

BenjaminBossan reviewed Jul 4, 2024

View reviewed changes

src/peft/tuners/xlora/model.py Outdated Show resolved Hide resolved

Prevent inf recursion on lora_model

2f85f91

BenjaminBossan approved these changes Jul 5, 2024

View reviewed changes

BenjaminBossan merged commit 58afb34 into huggingface:main Jul 5, 2024
12 of 14 checks passed


		logits = self.last.forward(hidden_state)

		### Repeat to make layerwise scalings if the classifier layer does not

Integrate X-LoRA #1491

Integrate X-LoRA #1491

Conversation

EricLBuehler commented Feb 20, 2024 • edited Loading

Changes

Status

BenjaminBossan commented Feb 22, 2024

EricLBuehler commented Feb 22, 2024

BenjaminBossan left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan commented Feb 27, 2024

EricLBuehler commented Feb 27, 2024

BenjaminBossan commented Feb 28, 2024

EricLBuehler commented Feb 28, 2024

BenjaminBossan commented Feb 29, 2024

EricLBuehler commented Feb 29, 2024 • edited Loading

BenjaminBossan commented Feb 29, 2024

EricLBuehler commented Feb 29, 2024 • edited Loading

BenjaminBossan commented Mar 1, 2024

EricLBuehler commented Mar 1, 2024

EricLBuehler commented Mar 5, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricLBuehler commented Mar 6, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BenjaminBossan commented Mar 26, 2024

EricLBuehler left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricLBuehler commented Jul 1, 2024 • edited Loading

BenjaminBossan commented Jul 1, 2024

EricLBuehler commented Jul 1, 2024

EricLBuehler commented Jul 1, 2024

BenjaminBossan commented Jul 1, 2024

EricLBuehler commented Jul 1, 2024

BenjaminBossan commented Jul 2, 2024

EricLBuehler commented Jul 2, 2024 • edited Loading

EricLBuehler commented Jul 3, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricLBuehler commented Jul 3, 2024

EricLBuehler commented Jul 4, 2024

BenjaminBossan commented Jul 4, 2024

EricLBuehler commented Jul 4, 2024

BenjaminBossan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

EricLBuehler commented Jul 5, 2024

BenjaminBossan commented Jul 5, 2024

EricLBuehler commented Feb 20, 2024 •

edited

Loading

BenjaminBossan left a comment •

edited

Loading

EricLBuehler commented Feb 29, 2024 •

edited

Loading

EricLBuehler commented Feb 29, 2024 •

edited

Loading

EricLBuehler commented Jul 1, 2024 •

edited

Loading

EricLBuehler commented Jul 2, 2024 •

edited

Loading