-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handled mixed modality #14
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes an error of
[rank2]: sources = self.list_data_dict[i]
[rank2]: TypeError: list indices must be integers or slices, not list
Also, to the best of my knowledge the sampler should just boost the training speed, not a ciritcal issue for running the code. I have some other issues with my env so, I can't really dig in to the code.
I'll try to benchmark the grouping modality code from LLaVA.
modified: scripts/finetune_lora.sh
modified: src/training/trainer.py
This PR introduces support for datasets containing mixed modalities by implementing the following changes:
Conditional Image Data Inclusion: Checks if there is atleast one image associated with a given datapoint. If no images are tagged,
all_pixel_values
andall_image_grid_thw
are excluded from the final data dictionary,data_dict
.Homogeneous Batch Sampling: Implements a custom sampler,
HomogeneousBatchSampler
, to ensure each batch is homogeneous—either containing only text or exclusively text-image pairs.Sampler Methods for Training and Evaluation: Adds
_get_eval_sampler
and_get_train_sampler
methods toQwenTrainer