Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handled mixed modality #14

Open
wants to merge 9 commits into
base: master
Choose a base branch
from
Open

Conversation

mano3-1
Copy link

@mano3-1 mano3-1 commented Oct 30, 2024

This PR introduces support for datasets containing mixed modalities by implementing the following changes:

  • Conditional Image Data Inclusion: Checks if there is atleast one image associated with a given datapoint. If no images are tagged, all_pixel_values and all_image_grid_thw are excluded from the final data dictionary, data_dict.

  • Homogeneous Batch Sampling: Implements a custom sampler, HomogeneousBatchSampler, to ensure each batch is homogeneous—either containing only text or exclusively text-image pairs.

  • Sampler Methods for Training and Evaluation: Adds _get_eval_sampler and _get_train_sampler methods to QwenTrainer

@mano3-1 mano3-1 changed the title handled mixed modality Handled mixed modality Oct 30, 2024
@mano3-1 mano3-1 mentioned this pull request Oct 30, 2024
Copy link
Owner

@2U1 2U1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes an error of

[rank2]:     sources = self.list_data_dict[i]
[rank2]: TypeError: list indices must be integers or slices, not list

Also, to the best of my knowledge the sampler should just boost the training speed, not a ciritcal issue for running the code. I have some other issues with my env so, I can't really dig in to the code.
I'll try to benchmark the grouping modality code from LLaVA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants