Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/custom dataset chat template #665

Merged
merged 5 commits into from
Sep 25, 2024
Merged

Conversation

mreso
Copy link
Contributor

@mreso mreso commented Sep 10, 2024

What does this PR do?

This PR fixes the custom dataset example for tokenizer that add system prompt by default.

Fixes # (issue)
#669

Feature/Issue validation/testing

Please describe the tests that you ran to verify your changes and relevant result summary. Provide instructions so it can be reproduced.
Please also list any relevant details for your test configuration.

  • pytest src/tests/datasets/test_custom_dataset.py -s -k test_tokenize_dialog
========================================================================================================================================================================== test session starts ===========================================================================================================================================================================
platform linux -- Python 3.10.14, pytest-8.3.2, pluggy-1.5.0
rootdir: /home/mreso/llama-recipes
configfile: pyproject.toml
plugins: anyio-4.4.0, mock-3.14.0
collected 6 items / 4 deselected / 2 selected

src/tests/datasets/test_custom_dataset.py ..

============================================================================================================================================================================ warnings summary ============================================================================================================================================================================
src/tests/datasets/test_custom_dataset.py::test_tokenize_dialog[meta-llama/Llama-2-7b-hf]
  /home/mreso/llama-recipes/src/llama_recipes/model_checkpointing/checkpoint_handler.py:17: DeprecationWarning: `torch.distributed._shard.checkpoint` will be deprecated, use `torch.distributed.checkpoint` instead
    from torch.distributed._shard.checkpoint import (

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
=============================================================================================================================================================== 2 passed, 4 deselected, 1 warning in 3.10s ===============================================================================================================================================================

Before submitting

Thanks for contributing 🎉!

Copy link
Contributor

@init27 init27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the tests as well!

@init27 init27 merged commit ee1768d into main Sep 25, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants