Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add system role to deepseek chat template #3031

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

AllentDan
Copy link
Collaborator

No description provided.

@lvhan028 lvhan028 self-requested a review January 15, 2025 03:08
@lvhan028
Copy link
Collaborator

model = MODELS.get('deepseek')()
messages = [{
    'role': 'system',
    'content': 'you are a helpful assistant'
}, {
    'role': 'user',
    'content': 'who are you'
}, {
    'role': 'assistant',
    'content': 'I am an AI'
}, {
    'role': 'user',
    'content': 'hi'
}]
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained(
    'deepseek-ai/DeepSeek-V2-Lite', trust_remote_code=True)
ref = tokenizer.apply_chat_template(messages, tokenize=False)
res = '<|begin▁of▁sentence|>' + model.messages2prompt(messages)
assert res.startswith(ref)

It tests failed using the above case.

@AllentDan
Copy link
Collaborator Author

AllentDan commented Jan 15, 2025

But in real testing, 'who are you' should be responsed with ' I am an AI'

@AllentDan AllentDan changed the title remove the space following assistant Add system role to deepseek chat template Jan 15, 2025
@lvhan028 lvhan028 requested a review from RunningLeon January 15, 2025 06:53
@lvhan028
Copy link
Collaborator

Can we add UTs to cover ti?

Copy link
Collaborator

@RunningLeon RunningLeon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lvhan028
Copy link
Collaborator

[:-1] should be removed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants