Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add max_completions_tokens for o1 series models #857

Merged
merged 2 commits into from
Sep 20, 2024

Conversation

chococola
Copy link
Contributor

Describe the change
For new o1 series models attribute max_tokens not working. In documentation this field marks deprecated.
OpenAi introduced new attribute max_completion_tokens.

To manage costs with the o1 series models, you can limit the total number of tokens the model generates (including both reasoning and completion tokens) by using the [max_completion_tokens](https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens) parameter.

In previous models, the max_tokens parameter controlled both the number of tokens generated and the number of tokens visible to the user, which were always equal. However, with the o1 series, the total tokens generated can exceed the number of visible tokens due to the internal reasoning tokens.

Because some applications might rely on max_tokens matching the number of tokens received from the API, the o1 series introduces max_completion_tokens to explicitly control the total number of tokens generated by the model, including both reasoning and visible completion tokens. This explicit opt-in ensures no existing applications break when using the new models. The max_tokens parameter continues to function as before for all previous models.

Provide OpenAI documentation link
https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_tokens
https://platform.openai.com/docs/api-reference/chat/create#chat-create-max_completion_tokens
https://platform.openai.com/docs/guides/reasoning/controlling-costs
https://community.openai.com/t/why-was-max-tokens-changed-to-max-completion-tokens/938077

Describe your solution
Update ChatCompletionRequest model

Copy link

codecov bot commented Sep 20, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.03%. Comparing base (774fc9d) to head (be774ca).
Report is 51 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #857      +/-   ##
==========================================
+ Coverage   98.46%   99.03%   +0.57%     
==========================================
  Files          24       26       +2     
  Lines        1364     1452      +88     
==========================================
+ Hits         1343     1438      +95     
+ Misses         15        8       -7     
  Partials        6        6              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@chococola
Copy link
Contributor Author

@sashabaranov please take a look to the PR 🙃

@sashabaranov
Copy link
Owner

@chococola great catch, thank you for the PR!

Could we also panic when o1 is used and max_tokens≠0?

@sashabaranov
Copy link
Owner

Or just return an error, panic might be too much :)

@chococola
Copy link
Contributor Author

@chococola great catch, thank you for the PR!

Could we also panic when o1 is used and max_tokens≠0?

panic would be to much, and propably break backwards capabilities for oldest models.

Maybe we should mark attribute as deprecated ? or leave the comment?

return an error

in all places ? and what about batch requests ?

@sashabaranov
Copy link
Owner

@chococola let's cover the most frequent chat completion + streaming case, all other cases would eventually be covered by other PRs :D

@chococola chococola force-pushed the max_completions_tokens branch from 4d313aa to be774ca Compare September 20, 2024 20:31
@chococola
Copy link
Contributor Author

@sashabaranov i add validation check for o1 series models and add some validation for fields which are limited in beta
https://platform.openai.com/docs/guides/reasoning/beta-limitations

Copy link
Owner

@sashabaranov sashabaranov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, o1 is a can of worms! Thank you so much for looking so deeply into this! 🙌🏻

@sashabaranov sashabaranov merged commit 9add1c3 into sashabaranov:master Sep 20, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants