Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server: add OpenAI compatible response format for legacy /completions with b… #10645

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Nero7991
Copy link

@Nero7991 Nero7991 commented Dec 4, 2024

This is based of a previous PR

However, @ngxson seems to be working refactoring the server.cpp to prevent use of JSON as stated here so I don't expect is to be merged easily. However, might be of use to someone else.

Support for full (almost) OpenAI API response format for the legacy completion related endpoints (including when logprobs is specified)

When oai_compat is set to True in the request (as suggested by @ngxson, the old response format is used (check tests)

HELM benchmarks from CRFM have support for a OpenAI compatible API server that uses this endpoint, this enables testing differently quantized models for degradation against this benchmark. Tested it on a QwQ Preview 32B GGUF Q4_K_M to evaluate the model against other frontier models. I've described that here

@github-actions github-actions bot added examples python python script changes server labels Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
examples python python script changes server
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant