-
Notifications
You must be signed in to change notification settings - Fork 351
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Estimate token usage, cost #870
Comments
hi @pengelbrecht - thanks for the issue! let me know if something like this is what you're looking for and/or feel free to make a specific enhancement request |
It seems like that would work. However, I'm primarily a hobby programmer who appreciates the simplicity of Marvin, so the subclassing approach might be a bit beyond my expertise. Ideally, I'd prefer something simpler and more in line with Marvin's design philosophy. Unfortunately, I'm not really qualified to suggest a specific alternative. Sorry. |
thanks for the response @pengelbrecht - no worries. if you don't mind, what would you ideal experience look like? people often have drastically different ideas as far as what they want token tracking to look like, but your perspective would be useful to build a sense of what a common-sense / middle-of-the-ground offering might look like |
Here's how I do it today with direct openAI API use. But returning a tuple doesn't feel very Marvinesque :) def openai_cost_usd(model_name, prompt_tokens, completion_tokens):
if model_name == "gpt-4-turbo-preview":
return prompt_tokens * 10.0 / 1e6 + completion_tokens * 30.0 / 1e6
elif model_name == "gpt-3.5-turbo":
return prompt_tokens * 0.5 / 1e6 + completion_tokens * 1.5 / 1e6
else:
return None
async def fetch_chat_completion(
user_message: str,
system_prompt: str = _default_system_prompt,
model_name: str = _default_model,
temperature: float = _default_temperature,
) -> Tuple[str, int, float]:
"""Fetch a single chat completion for a user message"""
chat_completion = await client.chat.completions.create(
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": user_message},
],
model=model_name,
temperature=temperature,
)
response_message = chat_completion.choices[0].message.content
prompt_tokens = chat_completion.usage.prompt_tokens
completion_tokens = chat_completion.usage.completion_tokens
total_tokens = prompt_tokens + completion_tokens
cost = openai_cost_usd(model_name, prompt_tokens, completion_tokens)
return response_message, total_tokens, cost |
litellm's approach is wonderful: https://litellm.vercel.app/docs/completion/token_usage – but I guess there's no parallel to the completion object in Marvin's approach? |
Discussed in #546
Originally posted by ww-jermaine August 25, 2023
Hello, is there a way to estimate the token usage and cost per call of ai_fn, ai_model, etc.?
Something like the callback from langchain:
The text was updated successfully, but these errors were encountered: