Add explicit prompt/completion logging/attributes, skip tracing for synchronous components #240

petersalas · 2023-08-16T20:58:27Z

Add explicit (conversational) prompt/completion logging and span attributes, including token counts
Skip OpenTelemetry tracing for non-root synchronous components, which drastically simplifies the traces
Remove universal ai.jsx.result.tokenCount span attribute, which doesn't have the opportunity to handle conversational messages correctly

…ace async components by default

vercel · 2023-08-16T20:58:34Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
ai-jsx-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 22, 2023 4:20pm
ai-jsx-nextjs-demo	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 22, 2023 4:20pm
ai-jsx-tutorial-nextjs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Aug 22, 2023 4:20pm

petersalas · 2023-08-16T21:01:23Z

@NickHeiner do you think this tracing/logging is good/comprehensive enough to remove the full render logging/token count attribute? Here's an example trace from the use-tools example

packages/ai-jsx/src/core/log.ts

packages/ai-jsx/src/core/conversation.tsx

packages/ai-jsx/src/core/opentelemetry.ts

packages/ai-jsx/src/core/render.ts

packages/ai-jsx/src/lib/openai.tsx

packages/ai-jsx/src/core/conversation.tsx

packages/examples/test/core/completion.tsx

NickHeiner · 2023-08-18T14:37:30Z

@NickHeiner do you think this tracing/logging is good/comprehensive enough to remove the full render logging/token count attribute? Here's an example trace from the use-tools example

Maybe I'm missing something, but where do we see how many tokens UseTools resolved to?

petersalas · 2023-08-18T15:51:24Z

Maybe I'm missing something, but where do we see how many tokens UseTools resolved to?

You can't -- my claim is that token counts are only meaningful in the context of a single model, so the token counts are only available on the <OpenAIChatModel> spans. In particular with things like context-window-trimming, the relationship between what <UseTools> renders to and the underlying model calls is non-trivial. (Also, which tokenizer should be used? What if different models are used?)

That said, a more well-defined thing we could do is aggregate token usage as reported by the descendant model calls, but that's a different beast. (My inclination is also to do that aggregation downstream rather than in the code itself.)

NickHeiner · 2023-08-21T14:05:41Z

Maybe I'm missing something, but where do we see how many tokens UseTools resolved to?

You can't -- my claim is that token counts are only meaningful in the context of a single model, so the token counts are only available on the <OpenAIChatModel> spans. In particular with things like context-window-trimming, the relationship between what <UseTools> renders to and the underlying model calls is non-trivial. (Also, which tokenizer should be used? What if different models are used?)

That said, a more well-defined thing we could do is aggregate token usage as reported by the descendant model calls, but that's a different beast. (My inclination is also to do that aggregation downstream rather than in the code itself.)

Ok, great point that the tokenizer means the count is only meaningful within the context of a single model.

Broadly, the thing I'm trying to support is the ability to ask "how many tokens did I spend on docs vs API responses". My original solution was to give an output size for every component, but I agree with your points that there are some problems with that. Could we address that with your downstream aggregation suggestion?

NickHeiner

I disagree with some aspects of how this is being done but don't want to block on it.

Can you update https://docs.ai-jsx.com/guides/observability to reflect these changes?

And perhaps add one or two more unit tests?

petersalas · 2023-08-22T16:14:14Z

Broadly, the thing I'm trying to support is the ability to ask "how many tokens did I spend on docs vs API responses". My original solution was to give an output size for every component, but I agree with your points that there are some problems with that. Could we address that with your downstream aggregation suggestion?

IIUC I think that scenario is already addressed with the newly added ai.jsx.prompt attribute, which shows each conversation message given to the LLM and its token count.

petersalas added 2 commits August 16, 2023 16:33

Add prompt/completion conversation logging with token counts, only tr…

20b522d

…ace async components by default

Update test

45158c4

petersalas requested a review from NickHeiner August 16, 2023 20:58

vercel bot deployed to Preview – ai-jsx-nextjs-demo August 16, 2023 21:00 View deployment

vercel bot deployed to Preview – ai-jsx-docs August 16, 2023 21:00 View deployment

vercel bot deployed to Preview – ai-jsx-tutorial-nextjs August 16, 2023 21:00 View deployment