Skip to content

Mismatch to OpenAI's tokenizer? #29

Closed Answered by niieani
hitsthings asked this question in Q&A
Discussion options

You must be logged in to vote

The old OpenAI tokenizer playground use GPT-3, which was p50k_base. They've updated it now.
gpt-tokenizer's playground uses cl100k_base by default, which is compatible GPT-3.5 and GPT-4.
Different tokenizer encoding, different result. Hope this helps!

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@hitsthings
Comment options

Answer selected by niieani
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants