Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tokenizers fork: Include upstream changes for platform dependent libs in CGO #33

Closed
gregfurman opened this issue Jul 16, 2024 · 3 comments

Comments

@gregfurman
Copy link

gregfurman commented Jul 16, 2024

Hey 👋

Thanks for this library. I'm currently using it to build some inference tooling plugins for a stream processor.

Can the knights-analytics/tokenizer fork be updated to include PR/18 Update to allow for platform dependent libs in CGO?

Think this will make compilation a bit easier, especially for people like me who are on a Mac where /usr/lib is a protected directory as including that tokenizers_srcdir_relative is not really ideal when building.

https://github.com/daulet/tokenizers/blob/d9aff87d16f3db537ee005fb45ebca26049e7916/tokenizer.go#L6

@RJKeevil
Copy link
Collaborator

RJKeevil commented Jul 16, 2024

Hi @gregfurman , yes we are planning to move back to the base tokenizer project now that the project has configurable paths. We just need to contribute one PR back to the repo as we rely on having offsets for some of our pipeine types. I'll update once that is in.

@RJKeevil
Copy link
Collaborator

FYI PR is here daulet/tokenizers#21

@RJKeevil
Copy link
Collaborator

This change is now part of the v0.1.4 release

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants