Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: update README on preparing quantized model #19

Closed
wants to merge 2 commits into from

Conversation

krschacht
Copy link
Contributor

@krschacht krschacht commented Jul 2, 2024

docs: update description of preparing quantized model in usage section.

Correct some some outdated references to files within the llama.cpp repo and update the example to use a smaller model.
@krschacht
Copy link
Contributor Author

@yoshoku I don't understand your commitlint. I get the line length, but it's referring to some subject and type which eludes me. But you're welcome to reject this PR if you don't care about updating this. I mostly was keeping these notes for myself as I was trying to get your project working since some of the references had changed within llama_cpp

@yoshoku
Copy link
Owner

yoshoku commented Jul 3, 2024

@krschacht Thank you for your contribution. llama_cpp.rb adopts conventional commits: https://www.conventionalcommits.org/en/v1.0.0/ For example, the commit message for this change might be docs: update description of preparing quantized model in usage section.

@krschacht krschacht changed the title Correct README.md docs: update README on preparing quantized model Jul 3, 2024
@krschacht
Copy link
Contributor Author

krschacht commented Jul 3, 2024

@yoshoku Ah, got it. Somehow I never knew about conventional commits! TIL :) I just updated the PR so hopefully it's ready to go.

BTW, do you have other plans for this project? I was very excited to find this. I found it while looking for the ruby equivalent of Python's https://github.com/jncraton/languagemodels. I actually really like that this PIP package uses CTranslate2 as the backend and I haven't found a Ruby gem doing the bindings for CTranslate2.

The one issue I ran into with your project was when I tried to load a model like LaMini-Flan-T5-248M. I'm new to this space, but apparently Llama is it's own model architecture whereas T5 is a different architecture, so I can't use llama_cpp.rb to run a T5 model.

@yoshoku
Copy link
Owner

yoshoku commented Jul 4, 2024

@krschacht I wanted you to fix the git commit message, not the pull request description. But, I understood the gist of the pull request, so I fixed README with you as a co-author 8edfd6d. I am going to close this pull request, but please do not take it personally.

@yoshoku yoshoku closed this Jul 4, 2024
@krschacht
Copy link
Contributor Author

@yoshoku I don't mind at all! Linters... I'm not looking for points. :) I also run a project and I jump in to help get PRs over the line all the time. It's often easier.

I really am interested in if you have other plans for this project? I was thinking about trying to create a version of what you did but for CTranslate2. But maybe I'm wrong and your llama_cpp bindings can also work for T5 models? Anyway, curious where you plan to take your project

@yoshoku
Copy link
Owner

yoshoku commented Jul 5, 2024

@krschacht I think it would be a good idea to create bindings for CTranslate2, but I am pretty busy these days so I probably will not have time to do it.
llama.cpp only recently added support for the T5 architecture, so llama_cpp.rb does not yet support it: ggerganov/llama.cpp#8141. I plan to add bindings for newly added functions such as llama_model_has_encoder, but I cannot guarantee that the example scripts will also support the T5 architecture, as this depends on my free time.

@krschacht
Copy link
Contributor Author

I didn't realize llama_cpp had recently added support for T5. I found an interesting thread talking about the performance of CTranslate2 vs llama_cpp and it sounds like some of the performance optimizations within CTranslate2 are already in llama_cpp, so the performance may not be that different.

Anyway, I appreciate you creating this project and for sharing this insight. Consider me a motivated & interested "user" in case you ever need help with testing, bugs, implementing specific pieces, etc. I'll keep playing with llama_cpp.rb now that I have it all working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants