Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ollama and requests: support proper streaming #14

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

maxwell-bland
Copy link

Supports proper streaming of inputs via curl and incremental building of autocomplete suggestions for Ollama. This is necessary on lower-end or non-GPU machines. Additionally, kills previous curl calls to ensure that multiple curls will not be fired in a row during autocomplete, which kills the ollama server.

Needs testing on non-ollama services (I have no access).

Thanks!
Maxwell

Supports proper streaming of inputs via curl and incremental
building of autocomplete suggestions for Ollama. This is
necessary on lower-end or non-GPU machines. Additionally,
kills previous curl calls to ensure that multiple curls
will not be fired in a row during autocomplete, which kills
the ollama server.
@maxwell-bland
Copy link
Author

Update, this also has a back-end issue since Ollama's http server is stateful and does not handle canceling curl very well. Potentially the ollama backend should be reworked to use a pipe to ollama run directly.

ollama/ollama#3225

@tzachar
Copy link
Owner

tzachar commented Mar 20, 2024

@maxwell-bland
waiting for you to address my comments.
10x

@maxwell-bland
Copy link
Author

maxwell-bland commented Mar 21, 2024

@tzachar not sure they went through, the merge pull request (below) should highlight review changes?

I need to fix ollama's http server setup to add appropriate request cancellation so it doesn't get flooded, but do not have an abundance of time just yet, hopefully this weekend

@tzachar
Copy link
Owner

tzachar commented Mar 24, 2024

@maxwell-bland
You did not push any more changes to your branch.

@maxwell-bland
Copy link
Author

maxwell-bland commented Mar 24, 2024

@tzachar sorry if it was not clear: your comments did not get posted, as far as I can see. What is your opinion on the change? Were there some comments in the code that I missed?

I think it was fine to use curl in the short term, but it would probably be better to switch it to a subprocess that works via pipes since it would reduce latency and difficulties in managing requests. I will update this commit once I have some more time.

I looked at adding timeout/session management to ollama yesterday but it will be a bit of a slog.

@tzachar
Copy link
Owner

tzachar commented Mar 25, 2024

@maxwell-bland you can see my comments above in this thread, with a pending badge.

Copy link
Author

@maxwell-bland maxwell-bland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a comment

@maxwell-bland
Copy link
Author

@tzachar apologies, I see no "view reviewed changes" similar to my own comment above.

potentially it is just github being unintuitive. Maybe you could leave your changes in a direct comment on the pull request? I checked through the inbox on the site, etc, and cannot find the comments anywhere

@tzachar
Copy link
Owner

tzachar commented Mar 28, 2024

All my comments are included inside gitlab's review process.
you can access it from the top of the pull request (https://github.com/tzachar/cmp-ai/pull/14/files/f4877c51e2c4354dea5eb3ffa95530b481b7c8d3).
search for "view reviewed changes"

@alexandradeas
Copy link

alexandradeas commented Apr 9, 2024

@tzachar if the comments show as pending that means they haven't been sent yet. If you write comments as part of a PR review (and not just as individual comments) you need to give a decision on the PR before they're posted - either Accept/Request Changes/Comment I believe.

There's a guide here about the process https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/reviewing-proposed-changes-in-a-pull-request

@@ -44,10 +46,30 @@ function Service:Get(url, headers, data, cb)
args[#args + 1] = h
end

job
if lastjob ~= nil then
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not use the job:shutdown method here? using kill means it will only work on linux/mac

options = self.params.options,
}
local new_data = {}
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems off.
You redefine it inside the callback later (line 35).

@tzachar
Copy link
Owner

tzachar commented Apr 10, 2024

Thanks @alexandradeas
My bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants