ollama and requests: support proper streaming #14

maxwell-bland · 2024-03-15T19:16:16Z

Supports proper streaming of inputs via curl and incremental building of autocomplete suggestions for Ollama. This is necessary on lower-end or non-GPU machines. Additionally, kills previous curl calls to ensure that multiple curls will not be fired in a row during autocomplete, which kills the ollama server.

Needs testing on non-ollama services (I have no access).

Thanks!
Maxwell

Supports proper streaming of inputs via curl and incremental building of autocomplete suggestions for Ollama. This is necessary on lower-end or non-GPU machines. Additionally, kills previous curl calls to ensure that multiple curls will not be fired in a row during autocomplete, which kills the ollama server.

maxwell-bland · 2024-03-18T14:05:21Z

Update, this also has a back-end issue since Ollama's http server is stateful and does not handle canceling curl very well. Potentially the ollama backend should be reworked to use a pipe to ollama run directly.

ollama/ollama#3225

tzachar · 2024-03-20T09:01:40Z

@maxwell-bland
waiting for you to address my comments.
10x

maxwell-bland · 2024-03-21T13:25:59Z

@tzachar not sure they went through, the merge pull request (below) should highlight review changes?

I need to fix ollama's http server setup to add appropriate request cancellation so it doesn't get flooded, but do not have an abundance of time just yet, hopefully this weekend

tzachar · 2024-03-24T07:39:18Z

@maxwell-bland
You did not push any more changes to your branch.

maxwell-bland · 2024-03-24T14:22:45Z

@tzachar sorry if it was not clear: your comments did not get posted, as far as I can see. What is your opinion on the change? Were there some comments in the code that I missed?

I think it was fine to use curl in the short term, but it would probably be better to switch it to a subprocess that works via pipes since it would reduce latency and difficulties in managing requests. I will update this commit once I have some more time.

I looked at adding timeout/session management to ollama yesterday but it will be a bit of a slog.

tzachar · 2024-03-25T09:52:44Z

@maxwell-bland you can see my comments above in this thread, with a pending badge.

maxwell-bland

This is a comment

maxwell-bland · 2024-03-25T14:01:13Z

@tzachar apologies, I see no "view reviewed changes" similar to my own comment above.

potentially it is just github being unintuitive. Maybe you could leave your changes in a direct comment on the pull request? I checked through the inbox on the site, etc, and cannot find the comments anywhere

tzachar · 2024-03-28T09:49:15Z

All my comments are included inside gitlab's review process.
you can access it from the top of the pull request (https://github.com/tzachar/cmp-ai/pull/14/files/f4877c51e2c4354dea5eb3ffa95530b481b7c8d3).
search for "view reviewed changes"

alexandradeas · 2024-04-09T22:36:09Z

@tzachar if the comments show as pending that means they haven't been sent yet. If you write comments as part of a PR review (and not just as individual comments) you need to give a decision on the PR before they're posted - either Accept/Request Changes/Comment I believe.

There's a guide here about the process https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/reviewing-proposed-changes-in-a-pull-request

tzachar · 2024-03-17T07:16:01Z

lua/cmp_ai/requests.lua

@@ -44,10 +46,30 @@ function Service:Get(url, headers, data, cb)
    args[#args + 1] = h
  end

-  job
+  if lastjob ~= nil then


why not use the job:shutdown method here? using kill means it will only work on linux/mac

tzachar · 2024-03-20T09:00:41Z

lua/cmp_ai/backends/ollama.lua

    options = self.params.options,
  }
+  local new_data = {}


This seems off.
You redefine it inside the callback later (line 35).

tzachar · 2024-04-10T08:10:57Z

Thanks @alexandradeas
My bad.

maxwell-bland commented Mar 25, 2024

View reviewed changes

tzachar requested changes Apr 10, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ollama and requests: support proper streaming #14

ollama and requests: support proper streaming #14

maxwell-bland commented Mar 15, 2024

maxwell-bland commented Mar 18, 2024

tzachar commented Mar 20, 2024

maxwell-bland commented Mar 21, 2024 •

edited

Loading

tzachar commented Mar 24, 2024

maxwell-bland commented Mar 24, 2024 •

edited

Loading

tzachar commented Mar 25, 2024

maxwell-bland left a comment •

edited

Loading

maxwell-bland commented Mar 25, 2024

tzachar commented Mar 28, 2024

alexandradeas commented Apr 9, 2024 •

edited

Loading

tzachar Mar 17, 2024

tzachar Mar 20, 2024

tzachar commented Apr 10, 2024

ollama and requests: support proper streaming #14

Are you sure you want to change the base?

ollama and requests: support proper streaming #14

Conversation

maxwell-bland commented Mar 15, 2024

maxwell-bland commented Mar 18, 2024

tzachar commented Mar 20, 2024

maxwell-bland commented Mar 21, 2024 • edited Loading

tzachar commented Mar 24, 2024

maxwell-bland commented Mar 24, 2024 • edited Loading

tzachar commented Mar 25, 2024

maxwell-bland left a comment • edited Loading

Choose a reason for hiding this comment

maxwell-bland commented Mar 25, 2024

tzachar commented Mar 28, 2024

alexandradeas commented Apr 9, 2024 • edited Loading

tzachar Mar 17, 2024

Choose a reason for hiding this comment

tzachar Mar 20, 2024

Choose a reason for hiding this comment

tzachar commented Apr 10, 2024

maxwell-bland commented Mar 21, 2024 •

edited

Loading

maxwell-bland commented Mar 24, 2024 •

edited

Loading

maxwell-bland left a comment •

edited

Loading

alexandradeas commented Apr 9, 2024 •

edited

Loading