Support for "Online" LLM #100

cpursley · 2024-04-10T13:56:12Z

cpursley
Apr 10, 2024

Maybe out of scope but you know what would be cool - doing this sort of thing (Perplexity clone ie., "online" live LLM):

https://news.ycombinator.com/item?id=39923404
Please add your ideas - infrastructure planing thread nilsherzig/LLocalSearch#38 (architecture)
-https://github.com/searxng/searxng (open source search aggregator)

Basically a RAG approach that knows when to run a live search through the searxng aggregator. Perhaps combined with pgvector and the bumblebee harness GPU for a true open source setup. And maybe hooked up to Crawly to get the page content in parallel.

Could make for an interesting open source fly setup (as all the services could be run there).

Or a simpler version using the Bing API: #48

brainlid · 2024-04-18T01:21:52Z

brainlid
Apr 18, 2024
Maintainer

I think it's a cool idea.

0 replies

cpursley · 2024-04-19T12:51:39Z

cpursley
Apr 19, 2024
Author

Just came across this interesting project:

0 replies

cpursley · 2024-04-24T16:00:22Z

cpursley
Apr 24, 2024
Author

I'm thinking through approaches for building "websearch". Perhaps as another tool (like the calculator)? Where there's a swapable search engine adapter (Bing API, Brave, searxng, etc).

And where you can specify via confg how many results it should fetch in parallel. But it's unclear how to best handle the next step - semantically chunk the html and vectorize it in memory via Nx? Or dump into Postgres pgvector?

Then there's some other things that might need to be done for better reliability like this video describes (regarding the Grader and Hallucination checker): Reliable, fully local RAG agents with LLaMA3

There's also some interesting ideas here: https://docs.tavily.com/blog/building-openai-assistant

Thoughts?

7 replies

cpursley Apr 27, 2024
Author

Regarding the web results, I'd think that typical web pages (the useful parts of the text) would oftentimes end up being too many tokens if dumped right into the LLM context?

brainlid Apr 27, 2024
Maintainer

You may be right. I was finally able to watch the YouTube video "Reliable, fully local RAG agents with LLaMA3" that you mentioned. It's cool. And yes, in that example web searches were vectorized and searched that way.

cpursley Jun 15, 2024
Author

I personally don't think the web results need to be vectorized, just scraped and returned to the LLM for analysis.

I ended up have pretty good results with this approach using a larger context window model (Phi 3). Here's a little gist: https://gist.github.com/cpursley/b4af2ff3b56c912f659bd5300e422790#file-web_search-ex - the interesting part is probably the prompts.

What are you thoughts on creating some kind of WebSearch or WebSummarizer Chain?

brainlid Jun 15, 2024
Maintainer

@cpursley, from what I've read or seen, I imagine it as a tool that's available to be called within a chain. And yes, I would like to have it. 🙂

cpursley Jun 16, 2024
Author

Cool. I need to refine it a good bit and make the search engine swappable as well as llm. Also, maybe have a text chunk vector retrieval option (instead of using llms to parse and summarize).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for "Online" LLM #100

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 7 replies

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

Support for "Online" LLM #100

cpursley Apr 10, 2024

Replies: 3 comments · 7 replies

brainlid Apr 18, 2024 Maintainer

cpursley Apr 19, 2024 Author

cpursley Apr 24, 2024 Author

cpursley Apr 27, 2024 Author

brainlid Apr 27, 2024 Maintainer

cpursley Jun 15, 2024 Author

brainlid Jun 15, 2024 Maintainer

cpursley Jun 16, 2024 Author

cpursley
Apr 10, 2024

Replies: 3 comments 7 replies

brainlid
Apr 18, 2024
Maintainer

cpursley
Apr 19, 2024
Author

cpursley
Apr 24, 2024
Author

cpursley Apr 27, 2024
Author

brainlid Apr 27, 2024
Maintainer

cpursley Jun 15, 2024
Author

brainlid Jun 15, 2024
Maintainer

cpursley Jun 16, 2024
Author