Running neThing on a locally hosted model, perhaps using CRAG(Corrective Retrieval Augmented Generation)

aeropasta · March 8, 2024, 1:09am

I came across this project (having this respective tutorial) on running Corrective RAG apps on locally hosted LLM models.

Tweening what I know about LangChain/LlamaIndex’s respective ‘programatic functionalities’ (i.e. structured bundling of qeury + desired output framework in prompt), it seams like the part at 12:08 where a JSON output with a binary 1 or 0 for revelecy of the retreival component seams like a paradigm shifting aspects of what now seams like a very simplistic escapement for a RAG Agent to get n≥1 ‘shots on goal’, so-to-speak, for selecting the right reference documents and thereby providing a much better contextualized answer.

Wondering what any of ya’ll might think of this?

Do we know if these open source, ‘small’ parameter models can generate code comparably to the mainstream tools?

raw · April 9, 2024, 12:09am

re: local models that are good at codegen, see this:

https://evalplus.github.io/leaderboard.html

there are currently about 4-5 that are better than gpt-3.5-turbo. a few can even be run on ollama.

re: the “shots on goal”, i am currently implementing a retry loop in neThing.xyz when the code doesn’t compile.