Member-only story

So, You Want To Improve Your RAG Pipeline

Ways to go from prototype to production with LlamaIndex

Published in

Towards AI

9 min readSep 26, 2023

Updated: the actual implementation of how to improve the RAG pipeline with different indexing can be found here in my post:

LlamaIndex: How To Evaluate Your RAG (Retrieval Augmented Generation) Applications

Classic 10k reports with Australian real estate market reports to demonstrate the end-to-end evaluation process

betterprogramming.pub

LLMs are a fantastic innovation, but they have one major flaw. They have cut-off knowledge and a tendency to make up facts and create stuff out of thin air. The danger is LLMs always sound confident with the response and we only need to tweak the prompt a little bit to fool LLMs.
The RAG is here to resolve this issue. RAG makes LLMs significantly more useful by providing factual context for them to use when answering queries.

With roughly a few lines of code and a quick-start guide to a framework like LlamaIndex, anyone can construct a chatbot to chat with your private documents or even better, can build a new entire agent that is capable of searching on the internet.

BUT

You never have production-ready if you only follow the quick guide.

These five lines of code will not result in a very functional bot. RAG is simple to prototype but difficult to “production,” or bring to the point where customers would find it satisfactory. RAG might operate at an okay level after a little tutorial. However, it frequently requires some considerable testing and strategy to optimize to bridge the real production grade. Best practices are still being developed and can change based on the use case. Finding the best practices is worthwhile, from different indexing techniques to embedding algorithms or changing the LLM models.

In this post, I will discuss the caliber of RAG systems. It is designed for RAG builders who want to bridge the performance gap between entry-level setups and production-level performance.

There are 3 stages of the RAG pipeline:

Indexing Stage
Querying Stage

Towards AI

So, You Want To Improve Your RAG Pipeline

Ways to go from prototype to production with LlamaIndex

LlamaIndex: How To Evaluate Your RAG (Retrieval Augmented Generation) Applications

Classic 10k reports with Australian real estate market reports to demonstrate the end-to-end evaluation process

You never have production-ready if you only follow the quick guide.

Published in Towards AI

Written by Ryan Nguyen

Responses (3)