Retrieval Augmented Generation (RAG) is a technique that enhances language models by integrating them with search capabilities. This approach involves the model using data fetched from a search to inform its responses, effectively combining the model's capabilities with external data sources. In 2023, RAG emerged as a leading framework for developing various applications, such as question-answering services that leverage web searches and numerous "chat with your data" apps. This trend also revitalized interest in vector search technologies, leading to the creation of startups focused on vector database solutions, utilizing open-source search indices like faiss, and incorporating additional features for better data handling.
This collection of notebooks implement some of the more advanced RAG techniques.
The basic process of RAG involves breaking down texts into smaller pieces, converting these pieces into vectors using a transformer encoder, and storing these vectors in a searchable index. When a query is received, it's also turned into a vector using the same encoding process. The system then searches this vector against the index to find the most relevant text chunks, which are provided to the LLM as context for generating an answer. This method enables the model to produce responses that are informed by a broader context, making the answers more relevant and accurate.
The basic RAG process involves three main steps:
This approach typically yields okay results, but we can do much better. That's why it's sometimes called "naive" RAG.
Various schemes have have been developed to improve the relevance of retrieved documents and the quality of generated responses. Sometimes called the "RAG Triad", advanced RAG techniques typically try to improve upon:
Source: https://www.trulens.org/trulens_eval/getting_started/core_concepts/rag_triad/
Advanced RAG techniques can involve more sophisticated retrieval mechanisms, better ways of encoding context, and more effective methods for generating responses. These improvements can significantly enhance the performance of RAG systems, making them more accurate and reliable in a wide range of applications.
Source: https://pub.towardsai.net/advanced-rag-techniques-an-illustrated-overview-04d193d8fec6