Retrieval Augmented Generation (RAG) with agents is currently one of the hottest topics in generative AI.
The basic idea of RAG is to connect large language models with search technology. The search technologies are used to retrieve information that is relevant for a given conversation, which can then be exploited by a large language model when generating a natural language response. Though the idea is straight forward, many decisions have to be made when it comes to its implementation.
For research on RAG, it is hence critical to be able to measure the performance of a RAG system. In the project, we study the state-of-the-art in RAG evaluation, deploy RAG systems with different settings to our GPU-cluster, and compare their performance on various benchmark datasets and with respect to their user experience. |