A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems

التفاصيل البيبلوغرافية
العنوان:	A Tale of Trust and Accuracy: Base vs. Instruct LLMs in RAG Systems
المؤلفون:	Cuconasu, Florin, Trappolini, Giovanni, Tonellotto, Nicola, Silvestri, Fabrizio
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language, Computer Science - Information Retrieval
الوصف:	Retrieval Augmented Generation (RAG) represents a significant advancement in artificial intelligence combining a retrieval phase with a generative phase, with the latter typically being powered by large language models (LLMs). The current common practices in RAG involve using "instructed" LLMs, which are fine-tuned with supervised training to enhance their ability to follow instructions and are aligned with human preferences using state-of-the-art techniques. Contrary to popular belief, our study demonstrates that base models outperform their instructed counterparts in RAG tasks by 20% on average under our experimental settings. This finding challenges the prevailing assumptions about the superiority of instructed LLMs in RAG applications. Further investigations reveal a more nuanced situation, questioning fundamental aspects of RAG and suggesting the need for broader discussions on the topic; or, as Fromm would have it, "Seldom is a glance at the statistics enough to understand the meaning of the figures".
نوع الوثيقة:	Working Paper
الوصول الحر:	http://arxiv.org/abs/2406.14972Test
رقم الانضمام:	edsarx.2406.14972
قاعدة البيانات:	arXiv

View record in Arxiv

الوصف
الوصف غير متاح.