Retrieval-augmented generation

2024-09-08 (modified: 2025-04-08)
별칭: RAG

거대언어모델 답변의 품질을 높이기 위한 기법 중 하나. 사용자가 입력한 원래의 프롬프트에 해당 프롬프트와 관련된 정보를 추가로 덧붙여서 “증강”한 뒤 이를 기반으로 답변을 생성하게 하기.

Articles

2025-03-19 - Introducing RAG 2.0 - Contextual AI
- “A typical RAG system today uses a frozen off-the-shelf model for embeddings, a vector database for retrieval, and a black-box language model for generation, stitched together through prompting or an orchestration framework. This leads to a “Frankenstein’s monster” of generative AI: the individual components technically work, but the whole is far from optimal. These systems are brittle, lack any machine learning or specialization to the domain they are being deployed to, require extensive prompting, and suffer from cascading errors. As a result, RAG systems rarely pass the production bar.”
- “The RAG 2.0 approach pretrains, fine-tunes, and aligns all components as a single integrated system, backpropagating through both the language model and the retriever to maximize performance.”
2024-11-05 - HtmlRAG: HTML is Better Than Plain Text for Modeling Retrieved Knowledge in RAG Systems
2024-09-20 - Introducing Contextual Retrieval (Anthropic)
- “A method that dramatically improves the retrieval step in RAG”
- “This method can reduce the number of failed retrievals by 49% and, when combined with reranking, by 67%.”
- “The company’s revenue grew by 3% over the previous quarter.” → “This chunk is from an SEC filing on ACME corp’s performance in Q2 2023; the previous quarter’s revenue was $314 million. The company’s revenue grew by 3% over the previous quarter.”
- 청크에 맥락을 붙이는 방법? Claude Haiku + Prompt Caching.
2024-08-19 - The RAG Playbook - jxnl.co
2020 - Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - 최초로 RAG를 소개한 논문

Tools

github.com/bhavnicksm/chonkie
- “The no-nonsense RAG chunking library that’s lightweight, lightning-fast, and ready to CHONK your texts”
Introducing AutoRAG: fully managed Retrieval-Augmented Generation on Cloudflare

Retrieval-augmented generation

Articles

Tools

See also