<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Reranking on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/reranking/</link><description>Recent content in Reranking on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Fri, 03 Apr 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/reranking/index.xml" rel="self" type="application/rss+xml"/><item><title>LLM Engineering (8): Retrieval-Augmented Generation</title><link>https://www.chenk.top/en/llm-engineering/08-rag/</link><pubDate>Fri, 03 Apr 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/llm-engineering/08-rag/</guid><description>&lt;p>RAG is the most over-deployed and under-engineered pattern in LLM applications. The 2024 demo loop — embed everything with &lt;code>text-embedding-3-large&lt;/code>, dump into pgvector, top-5 cosine — works for 1000 documents and a forgiving demo. It does not survive 100K real documents and a customer who notices when the answer is wrong. This chapter is what I wish more teams knew before they built their second generation of RAG.&lt;/p>
&lt;p>The original RAG paper (&lt;a href="https://arxiv.org/abs/2005.11401" target="_blank" rel="noopener noreferrer">Lewis et al., 2020 &lt;span aria-hidden="true" style="font-size:0.75em; opacity:0.55; margin-left:2px;">↗&lt;/span>&lt;/a>
) framed retrieval-augmented generation as a hybrid model: a dense retriever (DPR) trained jointly with a generator (BART) so the retrieval objective optimized end-task accuracy. Production RAG in 2026 doesn&amp;rsquo;t look much like Lewis&amp;rsquo;s RAG — modern systems use frozen pre-trained embedders, separate rerankers, and decoder-only generators that don&amp;rsquo;t train against the retriever. But the core insight (parameterize knowledge separately from reasoning) survived and became the dominant paradigm. The &lt;a href="https://arxiv.org/abs/2312.10997" target="_blank" rel="noopener noreferrer">Gao et al. (2023) RAG survey &lt;span aria-hidden="true" style="font-size:0.75em; opacity:0.55; margin-left:2px;">↗&lt;/span>&lt;/a>
 is the best comprehensive overview of the post-2020 evolution into &amp;ldquo;Naive RAG → Advanced RAG → Modular RAG.&amp;rdquo;&lt;/p></description></item></channel></rss>