<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>SFT on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/sft/</link><description>Recent content in SFT on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Mon, 30 Mar 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/sft/index.xml" rel="self" type="application/rss+xml"/><item><title>LLM Engineering (4): Post-training — SFT, DPO, RLHF, RLAIF</title><link>https://www.chenk.top/en/llm-engineering/04-post-training/</link><pubDate>Mon, 30 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/llm-engineering/04-post-training/</guid><description>&lt;p>A base model from pretraining can complete text but cannot follow instructions, refuse harmful requests, or maintain a persona—these are post-training behaviors. Post-training is where the gap between a research paper&amp;rsquo;s claims and a production-grade model lies. This chapter covers what each post-training algorithm optimizes, why most reward models are subtly flawed, and the effective methods for 2026.&lt;/p>
&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/llm-engineering/04-post-training/illustration_1.png" alt="LLM Engineering (4): Post-training — SFT, DPO, RLHF, RLAIF — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p></description></item><item><title>Aliyun PAI (3): PAI-DLC — Distributed Training Without the Cluster Pain</title><link>https://www.chenk.top/en/aliyun-pai/03-pai-dlc-distributed-training/</link><pubDate>Sat, 07 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/03-pai-dlc-distributed-training/</guid><description>&lt;p>A DSW notebook is for one engineer on one GPU. When you need eight GPUs across two nodes or training that runs longer than eight hours, you switch to &lt;strong>DLC&lt;/strong>. DLC is PAI&amp;rsquo;s job-submission front-end for a managed Kubernetes cluster. You describe what you want (image, command, resources, data mounts), and DLC schedules pods, runs them to completion, persists logs, and reports the results. The docs call this &lt;em>Deep Learning Containers&lt;/em>; we just say &amp;ldquo;DLC job&amp;rdquo;.&lt;/p></description></item></channel></rss>