<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Aliyun PAI on Chen Kai Blog</title><link>https://www.chenk.top/en/categories/aliyun-pai/</link><description>Recent content in Aliyun PAI on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Mon, 09 Mar 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/categories/aliyun-pai/index.xml" rel="self" type="application/rss+xml"/><item><title>Aliyun PAI (5): Designer vs Model Gallery — When the GUIs Actually Earn Their Keep</title><link>https://www.chenk.top/en/aliyun-pai/05-pai-designer-vs-quickstart/</link><pubDate>Mon, 09 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/05-pai-designer-vs-quickstart/</guid><description>&lt;p>The first four articles covered the underlying primitives — DSW, DLC, EAS — that you orchestrate with Python. This one focuses on two GUI products that wrap these primitives and provide a runnable solution for users who don&amp;rsquo;t want to write Python: &lt;strong>PAI-Designer&lt;/strong> for drag-and-drop tabular pipelines, and &lt;strong>Model Gallery&lt;/strong> for zero-code open-source model deployment and fine-tuning. While serious engineers might not use them first, they are the right choice in two specific situations.&lt;/p></description></item><item><title>Aliyun PAI (4): PAI-EAS — Model Serving, Cold Starts, and the TPS Lie</title><link>https://www.chenk.top/en/aliyun-pai/04-pai-eas-model-serving/</link><pubDate>Sun, 08 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/04-pai-eas-model-serving/</guid><description>&lt;p>EAS is where the money goes. DSW costs a few hundred RMB a month for development. DLC costs spike. EAS bills 24/7 because someone might call your endpoint, and the &amp;ldquo;minimum replica count&amp;rdquo; in the autoscaler config is the most critical setting in the entire platform. This article covers what I wish I&amp;rsquo;d known before shipping our first production endpoint.&lt;/p>
&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/aliyun-pai/04-pai-eas-model-serving/illustration_1.png" alt="Aliyun PAI (4): PAI-EAS — Model Serving, Cold Starts, and the TPS Lie — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p></description></item><item><title>Aliyun PAI (3): PAI-DLC — Distributed Training Without the Cluster Pain</title><link>https://www.chenk.top/en/aliyun-pai/03-pai-dlc-distributed-training/</link><pubDate>Sat, 07 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/03-pai-dlc-distributed-training/</guid><description>&lt;p>A DSW notebook is for one engineer on one GPU. When you need eight GPUs across two nodes or training that runs longer than eight hours, you switch to &lt;strong>DLC&lt;/strong>. DLC is PAI&amp;rsquo;s job-submission front-end for a managed Kubernetes cluster. You describe what you want (image, command, resources, data mounts), and DLC schedules pods, runs them to completion, persists logs, and reports the results. The docs call this &lt;em>Deep Learning Containers&lt;/em>; we just say &amp;ldquo;DLC job&amp;rdquo;.&lt;/p></description></item><item><title>Aliyun PAI (2): PAI-DSW — Notebooks That Don't Eat Your Weights</title><link>https://www.chenk.top/en/aliyun-pai/02-pai-dsw-notebook/</link><pubDate>Fri, 06 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/02-pai-dsw-notebook/</guid><description>&lt;p>Every time I onboard a new ML engineer to PAI the first day looks the same. They start a DSW instance, &lt;code>pip install&lt;/code> their world, train for an hour, restart the kernel for some reason, and then ask me where their model file went. The honest answer — &amp;ldquo;in &lt;code>/root&lt;/code> on a node that no longer exists&amp;rdquo; — is the kind of lesson you only need to learn once. This article is the version of that lesson you read in advance.&lt;/p></description></item><item><title>Aliyun PAI (1): Platform Overview and the Product Family Map</title><link>https://www.chenk.top/en/aliyun-pai/01-platform-overview/</link><pubDate>Thu, 05 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-pai/01-platform-overview/</guid><description>&lt;p>If your team trains or serves models on Alibaba Cloud, you&amp;rsquo;ll eventually use the PAI console. PAI is the umbrella; underneath it are the actual workhorses — a notebook product, a distributed training service, a model-serving service, and a few GUI/quick-deploy layers. After about eighteen months of running real LLM workloads on it for an AI marketing platform, this series is the field guide I wish I had before deploying my first endpoint.&lt;/p></description></item></channel></rss>