<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Infrastructure as Code on Chen Kai Blog</title><link>https://www.chenk.top/en/tags/infrastructure-as-code/</link><description>Recent content in Infrastructure as Code on Chen Kai Blog</description><generator>Hugo</generator><language>en</language><lastBuildDate>Sat, 09 May 2026 09:00:00 +0000</lastBuildDate><atom:link href="https://www.chenk.top/en/tags/infrastructure-as-code/index.xml" rel="self" type="application/rss+xml"/><item><title>Alibaba Cloud Full Stack (12): End-to-End — One Terraform Apply for Everything</title><link>https://www.chenk.top/en/aliyun-fullstack/12-terraform-e2e/</link><pubDate>Sat, 09 May 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/aliyun-fullstack/12-terraform-e2e/</guid><description>&lt;p>Eleven articles. Dozens of CLI commands. Hundreds of manual steps. Now we throw all of that away and rebuild the entire stack with a single &lt;code>terraform apply&lt;/code>. This is why infrastructure-as-code exists.&lt;/p>
&lt;p>Over the past eleven parts of this series, we have clicked through consoles, typed &lt;code>aliyun&lt;/code> CLI commands, and manually configured everything from VPCs to Function Compute triggers. It worked. We learned every resource intimately because we built each one by hand. But if I asked you right now to recreate that entire stack in a new region — the VPC with its three tiers and two availability zones, the ECS instance with its cloud-init script, the RDS MySQL HA setup, the OSS bucket with lifecycle rules, the RAM policies, the SLS log pipeline, the Function Compute event processing — you would need at least a full day of careful work. And you would inevitably miss something. A security group rule. A backup policy. A CORS configuration.&lt;/p></description></item><item><title>Terraform for AI Agents (2): Provider, Auth, and Remote State on OSS</title><link>https://www.chenk.top/en/terraform-agents/02-provider-and-state-setup/</link><pubDate>Sat, 14 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/terraform-agents/02-provider-and-state-setup/</guid><description>&lt;p>This article is where you stop reading and start typing. By the end, you&amp;rsquo;ll have:&lt;/p>
&lt;ol>
&lt;li>The &lt;code>alicloud&lt;/code> Terraform provider installed and version-pinned.&lt;/li>
&lt;li>Authentication wired up — through the right method, not the convenient one.&lt;/li>
&lt;li>Remote state on an OSS bucket with Tablestore-based locking.&lt;/li>
&lt;li>Three workspaces (&lt;code>dev&lt;/code>, &lt;code>staging&lt;/code>, &lt;code>prod&lt;/code>) that share a backend but isolate state.&lt;/li>
&lt;li>A working &lt;code>terraform plan&lt;/code> against an empty config.&lt;/li>
&lt;/ol>
&lt;p>Nothing here provisions an agent yet. This lays the foundation for all future articles. If you skip this and try to wing it in article 3, you&amp;rsquo;ll likely face a state-corruption incident within a week.&lt;/p></description></item><item><title>Terraform for AI Agents (1): Why IaC Is the Only Sane Way to Ship Agents</title><link>https://www.chenk.top/en/terraform-agents/01-why-terraform-for-agents/</link><pubDate>Thu, 12 Mar 2026 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/terraform-agents/01-why-terraform-for-agents/</guid><description>&lt;p>I have shipped four agent systems on Alibaba Cloud in the last eighteen months. Three of them started life as a &lt;code>tmux&lt;/code> session on a single ECS instance someone created by clicking through the console. All three of those needed a panicked weekend of rebuilding when the second engineer joined the project, when the prod region had a stockout, or when the security team asked for a network diagram.&lt;/p>
&lt;p>The fourth started life as &lt;code>terraform apply&lt;/code>. It was the only one I haven&amp;rsquo;t lost a weekend to.&lt;/p></description></item><item><title>Cloud Computing (7): Cloud Operations and DevOps Practices</title><link>https://www.chenk.top/en/cloud-computing/operations-devops/</link><pubDate>Fri, 26 May 2023 09:00:00 +0000</pubDate><guid>https://www.chenk.top/en/cloud-computing/operations-devops/</guid><description>&lt;p>&lt;figure class="article-figure">
 &lt;img src="https://blog-pic-ck.oss-cn-beijing.aliyuncs.com/posts/en/cloud-computing/operations-devops/illustration_1.png" alt="Cloud Computing (7): Cloud Operations and DevOps Practices — Chapter overview" loading="lazy" decoding="async" class="content-image">
 
&lt;/figure>
&lt;/p>
&lt;p>In 2017 GitLab lost six hours of database state. An engineer, exhausted, ran &lt;code>rm -rf&lt;/code> on the wrong server during an incident. The backup procedures had silently been broken for months; nobody noticed because no one was restoring from backups. The lesson is not &amp;ldquo;be careful with rm&amp;rdquo;. The lesson is that operations is a &lt;em>system&lt;/em> — tools, runbooks, monitoring, automation, and the rituals around them. When the system is healthy, no single tired engineer can take down production. When the system is rotten, every late-night fix is one keystroke from disaster.&lt;/p></description></item></channel></rss>