Architecture on Chen Kai Blog

Product Thinking (1): Architecture Design — From Monolith to Autonomous Agents

Sat, 30 May 2026 09:00:00 +0000

The Shape of a System#

Every architecture is a frozen argument. It records what you believed about the problem at the time you committed the code. Looking back across four systems I built over eighteen months — a marketing content platform (~70k lines TypeScript), a zero-dependency skill routing engine, an autonomous research agent (~315k lines Python), and a multi-model coding orchestrator — I can trace how my architectural instincts shifted. Not always forward. Sometimes sideways. But there is a clear progression: from “keep it in one process” to “let the agents govern themselves.”

OpenClaw QuickStart (3): The Six Layers That Make the Agent Loop Work

Fri, 10 Apr 2026 09:00:00 +0000

You can use OpenClaw for months without reading this. But the first time you need to write a skill, debug a misrouted message, or figure out why the agent forgot something, you’ll want to know what each component does.

The six layers#

LLM Engineering (1): Architectures from Transformer to MoE

Fri, 27 Mar 2026 09:00:00 +0000

The 2017 Transformer block is still the silhouette of every production LLM in 2026, but almost every internal piece has been swapped, sparsified, or specialized. This series covers the modern stack end to end — architecture, training, inference, retrieval, evaluation, safety, deployment. Chapter 1 is about the block itself: what attention looks like in a 2026 model, how MoE breaks the param-FLOPs link, and where the non-attention alternatives (Mamba, RWKV) actually beat the Transformer.

LLM Workflows and Application Architecture: Enterprise Implementation Guide

Thu, 31 Jul 2025 09:00:00 +0000

Most LLM tutorials end where the interesting work begins. They show you how to call a chat completion endpoint, attach a vector store, and wrap the whole thing in a Streamlit demo. None of that is wrong, but none of it is what breaks at 3 a.m. when 10,000 users hit your service at once and every other answer is a hallucination.

This article is about everything that comes after the demo. It is opinionated on purpose: production LLM systems are mostly plain distributed systems with one non-deterministic component bolted on, and most of the engineering effort goes into containing that non-determinism. We will work through seven dimensions — application architecture, workflow patterns, the RAG-vs-fine-tune decision, deployment topology, cost, observability, and enterprise integration — keeping each one short, concrete, and grounded in the levers that actually move the needle.

System Design (8): Case Studies — URL Shortener, Chat System, News Feed

Sun, 27 Jul 2025 09:00:00 +0000

The best way to learn system design is to practice it. Reading about individual components — caching, queues, load balancers — builds your vocabulary, but designing a complete system is where you learn to compose those components into something that actually works.

This article walks through three classic system design problems end to end. Each follows the framework from the first article in this series: clarify requirements, estimate scale, design the architecture, deep dive into critical components, and identify bottlenecks.

System Design (6): Microservices vs Monoliths — The Honest Tradeoff

Tue, 22 Jul 2025 09:00:00 +0000

In 2020, the team behind Segment — a customer data platform processing billions of events per month — published a blog post titled “Goodbye Microservices.” They had decomposed their monolith into over 140 microservices, and the result was not the engineering utopia they expected. Instead, they spent most of their time fighting the complexity of the distributed system itself: service discovery failures, cascading timeouts, inconsistent deployment pipelines, and an explosion of inter-service communication bugs. They consolidated back to a monolith and reported dramatic improvements in developer productivity and system reliability.

Cloud Computing (1): Fundamentals and Architecture

Wed, 01 Feb 2023 09:00:00 +0000

Every team building software in 2025 inherits the same buy-or-rent question their predecessors faced — only the answer has flipped. Twenty years ago you put hardware in a closet; today you describe the hardware in YAML and a global provider conjures it up in seconds, bills it by the second, and tears it down when you stop paying. Cloud computing is not just “someone else’s computer”. It is a programmable, metered, multi-tenant abstraction over compute, storage and networking that has fundamentally changed how businesses are built and how engineers spend their day.