Terraform for AI Agents (8): End-to-End — research-agent-stack in One Apply
Stitching the seven modules into one repo, running terraform apply once, and watching a complete agent runtime — VPC, ECS, RDS, OpenSearch, OSS, LLM gateway, SLS observability, cost alarms — come up in seven minutes. Real apply output, the module DAG, and the starter repo to fork.
This is the article where everything from articles 2 through 7 lands in one place. By the end you’ll have run terraform apply once and produced a complete, observable, budgeted agent runtime stack on Alibaba Cloud. About 31 resources, ~7 minutes of wall clock.
The stack we’re building:

Five layers — edge, compute, memory, platform, ops — composed from the modules we built across this series.
Project structure
research-agent-stack/
├── README.md
├── versions.tf # Terraform + provider pinning
├── backend.tf # OSS + Tablestore remote state
├── providers.tf # alicloud + alicloud.beijing alias
├── variables.tf # top-level inputs
├── locals.tf # workspace-aware computed locals
├── main.tf # module composition
├── outputs.tf # endpoints + connection strings
├── env/
│ ├── dev.tfvars
│ ├── staging.tfvars
│ └── prod.tfvars
├── secrets/
│ └── secrets.auto.tfvars # gitignored — provider keys
├── modules/
│ ├── vpc-baseline/ # article 3
│ ├── storage/ # article 5
│ ├── compute/ # article 4
│ ├── llm-gateway/ # article 6
│ └── observability/ # article 7
└── scripts/
├── cloud-init/
│ ├── agent.sh
│ └── gateway.sh
└── restore-drill.sh
Eight *.tf files at the top, five modules in modules/, environment-specific values in env/*.tfvars, secrets out of git in secrets/secrets.auto.tfvars. This is the layout I use on every project — boring is good.
main.tf — the composition
| |
Five module calls. Notice how each module takes the previous module’s output as input — module.compute reads module.vpc, module.storage, module.gateway, module.observability. That dependency wiring is what Terraform uses to build the apply DAG:

Network and KMS sit at the top — they have no dependencies. Storage, compute, and gateway depend on network + KMS but are independent of each other, so Terraform builds them in parallel. Compute also depends on storage and gateway because the cloud-init template needs their endpoints. Observability and alarms depend on compute because they reference the SG IDs.
variables.tf
| |
sensitive = true keeps Terraform from printing the value in plan/apply output. The values still land in tfstate (which is why we encrypted the OSS bucket back in article 2).
env/dev.tfvars
| |
secrets/secrets.auto.tfvars (gitignored)
| |
*.auto.tfvars files are auto-loaded without -var-file. Make sure secrets/ is in .gitignore from the very first commit.
The apply
| |
Real timing on a fresh apply:

The wall-clock breakdown:
- 0-60s: VPC, vSwitch, NAT, EIP, KMS keys — fast resources
- 60-380s: RDS (5 minutes), OpenSearch (5.5 minutes), ECS (~2 minutes), gateway (~1.5 minutes) — all parallel, gated by the slowest
- 380-460s: agent app deploy, observability resources, alarms
About 7 minutes total, dominated by RDS and OpenSearch provisioning. Re-applies on no-change runs settle in under 30 seconds because Terraform only diffs.
A trimmed apply transcript:
Terraform will perform the following actions:
# module.vpc.alicloud_vpc.this will be created
+ resource "alicloud_vpc" "this" {
+ cidr_block = "10.20.0.0/16"
+ vpc_name = "agents-dev"
...
}
... (29 more resources) ...
Plan: 31 to add, 0 to change, 0 to destroy.
Changes to Outputs:
+ agent_endpoints = (known after apply)
+ gateway_url = (known after apply)
+ sls_dashboard_url = (known after apply)
+ total_estimated_cost = "~¥1450/month at dev sizing"
Do you want to perform these actions in workspace "dev"?
Terraform will perform the actions described above.
Only 'yes' will be accepted to approve.
Enter a value: yes
module.vpc.alicloud_vpc.this: Creating...
module.vpc.alicloud_kms_key.this["memory"]: Creating...
module.vpc.alicloud_kms_key.this["secrets"]: Creating...
module.vpc.alicloud_kms_key.this["logs"]: Creating...
module.vpc.alicloud_vpc.this: Creation complete after 4s [id=vpc-uf6abc123]
module.vpc.alicloud_vswitch.private["0"]: Creating...
module.vpc.alicloud_vswitch.private["1"]: Creating...
module.vpc.alicloud_vswitch.private["2"]: Creating...
module.vpc.alicloud_vswitch.public["0"]: Creating...
...
module.storage.alicloud_db_instance.memory: Still creating... [4m 30s elapsed]
module.storage.alicloud_opensearch_app_group.vector: Still creating... [5m 10s elapsed]
module.storage.alicloud_db_instance.memory: Creation complete after 4m 38s [id=pgm-uf6def456]
module.storage.alicloud_opensearch_app_group.vector: Creation complete after 5m 24s [id=os-uf6ghi789]
...
module.compute.alicloud_instance.agent[0]: Creation complete after 1m 52s [id=i-uf6jkl012]
module.gateway.alicloud_alb_listener.gateway: Creation complete after 12s
module.observability.alicloud_log_alert.cost_ceiling: Creation complete after 3s
...
Apply complete! Resources: 31 added, 0 changed, 0 destroyed.
Outputs:
agent_endpoints = [
"http://alb-uf6.cn-shanghai.alb.aliyuncs.com",
]
gateway_url = "http://alb-uf7.cn-shanghai.alb.aliyuncs.com/v1"
sls_dashboard_url = "https://sls.console.aliyun.com/lognext/project/agents-dev/dashboard/agent-cost-overview"
total_estimated_cost = "~¥1450/month at dev sizing"
That’s a complete agent stack. ALB endpoint, gateway URL, the SLS dashboard URL — paste any of them into a browser and they work.
Day-2 operations
The stack is up. Now what?
Adding a new agent
- Add an entry to
var.agent_quotasindev.tfvars terraform apply -var-file=env/dev.tfvars- The
null_resourceprovisions a new LiteLLM key - Deploy your new agent code with the new
LITELLM_API_KEYenv var
About 30 seconds end-to-end.
Scaling up
Change ecs_count in the module call (or set it via tfvars). terraform apply brings up new instances, attaches them to the ALB, and old instances stay healthy throughout (create_before_destroy). Zero downtime.
Promoting dev → prod
| |
Same modules, different sizes (HA RDS, larger OpenSearch quota, more ECS, real DingTalk webhook, real LLM keys, cost ceiling at ¥800 instead of ¥100). The first prod apply takes 7-10 minutes; subsequent applies are seconds.
Destroying dev
When you’re done experimenting:
| |
This will fail because of deletion_protection = true on prod-like resources and prevent_destroy = true on the bootstrap state bucket. That’s intentional. For dev, you set deletion_protection = local.is_prod so it’s only on in prod — terraform destroy works.
Real-world tip: Always
terraform plan -destroybeforeterraform destroy. Read the plan output. The number of resources being destroyed should match what you intend. I have seen one engineer accidentally destroystagingbecause they forgot to switch workspaces.
Connecting your actual agent code
The stack is the platform. The agent itself comes from your repo (var.agent_repo_url) and is deployed by cloud-init at ECS launch. The minimal contract your agent code needs to honor:
| |
All of these get values from Terraform outputs. The agent code stays cloud-agnostic in shape — it just reads env vars — but is fully wired into the Aliyun stack at runtime.
Cost summary
A real bill for dev workspace, low traffic:
| Component | Monthly |
|---|---|
| VPC + NAT + EIP | ~¥150 |
| ECS x1 (c7.large) | ~¥250 |
| RDS Postgres (small) | ~¥350 |
| OpenSearch vector | ~¥800 |
| OSS (10 GB Standard) | ~¥2 |
| LLM Gateway ECS x1 | ~¥150 |
| ALB (small) | ~¥50 |
| SLS + ARMS | ~¥300 |
| KMS | ~¥10 |
| Total dev | ~¥2060/mo |
Prod with HA, larger sizes, cross-region DR: roughly ¥6000-9000/mo before LLM API cost. The LLM bill is usually the biggest line item — which is why article 6’s gateway and article 7’s cost alarms exist.
What I skipped
- CDN for serving artifact URLs publicly — alicloud_cdn_domain works, but most agents serve artifacts through their own gateway
- WAF in front of the ALB — required for public-facing prod, but the dev stack uses an Intranet ALB
- PrivateLink to DashScope — saves NAT egress cost at scale, configurable via alicloud_privatelink_*
- Custom domain + SSL — alicloud_alb_listener supports SSL certs but you have to bring the cert (or use ACM)
All four are worth adding once the basics work. Don’t add them on day 1.
Where to go from here
You now have a production-shaped agent runtime on Alibaba Cloud, fully expressed in Terraform, with observability, secret management, and cost guards built in. The next steps depend on your project:
- More agents: add to
var.agent_quotasandterraform apply - Different LLM providers: add to
local.litellm_configin the gateway module - Multiple regions: add provider aliases and replicate the stack
- GitOps: wrap
terraform applyin a CI pipeline gated by PR review - Pulumi or Crossplane migration: the resource graph translates directly
The single most important thing is that your infrastructure is now in git. Every change is reviewable. Every environment is reproducible. Every cost is attributable. That’s what IaC buys you, and it’s what makes shipping agents on Aliyun a sustainable practice instead of a perpetual scramble.
Thanks for reading the series. If you ship a stack based on this, I’d love to hear what you changed and why — that’s how the patterns evolve.