Terraform for AI Agents (5): Storage — Vector, Relational, and Object Memory
An agent has three kinds of memory and they map onto three Aliyun services: PolarDB/RDS for sessions, OpenSearch (vector edition) or pgvector for embeddings, OSS for artifacts. Real Terraform for each, plus the lifecycle and backup rules that keep the bill flat.
An agent’s memory is the part most tutorials hand-wave. “Just put the embeddings in Pinecone, the sessions in Postgres, the screenshots in S3.” On Aliyun, all three exist as managed services, and Terraform-provisioning them right is the difference between “memory works” and “we lost three weeks of conversation history because the disk filled up at 4am”.
This article covers all three layers, the Terraform for each, and the boring-but-critical lifecycle and backup rules.
The three-layer memory model

The mental model:
- Short-term / session — what the agent did in the current run and the last few runs. Conversation turns, tool calls, intermediate state. Schema-stable, low-latency, transactional. Goes in a relational database.
- Long-term / semantic — embeddings of documents, prior outputs, and recall corpus. Hybrid lexical + vector search. Goes in a vector store.
- Artifact / blob — generated images, PDFs, screenshots, run snapshots. Sometimes large, often write-once-read-rarely. Goes in object storage.
Don’t conflate them. I have watched a team try to put 50 GB of generated PDFs in Postgres because “it has a bytea column”. Cost ten times what OSS would have, query latency went to mush, backups took hours.
Layer 1: relational, RDS for PostgreSQL
For session state — turn-by-turn conversation, tool-call traces, user identity — you want a real RDBMS. PostgreSQL is my default; MySQL works fine if your team prefers it. PolarDB is the next step up when you need horizontal scale.
| |
Highlights:
- Password lives in KMS Secrets Manager from birth. Generated by
random_password, written toalicloud_kms_secret, retrieved by the agent at startup via STS. The plaintext password never leaves Terraform’s memory and isn’t in tfstate (it’s referenced viasecret_id). encryption_keyties the disk to thememoryCMK. At-rest encryption, no extra cost.backup_period+retention_periodcreate automated backups three times a week, kept 30 days in prod, 7 in dev. RDS backups are stored on OSS; you don’t manage the bucket.zone_id_slave_ain prod creates a hot standby in a second zone. Failover is sub-30s. The cost is 2× — worth it for prod, overkill for dev.deletion_protectionin prod blocksterraform destroyfrom killing the database. Always.
Real-world tip: PolarDB is the right choice once your sessions table crosses ~10M rows or you need read replicas without downtime. Migration from RDS to PolarDB is well-documented and Terraform handles both. Don’t start there — RDS is simpler and cheaper at small scale.
Layer 2: vector store
You have two reasonable choices on Aliyun for the vector layer:
- OpenSearch Vector Search Edition — managed, Lucene-backed, supports HNSW + IVF, billed per QPS quota
- PolarDB or RDS PostgreSQL with
pgvector— co-located with your relational data, free in terms of new infra, slower past ~1M vectors
For anything past prototype, I prefer OpenSearch. The cost is real (~¥800/mo for the smallest instance), but you get hybrid lexical+vector search out of the box, which is the right shape for retrieval.
| |
The app group is the OpenSearch concept that holds an index. From here you create the index schema via the OpenSearch console or SDK — the alicloud_opensearch_app resource exists but the schema bit is operational, not provisional.
If you go the pgvector route instead, add this to the RDS database creation:
| |
The Terraform half is just the database; the schema is application code (Alembic, Flyway, sqlx-migrate — pick one). Don’t try to manage table schemas in Terraform; that path leads to madness.
Layer 3: object storage
OSS is where artifacts go: generated images, PDFs, screenshots, run-trace tarballs, model checkpoints if you fine-tune.
The official “Create a bucket with Terraform” practice doc covers the basics. For an agent stack:
| |
Three things worth a closer look:
Bucket-name uniqueness
OSS bucket names are globally unique across all Aliyun customers. The random_id suffix avoids the “name already taken” plan failure that bites every first-time user. Once the bucket is created, the name is stable.
Lifecycle tiering
The lifecycle_rule block is the single biggest cost lever in OSS:

- Standard (0-30 days, ~¥0.12/GB/mo) — what you write to by default
- Infrequent Access (30-90 days, ~¥0.08/GB/mo) — cheaper storage, $0.0125 per GB retrieval
- Archive (90-365 days, ~¥0.033/GB/mo) — minutes-to-hours retrieval
- Cold Archive (365+ days, ~¥0.015/GB/mo) — hours retrieval, the cheapest
For agent artifacts, this rule says: keep 30 days hot, then move to IA, then Archive at 3 months, then Cold Archive at a year, then delete at two years. For a 1 TB artifact corpus, this is the difference between ¥1500/mo (all Standard) and ~¥250/mo. Codify it in HCL once, save 5 figures over a year.
Versioning
versioning { status = "Enabled" } keeps every object version. An agent that overwrites artifacts/run-123/output.pdf doesn’t actually destroy the previous version — it’s still there with a different version ID. Two reasons this matters:
- Recovery. A bug overwrote 50,000 objects with garbage? Restore the previous versions in a script.
- Tamper-evidence. Combined with WORM (Write-Once-Read-Many) policies, this gives you regulatory compliance for free.
The cost is real — versioned objects accumulate. Pair versioning with a noncurrent_version_expiration rule in the lifecycle to prune old versions after, say, 180 days.
The backup story
A Terraform-managed backup setup looks like this:

- RDS: built-in automated backups (already in our HCL above)
- OSS: versioning + cross-region replication for disaster recovery
- OpenSearch: snapshot to OSS via the
alicloud_opensearch_*snapshot resources
Cross-region replication for OSS is one resource:
| |
The aliased provider lets one Terraform run touch two regions:
| |
For a research agent that’s mainly stateless, you might decide DR isn’t worth the storage doubling. For a customer-facing one with conversation history that legally must persist, it’s mandatory.
Real-world tip: Test the restore quarterly. A backup you have never restored is just an expensive hope. I run a
restore-drill.shscript monthly that pulls a random RDS backup into acn-shanghai-drinstance and runs schema/checksum verification. It is the most useful 30 minutes I spend each month.
Connecting compute to storage
The ECS instance from article 4 needs to actually reach this storage. Three pieces:
- Network — already done. The
agent_runtime_sg_idfrom the VPC module is the source for thememory_rds_sgandvector_store_sgingress rules. - Credentials — the agent reads the DB password from KMS Secrets Manager via STS:
1 2 3from alibabacloud_kms20160120.client import Client as KmsClient resp = kms_client.get_secret_value(GetSecretValueRequest(secret_name="agents-prod-rds-admin")) db_password = resp.body.secret_data - Endpoints — Terraform outputs them:
1 2 3 4 5 6 7 8 9output "rds_endpoint" { value = alicloud_db_instance.memory.connection_string } output "vector_endpoint" { value = alicloud_opensearch_app_group.vector.api_domain } output "artifacts_bucket" { value = alicloud_oss_bucket.artifacts.bucket }
The agent reads these from environment variables that cloud-init sets from the Terraform outputs. No hardcoded endpoints, no manual config files.
What it costs (monthly, dev workspace, low traffic)
- RDS PostgreSQL (
pg.n2.medium.1c, 100 GB ESSD): ~¥350/mo - OpenSearch vector (smallest): ~¥800/mo
- OSS (10 GB Standard, lifecycle on): ~¥1.5/mo + traffic
- KMS (covered in article 3): ~¥10/mo
Roughly ¥1200/mo for the storage layer in dev. Prod with HA RDS, larger OpenSearch, more OSS will be ¥3000-5000/mo. This is where the cost pressure starts being real — article 7 shows how to track and alert on it.
What’s next
Article 6 builds the LLM gateway in front of the compute we provisioned in article 4 and the storage we just provisioned. That’s the place where API keys live, quotas get enforced, and per-agent cost gets attributed. By the end of article 6 you’ll have a complete agent-runnable stack — the last two articles wire observability and cost control over the top.