Series · Databases · Chapter 5

Databases (5): NoSQL — Document, Key-Value, Column, and Graph

A practical tour of the four major NoSQL families — document, key-value, wide-column, and graph — including the CAP theorem and when to use each model.

Not everything fits neatly into rows and columns. A social network’s friend graph, a product catalog with wildly varying attributes, a real-time leaderboard, a recommendation engine’s relationship web — these workloads push relational databases into awkward territory. NoSQL databases exist because different data models solve different problems better. The trick is knowing which one to reach for.


Why NoSQL?#

The term “NoSQL” is misleading. It does not mean “no SQL” — some NoSQL databases support SQL-like query languages. It means “not only SQL” or, more accurately, “non-relational.” The motivations for NoSQL fall into three categories:

NoSQL database types

  1. Schema flexibility: Your data does not have a fixed schema, or the schema changes frequently
  2. Scale-out architecture: You need horizontal scaling beyond what a single relational database can handle
  3. Data model fit: Your data is naturally a document, graph, key-value pair, or time series — not a table

Let us explore each family.

Document Stores: MongoDB#

Document databases store data as semi-structured documents, typically JSON (or BSON in MongoDB’s case). Each document can have a different structure — no fixed schema.

Document vs relational model

Data Model#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
// A user document in MongoDB
{
  "_id": ObjectId("507f1f77bcf86cd799439011"),
  "email": "alice@example.com",
  "name": "Alice Chen",
  "addresses": [
    {
      "type": "home",
      "street": "123 Main St",
      "city": "San Francisco",
      "state": "CA",
      "zip": "94105"
    },
    {
      "type": "work",
      "street": "456 Market St",
      "city": "San Francisco",
      "state": "CA",
      "zip": "94103"
    }
  ],
  "preferences": {
    "newsletter": true,
    "theme": "dark",
    "language": "en"
  },
  "created_at": ISODate("2023-11-15T10:30:00Z")
}

In a relational database, this would require at least 3 tables: users, addresses, user_preferences. In MongoDB, it is one document. No JOINs needed.

CRUD Operations#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
// Connect to MongoDB
const db = client.db("ecommerce");
const users = db.collection("users");

// Create
await users.insertOne({
  email: "alice@example.com",
  name: "Alice Chen",
  addresses: [{ type: "home", city: "San Francisco" }],
  created_at: new Date()
});

// Read
const user = await users.findOne({ email: "alice@example.com" });

// Read with projection (select specific fields)
const userBasic = await users.findOne(
  { email: "alice@example.com" },
  { projection: { name: 1, email: 1, _id: 0 } }
);

// Update: add a new address
await users.updateOne(
  { email: "alice@example.com" },
  { $push: { addresses: { type: "work", city: "Oakland" } } }
);

// Update: increment a counter
await users.updateOne(
  { _id: userId },
  { $inc: { login_count: 1 }, $set: { last_login: new Date() } }
);

// Delete
await users.deleteOne({ email: "alice@example.com" });

Querying and Filtering#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
// Find users in San Francisco with dark theme
const result = await users.find({
  "addresses.city": "San Francisco",
  "preferences.theme": "dark"
}).sort({ created_at: -1 }).limit(10).toArray();

// Find users created in the last 30 days
const recent = await users.find({
  created_at: { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) }
}).toArray();

// Text search (requires text index)
await users.createIndex({ name: "text", email: "text" });
const searchResults = await users.find({
  $text: { $search: "alice chen" }
}).toArray();

Aggregation Pipeline#

MongoDB’s aggregation framework is surprisingly powerful — it can do many things that SQL does with GROUP BY, JOINs, and window functions:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
// Revenue by product category for the last quarter
const pipeline = [
  // Stage 1: Filter orders from last quarter
  { $match: {
    status: "completed",
    created_at: { $gte: new Date("2023-10-01") }
  }},
  // Stage 2: Unwind order items array (one doc per item)
  { $unwind: "$items" },
  // Stage 3: Lookup product details
  { $lookup: {
    from: "products",
    localField: "items.product_id",
    foreignField: "_id",
    as: "product"
  }},
  // Stage 4: Flatten product array
  { $unwind: "$product" },
  // Stage 5: Group by category
  { $group: {
    _id: "$product.category",
    total_revenue: { $sum: { $multiply: ["$items.quantity", "$items.price"] } },
    order_count: { $sum: 1 },
    avg_order_value: { $avg: { $multiply: ["$items.quantity", "$items.price"] } }
  }},
  // Stage 6: Sort by revenue
  { $sort: { total_revenue: -1 } },
  // Stage 7: Rename fields for output
  { $project: {
    category: "$_id",
    total_revenue: { $round: ["$total_revenue", 2] },
    order_count: 1,
    avg_order_value: { $round: ["$avg_order_value", 2] },
    _id: 0
  }}
];

const results = await orders.aggregate(pipeline).toArray();
// [
//   { category: "Electronics", total_revenue: 45230.50, order_count: 312, avg_order_value: 145.00 },
//   { category: "Books",       total_revenue: 12890.00, order_count: 567, avg_order_value: 22.73 },
//   ...
// ]

When Documents Work Well#

ScenarioWhy documents fit
Product catalogsProducts have different attributes (shoes vs laptops)
Content managementArticles, blog posts with nested comments
User profilesVarying preferences and metadata
Event loggingSemi-structured event data
Mobile app backendsJSON in, JSON out, schema evolves fast

When Documents Do Not Work Well#

  • Many-to-many relationships: Duplication or manual reference management
  • Complex transactions across documents: Limited multi-document transaction support
  • Heavy aggregation/analytics: SQL databases and column stores are more efficient
  • Strong consistency requirements: Eventual consistency is the default in distributed setups

Key-Value Stores: Redis#

Cap theorem triangle floating in space consistency availabil

A key-value store is the simplest NoSQL model: you give it a key, it gives you back a value. Redis takes this further by supporting rich data structures as values.

Data Structures#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Strings: the basic key-value pair
SET user:1:name "Alice Chen"
GET user:1:name                    # "Alice Chen"

SET page:home:views 0
INCR page:home:views               # 1 (atomic increment)
INCR page:home:views               # 2
INCRBY page:home:views 100         # 102

# Strings with expiration (TTL)
SET session:abc123 '{"user_id":1,"role":"admin"}' EX 3600  # expires in 1 hour
TTL session:abc123                 # 3597 (seconds remaining)

# Hashes: like a mini-document
HSET user:1 name "Alice" email "alice@example.com" login_count 42
HGET user:1 name                   # "Alice"
HGETALL user:1                     # name "Alice" email "alice@example.com" login_count "42"
HINCRBY user:1 login_count 1       # 43

# Lists: ordered collection (linked list under the hood)
LPUSH notifications:user:1 "New order #1234"
LPUSH notifications:user:1 "Payment received"
LRANGE notifications:user:1 0 9    # latest 10 notifications
LLEN notifications:user:1          # number of notifications

# Sets: unordered unique collection
SADD product:42:tags "electronics" "wireless" "bluetooth"
SMEMBERS product:42:tags           # {"electronics", "wireless", "bluetooth"}
SISMEMBER product:42:tags "wireless"  # 1 (true)
# Set operations
SINTER product:42:tags product:99:tags  # tags common to both products

# Sorted Sets: ordered by score (great for leaderboards, rankings)
ZADD leaderboard 1500 "player:alice"
ZADD leaderboard 2300 "player:bob"
ZADD leaderboard 1800 "player:carol"
ZREVRANGE leaderboard 0 2 WITHSCORES
# 1) "player:bob"    2) "2300"
# 3) "player:carol"  4) "1800"
# 5) "player:alice"  6) "1500"
ZRANK leaderboard "player:carol"   # 1 (0-indexed, ascending)
ZREVRANK leaderboard "player:carol" # 1 (0-indexed, descending)

Persistence: RDB vs AOF#

Redis is primarily an in-memory database, but it offers two persistence mechanisms:

FeatureRDB (Snapshotting)AOF (Append-Only File)
How it worksPeriodic full snapshot to diskLogs every write command
Data loss riskUp to last snapshot intervalConfigurable: every second or every command
Recovery speedFast (load binary file)Slower (replay all commands)
File sizeCompact (binary format)Larger (text commands, but can be compacted)
CPU impactSpike during snapshot (fork)Steady (append to file)
1
2
3
4
5
6
7
# redis.conf: enable both for maximum safety
save 900 1        # snapshot if >= 1 key changed in 900 seconds
save 300 10       # snapshot if >= 10 keys changed in 300 seconds
save 60 10000     # snapshot if >= 10000 keys changed in 60 seconds

appendonly yes
appendfsync everysec   # fsync once per second (good balance)

Common Redis Patterns#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Rate limiting (sliding window)
# Allow max 100 requests per minute per user
MULTI
ZADD ratelimit:user:1 1702345678.123 "req-uuid-1"
ZREMRANGEBYSCORE ratelimit:user:1 0 1702345618.123  # remove entries > 60s old
ZCARD ratelimit:user:1  # count remaining entries
EXPIRE ratelimit:user:1 60
EXEC

# Distributed lock (simplified)
SET lock:process-payments "" NX EX 30  # acquire lock, 30s timeout
# NX = only set if key does not exist
# Returns OK if acquired, nil if already locked

# Pub/Sub
SUBSCRIBE channel:orders
PUBLISH channel:orders '{"order_id": 1234, "action": "created"}'

# Cache-aside pattern
# 1. Check cache
GET product:42
# 2. If miss, query database and set cache
SET product:42 '{"name":"Widget","price":9.99}' EX 300  # 5-min TTL

Wide-Column Stores: Cassandra#

Different nosql database types as distinct architectural sty

Wide-column databases (sometimes called column-family stores) are designed for massive scale with predictable performance. Apache Cassandra is the most prominent example.

Column-family store layout

Data Model#

Cassandra uses tables with a primary key composed of:

  • Partition key: determines which node stores the data (distribution)
  • Clustering key: determines sort order within a partition
1
2
3
4
5
6
7
8
9
-- CQL (Cassandra Query Language)
CREATE TABLE user_activity (
    user_id     UUID,
    activity_date DATE,
    activity_time TIMESTAMP,
    activity_type TEXT,
    details       MAP<TEXT, TEXT>,
    PRIMARY KEY ((user_id), activity_date, activity_time)
) WITH CLUSTERING ORDER BY (activity_date DESC, activity_time DESC);

Here, user_id is the partition key and (activity_date, activity_time) is the clustering key. All activities for a single user are stored together on the same node, sorted by date and time descending.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
-- Insert data
INSERT INTO user_activity (user_id, activity_date, activity_time, activity_type, details)
VALUES (
    550e8400-e29b-41d4-a716-446655440000,
    '2023-12-15',
    '2023-12-15T14:30:00Z',
    'purchase',
    {'product_id': '42', 'amount': '149.99'}
);

-- Query: get recent activity for a user (efficient - single partition)
SELECT * FROM user_activity
WHERE user_id = 550e8400-e29b-41d4-a716-446655440000
  AND activity_date >= '2023-12-01'
LIMIT 20;

-- Query across partitions: AVOID THIS (full cluster scan)
-- SELECT * FROM user_activity WHERE activity_type = 'purchase';
-- This requires ALLOW FILTERING and scans the entire cluster

When to Use Cassandra#

Use caseWhy Cassandra fits
Time-series dataPartition by entity, cluster by time
IoT sensor dataMassive write throughput, predictable latency
User activity logsPartition by user, query recent activity
Messaging / chatPartition by conversation, cluster by timestamp
Geographic dataReplicate across data centers

Cassandra Anti-Patterns#

  • Random reads across partitions: Each partition may be on a different node
  • Joins: Not supported — denormalize or use materialized views
  • Lightweight transactions: Cassandra supports them but they are expensive (Paxos-based)
  • Secondary indexes on high-cardinality columns: Poor performance

Graph Databases: Neo4j#

When relationships are the data — social networks, fraud detection, recommendation engines, knowledge graphs — a graph database is the natural fit.

Graph database traversal

Data Model#

Graphs have two primitives:

  • Nodes (vertices): entities with labels and properties
  • Relationships (edges): typed connections between nodes, also with properties
1
2
3
4
5
(Alice:Person {name: "Alice", age: 30})
    -[:FRIENDS_WITH {since: 2020}]->
(Bob:Person {name: "Bob", age: 28})
    -[:WORKS_AT {role: "Engineer"}]->
(Acme:Company {name: "Acme Corp", industry: "Tech"})

Cypher Query Language#

// Create nodes and relationships
CREATE (alice:Person {name: "Alice", age: 30})
CREATE (bob:Person {name: "Bob", age: 28})
CREATE (carol:Person {name: "Carol", age: 32})
CREATE (acme:Company {name: "Acme Corp"})
CREATE (alice)-[:FRIENDS_WITH {since: 2020}]->(bob)
CREATE (alice)-[:FRIENDS_WITH {since: 2019}]->(carol)
CREATE (bob)-[:WORKS_AT {role: "Engineer", since: 2021}]->(acme)
CREATE (carol)-[:WORKS_AT {role: "Designer", since: 2020}]->(acme)

// Find Alice's friends
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)
RETURN friend.name, friend.age

// Find friends of friends (2 hops)
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH*2]->(fof)
WHERE fof <> alice
RETURN DISTINCT fof.name

// Shortest path between two people
MATCH path = shortestPath(
  (alice:Person {name: "Alice"})-[:FRIENDS_WITH*]-(bob:Person {name: "Bob"})
)
RETURN path, length(path)

// Recommendation: people who work at the same company as Alice's friends
MATCH (alice:Person {name: "Alice"})-[:FRIENDS_WITH]->(friend)-[:WORKS_AT]->(company)<-[:WORKS_AT]-(colleague)
WHERE NOT (alice)-[:FRIENDS_WITH]->(colleague)
  AND colleague <> alice
RETURN colleague.name, company.name, count(*) AS mutual_connections
ORDER BY mutual_connections DESC

Graph vs Relational: The JOIN Problem#

Finding friends-of-friends-of-friends in SQL:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
-- 3-hop friend query in SQL (painful)
SELECT DISTINCT p4.name
FROM friendships f1
JOIN friendships f2 ON f1.friend_id = f2.person_id
JOIN friendships f3 ON f2.friend_id = f3.person_id
JOIN people p4 ON f3.friend_id = p4.person_id
WHERE f1.person_id = 1
  AND p4.person_id != 1;
-- Performance degrades exponentially with each hop
-- On a social graph with millions of users, this is impractical

The same in Cypher:

MATCH (alice:Person {id: 1})-[:FRIENDS_WITH*3]->(fofof)
WHERE fofof <> alice
RETURN DISTINCT fofof.name
// Graph databases use index-free adjacency — each node directly
// references its neighbors. No join tables, no index lookups.
// Performance depends on the number of results, not the total graph size.

Time-Series Databases#

Time-series data — metrics, IoT sensor readings, financial ticks, application logs — has unique access patterns: write-heavy, append-only, queries almost always filter by time range, and old data can be downsampled or expired.

Why Not Just PostgreSQL?#

PostgreSQL can store time-series data, but at scale it struggles:

ChallengeRDBMS approachTime-series DB approach
Billions of rowsTable bloat, slow vacuumPartitioned by time, automatic retention
Write throughputWAL bottleneck at >100K rows/sBatched append, LSM/columnar
Range queriesB-tree seeks per rowSequential scan of time-ordered chunks
Data retentionManual DELETE + vacuumAutomatic TTL policies
AggregationFull scan or materialized viewsPre-computed rollups

TimescaleDB (PostgreSQL Extension)#

TimescaleDB adds time-series superpowers to PostgreSQL — you keep full SQL compatibility:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
-- Create a hypertable (auto-partitioned by time)
CREATE TABLE metrics (
    time TIMESTAMPTZ NOT NULL,
    host VARCHAR(50),
    cpu_usage FLOAT,
    memory_usage FLOAT,
    disk_io FLOAT
);

SELECT create_hypertable('metrics', 'time');

-- Insert works exactly like normal PostgreSQL
INSERT INTO metrics VALUES
(NOW(), 'web-01', 72.3, 84.1, 12.5),
(NOW(), 'web-02', 45.8, 62.4, 8.2);

-- Time-bucket aggregation (unique to TimescaleDB)
SELECT
    time_bucket('5 minutes', time) AS bucket,
    host,
    AVG(cpu_usage) AS avg_cpu,
    MAX(cpu_usage) AS peak_cpu
FROM metrics
WHERE time > NOW() - INTERVAL '1 hour'
GROUP BY bucket, host
ORDER BY bucket DESC;

-- Continuous aggregates (materialized, auto-refreshed)
CREATE MATERIALIZED VIEW hourly_metrics
WITH (timescaledb.continuous) AS
SELECT
    time_bucket('1 hour', time) AS hour,
    host,
    AVG(cpu_usage) AS avg_cpu,
    percentile_cont(0.95) WITHIN GROUP (ORDER BY cpu_usage) AS p95_cpu
FROM metrics
GROUP BY hour, host;

-- Data retention policy (auto-delete data older than 90 days)
SELECT add_retention_policy('metrics', INTERVAL '90 days');

InfluxDB#

InfluxDB uses a custom query language (Flux) optimized for time-series operations:

from(bucket: "monitoring")
  |> range(start: -1h)
  |> filter(fn: (r) => r._measurement == "cpu" and r.host == "web-01")
  |> aggregateWindow(every: 5m, fn: mean)
  |> yield()

Choosing a Time-Series Database#

DatabaseBest forQuery languageDeployment
TimescaleDBTeams already on PostgreSQLSQL (full)Extension (self-host or cloud)
InfluxDBMetrics/monitoring pipelinesFlux / InfluxQLStandalone / Cloud
QuestDBUltra-low-latency ingestionSQL subsetStandalone
ClickHouseAnalytics on event dataSQL (extended)Standalone / Cloud
PrometheusPull-based metrics collectionPromQLStandalone (with remote storage)

Vector Databases#

Vector databases store high-dimensional embeddings and support similarity search. They power semantic search, recommendation systems, RAG (Retrieval-Augmented Generation), and image retrieval.

How Vector Search Works#

Traditional databases find exact matches. Vector databases find the nearest neighbors in embedding space:

1
2
3
4
5
6
Query: "How to deploy a Python app?"
                                        Cosine similarity
Embedding → [0.23, -0.14, 0.87, ...]  ─────────────────→  Top-K results
                                        0.94: "Deploying Flask to production"
                                        0.91: "Python app Docker guide"
                                        0.87: "CI/CD pipeline setup"

Index Types#

AlgorithmSpeedAccuracyMemoryBest for
Flat (brute force)Slow100%Low<100K vectors
IVF (Inverted File)Fast~95%Medium100K-10M vectors
HNSW (Hierarchical NSW)Very fast~98%HighGeneral purpose
PQ (Product Quantization)Fast~90%Very lowBillions of vectors

Milvus / Zilliz#

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
from pymilvus import connections, Collection, FieldSchema, CollectionSchema, DataType

# Connect
connections.connect(host="localhost", port="19530")

# Define schema
fields = [
    FieldSchema("id", DataType.INT64, is_primary=True),
    FieldSchema("text", DataType.VARCHAR, max_length=1000),
    FieldSchema("embedding", DataType.FLOAT_VECTOR, dim=1536),
]
schema = CollectionSchema(fields)
collection = Collection("documents", schema)

# Create HNSW index
collection.create_index("embedding", {
    "index_type": "HNSW",
    "metric_type": "COSINE",
    "params": {"M": 16, "efConstruction": 200},
})

# Search
results = collection.search(
    data=[query_embedding],  # your 1536-dim vector
    anns_field="embedding",
    param={"metric_type": "COSINE", "params": {"ef": 100}},
    limit=10,
    output_fields=["text"],
)

pgvector (PostgreSQL Extension)#

For teams already on PostgreSQL who want vector search without a new database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
-- Enable extension
CREATE EXTENSION vector;

-- Table with vector column
CREATE TABLE documents (
    id SERIAL PRIMARY KEY,
    content TEXT,
    embedding vector(1536)  -- OpenAI ada-002 dimension
);

-- HNSW index for fast similarity search
CREATE INDEX ON documents
USING hnsw (embedding vector_cosine_ops)
WITH (m = 16, ef_construction = 200);

-- Semantic search
SELECT id, content, 1 - (embedding <=> $1) AS similarity
FROM documents
ORDER BY embedding <=> $1  -- <=> is cosine distance
LIMIT 10;

When to Use a Dedicated Vector DB vs pgvector#

FactorpgvectorDedicated (Milvus, Pinecone, Qdrant)
Vector count<5MMillions to billions
Existing stackAlready on PostgreSQLGreenfield or specialized
Hybrid queriesSQL + vector in one querySeparate systems
ThroughputModerate (~1K QPS)High (~100K QPS)
FilteringFull SQL WHERE clauseMetadata filtering (limited)
Operational costZero (extension)New infrastructure

Multi-Model Databases#

Some databases support multiple data models in a single system, eliminating the need to synchronize between separate databases.

Examples#

DatabaseModels supportedUse case
PostgreSQL + extensionsRelational, Document (jsonb), Vector (pgvector), Time-series (TimescaleDB), Graph (Apache AGE)“Everything database” for medium scale
ArangoDBDocument, Graph, Key-ValueApplications needing graph + document
SurrealDBDocument, Graph, RelationalNew projects wanting flexibility
FaunaDBDocument, Relational, TemporalServerless, multi-region
CosmosDBDocument, Graph, Key-Value, Column-family, TableAzure ecosystem

The PostgreSQL Ecosystem Approach#

Instead of learning a new database, extend PostgreSQL:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
-- Document model (jsonb)
CREATE TABLE products (id SERIAL PRIMARY KEY, data JSONB);

-- Time-series (TimescaleDB)
CREATE TABLE metrics (time TIMESTAMPTZ, value FLOAT);
SELECT create_hypertable('metrics', 'time');

-- Vector search (pgvector)
CREATE TABLE embeddings (id INT, vec vector(768));

-- Graph queries (Apache AGE)
LOAD 'age';
SELECT * FROM cypher('social', $$ MATCH (a)-[:FOLLOWS]->(b) RETURN a, b $$) AS (a agtype, b agtype);

-- Full-text search (built-in)
CREATE INDEX idx_fts ON articles USING GIN (to_tsvector('english', content));
SELECT * FROM articles WHERE to_tsvector('english', content) @@ to_tsquery('database & vector');

One deployment, one backup strategy, one connection pool, full ACID across all models. The tradeoff: each extension is competent but not best-in-class at extreme scale.

The CAP Theorem#

The CAP theorem states that a distributed system can provide at most two of these three guarantees:

CAP theorem tradeoffs

  • Consistency: Every read receives the most recent write
  • Availability: Every request receives a response (even if not the most recent data)
  • Partition Tolerance: The system continues operating despite network partitions

Since network partitions are unavoidable in distributed systems, the real choice is between CP and AP:

ChoiceBehavior during partitionExamples
CP (Consistency + Partition Tolerance)Refuses requests it cannot guarantee are consistentHBase, MongoDB (with majority write concern), etcd, ZooKeeper
AP (Availability + Partition Tolerance)Serves requests but may return stale dataCassandra, DynamoDB, CouchDB, Riak
CA (Consistency + Availability)Not possible in a distributed systemSingle-node PostgreSQL / MySQL (not distributed)

In practice, most databases let you tune the consistency/availability trade-off per operation:

1
2
3
4
5
6
7
8
// MongoDB: tunable write concern
await collection.insertOne(doc, {
  writeConcern: { w: "majority", j: true }  // CP behavior
});

await collection.insertOne(doc, {
  writeConcern: { w: 1 }  // AP behavior (acknowledged by primary only)
});
1
2
3
4
5
6
7
-- Cassandra: tunable consistency per query
-- Quorum reads + quorum writes = strong consistency
SELECT * FROM users WHERE user_id = ? CONSISTENCY QUORUM;
INSERT INTO users (...) VALUES (...) USING CONSISTENCY QUORUM;

-- ONE = fast but possibly stale
SELECT * FROM users WHERE user_id = ? CONSISTENCY ONE;

NewSQL: The Best of Both Worlds?#

NewSQL databases attempt to provide SQL + ACID + horizontal scaling:

DatabaseArchitectureKey feature
CockroachDBRaft consensus, range-based shardingPostgreSQL wire protocol, survives zone failures
TiDBTiKV storage (RocksDB) + TiDB SQL layerMySQL protocol compatible, HTAP (hybrid)
YugabyteDBDocDB storage, Raft consensusPostgreSQL and Cassandra compatible APIs
Google SpannerTrueTime (atomic clocks), PaxosGlobal consistency with external consistency
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
-- CockroachDB: looks like PostgreSQL, scales like Cassandra
CREATE TABLE orders (
    order_id UUID DEFAULT gen_random_uuid() PRIMARY KEY,
    user_id UUID NOT NULL,
    total DECIMAL(10,2),
    created_at TIMESTAMPTZ DEFAULT now()
);

-- Transactions work exactly like PostgreSQL
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;
-- But the data is distributed across multiple nodes with Raft consensus

Decision Table: Choosing the Right Database#

RequirementBest fitExamples
ACID transactions, complex queriesRelationalPostgreSQL, MySQL
Flexible schema, nested documentsDocument storeMongoDB, Firestore
Ultra-low latency cachingKey-valueRedis, Memcached
Massive write throughput, time-seriesWide-columnCassandra, HBase
Relationship-heavy queriesGraphNeo4j, Amazon Neptune
SQL + horizontal scalingNewSQLCockroachDB, TiDB
Real-time analyticsColumn-orientedClickHouse, DuckDB
Full-text searchSearch engineElasticsearch, Meilisearch
Global distribution with strong consistencyManaged NewSQLGoogle Spanner, CockroachDB

The right answer is often “PostgreSQL” for your primary data store, with a specialized database for specific workloads. Most successful systems use 2-3 databases, not one.

What’s Next#

Whether you choose relational or NoSQL, a single machine eventually becomes a bottleneck. In the next article, we will explore replication and partitioning — the techniques that let databases scale beyond one server while maintaining (some level of) consistency.

In this series

Databases 8 parts

  1. 01 Databases (1): Data Models and SQL — Why Tables Won (For Now)
  2. 02 Databases (2): Indexing and Query Planning — How Databases Find Your Data
  3. 03 Databases (3): Transactions and Concurrency — ACID, Isolation Levels, and Locking
  4. 04 Databases (4): Storage Engines — How Data Hits Disk
  5. 05 Databases (5): NoSQL — Document, Key-Value, Column, and Graph you are here
  6. 06 Databases (6): Replication and Partitioning — Scaling Beyond One Machine
  7. 07 Databases (7): Distributed Transactions — 2PC, Saga, and Why Consensus Is Hard
  8. 08 Databases (8): Databases in Practice — Migration, Monitoring, and War Stories

Liked this piece?

Follow on GitHub for the next one — usually one a week.

GitHub