OLAPOps: medium

ClickHouse

ClickHouse is an open-source columnar OLAP database management system designed for real-time analytical queries on billions of rows with sub-second response times. Developed at Yandex for web analytics (Yandex.Metrica processes over 20 billion events per day), it uses columnar storage with aggressive compression, vectorized query execution, and sparse indexing to achieve orders-of-magnitude better performance than row-oriented databases for analytical workloads. ClickHouse supports SQL with extensions for approximate query processing, materialized views, and array/map operations.

Strengths

Columnar storage with LZ4/ZSTD compression achieves 10-40x compression ratios, reducing I/O dramaticallyVectorized query execution processes data in batches using SIMD instructions for CPU efficiencyIngestion rates of millions of rows per second per node with async inserts and buffer tablesMergeTree engine family supports automatic data partitioning, TTL-based expiration, and background mergesApproximate query functions (uniq, quantile) trade minimal accuracy for massive speed improvementsMaterialized views enable incremental pre-aggregation at insert time for dashboard acceleration

Weaknesses

Not designed for point lookups or OLTP; primary key lookups scan sparse index blocks, not individual rowsUPDATE and DELETE operations are heavy 'mutations' that rewrite entire data parts asynchronouslyNo built-in ACID transactions; concurrent inserts to the same partition can create many small partsJoins on large tables are memory-intensive; hash joins require the right-side table to fit in RAMReplication relies on ZooKeeper/ClickHouse Keeper, adding operational complexityDistributed queries across shards can be slow if data is not co-located by the sharding key

Ideal Workloads

-Real-time analytics dashboards over billions of events (web analytics, product analytics, ad tech)
-Log and event storage with time-based partitioning and TTL-based automatic cleanup
-Financial and business intelligence reporting with complex aggregations and window functions
-Metrics and observability backends as an alternative to specialized time-series databases

Scaling Model

Scales vertically by leveraging all CPU cores via parallel query execution. Horizontal scaling via sharding with a Distributed table engine that routes queries across shards. Each shard is typically a ReplicatedMergeTree with 2-3 replicas using ClickHouse Keeper for coordination. Data is inserted into local tables and queried via distributed tables. SharedMergeTree (ClickHouse Cloud) decouples compute from storage using object storage.

Consistency Model

Strong consistency within a single node for reads (read-your-own-writes). Replicated tables use a log-based replication model where all replicas converge to the same state by replaying the same sequence of operations. Mutations (UPDATEs/DELETEs) are applied asynchronously and eventually consistent. INSERT operations are atomic at the block level (default ~1M rows or 1 partition). No multi-statement transactions.

When to Use

You need sub-second analytical queries over billions of rows without pre-aggregation
Your workload is append-heavy with rare updates or deletes (event logs, metrics, clickstreams)
You want high compression ratios to minimize storage costs for large analytical datasets
You need real-time materialized views for dashboard acceleration
You are building an observability or product analytics platform

When Not to Use

You need OLTP capabilities with frequent point updates and deletes
Your workload requires multi-row ACID transactions
You need low-latency key-value lookups by primary key
Your queries involve complex many-to-many joins across large tables
You want a fully managed, zero-ops experience (though ClickHouse Cloud is improving this)

Source: editorial — Based on ClickHouse documentation, Yandex.Metrica architecture, and OLAP benchmarking data