DatabasesB-treehash-indexquery-optimizationcomposite-indexcovering-indexPostgreSQLMySQLperformance

Database Indexing

A database index is a data structure (typically a B-tree or hash table) that provides fast lookup of rows by specific column values, trading storage space and write overhead for dramatically faster read queries.

Without an index, a database must scan every row in a table to find matching records (full table scan) — O(n). An index creates a sorted data structure that allows O(log n) lookups, similar to a book's index. The most common index type is a B-tree (balanced tree), which supports equality lookups, range queries, and sorting. Indexes dramatically speed up reads but slow down writes (every INSERT/UPDATE must also update the index) and consume additional storage. Choosing the right indexes is one of the most impactful performance optimizations in any application.

Tradeoffs

Strengths

Dramatic read performance: Turns O(n) full table scans into O(log n) lookups — often 1000–100,000x faster.
Enables sorting: Indexes pre-sort data, eliminating expensive in-memory sorts.
Constraint enforcement: Unique indexes enforce data integrity at the database level.
Covering indexes: Can answer queries entirely from the index, avoiding heap access entirely.
Partial indexes: Index only relevant subsets of data, saving space and improving performance.

Weaknesses

Write overhead: Every INSERT, UPDATE, and DELETE must update all relevant indexes. Tables with many indexes have significantly slower writes.
Storage cost: Indexes consume additional disk space — sometimes 50–100% of the base table size.
Maintenance: Indexes can become bloated and fragmented, requiring periodic maintenance (VACUUM, REINDEX).
Complexity: Choosing optimal indexes requires understanding query patterns, selectivity, and the query planner.
Over-indexing risk: Too many indexes waste storage, slow writes, and increase backup size.
Update-heavy tables: Frequently updated columns in an index cause high index churn.

Likely Follow-Up Questions

How do you decide which columns to index?
What is the difference between a clustered and a non-clustered index?
How do composite indexes work with the leftmost prefix rule?
What is index bloat and how do you address it?
When would you use a GIN index vs. a B-tree index?
How do you use EXPLAIN ANALYZE to optimize queries?

Related Concepts

SQL vs NoSQL Database Sharding Database Replication Caching

Source: editorial — Synthesized from PostgreSQL and MySQL documentation, Use The Index Luke (by Markus Winand), and database performance engineering best practices.