SDI.
All Concepts
DatabasesB-treehash-indexquery-optimizationcomposite-indexcovering-indexPostgreSQLMySQLperformance

Database Indexing

A database index is a data structure (typically a B-tree or hash table) that provides fast lookup of rows by specific column values, trading storage space and write overhead for dramatically faster read queries.

Without an index, a database must scan every row in a table to find matching records (full table scan) — O(n). An index creates a sorted data structure that allows O(log n) lookups, similar to a book's index. The most common index type is a B-tree (balanced tree), which supports equality lookups, range queries, and sorting. Indexes dramatically speed up reads but slow down writes (every INSERT/UPDATE must also update the index) and consume additional storage. Choosing the right indexes is one of the most impactful performance optimizations in any application.

Tradeoffs

Strengths

  • Dramatic read performance: Turns O(n) full table scans into O(log n) lookups — often 1000–100,000x faster.
  • Enables sorting: Indexes pre-sort data, eliminating expensive in-memory sorts.
  • Constraint enforcement: Unique indexes enforce data integrity at the database level.
  • Covering indexes: Can answer queries entirely from the index, avoiding heap access entirely.
  • Partial indexes: Index only relevant subsets of data, saving space and improving performance.

Weaknesses

  • Write overhead: Every INSERT, UPDATE, and DELETE must update all relevant indexes. Tables with many indexes have significantly slower writes.
  • Storage cost: Indexes consume additional disk space — sometimes 50–100% of the base table size.
  • Maintenance: Indexes can become bloated and fragmented, requiring periodic maintenance (VACUUM, REINDEX).
  • Complexity: Choosing optimal indexes requires understanding query patterns, selectivity, and the query planner.
  • Over-indexing risk: Too many indexes waste storage, slow writes, and increase backup size.
  • Update-heavy tables: Frequently updated columns in an index cause high index churn.

Likely Follow-Up Questions

  • How do you decide which columns to index?
  • What is the difference between a clustered and a non-clustered index?
  • How do composite indexes work with the leftmost prefix rule?
  • What is index bloat and how do you address it?
  • When would you use a GIN index vs. a B-tree index?
  • How do you use EXPLAIN ANALYZE to optimize queries?

Source: editorial — Synthesized from PostgreSQL and MySQL documentation, Use The Index Luke (by Markus Winand), and database performance engineering best practices.

Command Palette

Search for a command to run...