SDI.
All Databases
GraphOps: medium

Neo4j

Neo4j is the leading native graph database, purpose-built for storing and traversing highly connected data using nodes, relationships, and properties. Its index-free adjacency storage model means that traversing a relationship is a constant-time operation O(1) regardless of graph size, unlike relational databases where join performance degrades with table size. Neo4j's Cypher query language provides an intuitive, pattern-matching syntax for expressing graph traversals, path finding, and complex relationship queries.

Strengths

Index-free adjacency enables constant-time relationship traversals regardless of total graph sizeCypher query language is expressive and readable for complex graph patterns, shortest paths, and subgraph matchingACID-compliant transactions with full support for constraints, indexes, and schema enforcementBuilt-in graph algorithms library (GDS) for PageRank, community detection, pathfinding, and centrality analysisNative graph storage engine optimized for deep traversals (6+ hops) that would require expensive recursive joins in SQL

Weaknesses

Not designed for aggregation-heavy analytics or tabular reporting; lacks columnar storage optimizationsHorizontal scaling is limited; sharding a graph across nodes requires Fabric or application-level partitioningMemory requirements are high as performance depends on caching the graph structure in RAMBulk loading large datasets (billions of nodes) requires the neo4j-admin import tool; online bulk inserts are slowerCypher query optimizer can produce suboptimal plans for complex queries; requires profile/explain tuningCommunity edition is single-instance only; clustering requires Enterprise license

Ideal Workloads

  • -Social networks and recommendation engines modeling user-to-user and user-to-item relationships
  • -Fraud detection systems identifying suspicious patterns across transaction graphs in real-time
  • -Knowledge graphs and ontologies for semantic search, content tagging, and entity resolution
  • -Network and IT infrastructure mapping with dependency analysis and impact assessment
  • -Identity and access management modeling complex role hierarchies and permission inheritance

Scaling Model

Primarily scales vertically by adding memory to cache more of the graph. Read scaling via causal clustering with core servers (for writes) and read replicas. Neo4j Fabric enables federated queries across multiple databases for logical sharding. Neo4j Aura (managed service) handles infrastructure scaling. True horizontal write scaling across partitioned subgraphs requires careful data modeling to minimize cross-partition traversals.

Consistency Model

Full ACID compliance with read-committed isolation by default. Causal clustering provides causal consistency: a bookmark token from a write transaction can be passed to read replicas to ensure they return data at least as fresh as that write. Core servers use Raft consensus for leader election and log replication. Read replicas are eventually consistent but can enforce causal ordering via bookmarks.

When to Use

  • Your data is inherently graph-structured with complex, variable-depth relationships
  • You need real-time traversals across many relationship hops (friend-of-friend, shortest path, pattern matching)
  • You are building a recommendation engine, fraud detection system, or knowledge graph
  • Relationship queries in your relational database require recursive CTEs or multiple self-joins and are too slow
  • You need graph algorithms (PageRank, community detection, betweenness centrality) as first-class operations

When Not to Use

  • Your data is primarily tabular with simple foreign-key relationships and few joins
  • You need high-throughput bulk analytics or aggregations over millions of records
  • Your workload is write-heavy with millions of inserts per second (consider a wide-column or key-value store)
  • You need to shard your data across many nodes for horizontal write scalability
  • Your relationships are simple and shallow (1-2 hops); a relational database with proper indexing will suffice

Source: editorial — Based on Neo4j 5.x documentation and graph database design patterns

Command Palette

Search for a command to run...