Neo4j
Neo4j is the leading native graph database, purpose-built for storing and traversing highly connected data using nodes, relationships, and properties. Its index-free adjacency storage model means that traversing a relationship is a constant-time operation O(1) regardless of graph size, unlike relational databases where join performance degrades with table size. Neo4j's Cypher query language provides an intuitive, pattern-matching syntax for expressing graph traversals, path finding, and complex relationship queries.
Strengths
Weaknesses
Ideal Workloads
- -Social networks and recommendation engines modeling user-to-user and user-to-item relationships
- -Fraud detection systems identifying suspicious patterns across transaction graphs in real-time
- -Knowledge graphs and ontologies for semantic search, content tagging, and entity resolution
- -Network and IT infrastructure mapping with dependency analysis and impact assessment
- -Identity and access management modeling complex role hierarchies and permission inheritance
Scaling Model
Primarily scales vertically by adding memory to cache more of the graph. Read scaling via causal clustering with core servers (for writes) and read replicas. Neo4j Fabric enables federated queries across multiple databases for logical sharding. Neo4j Aura (managed service) handles infrastructure scaling. True horizontal write scaling across partitioned subgraphs requires careful data modeling to minimize cross-partition traversals.
Consistency Model
Full ACID compliance with read-committed isolation by default. Causal clustering provides causal consistency: a bookmark token from a write transaction can be passed to read replicas to ensure they return data at least as fresh as that write. Core servers use Raft consensus for leader election and log replication. Read replicas are eventually consistent but can enforce causal ordering via bookmarks.
When to Use
- Your data is inherently graph-structured with complex, variable-depth relationships
- You need real-time traversals across many relationship hops (friend-of-friend, shortest path, pattern matching)
- You are building a recommendation engine, fraud detection system, or knowledge graph
- Relationship queries in your relational database require recursive CTEs or multiple self-joins and are too slow
- You need graph algorithms (PageRank, community detection, betweenness centrality) as first-class operations
When Not to Use
- Your data is primarily tabular with simple foreign-key relationships and few joins
- You need high-throughput bulk analytics or aggregations over millions of records
- Your workload is write-heavy with millions of inserts per second (consider a wide-column or key-value store)
- You need to shard your data across many nodes for horizontal write scalability
- Your relationships are simple and shallow (1-2 hops); a relational database with proper indexing will suffice
Source: editorial — Based on Neo4j 5.x documentation and graph database design patterns