DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

DGraph


Dgraph is a distributed graph database that supports GraphQL. It emphasizes concurrency in distributed environment by minimizing network calls.[04][05]

Source Code
https://github.com/dgraph-io/dgraph[02]
Country of Origin
US
Start Year
2015 [06]
Coding Agent
Project Types
Commercial, Open Source
Written in
Go
Supported Languages
Go
Embeds / Uses
BadgerDB
Operating Systems
Linux, macOS
License
Apache v2

Database Entry

DGraph


Dgraph is a distributed graph database that supports GraphQL. It emphasizes concurrency in distributed environment by minimizing network calls.[04][05]

History[06][07][08][09]


In July 2015, Manish Rai Jain created Dgraph based on his previous experience at Google -- there he led a project to unite all data structures for serving web search with a backend graph system. The first version v0.1 was released in December 2015.

In 2020, the company launched the hosted Dgraph Cloud service.

DGraph Labs was acquired by Hypermode, Inc. in 2023. The combined company was then acquired in October 2025 by Istari Digital.

Checkpoints[10]


Every mutation upon hitting the database doesn’t immediately make it on disk via BadgerDB. We avoid re-generating the posting list too often, because all the postings need to be kept sorted, and it’s expensive. Instead, every mutation gets logged and synced to disk via append only log files called write-ahead logs. So, any acknowledged writes would always be on disk. This allows us to recover from a system crash, by replaying all the mutations since the last write to Posting List.

Compression[11]


Dgraph Alpha lets you configure the compression of data on disk using the --badger superflag’s compression option. You can choose between the Snappy and Zstandard compression algorithms, or choose not to compress data on disk.

Concurrency Control[04]


Dgraph supports MVCC, Read Snapshots, and Distributed ACID transactions.

Data Model[12]


Dgraph is a horizontally scalable and distributed GraphQL database with a graph backend.

Foreign Keys[13]


In contrast to foreign key in relational database, nodes in graph database don't possess properties. Foreign relationships are represented by edges and should not exist implicitly. In Dgraph, creating relationships on top of data is the only way to model the data.

Indexes[14][15]


The DBMS uses the BadgerDB persistent key-value database in Go.

Isolation Levels[16]


Transactions are based on Snapshot Isolation (not Serializable Snapshot Isolation), because conflicts are determined by writes (not reads).

Joins[17]


Dgraph's PostingList structure stores all DirectedEdges corresponding to an Attribute in the format of Attribute: Entity -> sorted list of ValueId, which already consists of all data needed for a join. Therefore, each RPC call to the cluster would result in only one join rather than multiple joins. Join operation is reduced to lookup rather than application layer.

Logging[18]


Dgraph's logging scheme is close to logical logging. Every mutation is logged and then synced to disk via append-only log. Additionally, two layers of mutation responsible for replacing and addition/deletion respectively can log mutations in memory, allowing periodical garbage collection for dirty posting list via BadgerDB. This reduces the need for recreating the posting lists.

Query Compilation


Query Execution[17]


Query Interface[19][20][21]


Dgraph uses a variation of GraphQL (created by Facebook) called DQL as its query language because of GraphQL's graph-like query syntax, schema validation and subgraph shaped response. The difference is that DQL supports graph operations and has removed some inappropriate features considering graph database's special structure.

Storage Architecture[17]


BadgerDB library would decide how data are served out of memory, SSD or disk. In order to proceed processing, updates to posting lists can be stored in memory as an overlay over immutable Posting list. Two separate update layers are provided for replacing and addition/deletion respectively, which allows iteration over Postings in memory without fetching things from disk.

Storage Model[22]


Dgraph utilizes BadgerDB (an application library rather than a database) to help with key-value storage of posting lists on disk. However, all data handling still happens at Dgraph level rather than BadgerDB. BadgerDB functions as an interface of disk for Dgraph.

Storage Organization[22][15][23]


Stored Procedures[24][25]


Dgraph supports Persistent Queries. When a client uses persistent queries, the client only sends the hash of a query to the server. The server has a list of known hashes and uses the associated query accordingly.

System Architecture[17]


Dgraph uses RAFT consensus algorithm for communication between servers. During each term (election cycle), voting is conducted to decide a single leader. Then there is unidirectional RPC communication from leader to followers, but they don't share disk naturally. Each server exposes a GRPC interface, which can then be called by the query processor to retrieve data. Clients must locate the cluster to interact with it. A client can randomly pick up any server in the cluster. If not picking a leader, the request should be rejected, and the leader information is passed along. The client can then re-route it's query to the leader.

Views


Compatible Systems
BadgerDB BadgerDB
Derivative Systems
BadgerDB BadgerDB

Citations

27 sources
  1. Dgraph Documentation dgraph.io
  2. GitHub - dgraph-io/dgraph: high-performance graph database for real-time use cases · GitHub github.com
  3. https://dgraph.io/docs dgraph.io Dead — Check Archive
  4. https://dgraph.io/blog/post/why-google-needed-graph-serving-system dgraph.io Dead — Check Archive
  5. Jepsen: Dgraph 1.1.1 jepsen.io
  6. Dgraph raises $3M for its open-source distributed graph database, hits 1.0 release | TechCrunch techcrunch.com
  7. Big Data Archives | TechRepublic techrepublic.com
  8. Dgraph GraphQL database users detail graph use cases | TechTarget techtarget.com
  9. After two failed startups, ex-Google employee secures $1.45 million in funding and the backing of the co-founder of Atlassian - SmartCompany smartcompany.com.au
  10. https://dgraph.io/docs/design-concepts/concepts#write-ahead-logs dgraph.io Dead — Check Archive
  11. https://dgraph.io/docs/deploy/data-compression dgraph.io Dead — Check Archive
  12. https://dgraph.io/docs/dgraph-overview dgraph.io Dead — Check Archive
  13. https://dgraph.io/docs/design-concepts/concepts#edges dgraph.io Dead — Check Archive
  14. https://dgraph.io/docs/faq#why-doesnt-dgraph-use-boltdb-or-rocksdb dgraph.io Dead — Check Archive
  15. https://dgraph.io/blog/post/badger-over-rocksdb-in-dgraph dgraph.io Dead — Check Archive
  16. https://dgraph.io/docs/design-concepts/consistency-model dgraph.io Dead — Check Archive
  17. https://dgraph.io/docs/design-concepts/concepts dgraph.io Dead — Check Archive
  18. https://dgraph.io/docs/design-concepts/concepts#mutations dgraph.io Dead — Check Archive
  19. https://dgraph.io/docs/graphql/overview dgraph.io Dead — Check Archive
  20. https://facebook.github.io/graphql github.io Dead — Check Archive
  21. https://dgraph.io/docs/query-language/graphql-fundamentals dgraph.io Dead — Check Archive
  22. https://dgraph.io/docs/design-concepts/concepts#badger dgraph.io Dead — Check Archive
  23. https://dgraph.io/blog/post/badger dgraph.io Dead — Check Archive
  24. https://dgraph.io/docs/graphql/queries/persistent-queries dgraph.io Dead — Check Archive
  25. https://dgraph.io/docs/query-language/functions dgraph.io Dead — Check Archive
  26. https://www.hypermode.com/blog/the-future-of-dgraph-is-open-serverless-and-ai-ready hypermode.com Dead — Check Archive
  27. https://github.com/dgraph-io/dgraph/commit/6e9dab460e3f6bd936857d878fd2f3479a0b4f8d github.com
Revision #22 Last Updated: