Dgraph: Synchronously Replicated, Transactional and Distributed Graph Database
- Two kinds of nodes:
- Zeroes are administrative nodes that handle things like:
- Handing out monotonic timestamps
- Form consensus on metadata about the rest of the cluster
- Handle rebalances/reshards
- Alphas store data
- Nodes are divided into groups; one group for zeroes, and 1..N groups for alphas.
- Each group forms a Raft cluster and can have 1, 3, or 5 replicas.
- Primitive data structure is a triple, representing either:
- (subject, predicate, object) →
(0xab, <follower_of>, 0xff)
- (subject, predicate, value) →
(0xab, <name>, "John")
- All subjects are given a globally unique
- Data is sharded by relationship/predicate rather than by entity.
- For example, if your data store contains information about which user follows which other user, users are entities, and “follows” is the relationship
- Dgraph places all data for the “follows” relationship (across all entities) in a single group.
- A group can contain more than one relationship.
- Critically, what happens when a relationship can’t fit on a single node anymore?
- Dgraph uses Badger (in-house LSM KV store) to persist data; metadata is stored in Raft.
- The paper glosses over this, but how does Dgraph ensure that Badger and Raft are in sync?
- Uses GraphQL for queries, over GRPC/Protobuf or vanilla HTTP.
- All records with the same subject and predicate are grouped into the same Badger key:
- The Badger value is a sorted list of
uids/values, and can be split if a single KV-pair becomes too large.
- Dgraph can execute rebalances by marking shards read-only; are writes rejected during this process?
- Supports lock-free transactions via MVCC
- The entire Lock-Free High Availability Transaction Processing was impenetrable to me
- Lots of jargon and not much elucidation
- Reads are/can be linerizable.
- On the whole this was a quick, crisp paper, but it felt like a lot of details were omitted or glossed over, particularly in the second half.