Dgraph: Synchronously Replicated, Transactional and Distributed Graph Database (Paper)

https://dgraph.io/paper

  • Two kinds of nodes:
    • Zeroes are administrative nodes that handle things like:
      • Handing out monotonic timestamps
      • Form consensus on metadata about the rest of the cluster
      • Handle rebalances/reshards
    • Alphas store data
  • Nodes are divided into groups; one group for zeroes, and 1..N groups for alphas.
    • Each group forms a Raft cluster and can have 1, 3, or 5 replicas.
  • Primitive data structure is a triple, representing either:
    • (subject, predicate, object) → (0xab, <follower_of>, 0xff)
    • (subject, predicate, value) → (0xab, <name>, "John")
  • All subjects are given a globally unique uid
  • Data is sharded by relationship/predicate rather than by entity.
    • For example, if your data store contains information about which user follows which other user, users are entities, and “follows” is the relationship
    • Dgraph places all data for the “follows” relationship (across all entities) in a single group.
    • A group can contain more than one relationship.
    • Critically, what happens when a relationship can’t fit on a single node anymore?
  • Dgraph uses Badger (in-house LSM KV store) to persist data; metadata is stored in Raft.
    • The paper glosses over this, but how does Dgraph ensure that Badger and Raft are in sync?
  • Uses GraphQL for queries, over GRPC/Protobuf or vanilla HTTP.
  • All records with the same subject and predicate are grouped into the same Badger key:
    • The Badger value is a sorted list of uids/values, and can be split if a single KV-pair becomes too large.
  • Dgraph can execute rebalances by marking shards read-only; are writes rejected during this process?
  • Supports lock-free transactions via MVCC
    • The entire Lock-Free High Availability Transaction Processing was impenetrable to me
    • Lots of jargon and not much elucidation
  • Reads are/can be linerizable.
  • On the whole this was a quick, crisp paper, but it felt like a lot of details were omitted or glossed over, particularly in the second half.

Annotated Paper

Edit