Very very good talk (didn’t expect anything less); a couple of highlights:
- All Jepsen “clients” run in the same JVM, because analyzing the temporal order of concurrent events after the fact is critical to finding issues.
- Jepsen treats “failed” and “timeout” as two very different error conditions.
- Zookeeper passed its Jepsen test; no other (advertised) database did.
- Hazelcast has (had?) a number of very questionable primitives.
- I know why distributed locks are a bad idea / not technically viable, but I didn’t really understand Kyle’s explanation about this (something about side effects).
- Be very careful if you’re picking a database for production; read the documentation, and test things yourself.
- Pick the right tradeoff for your use case; 10% writes lost in the name of efficiency might be perfectly acceptable (or even desirable).
- Raft paper
- Cache line
- Only goes forward in time
- Goes forward/backward in time
- Byzantine fault tolerance + its use in blockchain algorithms
- Probabilistic execution
- Flake IDs
- Distributed locks
- tc / iptables