---

title: Systems design, algorithms, etc.

Distributed systems

Distributed systems involve having to deal with: * Group membership, which can be: * Centralized master, with concern being handling master failover. Here consensus algorithms like Paxos/Raft help. * Gossip based, but concern being to reduce network contention due to communication between peers. Here mechanisms like SWIM, Scuttlebutt, Dynamo's Merkle Tree approach help. * Failure detection (Ping based, SWIM, Scuttlebutt)

Distributed Databases

Probabilistic data structures/algorithms like Bloom filters, etc.

Probabilistic data structures use results from probability to help isolate desired properties. For e.g. Bloom filter uses it to create a memberset with a very small memory footprint, but can tell you if it does not have an item.

Uses

Hashicorp designed LifeGuard to address issues of unhealthy nodes in a gossip protocol incorrectly mapping other healthy nodes as unhealthy. They used probabilistic data structures - like Bloom filters to increase confidence of a node's health: paper, blog, talk
Medium used it to check stop recommending an already read item to users.

Non uses

https://twitter.com/_joemag/status/1207752056213082112

Related data structures

Cuckoo filters allow dynamic addition/deletions.