TRICKYwalkthrough

Conflict Resolution: LWW vs Vector Clocks

4 of 8
3 related
A network partition splits the cluster for 90 seconds. A user edits their shopping cart on both sides: adds a book via replicas on the left, adds headphones via replicas on the right.
Who wins? Option one: Last-Writer-Wins.
The partition heals and the same key now has two legitimate histories.
Each write carries a timestamp; the highest one silently replaces the rest. Simple, and it is what Cassandra does by default, but it has a sharp edge: with clock skew between coordinators (NTP keeps machines within tens of milliseconds at best, and mis-configured nodes drift by seconds), LWW can pick the LOSING write: the user's book vanishes with no error, no log, nothing.
Silent data loss is the worst failure mode in storage. Option two: vector clocks.
Each value carries a vector of per-node counters recording its causal history. When one history descends from the other, the store merges automatically; when the vectors are concurrent (neither descends from the other), the store keeps BOTH as siblings and returns them to the application, which must merge (the Dynamo cart union: book AND headphones survive).
Correctness moves the burden to every reader, and sibling explosion under heavy concurrency is real; Riak walked this road and later added CRDTs to automate common merges. Option three: CRDTs: data types (counters, sets, maps) whose merge is mathematically commutative, so replicas converge without coordination.
Perfect where they fit; not everything is expressible. Our position: LWW for values where losing a concurrent write is tolerable (caches, presence, metrics), vector clocks or CRDTs where it is not (carts, documents, counters), and NTP monitoring with bounded-skew alerts wherever LWW exists.
What if the interviewer asks: why not synchronized clocks (TrueTime)? Atomic-clock infrastructure like Spanner's changes the game but is a Google-scale investment; Dynamo-class systems assume commodity clocks and design for skew instead.
Why it matters in interviews
The silently vanished cart item is the concrete failure that makes LWW's trade-off real, and offering the LWW-where-tolerable, vector-clocks-where-not split shows judgment. Naming clock skew as the mechanism (not just "LWW is bad") survives the follow-up.
Related concepts