TRICKYwalkthrough

Play Count Pipeline for Royalty Accounting

4 of 8
3 related
A listener presses play, listens for 15 seconds, then skips. Does that count as a play?
If we lose a legitimate 30-second play event, the artist loses revenue. The constraint: 12 billion play events per day must be counted with zero double-counts and zero lost plays because every counted play triggers a royalty payment.
If we count it, we pay the rights holder a fraction of a cent they did not earn.
We chose a 30-second threshold (not 15 seconds or full-track completion) because this is the industry standard adopted by Spotify, Apple Music, and YouTube Music. Below 30 seconds, the play is not counted for royalty purposes.
The client sends a play_start event when the track begins and a play_complete event when the 30-second mark is reached. We chose Kafka with exactly-once semantics (not at-least-once with dedup) as the event backbone because at-least-once would require a massive deduplication layer to prevent double-counting across 139K events per second.
Kafka's idempotent producer plus transactional consumer guarantees each event is processed exactly once. Events flow: client to Play Ingest Service to Kafka (partitioned by track_id) to Play Counter Service to Cassandra for real-time counts and to a data warehouse for royalty calculation.
We partition Kafka by track_id (not user_id) because royalty aggregation groups by track. Daily aggregation: 300M DAU×40 songs/day=12B events/day300\text{M DAU} \times 40\text{ songs/day} = 12\text{B events/day}.
At 200 bytes per event: 12B×200B=2.4 TB/day12\text{B} \times 200\text{B} = 2.4\text{ TB/day}. Trade-off: exactly-once Kafka adds 10 to 15% latency overhead versus at-least-once, but for financial data, accuracy is non-negotiable.
What if the interviewer asks: what if the client loses connectivity at 29 seconds? The client buffers the event locally and retries with the same idempotency key when connectivity returns.
The server deduplicates on the idempotency key.
Why it matters in interviews
This is the concept interviewers use to test financial data pipeline design. Explaining why we chose exactly-once Kafka over at-least-once with dedup, and why the 30-second threshold exists, shows we understand that play counting is a financial transaction, not a vanity metric.
Related concepts