CAP Theorem in Practice
On this page
What CAP Actually Says
- When a network partition happens, you must trade between consistency and availability.
- Partition tolerance is not optional in distributed systems.
- CAP is about behavior under partition, not normal operation.
Operational Interpretation
- Consistency: reads reflect the latest acknowledged write or a valid ordering.
- Availability: every request receives a response without waiting for unreachable nodes.
- Under partition: you either reject some operations or serve potentially stale or divergent data.
Per Operation Choices
- Read path can be more available than write path.
- Some operations can tolerate staleness, others cannot.
- Strong consistency is often required for money movement and permissions.
Common Patterns
- CP systems: reject writes during partition to preserve consistency.
- AP systems: accept writes and reconcile later, using conflict resolution.
- Hybrid: use CP for critical metadata, AP for high volume events.
Quorums as a Dial
- Quorum reads and writes can increase consistency but reduce availability during failures.
- Lower quorum increases availability but can serve stale reads.
- Production rule: define quorum policy per dataset and operation.
Failure Modes
- Split brain due to weak leader election or stale leases.
- Cache shows wrong authorization decisions when stale data is used.
- Read your writes breaks when reads route to replicas with lag.
- Reconciliation produces conflicts without a defined policy.
Incident Playbook
- Detect partition: rising timeouts, asymmetric reachability, replica lag spike.
- Freeze critical writes if needed to prevent divergence.
- Reduce load with backoff, circuit breakers, and load shedding.
- Confirm leadership and membership view before resuming full traffic.
Production Checklist
- Define which operations require strong consistency.
- Document behavior under partition: reject, degrade, or reconcile.
- Test partition scenarios and validate recovery steps.
- Monitor leader health, quorum success rate, and replica lag.