Active-Active
Active-Active: Two Regions Serving Traffic
Active-active means multiple regions serve production traffic at the same time. The main benefit is regional resilience and local user latency. The main cost is correctness complexity: when both regions accept traffic, you must define how state stays consistent.
Active-Active Variants
- Active-active reads, single-writer writes: easiest form; reads are local, writes route to a home region.
- Geo-partitioned active-active: each region owns a subset of data (tenant/user home region).
- Multi-writer active-active: any region can write any data; requires conflict resolution.
Core Challenge: Write Coordination
If two regions can write the same record, conflicts are inevitable. If your domain cannot tolerate conflict semantics (like last-write-wins), avoid multi-writer for that dataset. A production rule: if you cannot explain conflicts to users, do not accept multi-writer writes for that flow.
Operational Challenge: Partial Partition
Active-active failures are often partial. One region may be healthy, but cross-region replication or routing is degraded. Without clear ownership and timeouts, traffic can bounce between regions, creating tail-latency spikes and inconsistent reads.
Make Locality the Default
Active-active succeeds when most requests are served fully within one region. The more cross-region calls you have on the hot path, the less benefit you get and the more p99 suffers.
Production Checklist
- Define ownership: single-writer, geo-owned, or multi-writer per dataset.
- Define read-after-write behavior per endpoint.
- Instrument replication lag and conflict rates (if applicable).
- Practice regional isolation: can one region run without the other?
Production-First Takeaway
Active-active is not “turn on multi-region.” It is a strict locality and ownership discipline. Start with active-active reads + single-writer writes, then evolve only if the product truly requires multi-writer behavior.