DATABASE-ADVANCED Contents

Phantoms, Lost Updates, and Anomalies

Concrete anomaly scenarios and the exact isolation/locking needed to prevent them.

On this page

Anomalies Are Not Theory: They Become Production Bugs

Isolation levels exist because concurrent transactions can create surprising outcomes. These outcomes are called anomalies. If your business rules assume they cannot happen, you will ship correctness bugs.

This lesson covers three high-impact anomalies:

  • Phantom reads
  • Lost updates
  • Write skew (often missed, very dangerous)

Setup: A Simple Inventory Table

CREATE TABLE inventory (
  sku        VARCHAR(64) PRIMARY KEY,
  available  INT NOT NULL
);

INSERT INTO inventory (sku, available)
VALUES ('chair-1', 1);

Lost Update: The Classic “Double Spend” Bug

Lost update happens when two transactions read the same value and both write a derived value, overwriting each other.

Session Script: Lost Update

T1:

BEGIN;
SELECT available FROM inventory WHERE sku = 'chair-1'; -- returns 1
-- application decides to decrement to 0
UPDATE inventory SET available = 0 WHERE sku = 'chair-1';
-- do not commit yet

T2:

BEGIN;
SELECT available FROM inventory WHERE sku = 'chair-1'; -- also returns 1
UPDATE inventory SET available = 0 WHERE sku = 'chair-1';
COMMIT;

T1:

COMMIT;

Outcome: both transactions “sold” the last item. The second write overwrote the first decision. The database may not prevent this under weaker isolation unless you use locking or atomic update patterns.

Fix Pattern: Atomic Conditional Update

A production-grade fix is to avoid “read then write.” Do it in one atomic statement:

UPDATE inventory
SET available = available - 1
WHERE sku = 'chair-1'
  AND available > 0;

If affected rows = 1, you sold it. If affected rows = 0, it was out of stock. This pattern avoids lost updates without requiring higher isolation.

Phantom Read: New Rows Appear in the Same Transaction

Phantom read happens when a transaction re-runs a query and sees new rows that match the predicate, inserted by another transaction.

Setup: Bookings Table

CREATE TABLE bookings (
  id         BIGINT PRIMARY KEY,
  room_id    INT NOT NULL,
  start_at   DATETIME NOT NULL,
  end_at     DATETIME NOT NULL
);

Session Script: Phantom Read

T1:

BEGIN;
SELECT COUNT(*)
FROM bookings
WHERE room_id = 7
  AND start_at >= '2026-03-01'
  AND start_at <  '2026-03-02'; -- returns 0

T2:

BEGIN;
INSERT INTO bookings (id, room_id, start_at, end_at)
VALUES (1001, 7, '2026-03-01 10:00:00', '2026-03-01 11:00:00');
COMMIT;

T1:

SELECT COUNT(*)
FROM bookings
WHERE room_id = 7
  AND start_at >= '2026-03-01'
  AND start_at <  '2026-03-02'; -- now returns 1 (phantom)

Under some isolation levels, this can happen because each statement sees the latest committed data.

Fix Options for Phantoms

  • Use stronger isolation (often serializable) when the invariant depends on stable predicate sets.
  • Use explicit locks or range locks (engine-specific) to prevent inserts into the range.
  • Design schema constraints that enforce the invariant (preferred when possible).

Write Skew: The Bug Many Teams Miss

Write skew happens when two transactions read overlapping data, make decisions that are individually valid, and then write to different rows, violating a global invariant.

This can happen under snapshot-based isolation (commonly repeatable read / snapshot isolation) because both transactions see the same snapshot and do not conflict on the same row.

Example Invariant: “At Least One Doctor On Call”

CREATE TABLE oncall (
  doctor_id INT PRIMARY KEY,
  on_call   INT NOT NULL -- 1 = on, 0 = off
);

INSERT INTO oncall (doctor_id, on_call)
VALUES (1, 1), (2, 1);

Invariant: at least one doctor must remain on_call=1.

Session Script: Write Skew

T1:

BEGIN;
SELECT COUNT(*) FROM oncall WHERE on_call = 1; -- returns 2
-- T1 decides it is safe to go off call
UPDATE oncall SET on_call = 0 WHERE doctor_id = 1;
-- do not commit yet

T2:

BEGIN;
SELECT COUNT(*) FROM oncall WHERE on_call = 1; -- also returns 2 (same snapshot)
UPDATE oncall SET on_call = 0 WHERE doctor_id = 2;
COMMIT;

T1:

COMMIT;

Outcome: both doctors are off call. No row-level conflict occurred because T1 and T2 updated different rows. The invariant is violated.

Fix Patterns for Write Skew

  • Serializable isolation with retry logic (often the cleanest correctness fix).
  • Explicit locking on a shared “guard” row or advisory lock to create a conflict.
  • Schema-level constraints if you can model the invariant declaratively.
  • Single-row counter pattern (careful: can become hotspot).

Production Guidance: Prefer Declarative Constraints

If you can enforce invariants with unique constraints, check constraints, or exclusion constraints (engine-specific), do it. It reduces reliance on isolation semantics and application correctness.

Failure Modes in Production

  • Overselling: lost update due to read-then-write logic.
  • Double spend: balance decremented twice with no atomic check.
  • Booking overlap: phantom inserts violate availability checks.
  • Write skew invariant break: “looks correct” logic violates global rule.
  • Retry storms: serializable fixes without backoff cause cascading load.

Operational Checklist

  • Identify business invariants explicitly (write them down).
  • Classify invariants: single-row vs multi-row vs range/predicate based.
  • Avoid read-then-write when atomic conditional updates can work.
  • For predicate invariants (ranges), consider serializable or explicit range locks.
  • For write skew risks, use serializable or a shared guard lock/row.
  • Implement safe retries with jitter/backoff for conflicts and serialization failures.
  • Load test concurrency scenarios; anomalies appear only under contention.
  • Log and monitor failed conditional updates (they are expected under contention).
  • After changes, verify invariants with periodic consistency checks.
  • Educate app teams: anomalies are predictable, not “random bugs.”

Summary

Phantoms, lost updates, and write skew are not academic. They are the core concurrency bug patterns behind oversells, double spends, and broken invariants. The fix is to choose the correct isolation level and pair it with atomic updates, constraints, and robust retry strategies.