Kafka Internals (Partitions, ISR, Replication)
Kafka Internals: How the Distributed Log Really Works
Kafka is best understood as a distributed, replicated, append-only log. Producers append records to topics, Kafka stores them durably on brokers, and consumers read them by tracking offsets. Unlike traditional message queues, Kafka is optimized for high throughput and replayable streams, which makes its internal mechanics especially important for production correctness and performance.
Core Building Blocks
Topics and Partitions
A topic is split into partitions. Each partition is an ordered log where records are appended sequentially. Kafka guarantees ordering only within a single partition, not across the entire topic.
Key implications:
- Parallelism scales with partition count.
- Consumer concurrency is bounded by partitions.
- Ordering constraints must be expressed through partitioning keys.
Brokers
Brokers are Kafka servers that store partitions on disk. A single broker can host many partitions from many topics. Storage performance (especially fsync behavior) and network bandwidth are often the real throughput limiters.
Replication: Leaders, Followers, and ISR
Kafka replicates partitions for durability and availability. Each partition has one leader and one or more followers. Producers and consumers interact with the leader. Followers replicate from the leader.
In-Sync Replicas (ISR)
ISR is the set of replicas that are caught up enough to be considered safe for acknowledgments. If replicas fall behind beyond configured thresholds, they leave ISR. A small ISR is a durability risk and often an early warning signal of storage or network pressure.
Reference: Apache Kafka documentation
Durability and Acknowledgments
Producer acknowledgments control the durability-latency tradeoff:
- acks=0: fastest, can lose messages silently.
- acks=1: leader writes locally then responds; risk if leader fails before followers replicate.
- acks=all: waits for ISR replication; strongest durability among standard modes.
In production, durability depends on the combination of acks mode, ISR health, and disk behavior. Even with acks=all, durability weakens if ISR shrinks to 1.
Log Structure: Segments and Indexes
Each partition log is stored as multiple segment files. Kafka appends records to the active segment. Older segments are sealed and can be cleaned or deleted based on retention. To enable efficient reads, Kafka maintains indexes that map offsets to file positions.
Why this matters operationally:
- Large segments affect recovery time after broker restart.
- Retention and compaction settings impact disk growth patterns.
- Index corruption or disk faults can cause partition unavailability.
Retention vs Compaction
Kafka can delete old data by time/size retention or keep the latest record per key using log compaction. Compaction is powerful for changelog topics and state rebuilds, but it changes the semantics of what is retained on disk.
Reference: Kafka log compaction
Consumers: Offsets and Group Coordination
Offsets
Consumers track progress using offsets. Offsets are the position in the partition log. Kafka stores committed offsets (commonly in an internal offsets topic). The key production question is: when do you commit offsets relative to processing side effects?
- Commit too early: you can lose processing (message acknowledged but not applied).
- Commit too late: you increase duplicates (at-least-once behavior).
Consumer Groups and Rebalancing
Consumers in a group divide partitions among themselves. When membership changes (deployment, crash, scaling), Kafka triggers a rebalance and partitions move. During rebalance, throughput can drop and processing pauses can occur depending on configuration. Poor rebalance behavior is a frequent source of latency spikes.
Production Scenario: ISR Shrink and Durability Risk
Symptom
Write latency increases, and brokers report frequent under-replicated partitions. Shortly after, a broker failure causes data loss for recent events.
Root Cause
Followers fell behind due to disk saturation and left ISR. With ISR size reduced, acknowledgments no longer represented multi-replica durability. A leader failure occurred before replicas caught up.
Diagnosis
- Under-replicated partitions increase.
- ISR size drops to 1 for critical partitions.
- Disk latency spikes on follower brokers.
Resolution
- Restore disk headroom (SSD, isolate disks, reduce contention).
- Throttle producers during incident windows.
- Alert on ISR shrink and under-replicated partitions as correctness risks.
Operational Checklist
- Is partition count aligned with required throughput and consumer concurrency?
- Are you monitoring ISR size and under-replicated partitions?
- Do producer acks and min.insync.replicas match durability requirements?
- Are retention/compaction policies documented per topic?
- Is consumer offset commit strategy aligned with idempotency and side effects?
Failure Injection Test
# Kafka durability and rebalance validation 1) Produce load to a replicated topic 2) Saturate disk on one follower broker 3) Observe ISR shrink and latency impact 4) Kill the leader broker for that partition 5) Verify availability and measure any data loss window 6) Trigger consumer group scaling and measure rebalance pause
Key Takeaways
- Kafka is a replicated log; partitions define ordering and parallelism.
- ISR health is a first-class durability signal.
- Producer acks are only meaningful when ISR is healthy.
- Offsets define processing guarantees; commit timing must match idempotency.
- Rebalancing is an operational event that must be measured and tuned.
Kafka reliability is not only about running brokers. It is about continuously maintaining replication health, disk headroom, and consumer correctness semantics. The internals determine whether your system degrades safely under stress or fails with silent data loss.