LINUX-PRODUCTION Contents

cgroups Basics (CPU/Memory Isolation)

Understand resource control foundations behind containers and systemd slices.

On this page

What Are cgroups?

Control Groups (cgroups) are a Linux kernel feature that limit and isolate resource usage of process groups. They control:

  • CPU usage
  • Memory usage
  • I/O bandwidth
  • Number of processes

Containers (Docker, Kubernetes) rely on cgroups. But cgroups exist even without containers. Production Linux uses them via systemd.

Why cgroups Matter in Production

Without isolation:

  • A memory leak can crash the entire host
  • A batch job can starve your API
  • A runaway process can fork bomb the system

With cgroups:

  • Each service gets defined resource boundaries
  • Blast radius is reduced
  • Incidents are contained

cgroups v1 vs v2

Modern Linux distributions use cgroups v2 (unified hierarchy). You can check:

mount | grep cgroup

If you see cgroup2, you are using v2.

How systemd Uses cgroups

Every systemd service runs inside a cgroup. You can inspect:

systemctl status myapp
systemd-cgls

systemd automatically creates slices and scopes.

Limiting CPU with systemd

Inside your unit file:

[Service]
CPUQuota=50%

This means the service can only use up to 50% of one CPU.

Limiting Memory

[Service]
MemoryMax=500M

If the process exceeds this limit, the kernel can kill it. Better one service dies than the entire host.

Live Inspection

Inspect cgroup paths:

cat /proc/<pid>/cgroup

Check memory usage (v2 example):

cat /sys/fs/cgroup/<path>/memory.current

cgroups vs nice

  • nice influences scheduling priority
  • cgroups enforce hard limits

nice is advisory. cgroups are enforcement.

Fork Bomb Protection

You can limit process count:

[Service]
TasksMax=500

This prevents uncontrolled process spawning.

Production Pattern: Isolate Everything

Good production design:

  • Each service runs in its own cgroup
  • CPUQuota defined for non-critical services
  • MemoryMax defined for untrusted workloads
  • TasksMax set to safe values

Common Production Mistakes

  • Running everything without limits
  • Blaming code when the issue is resource starvation
  • Using nice instead of proper cgroup limits
  • Not monitoring memory.current or pressure

Mental Model

cgroups are containment. They define how much damage a service can do to a host. In production, isolation is more important than raw performance.

Production Checklist

  • Use systemd limits (CPUQuota, MemoryMax, TasksMax)
  • Inspect cgroup assignments during incidents
  • Prefer isolation over shared unlimited resources
  • Monitor memory and CPU pressure per service