Goroutines Explained
Goroutines: Lightweight but Not Free
Goroutines are often described as “cheap threads.” This is partially true — they are lightweight compared to OS threads — but they are not free. In production systems, uncontrolled goroutines lead to memory growth, scheduler pressure, and degraded latency.
Understanding goroutines is not about syntax. It is about lifecycle control.
How Goroutines Actually Work
Go uses an M:N scheduler model:
- G = Goroutine
- M = OS thread
- P = Logical processor
The runtime schedules many goroutines onto a smaller number of OS threads. This enables massive concurrency without massive thread overhead.
Check CPU Parallelism
runtime.GOMAXPROCS(0)
This defines how many threads can execute Go code simultaneously.
Real Production Failure: Goroutine Leak
A service handling HTTP requests spawned a goroutine for each background task but never canceled them when requests ended.
go func() {
process(data)
}()
Under load, requests timed out. But background goroutines continued running. After hours, memory usage climbed steadily. Eventually, the container OOM-killed.
Root cause: unbounded goroutine creation + no cancellation strategy.
Detecting Goroutine Leaks
Check Goroutine Count
runtime.NumGoroutine()
Sudden growth under steady load is a red flag.
Using pprof
import _ "net/http/pprof"
Expose pprof endpoint and inspect:
curl http://localhost:6060/debug/pprof/goroutine?debug=2
This shows stack traces of all active goroutines.
Goroutine Lifecycle Discipline
Every goroutine must have:
- A clear start condition
- A clear exit condition
- Cancellation mechanism
Context Is Not Optional
Use context for cancellation.
ctx, cancel := context.WithTimeout(parentCtx, 2*time.Second)
defer cancel()
go func() {
select {
case <-ctx.Done():
return
case result := <-workChan:
handle(result)
}
}()
If your goroutine ignores context, it may outlive its usefulness.
Blocking Without Exit: A Classic Leak
go func() {
for msg := range ch {
process(msg)
}
}()
If ch is never closed, this goroutine never exits.
Always define ownership of channel closing.
Unbounded Concurrency
This pattern is dangerous:
for _, item := range items {
go process(item)
}
If items = 10,000, you just spawned 10,000 goroutines.
This may:
- Exhaust memory
- Overwhelm downstream systems
- Increase GC pressure
Bounded Concurrency Pattern
sem := make(chan struct{}, 10)
for _, item := range items {
sem <- struct{}{}
go func(it Item) {
defer func() { <-sem }()
process(it)
}(item)
}
This limits concurrent goroutines to 10.
Scheduler and Preemption
Go 1.14+ introduced improved preemption. However, CPU-bound infinite loops without function calls can still starve scheduling.
Avoid tight loops without blocking or yielding.
Stack Growth Model
Each goroutine starts with a small stack (~2KB). It grows dynamically.
This is efficient but means millions of goroutines still consume memory.
Goroutines and Panic Handling
Panic inside goroutines can crash the process if not recovered.
go func() {
defer func() {
if r := recover(); r != nil {
log.Println("recovered:", r)
}
}()
riskyOperation()
}()
Use recover cautiously. Panics often indicate programmer errors.
Testing for Goroutine Leaks
In tests, check goroutine count before and after.
before := runtime.NumGoroutine()
// run test logic
after := runtime.NumGoroutine()
if after > before {
t.Fatal("possible goroutine leak")
}
Not perfect, but useful heuristic.
Operational Monitoring
Export metrics:
- runtime.NumGoroutine()
- GC pause time
- Memory allocation
Spikes in goroutines often precede latency issues.
Anti-Patterns
- Spawning goroutines inside loops without limits
- Ignoring context cancellation
- Never closing channels
- Fire-and-forget background tasks
- Swallowing panics silently
Operational Checklist
- Every goroutine has exit condition
- Context propagated correctly
- Concurrency bounded where needed
- Goroutine count monitored
- pprof enabled in non-prod environments
- No unbounded background tasks
Final Perspective
Goroutines are one of Go’s greatest strengths — and one of its most subtle operational risks. Lightweight does not mean infinite. In production systems, concurrency must be controlled, observable, and cancelable. Discipline around goroutine lifecycle separates robust services from unstable ones.