cgroup v1 vs cgroup v2: real differences and concrete impacts on Kubernetes workloads
"cgroup v2 is no longer “experimental”. Most modern Linux distributions and Kubernetes releases are moving to it by default.
This article focuses on what actually changed, what breaks, and what you really observe in Kubernetes.
Quick reminder: what cgroups do
Control Groups (cgroups) are a Linux kernel feature used to:
- limit resources (CPU, memory, I/O, pids…)
- account resource usage
- enforce isolation between workloads
Containers do not work without cgroups. Kubernetes relies on them heavily via the container runtime.
High-level difference
| Aspect | cgroup v1 | cgroup v2 |
|---|---|---|
| Hierarchy | Multiple independent hierarchies | Single unified hierarchy |
| Controllers | Enabled per hierarchy | Enabled top-down |
| Resource accounting | Fragmented | Consistent |
| Memory management | Weak isolation | Strong isolation |
| Kernel complexity | High | Lower |
| Kubernetes future | Legacy | Target state |
Architecture changes that matter
1. Unified hierarchy (biggest change)
cgroup v1
- Each controller (cpu, memory, blkio…) has its own hierarchy
- A process can be in multiple places at once
- Leads to inconsistent accounting
cgroup v2
- One single hierarchy for all controllers
- Every process belongs to one and only one cgroup
- Resource distribution becomes predictable
Kubernetes impact
- Pod-level resource accounting is now accurate
- CPU and memory enforcement behave consistently
- Less “ghost” usage reported by kubelet
CPU behavior differences
CPU throttling
v1
cpu.cfs_quota_usapplied per cgroup- Bursty workloads often throttled unexpectedly
- CPU shares and quotas interact poorly
v2
- CPU controller redesigned
- Fairer distribution across siblings
- Better handling of bursts
Observed changes
- Less CPU throttling for short-lived containers
- HPA reacts more accurately to CPU usage
- Latency-sensitive workloads (API servers, gateways) behave more smoothly
Memory management changes (critical)
OOM behavior
v1
- Memory limit enforcement is aggressive
- Kernel can kill processes outside the container
- No clear distinction between reclaimable vs non-reclaimable memory
v2
- Memory pressure propagation is hierarchical
- OOM killer is scoped correctly
- Introduces
memory.minandmemory.high
Observed changes in Kubernetes
- Fewer node-level OOM kills
- Pods respect memory limits more strictly
- System daemons are less impacted by noisy neighbors
Important This directly improves node stability.
Swap handling (major improvement)
v1
- Swap accounting is optional and often disabled
- Hard to reason about memory + swap usage
v2
- Swap is first-class
memory.swap.maxsupported- Predictable swap behavior per pod
Kubernetes impact
- kubelet can enforce swap limits correctly
- Safer to enable swap on nodes (when configured properly)
- Reduces random OOMs on memory pressure spikes
I/O control differences
v1
blkiocontroller depends on legacy block layer- Poor support for modern NVMe devices
- Hard to tune
v2
- Replaced by
iocontroller - Works with cgroup-aware I/O schedulers
- More predictable throttling
Observed changes
- Stateful workloads (databases, queues) experience:
- more stable latency
- fewer I/O starvation scenarios
- Better isolation between noisy pods
Kubernetes + container runtime differences
Container runtimes
| Runtime | v1 support | v2 support |
|---|---|---|
| containerd | ✅ | ✅ (default) |
| CRI-O | ✅ | ✅ (recommended) |
| Docker (legacy) | ⚠️ | ❌ |
Breaking change
- Docker (dockershim) does not fully support cgroup v2
- This is one reason Kubernetes removed dockershim
kubelet behavior changes
With cgroup v2:
- kubelet uses systemd cgroup driver only
- CPU and memory stats are more accurate
- Pod eviction decisions improve
If you still use
--cgroup-driver=cgroupfs
you will hit incompatibilities.
Common breaking changes observed
1. Monitoring tools
Some tools still expect v1 paths:
/sys/fs/cgroup/memory/.../sys/fs/cgroup/cpu/...
With v2:
- Single unified mount
- Metrics location changes
Impact
- Older exporters break
- Custom scripts fail silently
2. Hardcoded resource tuning
Examples:
- Custom scripts writing directly to v1 files
- Old systemd unit overrides
These stop working entirely on v2.
3. Overcommitted CPU workloads
Because CPU fairness improves:
- Some workloads receive less CPU than before
- Especially batch jobs abusing shares
This exposes bad resource requests.
Kubernetes features that benefit most from cgroup v2
- Guaranteed QoS pods
- Memory QoS (alpha/beta features)
- Vertical Pod Autoscaler accuracy
- Node stability under pressure
- Multi-tenant clusters
When you still might see issues
- Legacy kernels (< 5.8)
- Old monitoring stacks
- Docker-based nodes
- Custom admission controllers relying on v1 semantics
Recommendation
If you run:
- Kubernetes ≥ 1.25
- containerd or CRI-O
- systemd-based nodes
👉 Use cgroup v2
cgroup v1 should be considered deprecated operational debt, not a tuning option.
Quick validation command
stat -fc %T /sys/fs/cgroup
Expected output for v2:
cgroup2fs
Final take
cgroup v2 is not just a cleanup. It changes real behavior: CPU fairness, memory isolation, OOM safety, and observability.
If your cluster “behaves better” after switching to v2, that’s not magic — that’s the kernel finally doing the right thing.
Ready to test your Cilium knowledge? Try our Cilium quiz!