kubernetes Resource Isolation - 03. CPU ISOLATION IN KUBERNETES USING CGROUP
SEGMENT 3 — CPU ISOLATION IN KUBERNETES USING CGROUPS
We break this segment into 6 major parts:
- CPU Resource Types in Kubernetes
- CPU Requests → cpu.shares / cpu.weight
- CPU Limits → cpu.cfs_quota_us (throttling)
- QoS Effects on CPU
- Real-world Behavior & Noisy Neighbor Scenarios
- CPU Pinning, cpuset, & NUMA topology (optional advanced topic)
Let’s go through each.
Part 1 — CPU Resource Types in Kubernetes
In a Pod, you can specify:
resources:
requests:
cpu: "500m"
limits:
cpu: "2"
These do different things:
1. CPU Requests → scheduling + relative share
- Tells the scheduler: “I need this much CPU capacity”
- Translates to cpu.shares (cgroup v1) or cpu.weight (cgroup v2)
- Used ONLY during contention (when CPU is 100% used)
2. CPU Limits → absolute maximum
- Translates to cpu.cfs_quota_us / cpu.cfs_period_us
- Caps CPU usage even when node is idle
- Enforces throttling
That’s the key conceptual split:
**Requests = fairness Limits = hard caps**
Part 2 — CPU Requests → cgroup shares/weights
cgroup v1
cpu.shares
Formula:
cpu.shares = cpu_request * 1024
Examples:
| Request | cpu.shares |
|---|---|
| 100m | ~102 |
| 250m | ~256 |
| 1 CPU | 1024 |
| 4 CPUs | 4096 |
How shares work
- Shares only matter when multiple cgroups want CPU and CPU is full.
- If node is idle → container can use ALL CPU regardless of shares.
Example:
- Pod A requests 1 CPU → shares = 1024
- Pod B requests 0.1 CPU → shares = 102
When both try to fully use CPU on a 1-core node:
- A gets 1024/(1024+102) ≈ 90%
- B gets 102/(1024+102) ≈ 10%
If node is idle and B is alone:
- B can use 100% of a CPU even though it requested only 0.1 CPU.
cgroup v2
cpu.weight (1–10000)
Kubelet maps:
cpu.weight = scale(cpu_request)
Equivalent behavior: relative fairness.
Part 3 — CPU Limits → cfs_quota_us (hard throttling)
This is the most important piece of CPU isolation.
Kubernetes sets:
cpu.cfs_period_us = 100000(100ms)cpu.cfs_quota_us = limit * 100000
Example:
cpu limit = 2 CPUs → quota = 200000
cpu limit = 0.5 CPU → quota = 50000
Meaning:
- In every 100ms window
- This cgroup can use ONLY quota microseconds of CPU time
If container tries to exceed it → kernel throttles it.
How throttling feels inside the Pod
Inside the container, your application will see:
- spikes of good performance, then
- micro-pauses (1–10ms) during throttling cycles
This impacts:
- JVM apps
- Go apps with goroutine scheduling
- Latency-sensitive workloads
- High-throughput microservices
If you set no CPU limit:
Then:
cpu.cfs_quota_us = -1
Meaning:
- unlimited CPU usage (no throttling)
- but limited by cpu.shares under contention
- this is actually the recommended setting for many apps (e.g., Java apps with thread pools)
Part 4 — QoS Effects on CPU Isolation
Review of classes:
BestEffort
- No request, no limit → cpu.weight = 1 (lowest) cpu.cfs_quota: unlimited
- They get CPU only when nobody else needs it
Burstable
- Has requests but may not have limits
- Gets proportional CPU during contention
- Can be throttled if limit < actual demand
Guaranteed
- Requests == limits for all containers
- They get exactly what they requested
- They have highest CPU weight
- Strongest isolation
Part 5 — Real-world CPU Isolation Behavior
Scenario A — Noisy neighbor with no limit
Pod A:
limit = 200m
Pod B:
no limit
If both fully utilize CPU:
- A is throttled at 200m
- B uses everything else
If node is idle:
- B uses 100% CPU
- A uses up to its limit, then throttles
Scenario B — Both have limits
Pod A:
limit = 1 CPU
Pod B:
limit = 2 CPU
Total node capacity: 4 CPUs
If both try to use as much as possible:
- A capped at 1
- B capped at 2
- node has 1 CPU leftover idle
Scenario C — High request, low limit (dangerous)
Pod:
request = 4 CPU
limit = 1 CPU
- scheduler thinks Pod needs 4 CPUs
- kubelet will cap container to 1 CPU
- This causes heavy throttling
- Often leads to CPU starvation behavior
This is why setting limit < request is a bad practice.
Scenario D — BestEffort pod vs Guaranteed pod
Guaranteed pod:
limit=2CPU request=2CPU → shares=2048
BestEffort pod:
no request → shares=2
Under contention:
- Guaranteed pod wins 1000:1 ratio
- BestEffort pod gets CPU scraps (may get <1% CPU when node is busy)
Scenario E — Multi-container Pod
CPU limits are per-container.
Pod:
- container A: limit = 1
- container B: limit = 1
Container A cannot steal CPU from container B. Throttling is per-container.
Part 6 — CPU Pinning, cpuset, and NUMA Topology (Advanced)
This enters the world of:
cpuset.cpus- Guaranteed pods with integer CPU requests
- Topology Manager
- Static CPU Manager Policy
Kubernetes can pin containers to specific CPUs, BUT only when:
--cpu-manager-policy=staticis enabled- Pod is Guaranteed
- Requests & limits for CPU are whole integers (2, 3, 4 — NO “500m”)
Then kubelet will allocate exclusive CPUs like:
cpuset.cpus = "2-3"
Benefits:
- No context switch noise
- Lower latency jitter
- NUMA-local memory improves throughput
Used for:
- trading systems
- HPC
- AI inference workloads
- low-latency telecom apps
I can do a full “CPU Manager + Topology Manager” segment later if you want.
Segment 3 Summary
You now understand:
CPU Requests
- Convert to shared weights
- Only matter during contention
CPU Limits
- Convert to CFS quota → throttling
- Hard caps even on idle nodes
QoS classes
- Control share-weight priorities
- Guaranteed pods get best isolation
Real behavior
- Limit < request is an anti-pattern
- Apps can burst above requests
- Throttling causes micro-pauses
- BestEffort workloads get scraps
Advanced
- CPU pinning possible with special kubelet configs
Next step
Pick the next deep dive:
👉 Segment 4 — Memory Isolation
(memory.max, memory.high, OOM, eviction, memory QoS)
👉 Segment 5 — I/O, PID limits, cpuset, hugetlb & other controllers
👉 Segment 6 — CPU Manager & Topology Manager (full deep dive)
(how Kubernetes assigns exclusive CPUs, NUMA awareness)
Which one do you want next?
Leave a comment