kubernetes Resource Isolation - 12. Ultimate Node Sizing Guide for AKS, EKS, and GKE
Segment 12 is where we get extremely practical about selecting the right node sizes and VM shapes in AKS/EKS/GKE. This is one of the most important but least understood aspects of Kubernetes performance engineering.
Choosing the wrong node size leads to:
- Constant evictions
- Memory pressure
- CPU throttling
- NUMA imbalance
- Poor inference latency
- Overpaying for unused cores
- Underpowered control-plane components (CNIs, CSI, monitoring agents)
This guide will help you select the best node types for:
- Microservices
- JVM workloads
- High-throughput services
- Dataplanes (Cilium, Envoy)
- Redis, Postgres
- AI/ML
- Spark
- GPU workloads
Let’s go deep.
SEGMENT 12 — Ultimate Node Sizing Guide for AKS, EKS, and GKE
We will cover:
- The principles for choosing node sizes
- CPU-to-memory ratios that actually work
- Understanding NUMA (critical!)
- Choosing VM families in each cloud (AKS/EKS/GKE)
- Node sizes for different workload types
- When to use large nodes vs many small nodes
- When to use local SSD
- Cost optimization rules
PART 1 — Principles of Good Node Sizing
These are universal across AKS/EKS/GKE.
1. Memory pressure kills nodes — not CPU
Always design node capacity with memory as primary constraint.
Nodes rarely fail from high CPU usage. Nodes frequently fail from memory exhaustion → eviction → OOM → kubelet death → NotReady.
2. NUMA topology heavily affects performance
Nodes with ≥ 2 sockets or ≥ 2 NUMA nodes require careful placement.
- JVM
- Redis
- AI inference
- network dataplanes cannot randomly bounce across NUMA nodes.
Prefer:
- single-NUMA nodes for latency-sensitive workloads
3. Avoid nodes with > 64 vCPUs unless you use pinned CPU workloads
Large nodes → more NUMA topology → more cgroup fragmentation → lower efficiency.
4. Prefer more medium nodes over fewer huge nodes
- reduces blast radius
- avoids multi-Pod NUMA fragmentation
- improves bin packing
- reduces eviction chain reactions
5. Always leave space for system daemons
Rule of thumb:
- Reserve 6–12% of node memory
- Reserve 0.5–1.5 vCPU for system/kube daemons
PART 2 — Recommended CPU : Memory Ratios
Use these ratios as starting points:
| Workload Type | Recommended Ratio |
|---|---|
| Stateless microservices (Go, Node, Python) | 1 vCPU : 2–4 GiB |
| JVM microservices (Spring Boot, Micronaut) | 1 vCPU : 3–8 GiB |
| Databases (Redis, Postgres) | 1 vCPU : 4–8 GiB |
| High-throughput dataplane (Envoy, Cilium) | 1 vCPU : 1–2 GiB |
| AI Inference (CPU-heavy) | 1 vCPU : 1–3 GiB |
| AI w/ GPU | CPU not bottleneck → 1 vCPU : 4–16 GiB |
| Spark/Flink executors | 1 vCPU : 2–8 GiB, memory-bound |
PART 3 — NUMA Topology Explained (Critical Selection Factor)
How to think about NUMA:
- Single NUMA node = predictable, consistent latency
-
Multiple NUMA nodes =
- Remote memory access
- 20–80% slowdown for AI/Redis/Envoy
- Complex scheduling
Cloud providers rarely document NUMA, but here’s the real mapping:
AWS (EKS) NUMA
- m5 / c5 / r5 → 1 NUMA node up to 24–32 vCPUs
- m6i / c6i / r6i → 1 NUMA until ~32–48 vCPUs
- m5.24xlarge / c5.24xlarge → 2 NUMA nodes
Azure (AKS) NUMA
Azure uses “CPU groups”, but effectively:
- D-series, E-series → 1 NUMA up to ~32 vCPUs
- F-series → 1 NUMA up to ~16 vCPUs
- Lsv2 → 2+ NUMA nodes (local SSD optimized)
GCP (GKE) NUMA
- n2-standard, e2-standard → single NUMA up to 32 vCPUs
- n2-highmem/highcpu → single NUMA up to 48 vCPUs
- a2 / g2 GPU nodes → big NUMA topology
PART 4 — Recommended VM Families Per Cloud
AKS (Azure)
⭐ Best General Purpose Workload Nodes:
- D4s_v5, D8s_v5, D16s_v5 Balance of:
- memory
- CPU
- no NUMA surprises
⭐ Best Compute Nodes:
- F4s_v2, F8s_v2 Best for:
- Cilium agents
- API gateways
- small services Avoid > F16 (NUMA segmentation)
⭐ Best Memory-Optimized:
- E8ds_v5, E16ds_v5, E20 Ideal for:
- Java
- Elasticsearch
- Redis
⭐ Best for NVMe-heavy workloads:
- L8s_v3, L16s_v3 For:
- Spark
- batch
- caching
- databases with high random IO
⭐ Best CPU-optimized for AI/DPDK:
- D8as_v5, F8as_v4 (start with 8 cores to keep single NUMA)
EKS (AWS)
⭐ Best general workloads:
- m6i.large / xlarge / 2xlarge / 4xlarge
⭐ Best for high-throughput:
- c6i.xlarge / 2xlarge
⭐ Best memory-heavy:
- r6i.xlarge / 2xlarge / 4xlarge
⭐ Best AI CPU-side pre/post processing:
- c7g (Graviton3) — extreme performance/price
- m7g — best balance
⭐ Best with local SSD:
- i3.xlarge / 2xlarge (best throughput in AWS)
Avoid:
- m5.24xlarge
- c5.18xlarge (NUM A splitting → inconsistent performance)
GKE (Google Cloud)
⭐ Best general workloads:
- n2-standard-4 / 8 / 16
⭐ Best memory workloads:
- n2-highmem-4 / 8 / 16
⭐ Best CPU-heavy:
- c2-standard-4 / 8
⭐ Best local SSD:
- n2-standard-8 w/ Local SSD
Avoid:
- n1 or older instance types
- Very large machine types (> 64 vCPUs)
PART 5 — Node Sizes Per Workload Type
1. Microservices (Go, Node, Python)
Best sizes:
- 4 vCPU / 16 GiB
- 8 vCPU / 32 GiB
Why:
- Good bin packing
- No NUMA pressure
- Fits 10–25 Pods safely
Avoid:
- Very small nodes (inefficient)
- Very large nodes (blast radius)
2. JVM Apps (Spring Boot, Pega, Kafka clients)
Needs:
- high memory per Pod
- JVM heap + direct buffers
Best sizes:
- 8 vCPU / 64 GiB
- 16 vCPU / 128 GiB
If each POD needs 4Gi:
- Node with 64Gi can fit 10–12 properly
- With headroom for system daemons
3. Redis / Memcached
Needs:
- single NUMA node
- predictable CPU
- local SSD optional
Best sizes:
- 8 vCPU / 64 GiB
- 16 vCPU / 128 GiB
Never deploy Redis on:
- multi-NUMA 32–64 core nodes (unless CPU pinned)
4. Envoy Proxy / API Gateway
Needs:
- stable CPU
- no throttling
- low jitter
Best sizes:
- 4 vCPU / 8 GiB
- 8 vCPU / 16 GiB
Run fewer Pods per node for isolation.
5. AI/ML Inference (CPU-bound)
Needs:
- NUMA alignment
- large memory for models
- predictable batching latency
Best sizes:
- 8 vCPU / 32 GiB
- 16 vCPU / 64 GiB
With CPUManager:
- Pin 4–8 CPUs exclusively for inference worker
6. AI/ML with GPU
CPU sizing is secondary.
Good rule:
- 4–6 vCPUs per GPU
- 16–32 GiB memory per GPU
Node example:
- A10 GPU node → 8 vCPU / 32 GiB
- A100 GPU node → 32 vCPU / 128–256 GiB
7. Databases (Postgres, MySQL, Elasticsearch)
Needs:
- huge page cache
- high memory
- stable IO
Best sizes:
- 8 vCPU / 64 GiB
- 16 vCPU / 128 GiB
With local SSD:
- Lsv2 (AKS)
- i3/i4i (EKS)
- n2-standard w/ local SSD (GKE)
Avoid:
- memory-poor compute nodes
8. Spark / Flink / Ray
Executors need:
- memory
- local SSD
- CPU bursts
Best sizes:
- 16 vCPU / 64 GiB
- 32 vCPU / 128 GiB
- with local SSD
Avoid:
- small nodes (executor fragmentation)
- massive nodes (NUMA issues)
PART 6 — When to Use Large Nodes vs Small Nodes
Use small/medium nodes (<16 vCPU) for:
- microservices
- latency-sensitive workloads
- Cilium/Envoy
- Redis
- AI inference
- clusters with high Pod churn
Benefits:
- low blast radius
- easier bin packing
- fast autoscaling
Use large nodes (32–64 vCPU) for:
- Spark executors
- Flink task managers
- ETL workloads
- AI training (multi-GPU nodes)
Avoid very large nodes (>64 vCPU) unless:
- you’re doing ML training
- pods are pinned to cores
- you fully understand NUMA management
PART 7 — Local SSD Guidance
Use nodes with local SSD when:
- Redis
- Postgres WAL/logs
- Spark shuffle
- ML preprocessing
- High local IO workloads
Avoid local SSD for:
- general microservices (no benefit)
- workloads using remote storage (EBS/EFS/Azure Disk/Premium)
PART 8 — Cost Optimization Rules
-
Use medium nodes for better bin packing
- 8 vCPU / 32 GiB is the global sweet spot
-
Avoid high-memory SKUs unless necessary
- r-series / E-series cost premium
-
Graviton (AWS) or Ampere (GKE/Oracle) > x86
- 20–40% cheaper
- better perf
-
GPU nodes: choose smallest CPU SKU that meets throughput
- oversizing CPU around GPUs is the #1 cost waste in AI clusters
-
Use autoscaling with Pod Disruption Budgets
- avoids evacuation storms
SEGMENT 12 SUMMARY
You now have a cloud-agnostic, workload-driven node sizing strategy:
Core Principles
- memory > CPU
- avoid NUMA fragmentation
- prefer several medium nodes
- leave room for system daemons
Best VM Families
- Azure: D-series, E-series, F-series, Lsv2 for SSD
- AWS: m6i, c6i, r6i, c7g (Graviton), i3/i4i
- GCP: n2-standard, n2-highmem, c2-standard
Per-Workload Node Size Playbooks
- Microservices → 4–8 vCPU
- JVM → 8–16 vCPU, high-memory
- Redis → 8 vCPU single-NUMA
- AI inference → 8–16 vCPU
- AI GPU → 4–6 CPUs per GPU
- Spark → 16–32 vCPU, local SSD
Cost Optimization
- medium nodes pack best
- avoid big NUMA nodes
- Graviton/Ampere highly efficient
- GPU nodes should minimize CPU
Leave a comment