kubernetes Resource Isolation - 14. A catalog of **cluster design patterns

October 17, 2025  4 minute read  

Segment 14 as a catalog of cluster design patterns you can combine:

  • How to slice the cluster into node pools
  • How to slice workloads via namespaces, tenants, and QoS
  • How to use taints/tolerations, priority classes, PDBs, and topology to control behavior
  • When to make more clusters vs fewer clusters

I’ll keep each pattern fairly tight so you can remix them.


1. Node Pool Segmentation Patterns

1.1 General vs Specialized Pools

Pattern:

  • general-pool for 80–90% of workloads
  • One or more specialized pools:

    • perf (CPUManager, TopologyManager)
    • gpu
    • batch
    • db or stateful

Mechanics:

  • Labels:

    kubectl label node node-1 node-pool=general
    kubectl label node node-2 node-pool=perf
    
  • Taints on special pools:

    kubectl taint node node-2 perf-only=true:NoSchedule
    
  • Workload spec:

    nodeSelector:
      node-pool: perf
    tolerations:
      - key: "perf-only"
        operator: "Exists"
        effect: "NoSchedule"
    

When to use: almost always. This is the baseline pattern.


1.2 Horizontal Isolation by “Noisy Class”

Separate node pools for:

  • system (CNI, CSI, metrics, logging)
  • user-apps
  • noisy-batch (Spark, ETL, big cronjobs)

Idea: Keep noisy, spiky workloads from contaminating general services.

Mechanics:

  • System DaemonSets:

    nodeSelector:
      node-role.kubernetes.io/system: "true"
    
  • Batch node pool tainted:

    kubectl taint node batch-pool batch-only=true:NoSchedule
    

1.3 Cost/Hardware Pools

Pools by machine type:

  • spot or preemptible
  • standard
  • high-mem
  • ssd-local

Use them like:

  • Non-critical workers → spot
  • Latency-critical → standard
  • Memory-heavy → high-mem
  • Spark/Redis → ssd-local

Key: Every pool has labels & taints; workloads choose via nodeSelector / nodeAffinity + tolerations.


2. Namespace & Tenant Patterns

2.1 Namespace-per-team / namespace-per-product

Pattern:

  • team-a-dev, team-a-prod
  • product-x-dev, product-x-prod

Controls per namespace:

  • ResourceQuota
  • LimitRange
  • NetworkPolicy
  • RBAC

Example:

apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-a-quota
  namespace: team-a-prod
spec:
  hard:
    requests.cpu: "40"
    requests.memory: "80Gi"
    limits.cpu: "80"
    limits.memory: "160Gi"
    pods: "200"

When to use: Multi-team clusters, platform teams serving app teams.


2.2 Soft Multi-Tenancy vs Hard Multi-Tenancy

  • Soft: Same cluster, tenants isolated via namespaces, quotas, network policies, RBAC. Most enterprises.
  • Hard: Separate clusters per tenant or per BU, sometimes separate accounts/subscriptions.

Rules of thumb:

  • If tenants can be semi-trusted & share infra → soft.
  • If you need strong isolation / different compliance regimes / noisy security boundaries → multiple clusters.

3. Workload Admission & QoS Patterns

3.1 Enforce Requests & Limits via Policy

Use an admission policy (OPA/Gatekeeper, Kyverno, or built-in ValidatingAdmissionPolicy) to:

  • Reject Pods without resources.requests & resources.limits
  • Forbid BestEffort except for debug namespaces
  • Enforce max/min resource sizes per namespace

Pattern:

  • Default: require at least requests and limits.memory.
  • Exception: special allow-bursty namespace.

3.2 Priority Classes for SLO Layers

Define PriorityClasses like:

  • system-critical (CNI, kube-dns)
  • platform-critical (ingress, logging, metrics)
  • business-critical (user-facing prod services)
  • batch (ETL, reports)
  • best-effort (preemptible stuff)

Example:

apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: business-critical
value: 900
globalDefault: false

Use in Pod spec:

priorityClassName: business-critical

Behavior:

  • On resource pressure, lower-priority Pods get evicted first.
  • Scheduler gives high-priority workloads first dibs on resources.

3.3 PodDisruptionBudget (PDB) + Autoscaling

Pattern:

  • For every stateful or important stateless workload, define PDB:
apiVersion: policy/v1
kind: PodDisruptionBudget
spec:
  minAvailable: 2

Combine with:

  • HPA for scale-out
  • Cluster Autoscaler / Karpenter for node scale-out

This gives:

  • Safe rollouts
  • Safe node drain / spot preemption
  • Enough replicas for resilience

4. Topology & Failure-Domain Patterns

4.1 Spread Across Zones / Nodes

Use topology spread constraints or anti-affinity:

topologySpreadConstraints:
  - maxSkew: 1
    topologyKey: topology.kubernetes.io/zone
    whenUnsatisfiable: ScheduleAnyway
    labelSelector:
      matchLabels:
        app: my-api

Or simpler:

affinity:
  podAntiAffinity:
    preferredDuringSchedulingIgnoredDuringExecution:
      - weight: 100
        podAffinityTerm:
          topologyKey: kubernetes.io/hostname
          labelSelector:
            matchLabels:
              app: my-api

Goal: Avoid all replicas landing on same node or same AZ.


4.2 Zone-aware Node Pools

Per cloud:

  • Separate node pools per AZ
  • Label nodes with zone
  • Use topologySpreadConstraints to distribute workloads evenly

This prevents:

  • All traffic going through a single zone
  • Single-AZ outages taking entire app down

5. Security & Network Isolation Patterns

5.1 Zero-Trust-by-default NetworkPolicy

Base policy in each namespace:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Then explicit “allow” policies for:

  • namespace-local communication
  • calls to specific backends (DBs, APIs)
  • calls to observability stack

Pattern: No ingress/egress allowed by default → everything opt-in.


5.2 Security Boundary Namespaces

For particularly sensitive apps, combine:

  • Dedicated namespace
  • Dedicated node pool (taints)
  • Strict NetworkPolicy
  • Stricter PodSecurity / PSP replacement (restricted baseline)
  • Separate secrets store (external KMS, Vault, AKV, etc.)

This is a cluster-within-a-cluster pattern.


6. Multi-Cluster Patterns

6.1 Env-tier Clusters

One of the most common:

  • prod cluster(s)
  • nonprod cluster(s) (dev/uat/stage)

Sometimes:

  • prod-us, prod-eu (data residency)

Pros:

  • Strong blast-radius isolation
  • Simple mental model: “prod is sacred”

Cons:

  • More control-plane overhead
  • You need a GitOps story that understands multiple clusters (ArgoCD, Flux).

6.2 Function-based Clusters

Patterns like:

  • core-platform cluster (ingress, observability, shared platform services)
  • app-tenant clusters for main product lines
  • data cluster for Kafka/Spark/Cassandra

This is helpful if:

  • Data-plane loads are wildly different than API-plane loads
  • Observability stack is heavy and you want to isolate it

7. Putting It Together – Example Design

Here’s a concrete cluster design pattern you can adapt:

Clusters

  • corp-nonprod
  • corp-prod

Node Pools in each cluster

  • system (small, stable, for CNI/CSI/monitoring)
  • general (default microservice nodes, D/E/m6i/n2)
  • perf (CPUManager+TopologyManager, latency/cpu-critical)
  • batch (cheaper, spot, larger nodes)
  • db (memory-heavy, local SSD, tainted)

Namespaces

  • platform-system (CNI, CSI, logging, metrics, ingress)
  • platform-observability (Prometheus, Loki, Tempo, etc.)
  • team-a-dev, team-a-prod
  • team-b-dev, team-b-prod
  • shared-services (auth, messaging, etc.)

Controls

  • ResourceQuota + LimitRange per team namespace
  • NetworkPolicy default-deny per namespace
  • PriorityClasses:

    • system-critical
    • platform-critical
    • business-critical
    • batch-low

Scheduling hints

  • Platform & observability → system & general pools
  • Latency-critical apps → perf pool (Guaranteed, pinned CPUs)
  • Spark jobs → batch pool (spot, large nodes, local SSD)
  • Redis/DB → db pool (memory-heavy, local SSD)

8. Quick design checklist

When you design or refactor a cluster, ask:

  1. Do I have at least two node pools? (general + something else)
  2. Are system components isolated or competing with apps?
  3. Do teams have clear namespace boundaries, quotas, and limits?
  4. Are BestEffort workloads controlled or confined?
  5. Do I have PriorityClasses & PDBs for production services?
  6. Are workloads spread across zones and nodes?
  7. Do sensitive workloads have network & node isolation?
  8. Do I need multiple clusters for prod vs nonprod or for legal isolation?

If the answer to most of these is “yes”, you’re in serious platform-engineering territory already.

Leave a comment