home / skills / manutej / luxor-claude-marketplace / kubernetes-orchestration
This skill helps you master Kubernetes orchestration, covering workloads, networking, storage, security, and production operations with practical guidance.
npx playbooks add skill manutej/luxor-claude-marketplace --skill kubernetes-orchestrationReview the files below or copy the command above to add this skill to your agents.
---
name: kubernetes-orchestration
description: Comprehensive guide to Kubernetes container orchestration, covering workloads, networking, storage, security, and production operations
version: 1.0.0
category: infrastructure
tags:
- kubernetes
- k8s
- containers
- orchestration
- cloud-native
- deployment
- microservices
- devops
- infrastructure
- container-management
prerequisites:
- Basic understanding of containerization (Docker)
- Familiarity with YAML syntax
- Command-line interface experience
- Basic networking concepts
- Linux fundamentals
---
# Kubernetes Orchestration Skill
## Table of Contents
1. [Introduction](#introduction)
2. [Core Concepts](#core-concepts)
3. [Workloads](#workloads)
4. [Services and Networking](#services-and-networking)
5. [Ingress Controllers](#ingress-controllers)
6. [Configuration Management](#configuration-management)
7. [Storage](#storage)
8. [Namespaces and Resource Isolation](#namespaces-and-resource-isolation)
9. [Security and RBAC](#security-and-rbac)
10. [Autoscaling](#autoscaling)
11. [Monitoring and Observability](#monitoring-and-observability)
12. [Logging](#logging)
13. [Production Operations](#production-operations)
14. [Troubleshooting](#troubleshooting)
## Introduction
Kubernetes is an open-source container orchestration platform that automates the deployment, scaling, and management of containerized applications. It provides a robust framework for running distributed systems resiliently, handling scaling and failover for your applications, and providing deployment patterns.
### Key Benefits
- **Service Discovery and Load Balancing**: Automatic DNS and load balancing for containers
- **Storage Orchestration**: Mount storage systems from local, cloud, or network storage
- **Automated Rollouts and Rollbacks**: Declarative deployment with health monitoring
- **Automatic Bin Packing**: Optimal placement of containers based on resource requirements
- **Self-Healing**: Automatic restart, replacement, and rescheduling of failed containers
- **Secret and Configuration Management**: Store and manage sensitive information securely
- **Horizontal Scaling**: Scale applications up and down automatically or manually
- **Batch Execution**: Manage batch and CI workloads
## Core Concepts
### Cluster Architecture
A Kubernetes cluster consists of:
**Control Plane Components:**
- **kube-apiserver**: The API server is the front end for the Kubernetes control plane
- **etcd**: Consistent and highly-available key-value store for all cluster data
- **kube-scheduler**: Watches for newly created Pods and assigns them to nodes
- **kube-controller-manager**: Runs controller processes
- **cloud-controller-manager**: Integrates with cloud provider APIs
**Node Components:**
- **kubelet**: Agent that runs on each node and ensures containers are running
- **kube-proxy**: Network proxy maintaining network rules on nodes
- **container runtime**: Software responsible for running containers (containerd, CRI-O)
### Objects and Specifications
Kubernetes objects are persistent entities representing the state of your cluster. Every object includes:
- **metadata**: Data about the object (name, namespace, labels, annotations)
- **spec**: The desired state
- **status**: The current state (managed by Kubernetes)
## Workloads
### Pods
Pods are the smallest deployable units in Kubernetes, representing one or more containers that share storage and network resources.
**Basic Pod Example:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: nginx-pod
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
```
**Multi-Container Pod:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: multi-container-pod
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
volumeMounts:
- name: shared-data
mountPath: /usr/share/nginx/html
- name: sidecar
image: busybox
command: ['sh', '-c', 'while true; do echo "$(date)" > /pod-data/index.html; sleep 30; done']
volumeMounts:
- name: shared-data
mountPath: /pod-data
volumes:
- name: shared-data
emptyDir: {}
```
**Pod with Init Container:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: init-demo
spec:
initContainers:
- name: install
image: busybox:1.28
command:
- wget
- "-O"
- "/work-dir/index.html"
- http://info.cern.ch
volumeMounts:
- name: workdir
mountPath: "/work-dir"
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
volumeMounts:
- name: workdir
mountPath: /usr/share/nginx/html
volumes:
- name: workdir
emptyDir: {}
```
**Pod with Security Context:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: security-context-demo
spec:
securityContext:
runAsUser: 1000
runAsGroup: 3000
fsGroup: 2000
containers:
- name: sec-ctx-container
image: busybox
command: [ "sh", "-c", "sleep 1h" ]
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- NET_RAW
- ALL
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
```
**Pod with Resource Limits and Requests:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: resource-demo
spec:
containers:
- name: app
image: nginx:1.21
resources:
requests:
memory: "64Mi"
cpu: "250m"
limits:
memory: "128Mi"
cpu: "500m"
```
**Pod with Probes:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: probe-demo
spec:
containers:
- name: app
image: nginx:1.21
ports:
- containerPort: 80
livenessProbe:
httpGet:
path: /healthz
port: 80
initialDelaySeconds: 3
periodSeconds: 10
timeoutSeconds: 1
failureThreshold: 3
readinessProbe:
httpGet:
path: /ready
port: 80
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
startupProbe:
httpGet:
path: /startup
port: 80
initialDelaySeconds: 0
periodSeconds: 10
failureThreshold: 30
```
### Deployments
Deployments provide declarative updates for Pods and ReplicaSets, enabling rolling updates and rollbacks.
**Basic Deployment:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.21
ports:
- containerPort: 80
```
**Deployment with Rolling Update Strategy:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: rolling-update-deployment
spec:
replicas: 5
strategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 1
maxUnavailable: 1
selector:
matchLabels:
app: myapp
template:
metadata:
labels:
app: myapp
spec:
containers:
- name: myapp
image: myapp:v2
ports:
- containerPort: 8080
resources:
requests:
cpu: 100m
memory: 128Mi
limits:
cpu: 500m
memory: 512Mi
```
**Deployment with Recreate Strategy:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: recreate-deployment
spec:
replicas: 3
strategy:
type: Recreate
selector:
matchLabels:
app: database-migration
template:
metadata:
labels:
app: database-migration
spec:
containers:
- name: migrator
image: migrator:v1
```
**Blue-Green Deployment Pattern:**
```yaml
# Blue Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-blue
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: blue
template:
metadata:
labels:
app: myapp
version: blue
spec:
containers:
- name: myapp
image: myapp:v1.0
ports:
- containerPort: 8080
---
# Green Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-green
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: green
template:
metadata:
labels:
app: myapp
version: green
spec:
containers:
- name: myapp
image: myapp:v2.0
ports:
- containerPort: 8080
```
### StatefulSets
StatefulSets manage stateful applications requiring stable network identities and persistent storage.
**Basic StatefulSet with Headless Service:**
```yaml
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
serviceName: "nginx"
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.21
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
```
**StatefulSet with Parallel Pod Management:**
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web-parallel
spec:
serviceName: "nginx"
podManagementPolicy: "Parallel"
replicas: 5
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: registry.k8s.io/nginx-slim:0.24
ports:
- containerPort: 80
name: web
volumeMounts:
- name: www
mountPath: /usr/share/nginx/html
volumeClaimTemplates:
- metadata:
name: www
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1Gi
```
**StatefulSet for Database (MySQL):**
```yaml
apiVersion: v1
kind: Service
metadata:
name: mysql-headless
spec:
ports:
- port: 3306
name: mysql
clusterIP: None
selector:
app: mysql
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: mysql
spec:
serviceName: mysql-headless
replicas: 3
selector:
matchLabels:
app: mysql
template:
metadata:
labels:
app: mysql
spec:
containers:
- name: mysql
image: mysql:8.0
ports:
- containerPort: 3306
name: mysql
env:
- name: MYSQL_ROOT_PASSWORD
valueFrom:
secretKeyRef:
name: mysql-secret
key: root-password
volumeMounts:
- name: data
mountPath: /var/lib/mysql
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
```
### DaemonSets
DaemonSets ensure that all or specific nodes run a copy of a Pod, ideal for logging, monitoring, and cluster storage.
**Logging DaemonSet:**
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd-elasticsearch
namespace: kube-system
labels:
k8s-app: fluentd-logging
spec:
selector:
matchLabels:
name: fluentd-elasticsearch
template:
metadata:
labels:
name: fluentd-elasticsearch
spec:
tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
- name: fluentd-elasticsearch
image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
resources:
limits:
memory: 200Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
```
**Monitoring DaemonSet (Node Exporter):**
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: node-exporter
namespace: monitoring
spec:
selector:
matchLabels:
app: node-exporter
template:
metadata:
labels:
app: node-exporter
spec:
hostNetwork: true
hostPID: true
containers:
- name: node-exporter
image: prom/node-exporter:v1.3.1
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
- --path.rootfs=/host/root
- --collector.filesystem.ignored-mount-points=^/(dev|proc|sys|var/lib/docker/.+)($|/)
ports:
- containerPort: 9100
hostPort: 9100
name: metrics
volumeMounts:
- name: proc
mountPath: /host/proc
readOnly: true
- name: sys
mountPath: /host/sys
readOnly: true
- name: root
mountPath: /host/root
readOnly: true
volumes:
- name: proc
hostPath:
path: /proc
- name: sys
hostPath:
path: /sys
- name: root
hostPath:
path: /
```
### Jobs
Jobs create one or more Pods and ensure a specified number successfully complete.
**Basic Job:**
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: pi
spec:
template:
spec:
containers:
- name: pi
image: perl:5.34
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
backoffLimit: 4
```
**Parallel Job with Fixed Completions:**
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: parallel-job
spec:
completions: 8
parallelism: 2
template:
spec:
containers:
- name: worker
image: busybox
command: ["sh", "-c", "echo Processing item $ITEM_ID && sleep 5"]
env:
- name: ITEM_ID
value: "$(JOB_COMPLETION_INDEX)"
restartPolicy: Never
backoffLimit: 3
```
**Job with TTL After Finished:**
```yaml
apiVersion: batch/v1
kind: Job
metadata:
name: ttl-job
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: cleaner
image: busybox
command: ["sh", "-c", "echo Cleaning up && sleep 10"]
restartPolicy: Never
```
### CronJobs
CronJobs create Jobs on a repeating schedule.
**Basic CronJob:**
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/5 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
command:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
```
**Backup CronJob:**
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: database-backup
spec:
schedule: "0 2 * * *"
successfulJobsHistoryLimit: 3
failedJobsHistoryLimit: 1
concurrencyPolicy: Forbid
jobTemplate:
spec:
template:
spec:
containers:
- name: backup
image: postgres:14
command:
- /bin/sh
- -c
- pg_dump -h $DB_HOST -U $DB_USER $DB_NAME | gzip > /backup/db-$(date +%Y%m%d-%H%M%S).sql.gz
env:
- name: DB_HOST
value: postgres-service
- name: DB_USER
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: PGPASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
- name: DB_NAME
value: mydb
volumeMounts:
- name: backup-storage
mountPath: /backup
restartPolicy: OnFailure
volumes:
- name: backup-storage
persistentVolumeClaim:
claimName: backup-pvc
```
**Report Generation CronJob:**
```yaml
apiVersion: batch/v1
kind: CronJob
metadata:
name: daily-report
spec:
schedule: "0 8 * * 1-5"
timeZone: "America/New_York"
startingDeadlineSeconds: 3600
concurrencyPolicy: Replace
jobTemplate:
spec:
template:
spec:
containers:
- name: report-generator
image: report-app:v1
command:
- python
- generate_report.py
- --format=pdf
- [email protected]
restartPolicy: OnFailure
```
## Services and Networking
### ClusterIP Service
Default service type providing internal cluster communication.
```yaml
apiVersion: v1
kind: Service
metadata:
name: backend-service
spec:
type: ClusterIP
selector:
app: backend
ports:
- protocol: TCP
port: 80
targetPort: 8080
```
**Service with Multiple Ports:**
```yaml
apiVersion: v1
kind: Service
metadata:
name: multi-port-service
spec:
selector:
app: myapp
ports:
- name: http
protocol: TCP
port: 80
targetPort: 8080
- name: https
protocol: TCP
port: 443
targetPort: 8443
- name: metrics
protocol: TCP
port: 9090
targetPort: 9090
```
**Headless Service:**
```yaml
apiVersion: v1
kind: Service
metadata:
name: stateful-service
spec:
clusterIP: None
selector:
app: stateful-app
ports:
- port: 80
targetPort: 8080
```
### NodePort Service
Exposes the service on each node's IP at a static port.
```yaml
apiVersion: v1
kind: Service
metadata:
name: nodeport-service
spec:
type: NodePort
selector:
app: frontend
ports:
- protocol: TCP
port: 80
targetPort: 8080
nodePort: 30080
```
### LoadBalancer Service
Creates an external load balancer in cloud environments.
```yaml
apiVersion: v1
kind: Service
metadata:
name: loadbalancer-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
spec:
type: LoadBalancer
selector:
app: web
ports:
- protocol: TCP
port: 80
targetPort: 8080
loadBalancerSourceRanges:
- "10.0.0.0/8"
- "172.16.0.0/12"
```
### ExternalName Service
Maps a service to an external DNS name.
```yaml
apiVersion: v1
kind: Service
metadata:
name: external-database
spec:
type: ExternalName
externalName: database.example.com
```
### Service with Session Affinity
```yaml
apiVersion: v1
kind: Service
metadata:
name: sticky-service
spec:
selector:
app: myapp
ports:
- protocol: TCP
port: 80
targetPort: 8080
sessionAffinity: ClientIP
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10800
```
## Ingress Controllers
Ingress manages external access to services, typically HTTP/HTTPS, providing load balancing, SSL termination, and name-based virtual hosting.
### Basic Ingress
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: basic-ingress
spec:
ingressClassName: nginx
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-service
port:
number: 80
```
### Ingress with TLS
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: tls-ingress
spec:
ingressClassName: nginx
tls:
- hosts:
- myapp.example.com
secretName: myapp-tls-secret
rules:
- host: myapp.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: myapp-service
port:
number: 80
```
### Path-Based Routing
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: path-based-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: example.com
http:
paths:
- path: /api
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
- path: /web
pathType: Prefix
backend:
service:
name: web-service
port:
number: 80
- path: /admin
pathType: Prefix
backend:
service:
name: admin-service
port:
number: 3000
```
### Multi-Host Ingress
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: multi-host-ingress
spec:
ingressClassName: nginx
tls:
- hosts:
- app1.example.com
- app2.example.com
secretName: multi-tls-secret
rules:
- host: app1.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app1-service
port:
number: 80
- host: app2.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app2-service
port:
number: 80
```
### Ingress with Authentication
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: auth-ingress
annotations:
nginx.ingress.kubernetes.io/auth-type: basic
nginx.ingress.kubernetes.io/auth-secret: basic-auth
nginx.ingress.kubernetes.io/auth-realm: 'Authentication Required'
spec:
ingressClassName: nginx
rules:
- host: secure.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: secure-service
port:
number: 80
```
### Ingress with Rate Limiting
```yaml
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: rate-limit-ingress
annotations:
nginx.ingress.kubernetes.io/limit-rps: "10"
nginx.ingress.kubernetes.io/limit-connections: "5"
spec:
ingressClassName: nginx
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: api-service
port:
number: 8080
```
## Configuration Management
### ConfigMaps
ConfigMaps store non-confidential data in key-value pairs.
**ConfigMap from Literals:**
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: app-config
data:
database_host: "postgres.default.svc.cluster.local"
database_port: "5432"
log_level: "INFO"
feature_flags: |
feature1=enabled
feature2=disabled
feature3=enabled
```
**ConfigMap with File Content:**
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: nginx-config
data:
nginx.conf: |
events {
worker_connections 1024;
}
http {
server {
listen 80;
location / {
root /usr/share/nginx/html;
index index.html;
}
}
}
```
**Using ConfigMap as Environment Variables:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: config-env-pod
spec:
containers:
- name: app
image: myapp:v1
envFrom:
- configMapRef:
name: app-config
env:
- name: SPECIFIC_CONFIG
valueFrom:
configMapKeyRef:
name: app-config
key: log_level
```
**Using ConfigMap as Volume:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: config-volume-pod
spec:
containers:
- name: nginx
image: nginx:1.21
volumeMounts:
- name: config-volume
mountPath: /etc/nginx/nginx.conf
subPath: nginx.conf
volumes:
- name: config-volume
configMap:
name: nginx-config
```
### Secrets
Secrets store sensitive information such as passwords, tokens, and keys.
**Opaque Secret:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: db-credentials
type: Opaque
data:
username: YWRtaW4= # base64 encoded "admin"
password: cGFzc3dvcmQxMjM= # base64 encoded "password123"
stringData:
connection-string: "postgresql://admin:password123@postgres:5432/mydb"
```
**TLS Secret:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: tls-secret
type: kubernetes.io/tls
data:
tls.crt: LS0tLS1CRUdJTi... # base64 encoded certificate
tls.key: LS0tLS1CRUdJTi... # base64 encoded private key
```
**Docker Registry Secret:**
```yaml
apiVersion: v1
kind: Secret
metadata:
name: registry-credentials
type: kubernetes.io/dockerconfigjson
data:
.dockerconfigjson: eyJhdXRocyI6eyJodHRwczovL2luZGV4...
```
**Using Secrets as Environment Variables:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: secret-env-pod
spec:
containers:
- name: app
image: myapp:v1
env:
- name: DB_USERNAME
valueFrom:
secretKeyRef:
name: db-credentials
key: username
- name: DB_PASSWORD
valueFrom:
secretKeyRef:
name: db-credentials
key: password
```
**Using Secrets as Volume:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: secret-volume-pod
spec:
containers:
- name: app
image: myapp:v1
volumeMounts:
- name: secret-volume
mountPath: /etc/secrets
readOnly: true
volumes:
- name: secret-volume
secret:
secretName: db-credentials
```
**Pod with Service Account and Secrets:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: prod-db-client-pod
labels:
name: prod-db-client
spec:
serviceAccount: prod-db-client
containers:
- name: db-client-container
image: postgres:14
env:
- name: DATABASE_URL
valueFrom:
secretKeyRef:
name: db-credentials
key: connection-string
```
## Storage
### PersistentVolumes (PV)
PersistentVolumes are cluster-level storage resources.
```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: pv-example
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: standard
hostPath:
path: /mnt/data
```
**NFS PersistentVolume:**
```yaml
apiVersion: v1
kind: PersistentVolume
metadata:
name: nfs-pv
spec:
capacity:
storage: 100Gi
accessModes:
- ReadWriteMany
nfs:
server: nfs-server.example.com
path: /exports/data
persistentVolumeReclaimPolicy: Retain
storageClassName: nfs
```
### PersistentVolumeClaims (PVC)
PVCs request storage from PersistentVolumes.
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: mysql-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 20Gi
storageClassName: standard
```
**PVC with Selector:**
```yaml
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: selective-pvc
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 10Gi
storageClassName: fast-ssd
selector:
matchLabels:
environment: production
tier: database
```
### StorageClass
StorageClasses define different classes of storage.
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: fast-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
iops: "3000"
throughput: "125"
encrypted: "true"
volumeBindingMode: WaitForFirstConsumer
allowVolumeExpansion: true
```
**Azure StorageClass:**
```yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: azure-premium
provisioner: kubernetes.io/azure-disk
parameters:
storageaccounttype: Premium_LRS
kind: Managed
reclaimPolicy: Delete
allowVolumeExpansion: true
```
**Using PVC in Pod:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: pvc-pod
spec:
containers:
- name: app
image: nginx:1.21
volumeMounts:
- name: data
mountPath: /usr/share/nginx/html
volumes:
- name: data
persistentVolumeClaim:
claimName: mysql-pvc
```
## Namespaces and Resource Isolation
### Creating Namespaces
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: development
labels:
environment: dev
team: engineering
```
**Namespace with Annotations:**
```yaml
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: prod
compliance: required
annotations:
owner: "[email protected]"
cost-center: "12345"
```
### ResourceQuota
ResourceQuotas limit resource consumption in a namespace.
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: compute-quota
namespace: development
spec:
hard:
requests.cpu: "10"
requests.memory: 20Gi
limits.cpu: "20"
limits.memory: 40Gi
persistentvolumeclaims: "10"
pods: "50"
```
**Object Count Quota:**
```yaml
apiVersion: v1
kind: ResourceQuota
metadata:
name: object-quota
namespace: development
spec:
hard:
configmaps: "10"
secrets: "10"
services: "10"
services.loadbalancers: "2"
services.nodeports: "5"
```
### LimitRange
LimitRanges set default resource limits and requests.
```yaml
apiVersion: v1
kind: LimitRange
metadata:
name: resource-limits
namespace: development
spec:
limits:
- max:
cpu: "2"
memory: 4Gi
min:
cpu: 100m
memory: 128Mi
default:
cpu: 500m
memory: 512Mi
defaultRequest:
cpu: 200m
memory: 256Mi
type: Container
- max:
storage: 10Gi
min:
storage: 1Gi
type: PersistentVolumeClaim
```
## Security and RBAC
### ServiceAccounts
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: app-service-account
namespace: default
```
**ServiceAccount with Image Pull Secrets:**
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: build-robot
namespace: default
imagePullSecrets:
- name: registry-credentials
```
### Roles and RoleBindings
**Role (Namespace-scoped):**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: development
name: pod-reader
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "watch", "list"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get"]
```
**RoleBinding:**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: read-pods
namespace: development
subjects:
- kind: User
name: jane
apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
name: app-service-account
namespace: development
roleRef:
kind: Role
name: pod-reader
apiGroup: rbac.authorization.k8s.io
```
### ClusterRole and ClusterRoleBinding
**ClusterRole (Cluster-scoped):**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: cluster-admin-role
rules:
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get", "list", "watch"]
- apiGroups: ["apps"]
resources: ["deployments", "statefulsets", "daemonsets"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["namespaces"]
verbs: ["get", "list"]
```
**ClusterRoleBinding:**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: cluster-admin-binding
subjects:
- kind: User
name: admin-user
apiGroup: rbac.authorization.k8s.io
- kind: Group
name: system:masters
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: cluster-admin-role
apiGroup: rbac.authorization.k8s.io
```
**Developer Role:**
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
namespace: development
name: developer
rules:
- apiGroups: ["", "apps", "batch"]
resources: ["pods", "deployments", "services", "configmaps", "secrets", "jobs", "cronjobs"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods/log", "pods/exec"]
verbs: ["get", "create"]
```
### NetworkPolicy
**Deny All Ingress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: deny-all-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
```
**Allow Specific Ingress:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-frontend
namespace: production
spec:
podSelector:
matchLabels:
app: backend
policyTypes:
- Ingress
ingress:
- from:
- podSelector:
matchLabels:
app: frontend
ports:
- protocol: TCP
port: 8080
```
**Allow from Specific Namespace:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-from-namespace
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Ingress
ingress:
- from:
- namespaceSelector:
matchLabels:
environment: production
podSelector:
matchLabels:
role: client
```
**Egress Network Policy:**
```yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-egress
spec:
podSelector:
matchLabels:
app: myapp
policyTypes:
- Egress
egress:
- to:
- namespaceSelector:
matchLabels:
name: kube-system
ports:
- protocol: UDP
port: 53
- to:
- podSelector:
matchLabels:
app: database
ports:
- protocol: TCP
port: 5432
```
### PodSecurityPolicy
```yaml
apiVersion: policy/v1beta1
kind: PodSecurityPolicy
metadata:
name: restricted
spec:
privileged: false
allowPrivilegeEscalation: false
requiredDropCapabilities:
- ALL
volumes:
- 'configMap'
- 'emptyDir'
- 'projected'
- 'secret'
- 'downwardAPI'
- 'persistentVolumeClaim'
hostNetwork: false
hostIPC: false
hostPID: false
runAsUser:
rule: MustRunAsNonRoot
seLinux:
rule: RunAsAny
fsGroup:
rule: RunAsAny
readOnlyRootFilesystem: false
```
## Autoscaling
### Horizontal Pod Autoscaler (HPA)
HPA automatically scales the number of Pods based on observed metrics.
**CPU-based HPA:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: webapp-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: webapp
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
**Memory-based HPA:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: memory-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: memory-intensive-app
minReplicas: 3
maxReplicas: 15
metrics:
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
```
**Multi-Metric HPA:**
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: multi-metric-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: api-server
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 60
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 70
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "1000"
behavior:
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 50
periodSeconds: 60
scaleUp:
stabilizationWindowSeconds: 0
policies:
- type: Percent
value: 100
periodSeconds: 30
- type: Pods
value: 4
periodSeconds: 30
selectPolicy: Max
```
### Vertical Pod Autoscaler (VPA)
VPA automatically adjusts CPU and memory requests/limits.
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: myapp
updatePolicy:
updateMode: "Auto"
resourcePolicy:
containerPolicies:
- containerName: app
minAllowed:
cpu: 100m
memory: 128Mi
maxAllowed:
cpu: 2
memory: 4Gi
controlledResources: ["cpu", "memory"]
```
### Cluster Autoscaler
Cluster Autoscaler adjusts the number of nodes in the cluster.
```yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-autoscaler-priority-expander
namespace: kube-system
data:
priorities: |
10:
- .*-spot-.*
50:
- .*-ondemand-.*
```
## Monitoring and Observability
### Metrics Server
Metrics Server provides resource usage metrics.
```yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
labels:
k8s-app: metrics-server
spec:
serviceAccountName: metrics-server
containers:
- name: metrics-server
image: k8s.gcr.io/metrics-server/metrics-server:v0.6.1
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
- --kubelet-use-node-status-port
ports:
- name: https
containerPort: 4443
protocol: TCP
```
### Prometheus
**Prometheus Deployment:**
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: prometheus
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: prometheus
template:
metadata:
labels:
app: prometheus
spec:
serviceAccountName: prometheus
containers:
- name: prometheus
image: prom/prometheus:v2.40.0
args:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
- '--storage.tsdb.retention.time=15d'
ports:
- containerPort: 9090
volumeMounts:
- name: config-volume
mountPath: /etc/prometheus
- name: storage-volume
mountPath: /prometheus
volumes:
- name: config-volume
configMap:
name: prometheus-config
- name: storage-volume
persistentVolumeClaim:
claimName: prometheus-pvc
```
**ServiceMonitor for Prometheus Operator:**
```yaml
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: app-metrics
namespace: monitoring
spec:
selector:
matchLabels:
app: myapp
endpoints:
- port: metrics
interval: 30s
path: /metrics
```
### Grafana
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: grafana
namespace: monitoring
spec:
replicas: 1
selector:
matchLabels:
app: grafana
template:
metadata:
labels:
app: grafana
spec:
containers:
- name: grafana
image: grafana/grafana:9.3.0
ports:
- containerPort: 3000
env:
- name: GF_SECURITY_ADMIN_PASSWORD
valueFrom:
secretKeyRef:
name: grafana-credentials
key: admin-password
volumeMounts:
- name: grafana-storage
mountPath: /var/lib/grafana
volumes:
- name: grafana-storage
persistentVolumeClaim:
claimName: grafana-pvc
```
## Logging
### Fluentd DaemonSet
```yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
namespace: kube-system
spec:
selector:
matchLabels:
k8s-app: fluentd-logging
template:
metadata:
labels:
k8s-app: fluentd-logging
spec:
serviceAccount: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.15-debian-elasticsearch7-1
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.logging.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
- name: config-volume
mountPath: /fluentd/etc
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- name: config-volume
configMap:
name: fluentd-config
```
### Elasticsearch
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: elasticsearch
namespace: logging
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:8.5.0
env:
- name: cluster.name
value: "k8s-logs"
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "elasticsearch-0.elasticsearch,elasticsearch-1.elasticsearch,elasticsearch-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "elasticsearch-0,elasticsearch-1,elasticsearch-2"
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
ports:
- containerPort: 9200
name: rest
- containerPort: 9300
name: inter-node
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 50Gi
```
## Production Operations
### Health Checks and Probes
Kubernetes provides three types of probes:
1. **Liveness Probe**: Determines if a container is running
2. **Readiness Probe**: Determines if a container is ready to serve traffic
3. **Startup Probe**: Determines if the application has started
### Rolling Updates
```bash
kubectl set image deployment/myapp myapp=myapp:v2
kubectl rollout status deployment/myapp
kubectl rollout history deployment/myapp
kubectl rollout undo deployment/myapp
```
### Pod Disruption Budgets
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: app-pdb
spec:
minAvailable: 2
selector:
matchLabels:
app: myapp
```
**PDB with Max Unavailable:**
```yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: database-pdb
spec:
maxUnavailable: 1
selector:
matchLabels:
app: database
```
### Taints and Tolerations
**Node Taint:**
```bash
kubectl taint nodes node1 key=value:NoSchedule
```
**Pod Toleration:**
```yaml
apiVersion: v1
kind: Pod
metadata:
name: toleration-pod
spec:
tolerations:
- key: "key"
operator: "Equal"
value: "value"
effect: "NoSchedule"
containers:
- name: app
image: nginx:1.21
```
### Node Affinity
```yaml
apiVersion: v1
kind: Pod
metadata:
name: affinity-pod
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: zone
operator: In
values:
- us-east-1a
containers:
- name: app
image: nginx:1.21
```
### Pod Anti-Affinity
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: web-server
spec:
replicas: 3
selector:
matchLabels:
app: web
template:
metadata:
labels:
app: web
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- web
topologyKey: "kubernetes.io/hostname"
containers:
- name: web
image: nginx:1.21
```
## Troubleshooting
### Common kubectl Commands
```bash
# Get resources
kubectl get pods
kubectl get deployments
kubectl get services
kubectl get nodes
# Describe resources
kubectl describe pod <pod-name>
kubectl describe node <node-name>
# View logs
kubectl logs <pod-name>
kubectl logs <pod-name> -c <container-name>
kubectl logs -f <pod-name> # Follow logs
# Execute commands in pod
kubectl exec -it <pod-name> -- /bin/bash
kubectl exec <pod-name> -- ls /app
# Port forwarding
kubectl port-forward pod/<pod-name> 8080:80
kubectl port-forward service/<service-name> 8080:80
# Resource usage
kubectl top nodes
kubectl top pods
# Events
kubectl get events --sort-by='.lastTimestamp'
kubectl get events --field-selector involvedObject.name=<pod-name>
# Debug
kubectl debug node/<node-name> -it --image=ubuntu
kubectl run debug-pod --rm -i --tty --image=busybox -- /bin/sh
```
### Common Issues and Solutions
**Pod Stuck in Pending State:**
- Check node resources: `kubectl describe node`
- Check PVC binding: `kubectl describe pvc`
- Check pod events: `kubectl describe pod <pod-name>`
**CrashLoopBackOff:**
- Check logs: `kubectl logs <pod-name> --previous`
- Check resource limits
- Verify liveness and readiness probes
**ImagePullBackOff:**
- Verify image name and tag
- Check registry credentials
- Verify network connectivity
**Service Not Accessible:**
- Verify service selector matches pod labels
- Check endpoints: `kubectl get endpoints <service-name>`
- Test DNS: `kubectl run -it --rm debug --image=busybox --restart=Never -- nslookup <service-name>`
This comprehensive guide covers the essential aspects of Kubernetes orchestration. For production deployments, always follow security best practices, implement proper monitoring and logging, and regularly update your cluster and applications.
This skill is a comprehensive guide to Kubernetes container orchestration covering workloads, networking, storage, security, and production operations. It distills core concepts, object types, and practical YAML patterns for Deployments, StatefulSets, DaemonSets, Jobs, probes, resource requests/limits, and security contexts. The content targets engineers who run containerized apps in development and production.
The skill inspects and explains Kubernetes primitives and control-plane/node components, showing how objects (metadata, spec, status) map to real behavior. It provides concrete YAML examples and recommended patterns for rollout strategies, stateful apps, cluster-level services, ingress, storage claims, probes, autoscaling, logging, and monitoring. It also outlines operational tasks: RBAC, namespaces, troubleshooting, and production readiness.
How do I choose between Deployment, StatefulSet, and DaemonSet?
Use Deployment for stateless scalable apps, StatefulSet when pods need stable identities and persistent storage, and DaemonSet to run a pod on every (or selected) node for logging, monitoring, or node-local services.
What protection should I add before production rollouts?
Set resource requests/limits, configure probes, enable RBAC and network policies, use health checks in CI, run load tests, and add monitoring and alerting to detect regressions after rollout.