Kubernetes basics
Introduction
Docker la containers create pannita — super! But ipo imagine pannunga: 50 containers oru app ku run aagudhu. Oru container crash aana? Traffic increase aana? New version deploy pannanum na? 😰
Manually manage panna impossible. Adhukkudhaan Kubernetes (K8s) use pannrom — containers oda autopilot maari! 🚀
Indha article la Kubernetes core concepts, architecture, and hands-on basics paapom. AI apps ku K8s epdhi game-changer nu therinjukkalam!
Why Kubernetes?
Docker alone use pannum bodhu problems:
| Problem | Without K8s | With K8s |
|---|---|---|
| Container crash | Manual restart | Auto restart ♻️ |
| Traffic spike | Manual scaling | Auto scaling 📈 |
| New deployment | Downtime risk | Zero downtime 🟢 |
| Load balance | Manual config | Automatic ⚖️ |
| Health check | You monitor | K8s monitors 🏥 |
Real-world example: Unga AI chatbot app ku suddenly 10x traffic vandhaa:
- Without K8s: Server crash, users angry 😤
- With K8s: Auto-scale, 10 new pods create, users happy 😊
Google daily 5 billion containers run pannum — Kubernetes use pannithaan! Adhaan idha open-source ah release pannanga. 🎁
Kubernetes Architecture
┌─────────────────────────────────────────────────────┐ │ KUBERNETES CLUSTER │ ├─────────────────────────────────────────────────────┤ │ │ │ ┌─────────────── Control Plane ──────────────────┐ │ │ │ ┌──────────┐ ┌──────────┐ ┌───────────────┐ │ │ │ │ │ API │ │ Scheduler│ │ Controller │ │ │ │ │ │ Server │ │ │ │ Manager │ │ │ │ │ └──────────┘ └──────────┘ └───────────────┘ │ │ │ │ ┌──────────────────────────────────────────┐ │ │ │ │ │ etcd (Database) │ │ │ │ │ └──────────────────────────────────────────┘ │ │ │ └────────────────────────────────────────────────┘ │ │ │ │ │ ┌─── Worker Node 1 ───┐ ┌─── Worker Node 2 ───┐ │ │ │ ┌─Pod──┐ ┌─Pod──┐ │ │ ┌─Pod──┐ ┌─Pod──┐ │ │ │ │ │ 🤖 │ │ 🧠 │ │ │ │ 🤖 │ │ 📊 │ │ │ │ │ │ API │ │Model │ │ │ │ API │ │Redis │ │ │ │ │ └──────┘ └──────┘ │ │ └──────┘ └──────┘ │ │ │ │ kubelet + proxy │ │ kubelet + proxy │ │ │ └──────────────────────┘ └──────────────────────┘ │ └─────────────────────────────────────────────────────┘
Core Concepts — K8s Building Blocks
K8s la mukkiyamana concepts:
1. Pod 🫛
- Smallest unit in K8s
- One or more containers hold pannum
- Oru pod = oru app instance
2. Deployment 📋
- Pods manage pannum
- "3 copies of my AI API run pannu" nu sollunga
- Auto-replace crashed pods
3. Service 🔌
- Pods ku stable network address kudukum
- Load balance pannum
- Types: ClusterIP, NodePort, LoadBalancer
4. Namespace 📁
- Virtual cluster — organize resources
- dev, staging, production separate pannalam
5. ConfigMap & Secret 🔑
- Configuration data store pannradhu
- Secrets = encrypted sensitive data (API keys)
6. Ingress 🚪
- External traffic manage pannum
- Domain routing: api.myapp.com → API pod, app.myapp.com → Web pod
Pod Lifecycle
Pod oda lifecycle stages:
1. Pending — Schedule aaga wait pannum
2. Running — Container start aagiduchu ✅
3. Succeeded — Task complete (batch jobs)
4. Failed — Container crash aagiduchu ❌
5. Unknown — Node communication lost
Important: Pods are ephemeral — die pannum, replace aagum. Data store panna Persistent Volumes use pannunga!
Pod crashed na K8s automatically new pod create pannum — nee tension panna vendaam! 🧘
YAML Files — K8s Language
K8s la ellam YAML files la define pannrom. Basic structure:
Key fields:
- replicas: Ethana pods venum
- image: Docker image
- resources: CPU/Memory limits
- labels: Pods identify panna
Essential kubectl Commands
kubectl = K8s oda CLI tool. Most used commands:
| Command | What It Does |
|---|---|
| `kubectl get pods` | All pods list pannum |
| `kubectl get services` | All services kaatum |
| `kubectl apply -f file.yaml` | Config apply pannum |
| `kubectl delete pod name` | Pod delete pannum |
| `kubectl logs pod-name` | Logs paakalam |
| `kubectl exec -it pod -- bash` | Pod inside pogalam |
| `kubectl scale deploy/app --replicas=5` | Scale pannum |
| `kubectl describe pod name` | Detailed info |
Quick start flow:
Services — Network Access for Pods
Pods ku direct ah access panna mudiyaadhu (IP keep changing). Service oru stable endpoint kudukum.
Service Types:
1. ClusterIP (Default) 🔒
- Internal access only
- Backend services ku use pannunga
- Example: AI model service → API server mattum access pannum
2. NodePort 🚪
- External access via node port (30000-32767)
- Development/testing ku use pannunga
3. LoadBalancer ⚖️
- Cloud provider load balancer create pannum
- Production ku best
- Example: User traffic AI API ku distribute pannum
Auto Scaling in K8s
K8s la 2 types of scaling irukku:
Horizontal Pod Autoscaler (HPA) 📊
- CPU/Memory usage based ah pods add/remove pannum
- "CPU 70% above pona 10 pods varaikkum scale pannu"
Vertical Pod Autoscaler (VPA) 📏
- Pod oda resource limits adjust pannum
- "Idhu ku more memory venum" nu auto-increase
AI apps ku HPA most useful — inference requests increase aana auto-scale aagum! 🚀
Local Setup with Minikube
Local la K8s try pannunga — Minikube! 💻
5 minutes la unga own K8s cluster ready! 🎉
Other options: kind (K8s in Docker), k3s (lightweight K8s), Docker Desktop (built-in K8s).
K8s for AI/ML Workloads
AI apps ku special K8s features:
GPU Scheduling 🎮
- NVIDIA GPU Operator use pannalam
- Specific pods ku GPU allocate pannalam
Model Serving 🧠
- KServe — ML model serving framework
- Seldon Core — ML deployment platform
- Auto-scaling based on inference load
Training Jobs 📚
- Kubeflow — ML pipeline orchestration
- Distributed training across multiple GPU nodes
- Job scheduling and queue management
AI-specific tools on K8s:
| Tool | Purpose |
|---|---|
| Kubeflow | ML Pipelines |
| KServe | Model Serving |
| Argo Workflows | DAG workflows |
| MLflow | Experiment tracking |
| Ray | Distributed computing |
Try It: K8s Setup Prompt
K8s Best Practices
Production K8s ku follow pannunga:
✅ Resource limits always set pannunga — illana oru pod full node resources eat pannum
✅ Health checks (readiness + liveness probes) add pannunga
✅ Namespaces use pannunga — dev/staging/prod separate
✅ RBAC enable pannunga — who can do what
✅ Secrets properly manage pannunga — never hardcode
✅ Rolling updates use pannunga — zero downtime
✅ Pod Disruption Budgets set pannunga — maintenance time safety
✅ Monitoring with Prometheus + Grafana setup pannunga
Managed Kubernetes Services
Own K8s cluster manage panna headache. Cloud providers managed version kudukuraanga:
| Service | Provider | Best For |
|---|---|---|
| **EKS** | AWS | Enterprise, large scale |
| **GKE** | AI/ML workloads, autopilot | |
| **AKS** | Azure | Microsoft ecosystem |
GKE Autopilot — Google manage everything, nee YAML apply pannunga. No node management! Best for AI apps starting out.
Pricing comparison (3 nodes, small):
- EKS: ~$73/month (cluster fee) + EC2 costs
- GKE: Free cluster + VM costs
- AKS: Free cluster + VM costs
Recommendation: GKE Autopilot la start pannunga — easiest and AI-friendly! 🎯
✅ Key Takeaways
✅ Kubernetes Orchestration — Hundreds/thousands containers auto-manage. Crash recovery, scaling, deployment zero-downtime. Docker alone production scale impossible
✅ Core Abstractions — Pod (smallest unit, one+ containers), Deployment (manage pods, replicas), Service (stable network access, load balance), Namespace (virtual clusters)
✅ Architecture Simple — Control Plane (API server, scheduler, controllers), Worker Nodes (Kubelet, runtime), etcd (database). Masters manage, nodes run pods
✅ Scaling Two Types — HPA (CPU/memory threshold auto-add pods), VPA (auto-increase resources). AI apps: HPA inference load based, VPA model size optimization
✅ YAML Manifests — Everything YAML files. Deployment spec (replicas, image, resources), Service spec (selector, ports), ConfigMap (config), Secret (sensitive data)
✅ AI/ML Specific — GPU scheduling (nvidia.com/gpu limits), KServe (model serving), Kubeflow (ML pipelines), Argo (workflows). Distributed training possible
✅ Kubectl Commands — Get pods/services, apply/delete resources, logs/exec into pods, scale deployments. Learning curve initial, powerful once mastered
✅ Managed Services — GKE (best AI), EKS (AWS), AKS (Azure). Cluster creation, upgrades, security provider-managed. Focus code, not infrastructure
🏁 🎮 Mini Challenge
Challenge: Deploy Flask App to Kubernetes (Minikube)
Local K8s cluster setup → Flask app deploy pannu:
Step 1: Minikube Install & Start 🚀
Step 2: Docker Image Create 🐳
Step 3: Kubernetes Manifest Create 📝
Step 4: Deploy to Minikube 🎯
Step 5: Access App 💻
Step 6: Scale & Monitor 📊
Completion Time: 2 hours
Tools: Minikube, kubectl, Docker, YAML
Real Kubernetes experience ⭐
💼 Interview Questions
Q1: Kubernetes epdhi auto-scale? HPA (Horizontal Pod Autoscaler)?
A: HPA monitors metrics (CPU, memory, custom metrics). Threshold exceed → pods automatically add. CPU 80% cross → 2 more pods add. Perfect for variable traffic. AI inference: request count spike → scale up, request reduce → scale down.
Q2: Stateful vs Stateless apps — K8s la difference?
A: Stateless: any pod serve pannalam (easy, scalable). Stateful: specific pod specific user (database, session). K8s: StatefulSets for stateful (ordered, persistent). AI apps mostly stateless (inference), but training apps stateful (persistent storage, GPU allocation).
Q3: Namespace use case — why organize?
A: Namespace = virtual clusters. Multi-team: production, staging, dev separate. Resource quotas: team A 50% cluster la use, team B 50%. RBAC: different access levels. Isolation: one namespace crash → others unaffected.
Q4: ConfigMap vs Secret — Kubernetes la difference?
A: ConfigMap = non-sensitive config (database host, API endpoint). Secret = sensitive (passwords, API keys, credentials). ConfigMap: plaintext, versioning easy. Secret: base64 encoded (not encrypted by default — additional encryption setup needed for production).
Q5: Resource requests/limits — K8s scheduling?
A: Request = minimum guaranteed (pod guaranteed get). Limit = maximum (pod exceed pannala). Setting: CPU/memory both. Scheduler: request check → right node find → allocate. Over-provisioning avoid, under-provisioning prevent. AI apps: GPU requests essential, limits set to prevent resource hogging.
Frequently Asked Questions
Kubernetes la smallest deployable unit edhu?