← Back|CLOUD-DEVOPSSection 1/18
0 of 18 completed

Kubernetes basics

Intermediate14 min read📅 Updated: 2026-02-17

Introduction

Docker la containers create pannita — super! But ipo imagine pannunga: 50 containers oru app ku run aagudhu. Oru container crash aana? Traffic increase aana? New version deploy pannanum na? 😰


Manually manage panna impossible. Adhukkudhaan Kubernetes (K8s) use pannrom — containers oda autopilot maari! 🚀


Indha article la Kubernetes core concepts, architecture, and hands-on basics paapom. AI apps ku K8s epdhi game-changer nu therinjukkalam!

Why Kubernetes?

Docker alone use pannum bodhu problems:


ProblemWithout K8sWith K8s
Container crashManual restartAuto restart ♻️
Traffic spikeManual scalingAuto scaling 📈
New deploymentDowntime riskZero downtime 🟢
Load balanceManual configAutomatic ⚖️
Health checkYou monitorK8s monitors 🏥

Real-world example: Unga AI chatbot app ku suddenly 10x traffic vandhaa:

  • Without K8s: Server crash, users angry 😤
  • With K8s: Auto-scale, 10 new pods create, users happy 😊

Google daily 5 billion containers run pannum — Kubernetes use pannithaan! Adhaan idha open-source ah release pannanga. 🎁

Kubernetes Architecture

🏗️ Architecture Diagram
┌─────────────────────────────────────────────────────┐
│              KUBERNETES CLUSTER                       │
├─────────────────────────────────────────────────────┤
│                                                       │
│  ┌─────────────── Control Plane ──────────────────┐  │
│  │  ┌──────────┐ ┌──────────┐ ┌───────────────┐  │  │
│  │  │ API      │ │ Scheduler│ │ Controller    │  │  │
│  │  │ Server   │ │          │ │ Manager       │  │  │
│  │  └──────────┘ └──────────┘ └───────────────┘  │  │
│  │  ┌──────────────────────────────────────────┐  │  │
│  │  │              etcd (Database)              │  │  │
│  │  └──────────────────────────────────────────┘  │  │
│  └────────────────────────────────────────────────┘  │
│                         │                             │
│  ┌─── Worker Node 1 ───┐  ┌─── Worker Node 2 ───┐   │
│  │ ┌─Pod──┐ ┌─Pod──┐   │  │ ┌─Pod──┐ ┌─Pod──┐   │   │
│  │ │ 🤖  │ │ 🧠  │   │  │ │ 🤖  │ │ 📊  │   │   │
│  │ │ API  │ │Model │   │  │ │ API  │ │Redis │   │   │
│  │ └──────┘ └──────┘   │  │ └──────┘ └──────┘   │   │
│  │    kubelet + proxy   │  │    kubelet + proxy   │   │
│  └──────────────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────┘

Core Concepts — K8s Building Blocks

K8s la mukkiyamana concepts:


1. Pod 🫛

  • Smallest unit in K8s
  • One or more containers hold pannum
  • Oru pod = oru app instance

2. Deployment 📋

  • Pods manage pannum
  • "3 copies of my AI API run pannu" nu sollunga
  • Auto-replace crashed pods

3. Service 🔌

  • Pods ku stable network address kudukum
  • Load balance pannum
  • Types: ClusterIP, NodePort, LoadBalancer

4. Namespace 📁

  • Virtual cluster — organize resources
  • dev, staging, production separate pannalam

5. ConfigMap & Secret 🔑

  • Configuration data store pannradhu
  • Secrets = encrypted sensitive data (API keys)

6. Ingress 🚪

  • External traffic manage pannum
  • Domain routing: api.myapp.com → API pod, app.myapp.com → Web pod

Pod Lifecycle

💡 Tip

Pod oda lifecycle stages:

1. Pending — Schedule aaga wait pannum

2. Running — Container start aagiduchu ✅

3. Succeeded — Task complete (batch jobs)

4. Failed — Container crash aagiduchu ❌

5. Unknown — Node communication lost

Important: Pods are ephemeral — die pannum, replace aagum. Data store panna Persistent Volumes use pannunga!

Pod crashed na K8s automatically new pod create pannum — nee tension panna vendaam! 🧘

YAML Files — K8s Language

K8s la ellam YAML files la define pannrom. Basic structure:


yaml
# deployment.yaml — AI API Deploy
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-chatbot
spec:
  replicas: 3          # 3 pods run pannu
  selector:
    matchLabels:
      app: ai-chatbot
  template:
    metadata:
      labels:
        app: ai-chatbot
    spec:
      containers:
      - name: chatbot
        image: myrepo/ai-chatbot:v1
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

Key fields:

  • replicas: Ethana pods venum
  • image: Docker image
  • resources: CPU/Memory limits
  • labels: Pods identify panna

Essential kubectl Commands

kubectl = K8s oda CLI tool. Most used commands:


CommandWhat It Does
`kubectl get pods`All pods list pannum
`kubectl get services`All services kaatum
`kubectl apply -f file.yaml`Config apply pannum
`kubectl delete pod name`Pod delete pannum
`kubectl logs pod-name`Logs paakalam
`kubectl exec -it pod -- bash`Pod inside pogalam
`kubectl scale deploy/app --replicas=5`Scale pannum
`kubectl describe pod name`Detailed info

Quick start flow:

bash
# Deploy pannunga
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Check pannunga
kubectl get pods
kubectl get svc

# Logs paakuanga
kubectl logs -f ai-chatbot-pod-xyz

Services — Network Access for Pods

Pods ku direct ah access panna mudiyaadhu (IP keep changing). Service oru stable endpoint kudukum.


Service Types:


1. ClusterIP (Default) 🔒

  • Internal access only
  • Backend services ku use pannunga
  • Example: AI model service → API server mattum access pannum

2. NodePort 🚪

  • External access via node port (30000-32767)
  • Development/testing ku use pannunga

3. LoadBalancer ⚖️

  • Cloud provider load balancer create pannum
  • Production ku best
  • Example: User traffic AI API ku distribute pannum

yaml
apiVersion: v1
kind: Service
metadata:
  name: ai-api-service
spec:
  type: LoadBalancer
  selector:
    app: ai-chatbot
  ports:
  - port: 80
    targetPort: 8000

Auto Scaling in K8s

K8s la 2 types of scaling irukku:


Horizontal Pod Autoscaler (HPA) 📊

  • CPU/Memory usage based ah pods add/remove pannum
  • "CPU 70% above pona 10 pods varaikkum scale pannu"

yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ai-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-chatbot
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Vertical Pod Autoscaler (VPA) 📏

  • Pod oda resource limits adjust pannum
  • "Idhu ku more memory venum" nu auto-increase

AI apps ku HPA most useful — inference requests increase aana auto-scale aagum! 🚀

Local Setup with Minikube

Example

Local la K8s try pannunga — Minikube! 💻

bash
# Install Minikube
brew install minikube    # Mac
# or
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64

# Start cluster
minikube start

# Deploy first app
kubectl create deployment hello-ai --image=nginx
kubectl expose deployment hello-ai --type=NodePort --port=80

# Access pannunga
minikube service hello-ai

5 minutes la unga own K8s cluster ready! 🎉

Other options: kind (K8s in Docker), k3s (lightweight K8s), Docker Desktop (built-in K8s).

K8s for AI/ML Workloads

AI apps ku special K8s features:


GPU Scheduling 🎮

  • NVIDIA GPU Operator use pannalam
  • Specific pods ku GPU allocate pannalam
yaml
resources:
  limits:
    nvidia.com/gpu: 1

Model Serving 🧠

  • KServe — ML model serving framework
  • Seldon Core — ML deployment platform
  • Auto-scaling based on inference load

Training Jobs 📚

  • Kubeflow — ML pipeline orchestration
  • Distributed training across multiple GPU nodes
  • Job scheduling and queue management

AI-specific tools on K8s:

ToolPurpose
KubeflowML Pipelines
KServeModel Serving
Argo WorkflowsDAG workflows
MLflowExperiment tracking
RayDistributed computing

Try It: K8s Setup Prompt

📋 Copy-Paste Prompt
You are a Kubernetes expert for AI/ML applications.

I have an AI image classification API (Python FastAPI, TensorFlow model 800MB).
Requirements:
- Handle 5000 requests/minute peak
- Auto-scale from 2 to 20 pods
- GPU needed for inference
- Zero-downtime deployments
- Health checks

Generate complete Kubernetes YAML files:
1. Deployment with GPU resources
2. Service (LoadBalancer)
3. HPA configuration
4. Readiness & Liveness probes
5. ConfigMap for environment variables

K8s Best Practices

💡 Tip

Production K8s ku follow pannunga:

Resource limits always set pannunga — illana oru pod full node resources eat pannum

Health checks (readiness + liveness probes) add pannunga

Namespaces use pannunga — dev/staging/prod separate

RBAC enable pannunga — who can do what

Secrets properly manage pannunga — never hardcode

Rolling updates use pannunga — zero downtime

Pod Disruption Budgets set pannunga — maintenance time safety

Monitoring with Prometheus + Grafana setup pannunga

Managed Kubernetes Services

Own K8s cluster manage panna headache. Cloud providers managed version kudukuraanga:


ServiceProviderBest For
**EKS**AWSEnterprise, large scale
**GKE**GoogleAI/ML workloads, autopilot
**AKS**AzureMicrosoft ecosystem

GKE Autopilot — Google manage everything, nee YAML apply pannunga. No node management! Best for AI apps starting out.


Pricing comparison (3 nodes, small):

  • EKS: ~$73/month (cluster fee) + EC2 costs
  • GKE: Free cluster + VM costs
  • AKS: Free cluster + VM costs

Recommendation: GKE Autopilot la start pannunga — easiest and AI-friendly! 🎯

Key Takeaways

Kubernetes Orchestration — Hundreds/thousands containers auto-manage. Crash recovery, scaling, deployment zero-downtime. Docker alone production scale impossible


Core Abstractions — Pod (smallest unit, one+ containers), Deployment (manage pods, replicas), Service (stable network access, load balance), Namespace (virtual clusters)


Architecture Simple — Control Plane (API server, scheduler, controllers), Worker Nodes (Kubelet, runtime), etcd (database). Masters manage, nodes run pods


Scaling Two Types — HPA (CPU/memory threshold auto-add pods), VPA (auto-increase resources). AI apps: HPA inference load based, VPA model size optimization


YAML Manifests — Everything YAML files. Deployment spec (replicas, image, resources), Service spec (selector, ports), ConfigMap (config), Secret (sensitive data)


AI/ML Specific — GPU scheduling (nvidia.com/gpu limits), KServe (model serving), Kubeflow (ML pipelines), Argo (workflows). Distributed training possible


Kubectl Commands — Get pods/services, apply/delete resources, logs/exec into pods, scale deployments. Learning curve initial, powerful once mastered


Managed Services — GKE (best AI), EKS (AWS), AKS (Azure). Cluster creation, upgrades, security provider-managed. Focus code, not infrastructure

🏁 🎮 Mini Challenge

Challenge: Deploy Flask App to Kubernetes (Minikube)


Local K8s cluster setup → Flask app deploy pannu:


Step 1: Minikube Install & Start 🚀

bash
# macOS/Linux: minikube install
brew install minikube kubectl

# Start cluster (local K8s!)
minikube start
minikube dashboard  # UI see

Step 2: Docker Image Create 🐳

python
# app.py
from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Kubernetes!"

@app.route("/health")
def health():
    return {"status": "healthy"}

dockerfile
FROM python:3.9-slim
WORKDIR /app
COPY app.py .
RUN pip install flask
CMD ["python", "app.py"]

bash
docker build -t flask-k8s:latest .
# Minikube docker use pannu
eval $(minikube docker-env)
docker build -t flask-k8s:latest .

Step 3: Kubernetes Manifest Create 📝

yaml
# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flask-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: flask
  template:
    metadata:
      labels:
        app: flask
    spec:
      containers:
      - name: flask
        image: flask-k8s:latest
        ports:
        - containerPort: 5000
---
apiVersion: v1
kind: Service
metadata:
  name: flask-service
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 5000
  selector:
    app: flask

Step 4: Deploy to Minikube 🎯

bash
kubectl apply -f deployment.yaml
kubectl get pods     # 3 pods running verify
kubectl get svc      # service see

Step 5: Access App 💻

bash
minikube service flask-service
# URL automatically open, browser la app see

Step 6: Scale & Monitor 📊

bash
kubectl scale deployment flask-app --replicas=5
kubectl get pods -w  # watch mode, changes see
kubectl logs <pod-name>  # pod logs read

Completion Time: 2 hours

Tools: Minikube, kubectl, Docker, YAML

Real Kubernetes experience

💼 Interview Questions

Q1: Kubernetes epdhi auto-scale? HPA (Horizontal Pod Autoscaler)?

A: HPA monitors metrics (CPU, memory, custom metrics). Threshold exceed → pods automatically add. CPU 80% cross → 2 more pods add. Perfect for variable traffic. AI inference: request count spike → scale up, request reduce → scale down.


Q2: Stateful vs Stateless apps — K8s la difference?

A: Stateless: any pod serve pannalam (easy, scalable). Stateful: specific pod specific user (database, session). K8s: StatefulSets for stateful (ordered, persistent). AI apps mostly stateless (inference), but training apps stateful (persistent storage, GPU allocation).


Q3: Namespace use case — why organize?

A: Namespace = virtual clusters. Multi-team: production, staging, dev separate. Resource quotas: team A 50% cluster la use, team B 50%. RBAC: different access levels. Isolation: one namespace crash → others unaffected.


Q4: ConfigMap vs Secret — Kubernetes la difference?

A: ConfigMap = non-sensitive config (database host, API endpoint). Secret = sensitive (passwords, API keys, credentials). ConfigMap: plaintext, versioning easy. Secret: base64 encoded (not encrypted by default — additional encryption setup needed for production).


Q5: Resource requests/limits — K8s scheduling?

A: Request = minimum guaranteed (pod guaranteed get). Limit = maximum (pod exceed pannala). Setting: CPU/memory both. Scheduler: request check → right node find → allocate. Over-provisioning avoid, under-provisioning prevent. AI apps: GPU requests essential, limits set to prevent resource hogging.

Frequently Asked Questions

Kubernetes na enna?
Kubernetes (K8s) na open-source container orchestration platform. Docker containers ah automatically deploy, scale, and manage pannum.
Docker illaama Kubernetes use panna mudiyuma?
Kubernetes ku container runtime venum. Docker most popular, but containerd, CRI-O maari alternatives um use pannalam. But generally Docker + K8s combo dhaan popular.
Kubernetes learn panna tough ah?
Initial learning curve irukku, but basics — Pods, Services, Deployments — 1-2 weeks la puridhum. Minikube la local ah practice pannunga.
Small projects ku Kubernetes venum ah?
Illai! 1-5 containers ku Docker Compose podhum. 10+ containers, auto-scaling, high availability venum na Kubernetes use pannunga.
🧠Knowledge Check
Quiz 1 of 1

Kubernetes la smallest deployable unit edhu?

0 of 1 answered