← Back|CLOUD-DEVOPS›Section 1/18

0 of 18 completed

Kubernetes basics

Intermediate⏱ 14 min read📅 Updated: 2026-02-17

Introduction

Docker la containers create pannita — super! But ipo imagine pannunga: 50 containers oru app ku run aagudhu. Oru container crash aana? Traffic increase aana? New version deploy pannanum na? 😰

Manually manage panna impossible. Adhukkudhaan Kubernetes (K8s) use pannrom — containers oda autopilot maari! 🚀

Indha article la Kubernetes core concepts, architecture, and hands-on basics paapom. AI apps ku K8s epdhi game-changer nu therinjukkalam!

Why Kubernetes?

Docker alone use pannum bodhu problems:

Problem	Without K8s	With K8s
Container crash	Manual restart	Auto restart ♻️
Traffic spike	Manual scaling	Auto scaling 📈
New deployment	Downtime risk	Zero downtime 🟢
Load balance	Manual config	Automatic ⚖️
Health check	You monitor	K8s monitors 🏥

Real-world example: Unga AI chatbot app ku suddenly 10x traffic vandhaa:

Without K8s: Server crash, users angry 😤
With K8s: Auto-scale, 10 new pods create, users happy 😊

Google daily 5 billion containers run pannum — Kubernetes use pannithaan! Adhaan idha open-source ah release pannanga. 🎁

Kubernetes Architecture

🏗️ Architecture Diagram

┌─────────────────────────────────────────────────────┐
│              KUBERNETES CLUSTER                       │
├─────────────────────────────────────────────────────┤
│                                                       │
│  ┌─────────────── Control Plane ──────────────────┐  │
│  │  ┌──────────┐ ┌──────────┐ ┌───────────────┐  │  │
│  │  │ API      │ │ Scheduler│ │ Controller    │  │  │
│  │  │ Server   │ │          │ │ Manager       │  │  │
│  │  └──────────┘ └──────────┘ └───────────────┘  │  │
│  │  ┌──────────────────────────────────────────┐  │  │
│  │  │              etcd (Database)              │  │  │
│  │  └──────────────────────────────────────────┘  │  │
│  └────────────────────────────────────────────────┘  │
│                         │                             │
│  ┌─── Worker Node 1 ───┐  ┌─── Worker Node 2 ───┐   │
│  │ ┌─Pod──┐ ┌─Pod──┐   │  │ ┌─Pod──┐ ┌─Pod──┐   │   │
│  │ │ 🤖  │ │ 🧠  │   │  │ │ 🤖  │ │ 📊  │   │   │
│  │ │ API  │ │Model │   │  │ │ API  │ │Redis │   │   │
│  │ └──────┘ └──────┘   │  │ └──────┘ └──────┘   │   │
│  │    kubelet + proxy   │  │    kubelet + proxy   │   │
│  └──────────────────────┘  └──────────────────────┘  │
└─────────────────────────────────────────────────────┘

Core Concepts — K8s Building Blocks

K8s la mukkiyamana concepts:

1. Pod 🫛

Smallest unit in K8s
One or more containers hold pannum
Oru pod = oru app instance

2. Deployment 📋

Pods manage pannum
"3 copies of my AI API run pannu" nu sollunga
Auto-replace crashed pods

3. Service 🔌

Pods ku stable network address kudukum
Load balance pannum
Types: ClusterIP, NodePort, LoadBalancer

4. Namespace 📁

Virtual cluster — organize resources
dev, staging, production separate pannalam

5. ConfigMap & Secret 🔑

Configuration data store pannradhu
Secrets = encrypted sensitive data (API keys)

6. Ingress 🚪

External traffic manage pannum
Domain routing: api.myapp.com → API pod, app.myapp.com → Web pod

Pod Lifecycle

💡 Tip

Pod oda lifecycle stages:

1. Pending — Schedule aaga wait pannum

2. Running — Container start aagiduchu ✅

3. Succeeded — Task complete (batch jobs)

4. Failed — Container crash aagiduchu ❌

5. Unknown — Node communication lost

Important: Pods are ephemeral — die pannum, replace aagum. Data store panna Persistent Volumes use pannunga!

Pod crashed na K8s automatically new pod create pannum — nee tension panna vendaam! 🧘

YAML Files — K8s Language

K8s la ellam YAML files la define pannrom. Basic structure:

yaml

# deployment.yaml — AI API Deploy
apiVersion: apps/v1
kind: Deployment
metadata:
  name: ai-chatbot
spec:
  replicas: 3          # 3 pods run pannu
  selector:
    matchLabels:
      app: ai-chatbot
  template:
    metadata:
      labels:
        app: ai-chatbot
    spec:
      containers:
      - name: chatbot
        image: myrepo/ai-chatbot:v1
        ports:
        - containerPort: 8000
        resources:
          requests:
            memory: "512Mi"
            cpu: "250m"
          limits:
            memory: "1Gi"
            cpu: "500m"

Key fields:

replicas: Ethana pods venum
image: Docker image
resources: CPU/Memory limits
labels: Pods identify panna

Essential kubectl Commands

kubectl = K8s oda CLI tool. Most used commands:

Command	What It Does
`kubectl get pods`	All pods list pannum
`kubectl get services`	All services kaatum
`kubectl apply -f file.yaml`	Config apply pannum
`kubectl delete pod name`	Pod delete pannum
`kubectl logs pod-name`	Logs paakalam
`kubectl exec -it pod -- bash`	Pod inside pogalam
`kubectl scale deploy/app --replicas=5`	Scale pannum
`kubectl describe pod name`	Detailed info

Quick start flow:

bash

# Deploy pannunga
kubectl apply -f deployment.yaml
kubectl apply -f service.yaml

# Check pannunga
kubectl get pods
kubectl get svc

# Logs paakuanga
kubectl logs -f ai-chatbot-pod-xyz

Services — Network Access for Pods

Pods ku direct ah access panna mudiyaadhu (IP keep changing). Service oru stable endpoint kudukum.

Service Types:

1. ClusterIP (Default) 🔒

Internal access only
Backend services ku use pannunga
Example: AI model service → API server mattum access pannum

2. NodePort 🚪

External access via node port (30000-32767)
Development/testing ku use pannunga

3. LoadBalancer ⚖️

Cloud provider load balancer create pannum
Production ku best
Example: User traffic AI API ku distribute pannum

yaml

apiVersion: v1
kind: Service
metadata:
  name: ai-api-service
spec:
  type: LoadBalancer
  selector:
    app: ai-chatbot
  ports:
  - port: 80
    targetPort: 8000

Auto Scaling in K8s

K8s la 2 types of scaling irukku:

Horizontal Pod Autoscaler (HPA) 📊

CPU/Memory usage based ah pods add/remove pannum
"CPU 70% above pona 10 pods varaikkum scale pannu"

yaml

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: ai-api-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: ai-chatbot
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

Vertical Pod Autoscaler (VPA) 📏

Pod oda resource limits adjust pannum
"Idhu ku more memory venum" nu auto-increase

AI apps ku HPA most useful — inference requests increase aana auto-scale aagum! 🚀

Local Setup with Minikube

✅ Example

Local la K8s try pannunga — Minikube! 💻

bash

# Install Minikube
brew install minikube    # Mac
# or
curl -LO https://storage.googleapis.com/minikube/releases/latest/minikube-linux-amd64

# Start cluster
minikube start

# Deploy first app
kubectl create deployment hello-ai --image=nginx
kubectl expose deployment hello-ai --type=NodePort --port=80

# Access pannunga
minikube service hello-ai

5 minutes la unga own K8s cluster ready! 🎉

Other options: kind (K8s in Docker), k3s (lightweight K8s), Docker Desktop (built-in K8s).

K8s for AI/ML Workloads

AI apps ku special K8s features:

GPU Scheduling 🎮

NVIDIA GPU Operator use pannalam
Specific pods ku GPU allocate pannalam

yaml

resources:
  limits:
    nvidia.com/gpu: 1

Model Serving 🧠

KServe — ML model serving framework
Seldon Core — ML deployment platform
Auto-scaling based on inference load

Training Jobs 📚

Kubeflow — ML pipeline orchestration
Distributed training across multiple GPU nodes
Job scheduling and queue management

AI-specific tools on K8s:

Tool	Purpose
Kubeflow	ML Pipelines
KServe	Model Serving
Argo Workflows	DAG workflows
MLflow	Experiment tracking
Ray	Distributed computing

Try It: K8s Setup Prompt

📋 Copy-Paste Prompt

You are a Kubernetes expert for AI/ML applications.

I have an AI image classification API (Python FastAPI, TensorFlow model 800MB).
Requirements:
- Handle 5000 requests/minute peak
- Auto-scale from 2 to 20 pods
- GPU needed for inference
- Zero-downtime deployments
- Health checks

Generate complete Kubernetes YAML files:
1. Deployment with GPU resources
2. Service (LoadBalancer)
3. HPA configuration
4. Readiness & Liveness probes
5. ConfigMap for environment variables

K8s Best Practices

💡 Tip

Production K8s ku follow pannunga:

✅ Resource limits always set pannunga — illana oru pod full node resources eat pannum

✅ Health checks (readiness + liveness probes) add pannunga

✅ Namespaces use pannunga — dev/staging/prod separate

✅ RBAC enable pannunga — who can do what

✅ Secrets properly manage pannunga — never hardcode

✅ Rolling updates use pannunga — zero downtime

✅ Pod Disruption Budgets set pannunga — maintenance time safety

✅ Monitoring with Prometheus + Grafana setup pannunga

Managed Kubernetes Services

Own K8s cluster manage panna headache. Cloud providers managed version kudukuraanga:

Service	Provider	Best For
EKS	AWS	Enterprise, large scale
GKE	Google	AI/ML workloads, autopilot
AKS	Azure	Microsoft ecosystem

GKE Autopilot — Google manage everything, nee YAML apply pannunga. No node management! Best for AI apps starting out.

Pricing comparison (3 nodes, small):

EKS: ~$73/month (cluster fee) + EC2 costs
GKE: Free cluster + VM costs
AKS: Free cluster + VM costs

Recommendation: GKE Autopilot la start pannunga — easiest and AI-friendly! 🎯

✅ Key Takeaways

✅ Kubernetes Orchestration — Hundreds/thousands containers auto-manage. Crash recovery, scaling, deployment zero-downtime. Docker alone production scale impossible

✅ Core Abstractions — Pod (smallest unit, one+ containers), Deployment (manage pods, replicas), Service (stable network access, load balance), Namespace (virtual clusters)

✅ Architecture Simple — Control Plane (API server, scheduler, controllers), Worker Nodes (Kubelet, runtime), etcd (database). Masters manage, nodes run pods

✅ Scaling Two Types — HPA (CPU/memory threshold auto-add pods), VPA (auto-increase resources). AI apps: HPA inference load based, VPA model size optimization

✅ YAML Manifests — Everything YAML files. Deployment spec (replicas, image, resources), Service spec (selector, ports), ConfigMap (config), Secret (sensitive data)

✅ AI/ML Specific — GPU scheduling (nvidia.com/gpu limits), KServe (model serving), Kubeflow (ML pipelines), Argo (workflows). Distributed training possible

✅ Kubectl Commands — Get pods/services, apply/delete resources, logs/exec into pods, scale deployments. Learning curve initial, powerful once mastered

✅ Managed Services — GKE (best AI), EKS (AWS), AKS (Azure). Cluster creation, upgrades, security provider-managed. Focus code, not infrastructure

🏁 🎮 Mini Challenge

Challenge: Deploy Flask App to Kubernetes (Minikube)

Local K8s cluster setup → Flask app deploy pannu:

Step 1: Minikube Install & Start 🚀

bash

# macOS/Linux: minikube install
brew install minikube kubectl

# Start cluster (local K8s!)
minikube start
minikube dashboard  # UI see

Step 2: Docker Image Create 🐳

python

# app.py
from flask import Flask
app = Flask(__name__)

@app.route("/")
def hello():
    return "Hello from Kubernetes!"

@app.route("/health")
def health():
    return {"status": "healthy"}

dockerfile

FROM python:3.9-slim
WORKDIR /app
COPY app.py .
RUN pip install flask
CMD ["python", "app.py"]

bash

docker build -t flask-k8s:latest .
# Minikube docker use pannu
eval $(minikube docker-env)
docker build -t flask-k8s:latest .

Step 3: Kubernetes Manifest Create 📝

yaml

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: flask-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: flask
  template:
    metadata:
      labels:
        app: flask
    spec:
      containers:
      - name: flask
        image: flask-k8s:latest
        ports:
        - containerPort: 5000
---
apiVersion: v1
kind: Service
metadata:
  name: flask-service
spec:
  type: LoadBalancer
  ports:
  - port: 80
    targetPort: 5000
  selector:
    app: flask

Step 4: Deploy to Minikube 🎯

bash

kubectl apply -f deployment.yaml
kubectl get pods     # 3 pods running verify
kubectl get svc      # service see

Step 5: Access App 💻

bash

minikube service flask-service
# URL automatically open, browser la app see

Step 6: Scale & Monitor 📊

bash

kubectl scale deployment flask-app --replicas=5
kubectl get pods -w  # watch mode, changes see
kubectl logs <pod-name>  # pod logs read

Completion Time: 2 hours

Tools: Minikube, kubectl, Docker, YAML

Real Kubernetes experience ⭐

💼 Interview Questions

Q1: Kubernetes epdhi auto-scale? HPA (Horizontal Pod Autoscaler)?

A: HPA monitors metrics (CPU, memory, custom metrics). Threshold exceed → pods automatically add. CPU 80% cross → 2 more pods add. Perfect for variable traffic. AI inference: request count spike → scale up, request reduce → scale down.

Q2: Stateful vs Stateless apps — K8s la difference?

A: Stateless: any pod serve pannalam (easy, scalable). Stateful: specific pod specific user (database, session). K8s: StatefulSets for stateful (ordered, persistent). AI apps mostly stateless (inference), but training apps stateful (persistent storage, GPU allocation).

Q3: Namespace use case — why organize?

A: Namespace = virtual clusters. Multi-team: production, staging, dev separate. Resource quotas: team A 50% cluster la use, team B 50%. RBAC: different access levels. Isolation: one namespace crash → others unaffected.

Q4: ConfigMap vs Secret — Kubernetes la difference?

A: ConfigMap = non-sensitive config (database host, API endpoint). Secret = sensitive (passwords, API keys, credentials). ConfigMap: plaintext, versioning easy. Secret: base64 encoded (not encrypted by default — additional encryption setup needed for production).

Q5: Resource requests/limits — K8s scheduling?

A: Request = minimum guaranteed (pod guaranteed get). Limit = maximum (pod exceed pannala). Setting: CPU/memory both. Scheduler: request check → right node find → allocate. Over-provisioning avoid, under-provisioning prevent. AI apps: GPU requests essential, limits set to prevent resource hogging.

Frequently Asked Questions

❓ Kubernetes na enna?

Kubernetes (K8s) na open-source container orchestration platform. Docker containers ah automatically deploy, scale, and manage pannum.

❓ Docker illaama Kubernetes use panna mudiyuma?

Kubernetes ku container runtime venum. Docker most popular, but containerd, CRI-O maari alternatives um use pannalam. But generally Docker + K8s combo dhaan popular.

❓ Kubernetes learn panna tough ah?

Initial learning curve irukku, but basics — Pods, Services, Deployments — 1-2 weeks la puridhum. Minikube la local ah practice pannunga.

❓ Small projects ku Kubernetes venum ah?

Illai! 1-5 containers ku Docker Compose podhum. 10+ containers, auto-scaling, high availability venum na Kubernetes use pannunga.

🧠Knowledge Check

Quiz 1 of 1

Kubernetes la smallest deployable unit edhu?

0 of 1 answered

← Previous ByteDocker for AI apps Next Byte →Deploy AI app

Courses

Learning Paths

Exam Prep

Kubernetes basics

Introduction

Why Kubernetes?

Kubernetes Architecture

Core Concepts — K8s Building Blocks

Pod Lifecycle

YAML Files — K8s Language

Essential kubectl Commands

Services — Network Access for Pods

Auto Scaling in K8s

Local Setup with Minikube

K8s for AI/ML Workloads

Try It: K8s Setup Prompt

K8s Best Practices

Managed Kubernetes Services

✅ Key Takeaways

🏁 🎮 Mini Challenge

💼 Interview Questions

Frequently Asked Questions