โ† Back|CLOUD-DEVOPSโ€บSection 1/17
0 of 17 completed

Infrastructure as Code

Intermediateโฑ 14 min read๐Ÿ“… Updated: 2026-02-17

Introduction

Nee AWS console la login panni, EC2 instance create pannuva, security group add pannuva, RDS setup pannuva โ€” manually ellam click click click. Next day same setup another region la venum โ€” again click click click. ๐Ÿ˜ฉ


Infrastructure as Code (IaC) vandha โ€” oru file ezhudhunga, terraform apply run pannunga โ€” DONE! 10 servers, 5 databases, networking ellam 2 minutes la ready! ๐Ÿš€


Indha article la IaC fundamentals, Terraform hands-on, AI infrastructure setup, best practices โ€” ellam detail ah paapom! โš™๏ธ

Why Infrastructure as Code?

Manual infrastructure management oda problems:


Without IaC ๐Ÿ˜ฐ:

ProblemImpact
Manual setupHours of clicking
No version history"Yaar change pannadhu?"
InconsistencyDev โ‰  Staging โ‰  Prod
No rollbackMistake fix panna impossible
DocumentationAlways outdated
ScalingEach server manual setup

With IaC ๐Ÿ˜Š:

BenefitHow
**Automation**One command โ€” full setup
**Version control**Git la track every change
**Consistency**Same code = Same infra everywhere
**Rollback**Git revert = Infra rollback
**Documentation**Code IS the documentation
**Scaling**count = 10 โ†’ 10 servers instant

Real example: Netflix โ€” thousands of servers manage panradhu IaC la. Manual ah possible eh illa! ๐ŸŽฌ

IaC Tools Landscape

Popular IaC tools compare pannalam:


ToolLanguageCloud SupportLearning CurveBest For
**Terraform**HCLMulti-cloudMediumIndustry standard
**Pulumi**Python/TS/GoMulti-cloudEasy (if dev)Developers
**CloudFormation**YAML/JSONAWS onlyMediumAWS-only shops
**Bicep**BicepAzure onlyEasyAzure users
**CDK**TypeScript/PythonAWSMediumAWS + developers
**Ansible**YAMLMulti-cloudEasyConfig management

Categories:

  • ๐Ÿ—๏ธ Provisioning: Terraform, Pulumi, CloudFormation โ€” create infrastructure
  • โš™๏ธ Configuration: Ansible, Chef, Puppet โ€” configure servers
  • ๐Ÿ“ฆ Containers: Docker Compose, Kubernetes YAML โ€” app deployment

My recommendation: Terraform learn pannunga first โ€” 80% companies use panradhu. Then Pulumi or CDK try pannunga. ๐ŸŽฏ

Terraform โ€” Core Concepts

Terraform oda building blocks:


1. Providers ๐ŸŒ โ€” Cloud platform connection

hcl
provider "aws" {
  region = "ap-south-1"
}

2. Resources ๐Ÿ—๏ธ โ€” What you want to create

hcl
resource "aws_instance" "ai_server" {
  ami           = "ami-0abcdef1234567890"
  instance_type = "g4dn.xlarge"  # GPU instance
  tags = { Name = "AI-Model-Server" }
}

3. Variables ๐Ÿ“ โ€” Reusable values

hcl
variable "instance_count" {
  default = 3
}

4. Outputs ๐Ÿ“ค โ€” Display created resource info

hcl
output "server_ip" {
  value = aws_instance.ai_server.public_ip
}

5. State ๐Ÿ’พ โ€” Terraform tracks what it created

  • terraform.tfstate file la current infra state store aagum
  • Remote state (S3 bucket) use pannunga team ku

Workflow: Write โ†’ Plan โ†’ Apply โ†’ Destroy ๐Ÿ”„

Terraform Workflow โ€” Step by Step

โœ… Example

Complete Terraform workflow:

bash
# 1. Initialize โ€” download providers
terraform init

# 2. Format โ€” clean code
terraform fmt

# 3. Validate โ€” check syntax
terraform validate

# 4. Plan โ€” preview changes (DRY RUN)
terraform plan
# Output: "Will create 3 resources"

# 5. Apply โ€” create infrastructure!
terraform apply
# Type "yes" to confirm

# 6. Show โ€” see current state
terraform show

# 7. Destroy โ€” delete everything
terraform destroy

Critical rule: ALWAYS run terraform plan before apply! Plan output paathuttu dhaan apply pannunga. Illana accidental delete aagum! โš ๏ธ

Pro tip: CI/CD la terraform plan PR la run pannunga, terraform apply merge la run pannunga. Review + automation! ๐Ÿ›ก๏ธ

AI Infrastructure with Terraform

Real-world AI project infrastructure setup:


hcl
# main.tf โ€” AI Platform Infrastructure

# VPC for isolation
resource "aws_vpc" "ai_vpc" {
  cidr_block = "10.0.0.0/16"
  tags       = { Name = "AI-Platform-VPC" }
}

# GPU Instance for Model Training
resource "aws_instance" "training_server" {
  count         = var.training_nodes
  ami           = "ami-deep-learning-ubuntu"
  instance_type = "p3.2xlarge"  # V100 GPU
  subnet_id     = aws_subnet.private.id

  root_block_device {
    volume_size = 500  # 500GB for datasets
  }

  tags = { Name = "Training-Node-${count.index}" }
}

# S3 for Model Storage
resource "aws_s3_bucket" "models" {
  bucket = "ai-models-${var.environment}"
  versioning { enabled = true }
}

# RDS for Metadata
resource "aws_db_instance" "metadata" {
  engine         = "postgres"
  instance_class = "db.r5.large"
  storage        = 100
}

# API Server (CPU) for Inference
resource "aws_ecs_service" "inference_api" {
  name            = "ai-inference"
  desired_count   = var.api_replicas
  task_definition = aws_ecs_task_definition.inference.arn
}

Oru file la complete AI platform define aagiduthu! Training servers, model storage, database, API โ€” ellam! ๐Ÿค–

Terraform Modules โ€” Reusable Components

Modules = Reusable infrastructure packages. Functions maari โ€” oru thadava write pannunga, everywhere use pannunga!


Module structure:

code
modules/
  ai-platform/
    main.tf        โ† Resources
    variables.tf   โ† Inputs
    outputs.tf     โ† Outputs
    README.md      โ† Documentation

Using a module:

hcl
module "ai_staging" {
  source          = "./modules/ai-platform"
  environment     = "staging"
  gpu_count       = 1
  api_replicas    = 2
}

module "ai_production" {
  source          = "./modules/ai-platform"
  environment     = "production"
  gpu_count       = 4
  api_replicas    = 10
}

Same module, different configs โ€” staging ku 1 GPU, production ku 4 GPU. Code duplicate illa! ๐Ÿงฉ


Public modules: Terraform Registry la 10,000+ community modules irukku. VPC, EKS, RDS โ€” ready-made modules use pannalam!

State Management โ€” Critical Topic

โš ๏ธ Warning

Terraform state file = Most important & dangerous file! โš ๏ธ

What is state?

- terraform.tfstate โ€” JSON file tracking your infrastructure

- Terraform idha use panni "what exists" vs "what should exist" compare pannum

NEVER do these:

โŒ State file Git la commit pannaadheenga โ€” secrets irukku!

โŒ State file manually edit pannaadheenga โ€” corrupt aagum!

โŒ Two people same time apply pannaadheenga โ€” state conflict!

โŒ State file delete pannaadheenga โ€” Terraform "forgets" everything!

Remote state setup (MUST for teams):

hcl
terraform {
  backend "s3" {
    bucket         = "my-terraform-state"
    key            = "ai-platform/terraform.tfstate"
    region         = "ap-south-1"
    dynamodb_table = "terraform-locks"  # Locking!
    encrypt        = true
  }
}

DynamoDB lock โ€” two people same time apply try pannina, one wait pannum. State corruption prevent! ๐Ÿ”’

Multi-Environment Setup

Dev, Staging, Production โ€” separate environments manage pannradhu:


Approach 1: Workspaces (Simple)

bash
terraform workspace new dev
terraform workspace new staging
terraform workspace new prod

terraform workspace select prod
terraform apply

Approach 2: Directory Structure (Recommended)

code
environments/
  dev/
    main.tf        โ†’ module source = "../../modules/ai-platform"
    terraform.tfvars โ†’ gpu_count = 1
  staging/
    main.tf
    terraform.tfvars โ†’ gpu_count = 2
  prod/
    main.tf
    terraform.tfvars โ†’ gpu_count = 8

Approach 3: Terragrunt (Advanced)

code
# terragrunt.hcl โ€” DRY configuration
terraform {
  source = "../../modules/ai-platform"
}
inputs = {
  environment = "prod"
  gpu_count   = 8
}

Recommendation: Start with Directory Structure. Team grow aana Terragrunt move pannunga! ๐Ÿ“‚

IaC Pipeline Architecture

๐Ÿ—๏ธ Architecture Diagram
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚           INFRASTRUCTURE AS CODE PIPELINE              โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚                                                        โ”‚
โ”‚  ๐Ÿ‘จโ€๐Ÿ’ป Developer                                          โ”‚
โ”‚    โ”‚ git push (*.tf files)                             โ”‚
โ”‚    โ–ผ                                                   โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”    โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                    โ”‚
โ”‚  โ”‚  GitHub   โ”‚โ”€โ”€โ”€โ–ถโ”‚ GitHub Actions โ”‚                   โ”‚
โ”‚  โ”‚   PR      โ”‚    โ”‚   CI/CD       โ”‚                   โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜    โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                    โ”‚
โ”‚                     โ”Œโ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”                        โ”‚
โ”‚                     โ”‚ tf init โ”‚                        โ”‚
โ”‚                     โ”‚ tf fmt  โ”‚                        โ”‚
โ”‚                     โ”‚ tf plan โ”‚ โ—€โ”€โ”€ PR Comment         โ”‚
โ”‚                     โ””โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”˜     (plan output)      โ”‚
โ”‚                          โ”‚                             โ”‚
โ”‚                   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚                   โ”‚   Review    โ”‚ โ—€โ”€โ”€ Team approves    โ”‚
โ”‚                   โ”‚   & Merge   โ”‚                      โ”‚
โ”‚                   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                          โ”‚                             โ”‚
โ”‚               โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                  โ”‚
โ”‚               โ”‚   terraform apply   โ”‚                  โ”‚
โ”‚               โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                  โ”‚
โ”‚          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”             โ”‚
โ”‚          โ–ผ               โ–ผ               โ–ผ             โ”‚
โ”‚   โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”‚
โ”‚   โ”‚  Dev Infra โ”‚  โ”‚  Staging  โ”‚  โ”‚ Production โ”‚      โ”‚
โ”‚   โ”‚  (auto)    โ”‚  โ”‚  (auto)   โ”‚  โ”‚ (approval) โ”‚      โ”‚
โ”‚   โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ”‚
โ”‚                                                        โ”‚
โ”‚   ๐Ÿ“ฆ State: S3 + DynamoDB Lock                         โ”‚
โ”‚   ๐Ÿ”’ Secrets: HashiCorp Vault / AWS Secrets Manager    โ”‚
โ”‚                                                        โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Pulumi โ€” Developer-Friendly Alternative

Terraform ku HCL learn pannanum. Pulumi la unga favourite language la IaC ezhudhalaam!


Pulumi with Python ๐Ÿ:

python
import pulumi
import pulumi_aws as aws

# Create VPC
vpc = aws.ec2.Vpc("ai-vpc",
    cidr_block="10.0.0.0/16")

# GPU Instance for AI Training
training = aws.ec2.Instance("ai-trainer",
    instance_type="p3.2xlarge",
    ami="ami-deep-learning",
    vpc_security_group_ids=[sg.id],
    tags={"Name": "AI-Training-Server"})

# S3 Bucket for Models
models = aws.s3.Bucket("ai-models",
    versioning={"enabled": True})

# Output the IP
pulumi.export("training_ip", training.public_ip)

Terraform vs Pulumi:

AspectTerraformPulumi
LanguageHCL (custom)Python, TS, Go, C#
Loops/LogicLimitedFull programming power
TestingExternal toolsNative unit tests
CommunityMassiveGrowing
StateFile/S3Pulumi Cloud (free)

When to use Pulumi: Complex logic, loops, conditions venum na โ€” Pulumi better. Simple infra ku Terraform podhum! ๐ŸŽฏ

IaC Security Best Practices

๐Ÿ’ก Tip

Infrastructure security code level la enforce pannunga:

๐Ÿ”’ 1. No hardcoded secrets

hcl
# โŒ BAD
password = "mysecretpassword"

# โœ… GOOD
password = var.db_password  # From env/vault

๐Ÿ”’ 2. Least privilege IAM

hcl
# Terraform service account ku minimum permissions only

๐Ÿ”’ 3. Encryption everywhere

hcl
resource "aws_s3_bucket" "models" {
  server_side_encryption_configuration {
    rule { apply_server_side_encryption_by_default {
      sse_algorithm = "aws:kms"
    }}
  }
}

๐Ÿ”’ 4. Security scanning

- tfsec โ€” Terraform security scanner

- checkov โ€” Policy-as-code scanner

- CI pipeline la add pannunga!

๐Ÿ”’ 5. State encryption โ€” Remote state always encrypt pannunga!

Prompt: Design AI Infrastructure

๐Ÿ“‹ Copy-Paste Prompt
You are a Cloud Infrastructure Architect.

Design Terraform configuration for an AI/ML platform with:
- VPC with public/private subnets across 2 AZs
- GPU instances (p3.2xlarge) for model training with auto-scaling
- ECS Fargate for model inference API
- S3 bucket for model artifacts with versioning
- RDS PostgreSQL for metadata
- CloudWatch monitoring and alerts
- All in ap-south-1 (Mumbai) region

Provide:
1. Complete main.tf, variables.tf, outputs.tf
2. Module structure recommendation
3. Remote state configuration
4. CI/CD pipeline for terraform apply
5. Cost estimation per month

Summary

Key takeaways:


โœ… IaC = Infrastructure code la define & manage pannradhu

โœ… Terraform = Industry standard, multi-cloud, HCL language

โœ… Modules = Reusable infrastructure components

โœ… State = Remote backend + locking MUST for teams

โœ… Security = No hardcoded secrets, encryption, scanning

โœ… Environments = Directory structure for dev/staging/prod


Action item: AWS Free Tier account la Terraform install pannunga, EC2 instance oru main.tf la create pannunga. terraform apply run pannunga โ€” magic feel pannunga! โœจ


Next article: Monitoring AI Apps โ€” observability deep dive! ๐Ÿ“Š

๐Ÿ ๐ŸŽฎ Mini Challenge

Challenge: Create EC2 Instance + GPU using Terraform


Infrastructure code la define โ†’ one command la deploy pannu! ๐Ÿ—๏ธ


Step 1: Terraform Install ๐Ÿ“ฆ

bash
# macOS
brew install terraform

# Verify
terraform version

Step 2: AWS Credentials Setup ๐Ÿ”‘

bash
# AWS console โ†’ create access key
# ~/.aws/credentials file
[default]
aws_access_key_id = YOUR_KEY
aws_secret_access_key = YOUR_SECRET

Step 3: Terraform Configuration Create ๐Ÿ“

hcl
# main.tf
terraform {
  required_providers {
    aws = {
      source  = "hashicorp/aws"
      version = "~> 5.0"
    }
  }
}

provider "aws" {
  region = "us-east-1"
}

# VPC (network)
resource "aws_vpc" "ai_vpc" {
  cidr_block           = "10.0.0.0/16"
  enable_dns_hostnames = true
  tags = {
    Name = "ai-vpc"
  }
}

# Subnet
resource "aws_subnet" "ai_subnet" {
  vpc_id            = aws_vpc.ai_vpc.id
  cidr_block        = "10.0.1.0/24"
  availability_zone = "us-east-1a"
}

# Security Group
resource "aws_security_group" "ai_sg" {
  vpc_id = aws_vpc.ai_vpc.id

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["YOUR_IP/32"]
  }

  ingress {
    from_port   = 8000
    to_port     = 8000
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }
}

# EC2 with GPU (g4dn.xlarge)
resource "aws_instance" "ai_gpu" {
  ami             = "ami-0c55b159cbfafe1f0"  # Deep Learning AMI
  instance_type   = "g4dn.xlarge"
  subnet_id       = aws_subnet.ai_subnet.id
  security_groups = [aws_security_group.ai_sg.id]

  tags = {
    Name = "ai-gpu-server"
  }
}

output "instance_ip" {
  value = aws_instance.ai_gpu.public_ip
}

Step 4: Initialize & Deploy ๐Ÿš€

bash
# Download Terraform modules
terraform init

# Preview changes
terraform plan

# Deploy!
terraform apply

# Output see
terraform output instance_ip

Step 5: Connect & Run ๐Ÿ”Œ

bash
# SSH connect
ssh -i key.pem ubuntu@<instance_ip>

# NVIDIA GPU check
nvidia-smi

# AI app run!
python train_model.py

Step 6: Destroy (cleanup) ๐Ÿงน

bash
# When done, remove resources
terraform destroy

# Cost saved! ๐Ÿ’ฐ

Completion Time: 2-3 hours

Skills: AWS, Terraform, Infrastructure as Code

Cost: ~$5-10 for usage โญ

๐Ÿ’ผ Interview Questions

Q1: Terraform state file โ€” why important? Security concerns?

A: State file = current infrastructure snapshot (what resources exist, IDs, attributes). Terraform read state, compare desired state, plan changes. Important: destroy safe, updates idempotent. Security: sensitive data (passwords, keys) state file la store โ€” encryption needed, version control avoid.


Q2: Terraform modules โ€” reusable code โ€” structure best practice?

A: Module = directory with main.tf, variables.tf, outputs.tf. Input variables: customization. Outputs: other modules consume pannalam. Structure: root module, child modules (networking, compute, database). Example: VPC module reuse multiple environments.


Q3: dev/staging/prod environments โ€” Terraform la manage?

A: Option 1: separate directories (dev/, staging/, prod/) โ€” each own state. Option 2: workspaces (terraform workspace new prod) โ€” same code, separate state. Option 2 simpler but careful (accidental prod delete risk). Recommendation: separate directories (safety), plus variable files (dev.tfvars, prod.tfvars).


Q4: Terraform version control โ€” state file commit pannala?

A: No! State file .gitignore. Remote backend use (AWS S3 + DynamoDB lock). Team: shared state (everyone up-to-date), locking prevents conflicts. State file only locally backup, or remote backend git push.


Q5: Terraform plan output โ€” false positive warnings?

A: Plan shows exact changes. Review carefully! Force new (instance recreate, data loss possible). Sensitive outputs hide (secrets show illa). Targets: specific resource deploy (terraform apply -target=aws_instance.ai_gpu). Destruction dry-run: terraform plan -destroy (safe check before destroy).

Frequently Asked Questions

โ“ IaC na enna simple ah?
Infrastructure as Code = Cloud resources (servers, databases, networks) code files la define pannradhu. Manual console clicking ku badhila, code ezhudhi run pannina infrastructure auto create aagum. Version control, reuse, automation โ€” ellam possible.
โ“ Terraform vs Pulumi โ€” evadhu better?
Terraform โ€” industry standard, HCL language, massive community. Pulumi โ€” real programming languages (Python, TypeScript) use pannalam. Beginners ku Terraform recommend. Already Python expert na Pulumi try pannunga.
โ“ IaC learn panna yevlo time aagum?
Terraform basics โ€” 1 week. Real project setup (VPC, EC2, RDS) โ€” 2-3 weeks. Production-grade modules โ€” 1-2 months. Start with AWS Free Tier + Terraform tutorials.
โ“ IaC illama cloud manage panna mudiyaadha?
Mudiyum โ€” but nightmare! Manual setup la consistency illa, documentation illa, rollback impossible. 10 servers manual ah setup vs 1 terraform apply โ€” which is better? IaC is non-negotiable for serious teams.
๐Ÿง Knowledge Check
Quiz 1 of 1

Terraform state file eppadi manage pannanum?

0 of 1 answered