← Back|SOFTWARE-ENGINEERING›Section 1/16

0 of 16 completed

Secure AI coding

Advanced⏱ 15 min read📅 Updated: 2026-02-22

Introduction

Unga monolith app la AI feature add pannirukkeenga — login service, payment service, AND AI recommendation ellam oru server la. Oru day AI model update panneenga — entire app crash! 💥

Idhu dhaan monolith + AI problem. AI features resource-hungry, independently scalable, frequently updated. Microservices architecture use panna — each service independently live and die pannalam!

Indha article la AI-powered microservices architecture — design, communication, deployment, and real-world patterns cover pannrom! 🧩✨

When to Use Microservices for AI

Every project ku microservices thevai illa — when it makes sense:

Signal	Monolith OK ✅	Microservices Needed 🧩
Team Size	< 5 developers	> 5 developers
AI Models	1-2 simple models	3+ complex models
Scale	< 10K requests/day	> 100K requests/day
Deploy Frequency	Weekly	Daily/multiple per day
GPU Needs	No GPU	GPU required
Model Updates	Monthly	Weekly/daily
Fault Tolerance	Some downtime OK	Zero downtime required

Migration Path:

code

Stage 1: Monolith (everything together)
    ↓
Stage 2: Modular Monolith (AI in separate module)
    ↓
Stage 3: AI as separate service (2 services)
    ↓
Stage 4: Full microservices (5+ services)

Rule: Don't start with microservices — grow into them! Premature microservices = premature complexity! 🎯

AI Service Decomposition

🏗️ Architecture Diagram

AI app ah microservices ah decompose panradhu:

```
┌────────────────────────────────────────────────────┐
│                   API GATEWAY                       │
│  [Kong/Nginx] — Auth, Rate Limit, Routing          │
└───┬────────┬────────┬────────┬────────┬───────────┘
    │        │        │        │        │
    ▼        ▼        ▼        ▼        ▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│ User │ │Search│ │Recom │ │Chat  │ │Noti- │
│ Svc  │ │ Svc  │ │ Svc  │ │ Svc  │ │fica- │
│      │ │      │ │      │ │(LLM) │ │tion  │
│ CRUD │ │Vector│ │ ML   │ │      │ │ Svc  │
│      │ │ DB   │ │Model │ │Stream│ │      │
└──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘ └──┬───┘
   │        │        │        │        │
   ▼        ▼        ▼        ▼        ▼
┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐ ┌──────┐
│Postgr│ │Pine- │ │Redis │ │Anthro│ │Redis │
│  es  │ │cone  │ │Cache │ │pic   │ │Queue │
└──────┘ └──────┘ └──────┘ └──────┘ └──────┘
```

**Service Boundaries (AI-Specific):**
1. **User Service** — Auth, profiles (CPU, low resources)
2. **Search Service** — Vector search, semantic search (GPU optional, vector DB)
3. **Recommendation Service** — ML model inference (GPU, high memory)
4. **Chat Service** — LLM integration, streaming (API calls, WebSocket)
5. **Notification Service** — Async, queue-based (CPU, low resources)

Each service **own database** own — no shared DB! 🗄️

Inter-Service Communication

Microservices communication patterns for AI:

1. Synchronous (REST/gRPC) — Real-time responses

javascript

// User Service calls Recommendation Service
// REST
const recommendations = await fetch('http://recommendation-svc:3001/recommend', {
  method: 'POST',
  body: JSON.stringify({ userId: '123', limit: 10 }),
});

// gRPC (faster for internal services)
const client = new RecommendationClient('recommendation-svc:50051');
const response = await client.getRecommendations({ userId: '123', limit: 10 });

2. Asynchronous (Message Queue) — Heavy AI tasks

javascript

// Producer: API Service
await rabbitMQ.publish('ai-tasks', {
  id: 'section-1',
  type: 'generate-summary',
  documentId: 'doc-456',
  callbackUrl: 'http://api-svc:3000/webhook/summary',
});

// Consumer: AI Service (GPU worker)
rabbitMQ.consume('ai-tasks', async (message) => {
  const summary = await aiModel.summarize(message.documentId);
  await fetch(message.callbackUrl, {
    method: 'POST',
    body: JSON.stringify({ documentId: message.documentId, summary }),
  });
});

3. Event-Driven (Pub/Sub) — Reactive updates

javascript

// When user action happens — multiple services react
eventBus.publish('user.purchased', { userId: '123', productId: 'abc' });

// Recommendation Service listens
eventBus.subscribe('user.purchased', async (event) => {
  await updateUserPreferences(event.userId, event.productId);
  await retrainModel(event.userId);  // Personalization update
});

// Notification Service also listens
eventBus.subscribe('user.purchased', async (event) => {
  await sendRecommendationEmail(event.userId);
});

Pattern	Latency	Coupling	Best For
REST	Low	Tight	Simple CRUD
gRPC	Very Low	Tight	Internal AI calls
Message Queue	High	Loose	Heavy AI tasks
Event Bus	Medium	Very Loose	Reactive updates

API Gateway for AI Services

AI microservices ku smart API gateway design:

javascript

// Kong/Custom API Gateway config
const routes = {
  // Route based on request type
  '/api/chat': {
    service: 'chat-service',
    rateLimit: { free: 20, pro: 200 },
    timeout: 60000,       // LLM responses take time
    streaming: true,       // SSE support
  },
  '/api/recommend': {
    service: 'recommendation-service',
    rateLimit: { free: 100, pro: 1000 },
    timeout: 5000,
    cache: { ttl: 1800 },  // Cache 30 min
  },
  '/api/search': {
    service: 'search-service',
    rateLimit: { free: 50, pro: 500 },
    timeout: 3000,
    cache: { ttl: 300 },   // Cache 5 min
  },
  '/api/users': {
    service: 'user-service',
    rateLimit: { free: 200, pro: 2000 },
    timeout: 2000,
  },
};

// Smart routing with fallback
async function routeRequest(req) {
  const route = routes[req.path];

  try {
    const response = await callService(route.service, req);
    return response;
  } catch (error) {
    // Circuit breaker — AI service down na fallback
    if (error.code === 'SERVICE_UNAVAILABLE') {
      return getFallbackResponse(req.path);
    }
    throw error;
  }
}

Gateway Responsibilities:

🔐 Authentication & Authorization
🚦 Rate Limiting (per tier, per endpoint)
📊 Request Logging & Metrics
🔄 Circuit Breaker & Fallback
📦 Response Caching
🌊 Streaming Support (SSE/WebSocket) for LLMs

Circuit Breaker Pattern for AI

AI services fail aagum — circuit breaker protect pannum:

javascript

class CircuitBreaker {
  constructor(options = {}) {
    this.failureThreshold = options.failureThreshold || 5;
    this.resetTimeout = options.resetTimeout || 30000; // 30s
    this.state = 'CLOSED';   // CLOSED → OPEN → HALF_OPEN
    this.failureCount = 0;
    this.lastFailureTime = null;
  }

  async call(fn) {
    if (this.state === 'OPEN') {
      if (Date.now() - this.lastFailureTime > this.resetTimeout) {
        this.state = 'HALF_OPEN';
      } else {
        throw new Error('Circuit OPEN — using fallback');
      }
    }

    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }

  onSuccess() {
    this.failureCount = 0;
    this.state = 'CLOSED';
  }

  onFailure() {
    this.failureCount++;
    this.lastFailureTime = Date.now();
    if (this.failureCount >= this.failureThreshold) {
      this.state = 'OPEN';
      console.log('🔴 Circuit OPEN! AI service failing.');
    }
  }
}

// Usage
const aiBreaker = new CircuitBreaker({ failureThreshold: 3 });

app.post('/api/recommend', async (req, res) => {
  try {
    const result = await aiBreaker.call(() =>
      recommendationService.getRecommendations(req.body)
    );
    res.json(result);
  } catch (error) {
    // Fallback: return popular items instead of AI recommendations
    const popular = await getPopularItems();
    res.json({ items: popular, source: 'fallback' });
  }
});

States:

🟢 CLOSED — Normal operation
🔴 OPEN — AI service down, use fallback
🟡 HALF_OPEN — Testing if service recovered

AI services crash aanaalum — user experience break aagaadhu! 🛡️

Docker Setup for AI Microservices

AI microservices ku Docker compose:

yaml

# docker-compose.yml
version: '3.8'

services:
  api-gateway:
    build: ./gateway
    ports: ['3000:3000']
    depends_on: [user-svc, recommendation-svc, chat-svc]

  user-svc:
    build: ./services/user
    environment:
      DATABASE_URL: postgres://db:5432/users

  recommendation-svc:
    build: ./services/recommendation
    environment:
      MODEL_PATH: /models/recommender.onnx
      REDIS_URL: redis://cache:6379
    volumes:
      - ./models:/models        # Model files mount
    deploy:
      resources:
        reservations:
          devices:
            - capabilities: [gpu]  # GPU access!

  chat-svc:
    build: ./services/chat
    environment:
      ANTHROPIC_API_KEY: ${ANTHROPIC_API_KEY}
      REDIS_URL: redis://cache:6379

  search-svc:
    build: ./services/search
    environment:
      PINECONE_API_KEY: ${PINECONE_API_KEY}

  # Infrastructure
  db:
    image: postgres:16
    volumes: ['pgdata:/var/lib/postgresql/data']

  cache:
    image: redis:7-alpine
    command: redis-server --maxmemory 256mb --maxmemory-policy allkeys-lru

  queue:
    image: rabbitmq:3-management
    ports: ['15672:15672']  # Management UI

volumes:
  pgdata:

AI Service Dockerfile:

dockerfile

# services/recommendation/Dockerfile
FROM python:3.11-slim

# Install ML dependencies
RUN pip install torch onnxruntime fastapi uvicorn redis

COPY . /app
WORKDIR /app

# Pre-download model
RUN python download_model.py

EXPOSE 8000
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]

docker compose up — whole AI platform start! 🐳

Service Mesh for AI Traffic

💡 Tip

Istio/Linkerd service mesh use panna AI traffic manage easy:

yaml

# Istio VirtualService — AI traffic splitting
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: recommendation-svc
spec:
  hosts: [recommendation-svc]
  http:
  - route:
    - destination:
        host: recommendation-svc
        subset: v1-current-model
      weight: 90              # 90% old model
    - destination:
        host: recommendation-svc
        subset: v2-new-model
      weight: 10              # 10% new model (canary!)

Service Mesh Benefits for AI:

- 🔄 Traffic splitting — A/B test models easily

- 🔒 mTLS — Secure inter-service communication

- 📊 Observability — Automatic metrics, tracing

- 🔁 Retry/Timeout — Automatic retry for AI failures

- 💀 Fault injection — Test AI service failures

Complex setup — but large-scale AI systems ku worth it! 🎯

Data Consistency Across AI Services

Microservices la data consistency maintain panradhu challenging:

Saga Pattern for AI Workflows:

javascript

// Example: AI Content Generation Saga
class ContentGenerationSaga {
  async execute(request) {
    const steps = [];

    try {
      // Step 1: Create content record
      const content = await userService.createContent(request);
      steps.push({ service: 'user', action: 'delete', id: content.id });

      // Step 2: Generate AI content
      const aiContent = await chatService.generate(request.prompt);
      steps.push({ service: 'chat', action: 'cleanup', id: aiContent.id });

      // Step 3: Run safety check
      const safetyResult = await moderationService.check(aiContent.text);
      if (!safetyResult.safe) {
        throw new Error('Content failed safety check');
      }

      // Step 4: Store and index
      await searchService.index(content.id, aiContent.text);

      return { success: true, content: aiContent };

    } catch (error) {
      // Compensating transactions — rollback!
      for (const step of steps.reverse()) {
        await this.compensate(step);
      }
      return { success: false, error: error.message };
    }
  }

  async compensate(step) {
    console.log(`🔄 Rolling back: ${step.service}.${step.action}`);
    // Each service has its own rollback logic
  }
}

Event Sourcing for AI audit trail:

javascript

// Track every AI decision
eventStore.append('ai-decisions', {
  id: 'section-2',
  type: 'RECOMMENDATION_GENERATED',
  userId: '123',
  model: 'recommender-v2',
  input: features,
  output: recommendations,
  confidence: 0.92,
  timestamp: Date.now(),
});

AI decisions auditable ah irukkanum — event sourcing helps! 📋

Testing AI Microservices

Each service independently test pannunga:

javascript

// Contract test — services agree on API shape
// recommendation-svc contract
const contract = {
  request: {
    method: 'POST',
    path: '/recommend',
    body: { userId: 'string', limit: 'number' },
  },
  response: {
    status: 200,
    body: {
      items: [{ id: 'string', score: 'number', title: 'string' }],
      model_version: 'string',
    },
  },
};

// Consumer test (API service tests recommendation contract)
test('recommendation service returns expected shape', async () => {
  const response = await request('http://recommendation-svc:3001')
    .post('/recommend')
    .send({ userId: 'test-user', limit: 5 });

  expect(response.status).toBe(200);
  expect(response.body.items).toBeInstanceOf(Array);
  expect(response.body.items[0]).toHaveProperty('id');
  expect(response.body.items[0]).toHaveProperty('score');
  expect(response.body.model_version).toBeDefined();
});

// Chaos test — AI service down scenario
test('system works when AI service is down', async () => {
  await stopService('recommendation-svc');

  const response = await request('http://api-gateway:3000')
    .get('/api/homepage')
    .expect(200);  // Should still work!

  expect(response.body.recommendations.source).toBe('fallback');
  await startService('recommendation-svc');
});

Test Types for AI Microservices:

Test Type	What It Tests	Tools
Unit	Individual service logic	Jest, pytest
Contract	Service API agreements	Pact
Integration	Service interactions	Docker Compose
Chaos	Failure scenarios	Chaos Monkey
Load	Performance at scale	k6, Artillery

Observability for AI Microservices

Distributed AI system la what's happening nu therinjukkanum:

Three Pillars:

1. Logs (What happened)

javascript

// Structured logging with correlation ID
logger.info({
  correlationId: req.headers['x-correlation-id'],
  service: 'recommendation-svc',
  action: 'predict',
  userId: req.body.userId,
  modelVersion: 'v2.3',
  latencyMs: 145,
  cacheHit: false,
});

2. Metrics (How much)

javascript

// Prometheus metrics
const inferenceLatency = new Histogram({
  name: 'ai_inference_duration_seconds',
  help: 'AI inference latency',
  labelNames: ['model', 'service'],
  buckets: [0.05, 0.1, 0.25, 0.5, 1, 2.5, 5],
});

3. Traces (Request journey across services)

javascript

// OpenTelemetry distributed tracing
const tracer = trace.getTracer('recommendation-svc');

async function getRecommendations(userId) {
  return tracer.startActiveSpan('get-recommendations', async (span) => {
    span.setAttribute('user.id', userId);

    const features = await tracer.startActiveSpan('fetch-features', async (s) => {
      const f = await featureStore.get(userId);
      s.end();
      return f;
    });

    const predictions = await tracer.startActiveSpan('model-predict', async (s) => {
      const p = await model.predict(features);
      s.setAttribute('model.version', 'v2.3');
      s.end();
      return p;
    });

    span.end();
    return predictions;
  });
}

Traces use panna — request enda service la slow aagudhu nu immediately find pannalam! 🔍

AI Microservices Anti-Patterns

⚠️ Warning

⚠️ Avoid these common mistakes:

❌ Distributed Monolith — Services tightly coupled, can't deploy independently

✅ Fix: Each service own database, own deployment pipeline

❌ Chatty Services — Too many inter-service calls per request

✅ Fix: Batch calls, cache responses, use events instead

❌ Shared AI Model — Multiple services load same model

✅ Fix: Dedicated inference service, other services call it

❌ No Fallback — AI service down = entire app down

✅ Fix: Circuit breaker + fallback for every AI dependency

❌ Synchronous Everything — All AI calls blocking

✅ Fix: Queue heavy tasks, webhook for results

❌ Giant AI Service — One service does NLP + Vision + Recommendations

✅ Fix: Separate by AI domain — one model per service

Remember: Microservices = independent deployability. If you can't deploy one service without touching others — it's NOT microservices! 🎯

✅ Key Takeaways

✅ Threat modeling first — security features add aagura munnaale threat identify pannunga, attack surfaces understand pannunga

✅ Input validation strict — all user inputs validate, sanitize, whitelist approach use pannunga injection attacks prevent pannunga

✅ Secrets never hardcoded — environment variables, vaults use, keys rotate regularly, access logs maintain pannunga

✅ Authentication + authorization enforce — strong auth mechanisms, fine-grained permissions, session management secure pannunga

✅ Prompt injection serious threat — LLM applications user inputs model kitta directly pass pannaadheenga, templates use, output validation necessary

✅ Data encryption essential — at-rest encryption, at-transit TLS/SSL, sensitive data anonymize, access controls strict pannunga

✅ Security testing continuous — OWASP scanning, penetration testing, security audits regular schedule pannunga, fix immediately

✅ Compliance regulatory important — GDPR, data protection laws, industry standards follow, audit trails maintain, documentation complete pannunga

🏁 Mini Challenge

Challenge: Build Secure AI Application

Oru security-focused AI application develop pannunga (60 mins):

Threat Model: Identify key security risks (data leak, prompt injection, model poisoning)
Input Validation: Comprehensive input validation implement panni test pannunga
Secrets Management: Environment variables, vault usage setup pannunga (no hardcoded keys!)
Rate Limiting: API rate limiting implement panni DDoS protect pannunga
Encryption: Data at-rest + at-transit encryption setup pannunga
Audit Logging: All AI-related actions log pannunga, monitoring setup pannunga
Security Testing: OWASP Top 10 vulnerabilities scan pannunga, fix pannunga

Tools: OWASP ZAP, Bandit, Snyk, HashiCorp Vault, TLS/SSL

Deliverable: Secure application, security test report, hardening checklist 🔒

Interview Questions

Q1: AI applications la most critical security risk enna?

A: Prompt injection (user inputs model kitta directly pass panra), data leaks (sensitive info in training data), model poisoning (malicious training data), API key exposure. Comprehensive input validation + secrets management critical.

Q2: Prompt injection attack enna? How prevent pannalam?

A: User input AI prompt mix panni, attacker instructions inject pannum. Prevention: input validation, prompt templates (not string concatenation), rate limiting, output validation, user input separate handling.

Q3: AI models training data privacy ensure pannuvom epdi?

A: Sensitive data anonymize, differential privacy techniques, federated learning consider, data access controls strict enforce, audit logs maintain. GDPR compliance ensure pannu important.

Q4: API key management AI applications la best practices?

A: Never hardcode, environment variables use, secrets vault (HashiCorp, AWS Secrets Manager), key rotation, access controls per API key, audit logging. Keys exposed = security breach.

Q5: AI model safety testing enna – adversarial examples check panna importance?

A: Critical! Adversarial examples model fool pannalam. Robustness testing, adversarial input testing, jailbreak testing (for LLMs), output boundaries define panni enforce panni.

Frequently Asked Questions

❓ AI features ku microservices venum ah, monolith podhatha?

Small projects ku monolith fine. But AI inference heavy compute use pannum — separate service ah isolate panna independent scaling, deployment, and fault isolation kedaikum.

❓ AI microservice ku gRPC vs REST — edhu better?

Internal AI services ku gRPC better — faster, typed, streaming support. External/public API ku REST better — simple, widely supported. Mix use pannunga.

❓ AI microservices ku Docker venum ah?

Highly recommended! AI models specific dependencies venum (CUDA, PyTorch versions). Docker containerize panna "works on my machine" problem solve aagum. Kubernetes la orchestrate pannunga.

❓ Microservices la AI model update eppdi panradhu?

Blue-green or canary deployment use pannunga. New model version new container la deploy pannunga. Traffic gradually shift pannunga. Rollback easy ah irukkum.

🧠Knowledge Check

Quiz 1 of 1

AI microservice down aana podhu enna pannum?

0 of 1 answered

← Previous ByteAI + DevOps integration

Courses

Learning Paths

Exam Prep

Secure AI coding

Introduction

When to Use Microservices for AI

AI Service Decomposition

Inter-Service Communication

API Gateway for AI Services

Circuit Breaker Pattern for AI

Docker Setup for AI Microservices

Service Mesh for AI Traffic

Data Consistency Across AI Services

Testing AI Microservices

Observability for AI Microservices

AI Microservices Anti-Patterns

✅ Key Takeaways

🏁 Mini Challenge

Interview Questions

Frequently Asked Questions