← Back|SOFTWARE-ENGINEERINGβ€ΊSection 1/17
0 of 17 completed

Testing AI-generated code

Intermediate⏱ 13 min readπŸ“… Updated: 2026-02-17

πŸ§ͺ Introduction – Why AI Code Needs Extra Testing

AI code looks correct but behaves wrong – that's the danger! 😱


Human-written bugs vs AI-written bugs:


AspectHuman BugsAI Bugs
**Visibility**Often obvious (typos, syntax)Subtle, looks perfect
**Confidence**Developer doubts own codeAI sounds very confident
**Pattern**Predictable mistakesRandom hallucinations
**Edge cases**Knows their own blind spotsDoesn't know what it doesn't know
**Security**Aware of common vulnerabilitiesOften generates insecure code

The Confidence Problem: 🎭

AI generated code syntactically perfect aa irukkum. Compile aagum. Run aagum. But logically wrong aa irukkum!


Example: AI write panna sort function correct aa sort pannum... except negative numbers ku! Tests illama indha bug production la dhaan find aagum! πŸ›


Testing = Your safety net when working with AI code! πŸ›‘οΈ


Study data:

  • πŸ”΄ AI code without tests: 35% bug rate in production
  • 🟒 AI code with proper tests: 5% bug rate in production

7x difference! Testing is non-negotiable! βœ…

πŸ“‹ Testing Strategy for AI Code

AI code ku special testing strategy vendumae:


The Testing Pyramid (AI-adjusted):


LevelWhatCoverage TargetPriority
**Unit Tests**Individual functions80%+πŸ”΄ Highest
**Integration Tests**API + DB interactionsKey flows🟠 High
**Edge Case Tests**Boundary values, nullsAll inputsπŸ”΄ Highest
**Security Tests**Injection, XSS, authAll endpoints🟠 High
**E2E Tests**Full user flowsCritical paths🟑 Medium

AI-specific testing additions:

  1. πŸ” Hallucination tests – AI use panna API/method really exist aa?
  2. 🧩 Edge case marathon – null, undefined, empty, max, min, negative
  3. πŸ”’ Security sweep – Every input point test pannunga
  4. πŸ“Š Output validation – AI output expected format la irukkaa?
  5. πŸ”„ Regression tests – AI refactor panna old functionality break aagalaye?

Golden rule: AI code ku normal testing + 30% extra edge case testing! πŸ†

🎯 Unit Testing AI Code – The Foundation

Unit tests = Your first line of defense! πŸ›‘οΈ


AI kitta unit tests ezhudha sollum podhu:

code
Prompt: "Write comprehensive unit tests for this 
function. Include:
- Happy path (normal inputs)
- Edge cases (empty, null, undefined, 0, -1)
- Boundary values (max int, empty string, huge array)
- Error scenarios (invalid input types)
- Use describe/it blocks with clear test names"

Example – Testing a discount calculator:

javascript
describe('calculateDiscount', () => {
  // Happy path
  it('should apply 10% discount for orders over $100', () => {
    expect(calculateDiscount(150, 'SAVE10')).toBe(135);
  });

  // Edge cases
  it('should return 0 for zero amount', () => {
    expect(calculateDiscount(0, 'SAVE10')).toBe(0);
  });

  it('should handle negative amounts', () => {
    expect(() => calculateDiscount(-50, 'SAVE10'))
      .toThrow('Amount must be positive');
  });

  it('should handle null coupon', () => {
    expect(calculateDiscount(100, null)).toBe(100);
  });

  // Boundary values
  it('should handle maximum safe integer', () => {
    expect(calculateDiscount(Number.MAX_SAFE_INTEGER, 'SAVE10'))
      .toBeGreaterThan(0);
  });

  // Invalid input
  it('should throw for string amount', () => {
    expect(() => calculateDiscount('abc', 'SAVE10'))
      .toThrow('Invalid amount');
  });
});

Each test = One specific scenario! 🎯

⚠️ AI-Generated Tests Are Not Enough!

⚠️ Warning

AI tests oda common problems:

1. 🎭 Happy path bias – AI mostly success cases test pannum

2. πŸ”„ Implementation testing – Behavior illa, implementation test pannum

3. ❌ Tautological tests – Code ae test la copy pannum (always pass!)

4. 🧩 Missing edge cases – Obvious cases cover pannum, tricky ones miss

5. πŸ“ Weak assertions – toBeDefined() instead of specific value check

Always supplement AI tests with:

- 🧠 Your domain knowledge – Business logic edge cases

- πŸ”’ Security scenarios – Malicious inputs

- πŸ’₯ Chaos testing – What if DB down? API timeout?

- πŸ“Š Data boundary tests – Empty array, single item, 1 million items

🧩 Edge Case Testing – Where AI Fails Most

Edge cases = AI oda Achilles heel! 🎯


The Edge Case Checklist:


CategoryTest Cases
**Null/Undefined**null, undefined, NaN
**Empty**'', [], {}, 0, false
**Boundaries**MAX_INT, MIN_INT, MAX_SAFE_INTEGER
**Strings**Unicode πŸŽ‰, special chars <>&, very long (10K+)
**Arrays**Empty, single item, duplicates, sorted, reversed
**Numbers**0, -0, Infinity, -Infinity, NaN, floats
**Dates**Leap year, timezone, DST, epoch, far future
**Concurrency**Simultaneous calls, race conditions

AI prompt for edge cases:

code
"For this function, list ALL possible edge cases 
that could cause bugs. Think about:
- Unusual inputs
- Boundary values  
- Concurrent access
- Resource failures
- Unicode and special characters"

Example – AI missed edge case:

javascript
// AI-generated function
function getAverage(numbers) {
  return numbers.reduce((a, b) => a + b) / numbers.length;
}

// AI forgot: What if numbers = []?
// Result: NaN (reduce on empty array with no initial value throws!)

// Fixed version:
function getAverage(numbers) {
  if (!numbers?.length) return 0;
  return numbers.reduce((a, b) => a + b, 0) / numbers.length;
}

Always ask: "What if input is empty/null/huge/negative?" πŸ€”

πŸ”— Integration Testing – AI Code + Your System

AI code isolation la work aagum, but your system la fail aagum! πŸ”—


Why integration tests matter:

  • AI your database schema theriyaadhu
  • AI your auth system theriyaadhu
  • AI your error conventions theriyaadhu
  • AI your API contracts theriyaadhu

Integration Test Example:

javascript
describe('User API Integration', () => {
  let testDb;
  
  beforeAll(async () => {
    testDb = await setupTestDatabase();
  });

  afterAll(async () => {
    await testDb.cleanup();
  });

  it('should create user and return with ID', async () => {
    const response = await request(app)
      .post('/api/users')
      .send({ name: 'Test User', email: 'test@example.com' })
      .expect(201);

    expect(response.body).toMatchObject({
      id: expect.any(String),
      name: 'Test User',
      email: 'test@example.com',
      createdAt: expect.any(String)
    });

    // Verify in database
    const dbUser = await testDb.users.findById(response.body.id);
    expect(dbUser).toBeTruthy();
    expect(dbUser.email).toBe('test@example.com');
  });

  it('should reject duplicate email', async () => {
    await request(app)
      .post('/api/users')
      .send({ name: 'User 1', email: 'dup@test.com' })
      .expect(201);

    await request(app)
      .post('/api/users')
      .send({ name: 'User 2', email: 'dup@test.com' })
      .expect(409);  // Conflict
  });
});

AI code integrate panna munnaadi, integration tests ezhudhunga! πŸ”—

🎬 Real Scenario: Testing AI-Generated Auth Code

βœ… Example

AI generate panna auth middleware test pannurom:

javascript
describe('Auth Middleware', () => {
  // βœ… Valid token
  it('should pass with valid JWT', async () => {
    const token = generateTestToken({ userId: '123' });
    const req = mockRequest({ authorization: `Bearer ${token}` });
    const res = mockResponse();
    
    await authMiddleware(req, res, nextFn);
    expect(req.user.userId).toBe('123');
    expect(nextFn).toHaveBeenCalled();
  });

  // πŸ”’ Security tests AI MISSED:
  it('should reject expired token', async () => {
    const token = generateTestToken({ userId: '123' }, '-1h');
    const req = mockRequest({ authorization: `Bearer ${token}` });
    await authMiddleware(req, mockResponse(), nextFn);
    expect(nextFn).not.toHaveBeenCalled();
  });

  it('should reject tampered token', async () => {
    const token = generateTestToken({ userId: '123' }) + 'tampered';
    // ... should reject
  });

  it('should reject token with wrong algorithm', async () => {
    const token = jwt.sign({ userId: '123' }, 'key', { algorithm: 'none' });
    // ... should reject (algorithm confusion attack!)
  });

  it('should handle missing Authorization header', async () => {
    const req = mockRequest({});
    // ... should return 401
  });
});

AI happy path mattum test pannum – security tests YOU add pannanum! πŸ”

πŸ—οΈ Testing Architecture for AI Code

πŸ—οΈ Architecture Diagram
**Complete testing pipeline:**

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚       AI GENERATES CODE πŸ€–              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  STEP 1: AI TESTS      β”‚
    β”‚  "Write tests for this" β”‚
    β”‚  Quick baseline coverageβ”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  STEP 2: EDGE CASES    β”‚
    β”‚  YOU add edge cases     β”‚
    β”‚  null, empty, boundary  β”‚
    β”‚  Unicode, concurrent    β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  STEP 3: SECURITY      β”‚
    β”‚  Injection tests        β”‚
    β”‚  Auth bypass tests      β”‚
    β”‚  XSS, CSRF tests       β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  STEP 4: INTEGRATION   β”‚
    β”‚  Database tests         β”‚
    β”‚  API contract tests     β”‚
    β”‚  Third-party mocks      β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  STEP 5: MUTATION TEST β”‚
    β”‚  Stryker / mutation     β”‚
    β”‚  "Are tests catching    β”‚
    β”‚   actual bugs?"         β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
              β”‚
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚  βœ… SHIP WITH          β”‚
    β”‚     CONFIDENCE! πŸš€     β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

**5 layers of testing = Maximum confidence!** πŸ›‘οΈ

πŸ”’ Security Testing – Non-Negotiable!

AI code la security vulnerabilities frequently varum! πŸ”


Must-test security scenarios:


VulnerabilityTest HowAI Miss Rate
**SQL Injection**Send `'; DROP TABLE--`60%
**XSS**Send ``50%
**Auth Bypass**Access without token40%
**IDOR**Access other user's data70%
**Path Traversal**Send `../../etc/passwd`55%
**Rate Limiting**1000 requests/second80%

Security Test Examples:

javascript
describe('Security Tests', () => {
  it('should prevent SQL injection', async () => {
    const maliciousInput = "'; DROP TABLE users; --";
    const response = await request(app)
      .get(`/api/users?search=${maliciousInput}`)
      .expect(200);
    
    // Table should still exist!
    const users = await db.query('SELECT count(*) FROM users');
    expect(users.count).toBeGreaterThan(0);
  });

  it('should sanitize XSS in user input', async () => {
    const xssPayload = '<script>alert("xss")</script>';
    const response = await request(app)
      .post('/api/comments')
      .send({ text: xssPayload });
    
    expect(response.body.text).not.toContain('<script>');
  });

  it('should prevent IDOR', async () => {
    const userAToken = getTokenForUser('userA');
    await request(app)
      .get('/api/users/userB/private-data')
      .set('Authorization', `Bearer ${userAToken}`)
      .expect(403);
  });
});

Every API endpoint ku security tests ezhudhunga! πŸ”

πŸ“Š Test Coverage – Quality over Quantity

Coverage = How much code tests cover πŸ“ˆ


Coverage targets:


Code TypeTargetWhy
**Business logic**90%+Core value, bugs here = $$ loss
**API handlers**85%+User-facing, security critical
**Utilities**80%+Shared code, many consumers
**UI components**60%+Snapshot + interaction tests
**Config/setup**SkipLow value, changes rarely

Setup coverage tracking:

json
// vitest.config.ts or jest.config.js
{
  "coverageThreshold": {
    "global": {
      "branches": 80,
      "functions": 80,
      "lines": 80,
      "statements": 80
    }
  }
}

Coverage != Quality! ⚠️

javascript
// 100% coverage but USELESS test:
it('should work', () => {
  const result = calculateTax(100);
  expect(result).toBeDefined(); // 😀 What value??
});

// Lower coverage but VALUABLE test:
it('should calculate 18% GST correctly', () => {
  expect(calculateTax(100)).toBe(118);
  expect(calculateTax(0)).toBe(0);
  expect(calculateTax(999.99)).toBe(1179.99);
});

Strong assertions > high coverage! πŸ’ͺ

πŸ”„ Mutation Testing – Are Your Tests Real?

Mutation testing = Your tests oda tests! 🧬


Concept: Code la small changes (mutations) pannum. Tests catch pannaadha, tests weak!


How it works:

code
Original:  if (age >= 18) return true;
Mutation:  if (age >  18) return true;  // Changed >= to >
           if (age <= 18) return true;  // Changed >= to <=
           if (age >= 18) return false; // Changed return value

If tests still pass β†’ Tests are WEAK! 🚨
If tests fail β†’ Tests caught the mutation βœ…

Setup Stryker (JavaScript):

bash
npm install --save-dev @stryker-mutator/core
npx stryker init
npx stryker run

Mutation Score:

ScoreQualityAction
**90%+**ExcellentShip with confidence!
**80-90%**GoodReview surviving mutants
**60-80%**Needs workAdd more edge case tests
**< 60%**Weak testsMajor test improvement needed

AI-generated tests typically score 50-65% – that's why you add your own! 🎯


Pro tip: AI kitta mutation test results show panni, "Write tests to kill these surviving mutants" nu sollunga! πŸ€–

πŸ€– AI-Powered Test Generation Workflow

Best workflow for AI test generation:


Step 1: Generate baseline 🎯

code
"Write unit tests for this function. 
Cover happy path and basic error cases."

Step 2: Request edge cases 🧩

code
"Now add edge case tests: null inputs, 
empty arrays, boundary values, Unicode strings, 
concurrent calls."

Step 3: Request security tests πŸ”’

code
"Add security-focused tests: injection attempts, 
auth bypass, malicious inputs, XSS payloads."

Step 4: Review & enhance πŸ‘€

  • AI tests padinga
  • Weak assertions strengthen pannunga
  • Missing scenarios add pannunga
  • Business logic tests ezhudhunga

Step 5: Run mutation testing 🧬

bash
npx stryker run
  • Surviving mutants identify pannunga
  • Those gaps fill pannunga

Coverage progression:

StepCoverageMutation Score
AI baseline~60%~50%
+ Edge cases~75%~65%
+ Security~80%~72%
+ Your additions~85%~82%
+ Mutation fixes~88%~90%

Incremental improvement! Each step quality increase pannum! πŸ“ˆ

πŸ’‘ Test-Driven Development with AI

πŸ’‘ Tip

TDD + AI = Powerful combo! πŸ†

Workflow:

1. πŸ“ YOU write the test first (define expected behavior)

2. πŸ€– AI writes the implementation (to pass your test)

3. πŸ” You review AI's implementation

4. πŸ”„ Refactor together

Why this works:

- Tests define YOUR requirements – AI can't miss them

- AI implements to pass your tests – focused output

- You control the quality through test design

- No "AI hallucination" problem – test catches it!

Example:

code
You: "I wrote these 15 tests for a password 
validator. Write the implementation that 
passes all tests."

AI will write exactly what you need – no more, no less! 🎯

βœ… Key Takeaways

βœ… AI code EXTRA testing venum β€” 35% bugs irukkum AI code la, proper testing 5% reduce pannalam


βœ… Unit tests + edge cases essential β€” AI happy path test pannum, null, empty, boundary, concurrent access YOU add pannunga


βœ… Edge case testing critical β€” AI Achilles heel β€” Unicode, MAX_INT, empty arrays, special characters all test pannunga


βœ… Integration tests separate ah β€” unit tests isolated success β‰  system level integration works guarantee pannaadhu


βœ… Security tests non-negotiable β€” injection, XSS, auth bypass β€” AI generate panna code security assume pannaadheenga


βœ… Coverage 80%+ target β€” 100% unnecessary, quality important β€” strong assertions > high coverage numbers


βœ… Mutation testing verify β€” AI-generated tests weak aagala β€” code change vandhaalum tests fail pannum verify pannunga


βœ… TDD + AI powerful β€” tests first write, AI implementation β€” requirements clear, AI follows exactly

🏁 Mini Challenge

Challenge: Achieve 90%+ Code Coverage on AI Code


Oru real-world component 90%+ coverage aah vaanga (50 mins):


  1. Generate Code: AI kitta oru feature implement panna sollvunga (login, payment, upload)
  2. Analyze: Coverage report run panni gaps identify panni
  3. Edge Cases: Missing edge cases brainstorm panni list pannunga
  4. Write Tests: Unit tests + edge cases + security tests write panni
  5. Coverage: 90%+ coverage achieve panni verify panni
  6. Mutation Testing: Pitest/mutant run panni test quality validate panni
  7. Document: Test strategy + coverage report document pannunga

Tools: Jest, Postman, nyc/istanbul for coverage, pitest for mutation


Success Criteria: 90%+ coverage, all edge cases covered, mutation score > 80% 🎯

Interview Questions

Q1: AI-generated code testing strategy – manual vs AI-generated tests?

A: AI generate panna tests baseline good, but human judgment add pannanum. Edge cases, security, business logic validation – human expertise essential. Ideal: AI baseline tests + human edge case tests.


Q2: Code coverage 100% aim pannalam aa?

A: Illa necessary illa! 100% coverage false sense of security provide pannum. 80-90% coverage good target – focus on critical paths, complex logic, security-sensitive code.


Q3: AI code testing la TDD approach useful aa?

A: Extremely useful! Tests first write panni AI kitta implementation generate panni. Tests define requirements clearly, AI follows specifications exactly. Test quality control panna key to success.


Q4: Performance testing AI code important aa?

A: Yes critical! AI generate panra code often inefficient – N+1 queries, memory leaks possible. Benchmark baseline establish panni AI code performance check panni optimize panni.


Q5: Security testing enna priority – testing strategy la?

A: Highest priority! Security bugs from AI code serious implications have, production vulnerability create pannum. Dependency scanning, input validation tests, authentication/authorization tests – mandatory for all AI code.

πŸš€ Next Steps – Ship AI Code with Confidence

Testing AI code = Professional discipline! πŸ†


The Testing Formula:

code
AI Tests (baseline) + Edge Cases (you) +
Security (you) + Integration (you) =
Ship with Confidence! πŸš€

Key mindset shift:

  • ❌ "AI wrote it, it probably works"
  • βœ… "AI wrote it, let me prove it works"

Time investment: 20-30% extra development time

Return: 7x fewer production bugs, 3x faster debugging, peaceful sleep! 😴


Remember: Untested AI code is a ticking time bomb πŸ’£. Test it, prove it, ship it! πŸš€

🧠Knowledge Check
Quiz 1 of 1

AI-generated tests oda most common problem enna?

0 of 1 answered