Kafka basics
Introduction
Post office yosichchu paaru ๐ฎ โ oru area la letters collect panni, sort panni, different areas ku deliver pannuvaanga. Apache Kafka same concept โ but for data! Digital post office ๐ค
LinkedIn 2011 la create pannanga โ ivanga per day 7 trillion messages process pannuvaanga! Now Uber, Netflix, Airbnb, 80%+ Fortune 500 companies Kafka use pannuvaanga.
Why Kafka is everywhere:
- โก Handle millions of messages/second
- ๐พ Messages disk la persist (days/weeks)
- ๐ Horizontally scalable โ more brokers add pannunga
- ๐ Multiple consumers same data read pannalaam
- ๐ก๏ธ Fault tolerant โ broker crash aanalum data safe
Kafka is not just a tool โ it's the central nervous system of modern data platforms! Let's understand how ๐ง
Core Concepts
Kafka oda key building blocks:
1. Topic ๐
- Category or feed name โ like a folder for messages
- Example: "orders", "payments", "user-events"
- Producers write to topics, consumers read from topics
2. Partition ๐๏ธ
- Topic subdivide aagum into partitions
- Parallelism enable pannum โ more partitions = more throughput
- Messages within partition ordered (guaranteed!)
- Across partitions? No ordering guarantee
3. Broker ๐ฅ๏ธ
- Kafka server instance
- Cluster = multiple brokers (typically 3+)
- Each broker stores some partitions
4. Producer โ๏ธ
- Application that writes messages to topics
- Decides which partition ku send pannanum
5. Consumer ๐
- Application that reads messages from topics
- Tracks its position (offset) in each partition
6. Consumer Group ๐ฅ
- Group of consumers that work together
- Each partition assigned to ONE consumer in the group
- Enables parallel processing!
| Concept | Analogy | Purpose |
|---|---|---|
| Topic | TV Channel | Categorize messages |
| Partition | Lanes in highway | Parallelism |
| Broker | Post office branch | Store & serve |
| Producer | News reporter | Send messages |
| Consumer | TV viewer | Read messages |
| Consumer Group | Family watching together | Load balance |
Kafka Architecture
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ KAFKA CLUSTER โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค โ โ โ PRODUCERS BROKERS CONSUMERS โ โ โโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ โ App 1 โโโโถโ Broker 1 โ โ Consumer โโ โ โ โ โ โโโโโโโโโโโโ โโโโถโ Group A โโ โ โ App 2 โโโโถโ โTopic:ordersโ โ โ โโโโ โโโโ โโ โ โ โ โ โ P0 โ P1 โ โ โ โC1โ โC2โ โโ โ โ App 3 โโโโถโ โโโโโโโโโโโโ โ โ โโโโ โโโโ โโ โ โโโโโโโโโโ โโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโคโ โ โ Broker 2 โ โ Consumer โโ โ โ โโโโโโโโโโโโ โโโโถโ Group B โโ โ โ โTopic:ordersโ โ โ โโโโ โโ โ โ โ P2 โ P3 โ โ โ โC3โ โโ โ โ โโโโโโโโโโโโ โ โ โโโโ โโ โ โโโโโโโโโโโโโโโโโโค โโโโโโโโโโโโโโโ โ โ Broker 3 โ โ โ โ (replicas) โ โ โ โโโโโโโโโโโโโโโโโโ โ โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โ โ ZooKeeper / KRaft โ โ โ โ (Cluster metadata management) โ โ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Partitions โ The Secret to Scale
Partitions Kafka oda superpower! ๐ช
How partitioning works:
Partition Key decides which partition:
Why partition key matters:
- Same key = same partition = ordering guaranteed for that key
- User U123 oda orders always order la varum โ
- Different users parallel ah process aagum โก
How many partitions?
- Rule of thumb: target throughput / single consumer throughput
- 100 MB/s venum, consumer 10 MB/s handle pannum โ 10 partitions minimum
- More partitions = more parallelism, but more overhead too
- Start with 6-12, increase as needed
Warning: Partition count increase pannalaam, but decrease panna mudiyaadhu! Plan carefully ๐ฏ
Producer Configuration โ Get It Right!
Producer configuration production la critical:
๐ก acks setting โ Most important config!
- acks=0 โ Don't wait for broker acknowledgment (fastest, risky)
- acks=1 โ Leader acknowledges (good balance)
- acks=all โ All replicas acknowledge (safest, slower)
๐ก Batching โ Don't send one message at a time!
๐ก Compression โ Reduce network and disk usage:
๐ก Idempotent Producer โ Prevent duplicates:
๐ก Retries โ Network blips ku:
Golden Rule: Financial data? acks=all + enable.idempotence=true. Metrics/logs? acks=1 + compression podhum! ๐ฆ
Consumer Groups โ Parallel Processing
Consumer groups Kafka la load balancing enable pannudhu:
Scenario: "orders" topic with 4 partitions
1 Consumer in Group:
2 Consumers in Group:
4 Consumers in Group:
5 Consumers in Group:
Key Rules:
- Max useful consumers = partition count
- Multiple consumer groups = multiple independent readers
- Group A reads for analytics, Group B reads for ML โ both get ALL messages
- Consumer crash aana? Rebalance โ remaining consumers take over its partitions
Offset Management:
- Each consumer tracks offset (position in partition)
- Committed offset = "I've processed up to here"
- Crash aana, restart from last committed offset โ no data loss! ๐ก๏ธ
Kafka with Python โ Hands On
Python la Kafka use pannalaam โ confluent-kafka library:
Producer:
Consumer:
Simple ah start pannalaam! Docker la Kafka run panni practice pannunga ๐ณ
Replication โ Data Safety
Broker crash aana data lose aagakoodadhu โ replication saves us:
Replication Factor = 3 (standard production config)
How it works:
- Producer โ Leader ku write
- Leader โ Followers ku replicate
- Followers "in-sync" aana โ ISR (In-Sync Replica) set la irukku
- Leader crash aana โ ISR la irundhu new leader elect aagum โก
ISR (In-Sync Replicas):
- Follower leader la irundhu data catch up panniduchaa = "in sync"
min.insync.replicas=2โ at least 2 replicas in sync irukanumacks=all+min.insync.replicas=2= data loss almost impossible!
Scenario: Broker 1 crashes
Production config recommendation:
| Setting | Value | Why |
|---|---|---|
| replication.factor | 3 | Standard safety |
| min.insync.replicas | 2 | Tolerate 1 broker down |
| acks | all | No data loss |
ZooKeeper โ KRaft Migration
โ ๏ธ ZooKeeper is being deprecated! Kafka 3.5+ la KRaft mode use pannunga.
ZooKeeper problems:
- Separate system maintain pannanum โ operational overhead
- Scaling limitations with large clusters
- Metadata operations slow
KRaft (Kafka Raft) benefits:
- โ No separate ZooKeeper cluster needed
- โ Faster controller failover (seconds vs minutes)
- โ Simplified operations
- โ Better scalability (millions of partitions)
Migration path:
1. Kafka 3.5+ install pannunga
2. KRaft mode la new clusters start pannunga
3. Existing clusters gradually migrate pannunga
New projects ku: Always KRaft mode use pannunga! ZooKeeper dependency remove pannunga ๐ฏ
Real-World Kafka Use Cases
Companies Kafka eppadi use pannuvaanga ๐:
LinkedIn (Kafka creators):
- 7 trillion messages/day
- Activity tracking, metrics, logs
- 100+ Kafka clusters
Netflix:
- 1 trillion+ events/day
- Viewing activity โ recommendation engine
- Real-time A/B testing results
Uber:
- Trip events, driver location, pricing
- Kafka โ Flink โ real-time features
- 1000+ microservices Kafka through communicate
Spotify:
- User listening events โ discover weekly
- 500 billion events/day
- Real-time music recommendations
Common patterns across companies:
1. ๐ Event sourcing โ All state changes as events
2. ๐ Microservice communication โ Async message passing
3. ๐ Real-time analytics โ Live dashboards
4. ๐ค ML feature pipelines โ Fresh features for models
5. ๐ Audit logs โ Compliance and debugging
Kafka Connect โ Easy Integrations
Custom code ezhudhaama data move panna โ Kafka Connect!
What is it?
- Pre-built connectors for databases, file systems, cloud services
- No coding needed โ just configuration!
Source Connectors (Data โ Kafka):
Sink Connectors (Kafka โ Data):
Popular Connectors:
| Connector | Direction | Use Case |
|---|---|---|
| Debezium MySQL | Source | CDC from MySQL |
| JDBC | Source/Sink | Any SQL database |
| S3 | Sink | Archive to S3 |
| Elasticsearch | Sink | Search index |
| BigQuery | Sink | Analytics warehouse |
Kafka Connect = No-code data integration! ๐
Kafka Monitoring Essentials
Production Kafka cluster monitor pannanum โ key metrics:
Broker Metrics:
- Under-replicated partitions โ 0 irukanum, > 0 = problem! ๐ด
- Active controller count โ Exactly 1 irukanum
- Request latency โ Produce/fetch request time
- Disk usage โ retention policy based
Producer Metrics:
- Record send rate โ Messages per second
- Record error rate โ Failed sends
- Batch size avg โ Batching efficiency
Consumer Metrics:
- Consumer lag โ ๐ด MOST IMPORTANT! Messages behind = lag
- Fetch rate โ Consumption speed
- Commit rate โ Offset commit frequency
Consumer Lag Formula:
Lag increasing = consumer slow or stuck! Immediate attention venum โ ๏ธ
Tools:
- Kafka UI (open source) โ cluster visualization
- Burrow โ consumer lag monitoring
- Grafana + Prometheus โ dashboards
- Confluent Control Center โ commercial
Prompt: Local Kafka Setup
Kafka Best Practices
Production Kafka ku essential practices:
โ Replication factor 3 โ Minimum for production
โ Use Avro/Protobuf โ JSON is slow at scale, schema registry use pannunga
โ Partition key wisely โ Hot partitions avoid pannunga (don't use country as key!)
โ Monitor consumer lag โ Alerting set pannunga, lag > threshold = page
โ Retention policy โ Business need ku match pannunga (7 days common)
โ Compaction โ State topics ku log compaction enable pannunga
โ Security โ SASL/SSL enable pannunga, ACLs configure pannunga
โ Capacity plan โ Disk space = throughput ร retention ร replication factor
โ Test failover โ Monthly broker restart panni recovery test pannunga
Kafka master pannaa real-time data engineering la unstoppable aaiduveeenga! ๐
Next: Airflow orchestration โ pipelines schedule and manage pannalaam! ๐
โ Key Takeaways
โ Kafka Fundamentals โ Distributed event streaming platform. Topics (categories), Partitions (parallelism), Brokers (servers), Producers (write), Consumers (read)
โ Topics & Partitions โ Topic messages categorize. Partitions parallel ah process. Same partition messages ordered; across partitions ordering guarantee illa. Partition count throughput affect
โ Consumer Groups โ Multiple consumers work together. Each partition max one consumer per group. Multiple groups independent read. Rebalancing automatic aagum
โ Durability & Replication โ Replication factor 3 standard production. Leader + followers in-sync. Leader crash โ follower elected. Data loss protection
โ Producer Configuration โ acks=all (safest, slow), acks=1 (good balance), acks=0 (fastest, risky). Batch, compression, idempotence setup. Financial data acks=all
โ Offset Management โ Consumer position track. Commit offset "processed up to here". Crash โ resume from last offset. No manual implementation, Kafka handles
โ Kafka Streams โ Lightweight processing library. Topology building, stateless + stateful operations. Small-scale real-time processing perfect
โ Monitoring Essential โ Consumer lag (critical metric), under-replicated partitions, controller status. Burrow tool consumer lag dedicated monitoring. Lag increasing โ escalation
๐ ๐ฎ Mini Challenge
Challenge: Kafka Topic + Producer + Consumer
Complete Kafka workflow hands-on:
Step 1: Create Topic - 5 min:
Step 2: Producer - 10 min:
Step 3: Consumer Group 1 - 5 min:
Step 4: Consumer Group 2 - 5 min:
- Same code, group.id='order-processor-2' (different group)
- Both groups receive SAME messages independently!
Observe:
- Consumer lag (Kafka lag monitor)
- Same messages different consumers
- Partition assignment (3 partitions, load balanced)
Learning: Kafka flexibility โ multiple consumers, persistence, replay capability! ๐จ
๐ผ Interview Questions
Q1: Kafka oda magic โ traditional queue la irundhu difference?
A: Traditional queue: Message consume aana delete. Kafka: Message disk la persist (days/weeks). Multiple consumers same message read. Offset track pannu. Replay possible! Scalability, durability, flexibility โ Kafka advantages! But complexity increase aagum.
Q2: Partitions โ eppadi choose pannanum?
A: Throughput calculation first: Need 1M messages/sec, single partition 100K/sec โ minimum 10 partitions. Then: Add 2-3x buffer for growth. Partition distribution balanced? Check leadership. Too many (overhead), too few (throughput bottleneck). Monitor and adjust!
Q3: Consumer lag โ production alert setup?
A: CRITICAL metric! Lag increasing โ consumers slow. Alert: lag > threshold (e.g., 1 hour). Burrow (monitoring tool) use pannu. Investigate: consumer slow, network issue, partition imbalanced? Fix โ restart consumer, scale up, optimize code.
Q4: Producer reliability โ acks configuration?
A: acks=0 (fastest, risky), acks=1 (leader confirms), acks=all (all replicas confirm = safest, slow). Financial data? acks=all mandatory! Metrics/logs? acks=1 ok. Trade-off: Speed vs safety. Batch size, compression, retries โ all producer config tune pannu based on use case!
Q5: ZooKeeper โ KRaft migration โ business impact?
A: KRaft better (simpler operations). But migration complex โ downtime risk. Strategy: New clusters KRaft mode, existing gradual migrate. Kafka 3.5+ already KRaft ready. Performance improvement, operational simplicity gain. Timeline: 2026-27 lah industry mostly KRaft adopt pannuvaanga! ๐
Frequently Asked Questions
Consumer group la 3 consumers irukku, topic la 2 partitions irukku. Enna nadakkum?