Retry, Backoff & Dead Letter Topic (DLT)
1. Big Picture (End-to-End Flow)
Producer
↓ (producer retry + backoff)
Kafka Topic
↓
Consumer
↓ (consumer retry + backoff)
Success ✅ [ OR ] DLT ❌
2. Retry — Two Types
2.1 Producer Retry
Retry when sending message to Kafka fails.
When
Producer → Kafka fails (broker down / network issue / timeout)
Config (Spring Boot)
spring:
kafka:
producer:
acks: all
retries: 3
properties:
retry.backoff.ms: 1000
enable-idempotence: true
Flow
Send → fail → wait → retry → success
Key Points
- Happens BEFORE message enters Kafka
- Controlled by Kafka client
- Risk: duplicate messages (if idempotence disabled)
2.2 Consumer Retry
Retry when message processing fails.
When
Kafka → Consumer → DB/API fails
Flow
Read → process → fail → retry → success OR DLT
3. Backoff — Delay Between Retries
3.1 Producer Backoff
Config
spring:
kafka:
producer:
properties:
retry.backoff.ms: 1000
Purpose
- Avoid hammering Kafka
- Give broker time to recover
3.2 Consumer Backoff (Critical)
Example (Spring Kafka)
@RetryableTopic(
attempts = "3",
backoff = @Backoff(
delay = 5000,
multiplier = 2
)
)
@KafkaListener(topics = "account.transaction.completed.v1")
public void consume(AccountTransactionCompletedEvent event) {
process(event);
}
Behavior
1st fail → wait 5s
2nd fail → wait 10s
3rd fail → wait 20s
→ DLT
Types of Backoff
| Type | Behavior |
|---|---|
| Fixed | 5s → 5s → 5s |
| Exponential (Recommended) | 5s → 10s → 20s |
Why Backoff is Critical
Without backoff:
Immediate retry → CPU spike → DB overload ❌
With backoff:
Retry with delay → system stabilizes ✅
4. Retry Strategies
Immediate Retry (Bad)
Fail → retry instantly → infinite loop ❌
Backoff Retry (Better)
Fail → wait → retry
Retry Topics (Best Practice)
Main Topic
↓
Retry Topic 1 (5s)
↓
Retry Topic 2 (30s)
↓
DLT
5. Dead Letter Topic (DLT)
Kafka topic for permanently failed messages.
Important
- Just another Kafka topic
- Not offset related
- Not automatic (handled by consumer logic)
Flow
Message fails → retries → still fails
→ send to DLT
→ commit offset
Spring Kafka DLT Handler
@DltHandler
public void handleDlt(AccountTransactionCompletedEvent event) {
log.error("DLT Event: " + event.getTransactionId());
// store in DB / alert / manual fix
}
6. Offset + DLT Relationship
After sending to DLT: Consumer commits offset
Why? To avoid infinite reprocessing loop
Key Difference
| Concept | Meaning |
|---|---|
| Offset Commit | Processing completed |
| DLT | Processing failed permanently |
7. Full Flow (Correct Design)
- Read message
- Process
- Fail
- Retry with backoff
- Still fail
- Send to DLT
- Commit offset
8. Banking Example (Real)
Scenario
Transaction Event → Ledger Service
Case 1 — DB Down (Temporary)
Fail → retry → success
Case 2 — Invalid Data (Permanent)
Fail → retry → fail → DLT
Case 3 — Kafka Down
Producer retry handles it
9. application.yaml (Consumer)
spring:
kafka:
consumer:
enable-auto-commit: false
listener:
ack-mode: manual
10. Best Practices
Producer
- acks=all
- retries enabled
- idempotence enabled
- small backoff
Consumer
- use @RetryableTopic
- use exponential backoff
- always use DLT
- commit offset after handling
- implement idempotency
11. Common Mistakes
- No DLT ❌
- Infinite retry ❌
- No backoff ❌
- Blocking retry ❌
- Mixing commit + retry ❌
12. Final Summary
| Concept | Side | Purpose |
|---|---|---|
| Producer Retry | Producer | Ensure delivery |
| Producer Backoff | Producer | Retry delay |
| Consumer Retry | Consumer | Ensure processing |
| Consumer Backoff | Consumer | Retry delay |
| DLT | Consumer | Store failed messages |
Final Insight
Kafka ensures delivery
You ensure correctness