Scalability in Distributed Systems
Scalability is the ability of an application to handle increased load (more users, more requests, more data) without sacrificing performance.
Example:
If a banking system suddenly gets 10× more customers,
the system should still respond reliably and within acceptable time.
Why Scalability Matters
Without proper scalability:
- Response time increases
- System becomes unstable
- Failures cascade
- Customer experience degrades
In modern systems, scalability is not optional — it is a design requirement.
Two Main Types of Scalability
Scalability is broadly classified into:
- Vertical Scaling (Scale Up)
- Horizontal Scaling (Scale Out)
1️⃣ Vertical Scaling (Scale Up)
Vertical scaling means increasing the capacity of a single machine.
You scale by:
- Adding more CPU cores
- Increasing RAM
- Using faster disks
No new machines are added — the same machine becomes stronger.
Example
Suppose your AccountService runs on a single Tomcat server.
When load increases, you upgrade the VM:
- CPU:
2 cores → 8 cores - RAM:
8 GB → 32 GB
The application remains the same, only the hardware changes.
Pros of Vertical Scaling
✅ Simple to implement
✅ No code changes required
✅ Works well for small or early-stage systems
Cons of Vertical Scaling
❌ Hardware upgrades are expensive
❌ Physical limits (CPU/RAM have a maximum)
❌ Downtime during upgrade
❌ Single point of failure
❌ Not suitable for large-scale systems
If that one server crashes, the entire service is down.
2️⃣ Horizontal Scaling (Scale Out)
Horizontal scaling means adding more instances of the application and distributing traffic among them using a Load Balancer.
Instead of making one machine stronger, you add more machines.
Example
Instead of one AccountService instance:
AccountService-1
AccountService-2
AccountService-3
A load balancer distributes requests across these instances.
All instances:
- Connect to the same database
OR - Use distributed storage (recommended for scale)
Pros of Horizontal Scaling
✅ Handles very high traffic
✅ High availability
✅ Fault tolerance (one instance failure doesn’t stop service)
✅ Easy to scale in cloud & containers
✅ Supports auto-scaling
Cons of Horizontal Scaling
❌ More complex architecture
❌ Requires load balancing
❌ Services must be stateless
❌ Session management must be external (Redis, DB)
Statelessness Requirement
For horizontal scaling to work:
- Services must not store session/state in memory
- State must be stored in:
- Database
- Cache (Redis)
- Token (JWT)
This is a core microservices principle.
Vertical vs Horizontal Scaling
| Aspect | Vertical Scaling | Horizontal Scaling |
|---|---|---|
| Scaling method | Bigger machine | More machines |
| Complexity | Low | High |
| Cost | High (hardware) | Optimized (cloud) |
| Availability | Low | High |
| Fault tolerance | ❌ No | ✅ Yes |
| Cloud friendly | ❌ Limited | ✅ Yes |
| Microservices fit | ❌ Poor | ✅ Excellent |
Which Scaling Is Used in Microservices?
Microservices primarily rely on horizontal scaling.
Vertical scaling may still be used:
- At database level (temporarily)
- For legacy systems
- For quick short-term fixes
But long-term scalable systems:
✔ Prefer horizontal scaling
Summary
- Scalability ensures system reliability under load
- Vertical scaling is simple but limited
- Horizontal scaling is complex but powerful
- Microservices are designed for horizontal scaling
- Stateless design is mandatory for scale-out