Scalability

Scaling in system design refers to the ability of a system to handle increased load or demand by growing in capacity. As your application gains users, handles more requests, or processes more data, scaling ensures it continues to perform well and meet user expectations.

Vertical Scaling (Scaling Up)

Vertical Scaling (also called scaling up) means increasing the capacity of a single machine/server to handle more load. Instead of adding more servers (like in horizontal scaling), you upgrade the existing machine with:

More powerful CPU
More RAM
Faster SSD storage
Better network bandwidth

Vertical scaling is often used in:

Monolithic applications
Databases (before sharding/replication)
Early-stage startups where architecture is still simple
Systems with tight dependencies or shared state (where horizontal scaling is hard)

Benefits of Vertical Scalling

Advantage	Explanation
✅ Simpler architecture	No need to manage multiple nodes or distributed systems
✅ No code changes	App continues to run without refactoring
✅ Faster to implement	Just upgrade the hardware or instance type
✅ Useful for databases	Databases benefit from more memory and CPU

Limitations of Vertical Scalling

Limitation	Explanation
Hardware limit	You can only scale up to the most powerful machine available
Downtime possible	Upgrading may require rebooting the server
Cost increases steeply	Higher-tier machines cost disproportionately more
No fault tolerance	Single point of failure if the machine crashes

Example of Vertical Scaling

Scenario: You built a Node.js-based blog platform. It runs on a single server (2 vCPU, 4 GB RAM). As traffic increases, your app slows down—especially under heavy request bursts.

Solution: You vertically scale by upgrading to a more powerful instance (8 vCPU, 16 GB RAM).

// server.js
const express = require("express");
const app = express();

app.get("/", (req, res) => {
  // Simulate heavy computation
  let sum = 0;
  for (let i = 0; i < 1e7; i++) sum += i;
  res.send("Welcome to my blog!");
});

app.listen(3000, () => console.log("Server started on port 3000"));

On a low-memory, low-CPU server, requests take time and queue up. Users may face timeouts or slow responses.

After Vertical Scaling

You upgrade the server (e.g., using AWS EC2):

From: t3.small (2 vCPU, 2GB RAM)
To: m6i.2xlarge (8 vCPU, 32GB RAM)

This boosts:

Number of concurrent requests handled
Speed of compute-heavy endpoints
RAM available for Node.js heap and cache

No code changes needed.

Vertical Scaling for Databases

A common use case:

You're using PostgreSQL with high query volume.
Queries are slow due to lack of memory (no room for indexes/cache).
You upgrade the DB instance to get more RAM & CPU.

Tools like Amazon RDS, DigitalOcean Managed DB, or Google Cloud SQL allow one-click vertical scaling.

Performance Comparison of Vertical Scalling

Metric	Before Upgrade	After Upgrade
Avg. response time	800ms	150ms
Concurrent users	100	1000+
Memory usage	95% (swap used)	50% (no swap)

Horizontal Scaling (Scaling Out)

Horizontal Scaling (also called scaling out) is the process of adding more machines or nodes to your system to handle increased load. Instead of upgrading a single machine (vertical scaling), you add more instances of your application or database and distribute traffic or data among them behind a load balancer..

It require more complex architecture; requires stateless design.

It’s used in:

Web applications serving high traffic (e.g., Netflix, Facebook)
Microservices architectures
Cloud-native systems (Kubernetes, serverless)
Big data processing systems

Benefits of Horizontal Scaling

Advantage	Explanation
High scalability	Add as many servers as needed to meet demand
High availability	No single point of failure—if one server fails, others handle the load
Cost efficiency	Use many low-cost servers instead of one expensive one
Fault tolerance	Easy to design resilient systems
Easy automation	Works well with autoscaling in cloud environments

Limitations of Horizontal Scaling

Limitation	Explanation
🚫 More complex system	Requires load balancing, service discovery, etc.
🚫 Stateless requirement	App logic must avoid using local memory for session/state
🚫 Network overhead	Data sharing across nodes adds latency and complexity

Horizontal Scaling Architecture

             +-------------------+
             |   Load Balancer   |
             +--------+----------+
                      |
   +------------------+------------------+
   |                  |                  |
+-----+          +-----+            +-----+
| App |          | App |            | App |
| #1  |          | #2  |            | #3  |
+-----+          +-----+            +-----+

Example of Horizontal Scalling

Scenario: You built a Node.js API using Express. As traffic increases, a single instance isn’t enough. You need to scale out.

Step 1: Create a Stateless Node.js App

You deploy multiple Node.js app instances using a load balancer like NGINX or AWS ELB to distribute incoming HTTP traffic.

// server.js
const express = require("express");
const app = express();

app.get("/", (req, res) => {
  res.send(`Hello from process ${process.pid}`);
});

app.listen(3000, () => console.log(`Server running on port 3000`));

You can deploy this app on 3 servers and use a load balancer to route traffic across them.

To support horizontal scaling, make sure:

No local in-memory state
Sessions (if any) are stored in Redis or DB

Step 2: Run Multiple Instances (e.g., Using `cluster` or Docker)

Using cluster module (simulates horizontal scaling on one machine):

// cluster.js
const cluster = require("cluster");
const os = require("os");
const numCPUs = os.cpus().length;

if (cluster.isMaster) {
  console.log(`Master ${process.pid} is running`);
  for (let i = 0; i < numCPUs; i++) {
    cluster.fork(); // Spawn worker
  }
} else {
  require("./server"); // Worker runs app
}

This runs multiple processes on one machine — like simulating multiple servers.

Real Horizontal Scaling (Multiple Servers + Load Balancer)
- Deploy your Node.js app on multiple VMs/containers (e.g., app1, app2, app3)
- Use NGINX or cloud load balancer to route traffic across them.

NGINX config (load balancing):

http {
  upstream node_backend {
    server 192.168.1.10:3000;
    server 192.168.1.11:3000;
    server 192.168.1.12:3000;
  }

  server {
    listen 80;
    location / {
      proxy_pass http://node_backend;
    }
  }
}

Other Components You Might Add

Session Store: Redis or Memcached (to share sessions across instances)
Service Discovery: If using microservices (e.g., Consul, Eureka)
Containerization: Docker, Kubernetes (to manage scaling and orchestration)
Auto Scaling: AWS Auto Scaling Groups, GCP Instance Groups, or K8s Horizontal Pod Autoscaler

Strategies to Implement Scaling

Stateless Services

Ensure your application doesn’t store session or state data in memory. Use external tools like Redis or databases for session storage.
This allows easy replication across servers.

Load Balancing

Distribute requests across instances.
Load balancer uses algorithms like Round Robin, Least Connections, or IP Hashing.

# Sample NGINX config
upstream backend {
    server app1.example.com;
    server app2.example.com;
    server app3.example.com;
}

server {
    listen 80;
    location / {
        proxy_pass http://backend;
    }
}

Database Scaling

Read Replicas: Separate read traffic from write.
Sharding: Partition data across multiple databases.
Caching: Use Redis or Memcached to cache frequent queries.

Example of Scaling

Scenario:

You’re building a product catalog service. Initially, you have:

One Node.js server
One PostgreSQL DB As traffic grows, product searches slow down.

Solution:

Scale Node.js horizontally: Use Docker/Kubernetes to spin up multiple Node.js containers.
Introduce Redis Cache:** Cache popular search queries.
Use PostgreSQL Read Replicas:** Direct read-heavy operations (like product listings) to replicas.
Add Load Balancer:** AWS Application Load Balancer routes traffic across Node.js containers.

Vertical Scaling (Scaling Up)​

Benefits of Vertical Scalling​

Limitations of Vertical Scalling​

Example of Vertical Scaling​

After Vertical Scaling​

Vertical Scaling for Databases​

Performance Comparison of Vertical Scalling​

Horizontal Scaling (Scaling Out)​

Benefits of Horizontal Scaling​

Limitations of Horizontal Scaling​

Horizontal Scaling Architecture​

Example of Horizontal Scalling​

Step 1: Create a Stateless Node.js App​

Step 2: Run Multiple Instances (e.g., Using cluster or Docker)​

Other Components You Might Add​

Strategies to Implement Scaling​

Stateless Services​

Load Balancing​

Database Scaling​

Example of Scaling​

Scenario:​

Solution:​