Skip to main content

Load Balancer

A load balancer is a key component in system design that distributes incoming network or application traffic across multiple servers. Its primary goal is to optimize resource use, maximize throughput, reduce latency, and ensure fault tolerance.

Load Balancer Does

  • Distributes client requests to multiple backend servers (web/app servers).
  • Prevents overloading a single server.
  • Provides high availability—if one server fails, traffic is rerouted to healthy servers.
  • Improves scalability—more servers can be added easily to handle increased load.

Types of Load Balancers

TypeDescription
Layer 4 (Transport)Operates at TCP/UDP level. Balances based on IP and port. Fast and simple.
Layer 7 (Application)Operates at HTTP/HTTPS level. Can inspect content (URLs, cookies, etc.).
Global Load BalancerRoutes traffic between data centers or regions.
Internal Load BalancerRoutes traffic between microservices or internal systems.

Load Balancing Algorithms

AlgorithmDescription
Round RobinRequests are sent to each server in order.
Least ConnectionsSends traffic to the server with the fewest active connections.
IP HashUses client's IP address to determine which server handles the request.
WeightedAssigns more traffic to stronger servers with more capacity.

Architecture Diagram Example of Load Balancer

                +---------------------+
| Clients |
+---------------------+
|
v
+----------------------+
| Load Balancer | <--- Entry point (L4 or L7)
+----------------------+
/ | \
v v v
+------------+ +------------+ +------------+
| WebServer1 | | WebServer2 | | WebServer3 |
+------------+ +------------+ +------------+

Example of Load Balancer

Let’s say you are building an online store like Amazon:

  • Traffic increases dramatically during sales events.
  • You deploy three web servers behind a Layer 7 Load Balancer (e.g., AWS ELB, Nginx, HAProxy).

What happens:

  1. User A accesses www.store.com.
  2. Request goes to the Load Balancer.
  3. The Load Balancer:
    • Checks which server is least loaded.
    • Sends User A’s request to WebServer2.
  4. WebServer2 handles the request and returns the response.

If WebServer2 crashes:

  • Load balancer detects health check failure.
  • Redirects future requests to WebServer1 and WebServer3.

Where Load Balancer used

  • Frontend Traffic Management: Distributes user traffic across web servers.
  • Microservices Communication: Distributes service-to-service calls.
  • Multi-region Deployments: Global load balancers distribute traffic across continents.
  • Autoscaling: Automatically routes traffic to newly created instances.

Layer 4 vs Layer 7 Load Balancers

FeatureLayer 4 Load BalancerLayer 7 Load Balancer
OSI LayerTransport Layer (TCP/UDP)Application Layer (HTTP/HTTPS)
Traffic TypeLow-level protocols (TCP, UDP)High-level protocols (HTTP, WebSocket, gRPC)
Routing LogicBased on IP address and portBased on URL path, headers, cookies, request content, etc.
PerformanceFaster, lower latency (less inspection)Slightly slower due to deep packet inspection
Use CaseGame servers, database traffic, generic TCP servicesWeb applications, REST APIs, websites
Sticky SessionsVia IP hash or custom setupSupports cookie-based session persistence
SSL TerminationOften not supported or limitedFully supported
FlexibilityLimited routing logicHighly flexible (can route /api/ to one service, /admin/ to another)
ExamplesHAProxy (TCP mode), AWS NLBNGINX (HTTP mode), AWS ALB, Envoy, Traefik
  • Layer 4: You’re routing database or chat server traffic (TCP-based).
  • Layer 7: You’re routing different API endpoints to different microservices based on URL path.

Real Deployment Example Using AWS

Components:

  • 3 EC2 Instances (Web Servers)
  • Amazon Application Load Balancer (ALB - Layer 7)
  • Auto Scaling Group
  • Amazon Route 53 (optional, for domain)

Diagram

     User (Browser)
|
Route 53 (DNS)
|
Application Load Balancer (Layer 7)
|
Auto Scaling Group (EC2 Instances)
/ | \
EC2-1 EC2-2 EC2-3 (Web servers)

Flow Explaination

  1. User visits www.example.com.
  2. Route 53 resolves the domain to ALB IP.
  3. ALB inspects the HTTP request:
    • If request is /api/, route to Service A
    • If request is /admin/, route to Service B
  4. The ALB sends the request to one of the EC2 instances in the Auto Scaling Group, using Least Connections or Round Robin.
  5. If one instance fails, ALB detects it via health checks and reroutes traffic to healthy instances.
  6. You can scale up/down the number of EC2s automatically based on CPU or request rate.

Tools Used

ComponentAWS Service
Load BalancerApplication Load Balancer (ALB)
Backend ServersEC2 Auto Scaling Group
Routing DomainRoute 53
Health MonitoringALB Health Checks
SSL SupportSSL Termination at ALB

Example of Load Balancer

           Client
|
Node.js Load Balancer (custom)
/ | \
App Server 1 App Server 2 App Server 3
(Node.js) (Node.js) (Node.js)

Custom Load Balancer in Node.js (Layer 4 Style)

You can create a basic TCP/HTTP round-robin load balancer in Node.js using the http and http-proxy or net modules.

Create Load Balancer (load-balancer.js):

const http = require("http");
const httpProxy = require("http-proxy");

// Create a proxy server
const proxy = httpProxy.createProxyServer({});

// Backend server targets
const targets = [
{ host: "localhost", port: 3001 },
{ host: "localhost", port: 3002 },
{ host: "localhost", port: 3003 },
];

let current = 0;

// Create the load balancer server
const server = http.createServer((req, res) => {
// Round-robin selection
const target = targets[current];
current = (current + 1) % targets.length;

proxy.web(
req,
res,
{ target: `http://${target.host}:${target.port}` },
(err) => {
res.writeHead(502);
res.end("Bad Gateway");
}
);
});

server.listen(8000, () => {
console.log("Load balancer listening on http://localhost:8000");
});

Create Backend App Servers (app-server.js):

const http = require("http");

const port = process.env.PORT || 3001;

const server = http.createServer((req, res) => {
res.writeHead(200);
res.end(`Response from server on port ${port}`);
});

server.listen(port, () => {
console.log(`Server running on port ${port}`);
});

Run Everything:

PORT=3001 node app-server.js
PORT=3002 node app-server.js
PORT=3003 node app-server.js
node load-balancer.js
  • Now, open http://localhost:8000 in the browser.
  • Refresh multiple times — you'll see responses rotating from each app server.

Production Setup: Node.js Behind NGINX (Layer 7 Load Balancer)

Use NGINX to load balance requests across multiple Node.js servers.

File Structure:

project/
├── server1.js
├── server2.js
├── server3.js
├── nginx.conf

Each can be the same as earlier (serverX.js):

// server1.js
const http = require("http");
const port = process.env.PORT || 3001;

const server = http.createServer((req, res) => {
res.writeHead(200);
res.end(`Hello from Node server on port ${port}`);
});

server.listen(port, () => console.log(`Server running on port ${port}`));

Launch:

PORT=3001 node server1.js
PORT=3002 node server2.js
PORT=3003 node server3.js

Configure NGINX (nginx.conf)

http {
upstream node_servers {
server 127.0.0.1:3001;
server 127.0.0.1:3002;
server 127.0.0.1:3003;
}

server {
listen 8000;

location / {
proxy_pass http://node_servers;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection 'upgrade';
proxy_set_header Host $host;
proxy_cache_bypass $http_upgrade;
}
}
}

Reload or restart NGINX:

nginx -c /path/to/nginx.conf
nginx -s reload

Now when you visit http://localhost:8000, NGINX will load balance between Node.js servers.

Summary of uses

ApproachUse CaseTools
Node.js custom balancerLearning, dev testinghttp-proxy
NGINX + Node.jsProduction-grade, static routingNGINX
AWS ALB + Node.jsScalable cloud appsEC2, ALB, Route 53
Kubernetes + IngressContainerized Node.js appsK8s, Ingress

How Load Balancer Work

Imagine you have multiple Node.js servers, each running the same app (like an API or website). A load balancer sits in front of these servers and acts like a traffic controller:

  • It accepts incoming requests from users.
  • It decides which server should handle each request (based on a strategy like round robin or least connections).
  • It forwards the request to that chosen server.
  • The chosen server sends the response back, and the load balancer relays it to the user.

Let’s say you have this setup:

  • 3 Node.js servers on ports 3001, 3002, and 3003
  • 1 Load balancer on port 8000 (either NGINX or a custom one with http-proxy)

1. Client Sends Request

User opens a browser and visits:

http://localhost:8000/api/products

This request hits the load balancer, not the app servers directly.

2. Load Balancer Receives Request

The load balancer server (Node.js or NGINX) listens on port 8000.

It receives the request like this:

GET /api/products HTTP/1.1
Host: localhost:8000

3. Load Balancer Chooses a Backend Server

It uses a load balancing algorithm such as:

  • Round robin: rotates through the server list.
  • Least connections: chooses the least busy server.
  • IP hash: chooses based on the user’s IP. Let’s say it's round robin and the current turn is:
target = { host: "localhost", port: 3002 };

4. Request Is Forwarded

Using a proxy (like http-proxy in Node.js or proxy_pass in NGINX), the load balancer forwards the original HTTP request to the selected server:

GET /api/products HTTP/1.1
Host: localhost:3002

5. Backend Server Handles It

Node.js server on port 3002 receives the request and runs the handler:

res.end("Product list from server 3002");

6. Response Is Sent Back to Client

The response travels:

Node.js server → Load BalancerClient

The user sees:

Product list from server 3002

Even though the client sent the request to port 8000, the load balancer internally managed all the routing.

Behind the scene

LayerRole
Layer 4If using TCP/UDP only, load balancer just reroutes IP + port (doesn't inspect HTTP data).
Layer 7Load balancer looks inside HTTP headers and URLs, enabling smart routing (e.g., send /admin to different backend).

Real Example with Node.js Proxy

proxy.web(req, res, {
target: `http://${target.host}:${target.port}`,
});
  • proxy.web() forwards the HTTP request to the chosen backend.
  • It streams data in real-time, handling headers, body, and response.
  • It also listens for errors and falls back if one server is down.

Health Check

A good load balancer (like NGINX, HAProxy, or AWS ALB) periodically checks if each server is healthy by sending a small test request (GET /healthz). If a server is down:

  • It’s removed from the rotation automatically.
  • Traffic continues flowing to the healthy servers.

Workflow of Load Balancer

[Client Request]

[Load Balancer]
↓ (decides which backend to use)
[Forwarded to Node.js Server]

[Node.js Server Response]

[Load Balancer relays response]

[Client gets the result]