What does a load balancer do?

A load balancer sits in front of a group of servers and spreads incoming requests across them. It checks which servers are healthy, routes traffic only to those, and keeps the application available even if one server fails or gets overloaded.

What is the difference between Layer 4 and Layer 7 load balancing?

A Layer 4 load balancer routes based on network information like IP address and port, without reading the request contents. A Layer 7 load balancer understands HTTP and can route based on the URL path, headers, or cookies, enabling smarter rules at a small performance cost.

What are common load balancing algorithms?

Round robin sends requests to each server in turn. Least connections favors the server with the fewest active requests. IP hash sends a given client consistently to the same server. Each fits a different traffic and session pattern.

Do I need a load balancer if I only have one server?

Not strictly, but adding one early lets you scale to multiple servers later without re-architecting, and it provides health checks and a stable entry point. Many cloud and serverless platforms include load balancing automatically.

What is a health check?

A health check is a periodic probe the load balancer sends to each backend server. If a server stops responding correctly, the balancer removes it from rotation until it recovers, so users are never routed to a broken instance.

Glossary · Software

What is a Load Balancer?

A load balancer is a component that sits in front of a group of servers and spreads incoming traffic across them — routing each request to a healthy server so no single machine is overwhelmed and the application stays available even when one fails.

What a load balancer means

A load balancer is the traffic cop of a web system. Clients send their requests to a single stable address, and behind that address the load balancer decides which of several identical backend servers should handle each one. By spreading the work, it lets you serve far more traffic than any single machine could, and by routing around servers that are down, it keeps the service running through hardware failures and deployments.

It also provides a clean point of control. Because every request passes through it, the load balancer is a natural place to terminate encryption, enforce health checks, and present one consistent entry point even as the pool of servers behind it grows, shrinks, or gets replaced one at a time.

Where it came from

Load balancing grew up alongside the commercial web in the late 1990s. As popular sites outgrew a single server, hardware appliances appeared to fan requests out across a bank of machines. Over time the function moved into software and then into the cloud, where load balancers became managed services you configure rather than boxes you rack.

The underlying motivation never changed: a single server is both a capacity ceiling and a single point of failure. Putting a balancer in front turns one fragile server into a resilient, horizontally scalable pool — the architectural pattern that underpins virtually every high-traffic application today.

How it works

A load balancer continuously runs health checks against its backend servers — small periodic probes — and only sends traffic to the ones responding correctly. When a request arrives, it picks a server using an algorithm: round robin rotates through them in order, least connections favors the server with the fewest in-flight requests, and IP hashing pins each client to a consistent server when sessions need to stick.

Load balancers operate at one of two levels. A Layer 4 balancer routes on raw network details — IP and port — without inspecting the contents, which makes it fast and protocol-agnostic. A Layer 7 balancer understands HTTP and can make decisions based on the URL path, headers, or cookies, allowing it to send /api traffic to one set of servers and the marketing site to another. Many setups also terminate TLS at the balancer, decrypting once at the edge so the backend servers do not each have to.

When it matters

A load balancer matters the moment uptime or scale is on the line. The instant you run more than one server — for capacity or for redundancy — you need something to distribute traffic between them, and that something is a load balancer. It is what lets you deploy a new version with zero downtime by draining traffic from old servers as new ones come up, and it is what keeps the lights on when a machine dies at 3 a.m. For small single-server apps it is optional, but designing for it early means you can scale horizontally later without rebuilding the front door.

At QUANT LAB

Load balancing is built into how we architect for reliability, even when it is not a box the client ever sees. The serverless and managed platforms we deploy Next.js apps onto include load balancing transparently — traffic is spread across many execution environments automatically — so most clients get the resilience without operating a balancer themselves. When a project runs on its own cloud infrastructure, we place a managed load balancer in front of the application tier and configure the health checks, routing rules, and TLS termination explicitly.

The payoff shows up in our DevOps engineering work: because traffic flows through a balancer with health checks, we can roll out new versions gradually and pull a bad release out of rotation in seconds. That is the difference between a deploy that risks downtime and one nobody notices.

Long-form deep-dives that use this term

All posts

Related terms

Talk to the engineer who would build it

If you want a 30-minute conversation about scaling your app reliably across multiple servers — not a pitch — book a call.

Cloud infrastructure