What is a Load Balancer?
A load balancer is a component that sits in front of a group of servers and spreads incoming traffic across them — routing each request to a healthy server so no single machine is overwhelmed and the application stays available even when one fails.
What a load balancer means
A load balancer is the traffic cop of a web system. Clients send their requests to a single stable address, and behind that address the load balancer decides which of several identical backend servers should handle each one. By spreading the work, it lets you serve far more traffic than any single machine could, and by routing around servers that are down, it keeps the service running through hardware failures and deployments.
It also provides a clean point of control. Because every request passes through it, the load balancer is a natural place to terminate encryption, enforce health checks, and present one consistent entry point even as the pool of servers behind it grows, shrinks, or gets replaced one at a time.
Where it came from
Load balancing grew up alongside the commercial web in the late 1990s. As popular sites outgrew a single server, hardware appliances appeared to fan requests out across a bank of machines. Over time the function moved into software and then into the cloud, where load balancers became managed services you configure rather than boxes you rack.
The underlying motivation never changed: a single server is both a capacity ceiling and a single point of failure. Putting a balancer in front turns one fragile server into a resilient, horizontally scalable pool — the architectural pattern that underpins virtually every high-traffic application today.
How it works
A load balancer continuously runs health checks against its backend servers — small periodic probes — and only sends traffic to the ones responding correctly. When a request arrives, it picks a server using an algorithm: round robin rotates through them in order, least connections favors the server with the fewest in-flight requests, and IP hashing pins each client to a consistent server when sessions need to stick.
Load balancers operate at one of two levels. A Layer 4 balancer routes on raw network details — IP and port — without inspecting the contents, which makes it fast and protocol-agnostic. A Layer 7 balancer understands HTTP and can make decisions based on the URL path, headers, or cookies, allowing it to send /api traffic to one set of servers and the marketing site to another. Many setups also terminate TLS at the balancer, decrypting once at the edge so the backend servers do not each have to.
When it matters
A load balancer matters the moment uptime or scale is on the line. The instant you run more than one server — for capacity or for redundancy — you need something to distribute traffic between them, and that something is a load balancer. It is what lets you deploy a new version with zero downtime by draining traffic from old servers as new ones come up, and it is what keeps the lights on when a machine dies at 3 a.m. For small single-server apps it is optional, but designing for it early means you can scale horizontally later without rebuilding the front door.
At QUANT LAB
Load balancing is built into how we architect for reliability, even when it is not a box the client ever sees. The serverless and managed platforms we deploy Next.js apps onto include load balancing transparently — traffic is spread across many execution environments automatically — so most clients get the resilience without operating a balancer themselves. When a project runs on its own cloud infrastructure, we place a managed load balancer in front of the application tier and configure the health checks, routing rules, and TLS termination explicitly.
The payoff shows up in our DevOps engineering work: because traffic flows through a balancer with health checks, we can roll out new versions gradually and pull a bad release out of rotation in seconds. That is the difference between a deploy that risks downtime and one nobody notices.
Long-form deep-dives that use this term
All postsBuilding Multi-Tenant SaaS on Postgres RLS
Row-level security patterns for isolating tenant data without separate databases.
Read postBuild vs Buy Software: A 2026 Decision Framework
Three-year TCO math, the 80/20 rule, and a 12-question checklist.
Read postNext.js + Stripe: The Complete Integration Guide
Server Actions, the Payment Element, webhook idempotency, and subscriptions.
Read post
Related terms
Talk to the engineer who would build it
If you want a 30-minute conversation about scaling your app reliably across multiple servers — not a pitch — book a call.