What is load testing in one sentence?

Load testing simulates realistic user traffic against a system to measure how it performs under expected and peak demand, revealing bottlenecks and breaking points before real users hit them in production.

What is the difference between load, stress, and soak testing?

Load testing checks behavior at expected traffic. Stress testing pushes past the limit to find the breaking point and how the system fails. Soak testing applies steady load for hours or days to catch slow problems like memory leaks.

Why measure percentiles instead of averages?

Averages hide pain. A 200ms average can still mean the slowest 1% of users wait 5 seconds. Percentiles like p95 and p99 describe the tail of the distribution, which is where real users feel slowness, so they are the honest metric.

What tools are used for load testing?

Common open-source tools include k6, Apache JMeter, Gatling, and Locust. They generate concurrent virtual users, script realistic request flows, and report throughput, latency percentiles, and error rates under increasing load.

When should you run load tests?

Before a known traffic spike such as a launch or sale, when changing infrastructure, and ideally as a regular part of the pipeline so performance regressions are caught early rather than discovered during an outage.

Glossary · Infrastructure

What is Load Testing?

Load testing is the practice of simulating realistic user traffic against a system — hundreds or thousands of concurrent virtual users hammering it the way real customers would — to find out how it behaves under pressure, where it slows down, and exactly when it breaks, all before a real crowd shows up and finds out for you.

Why "it works on my machine" is not enough

A system that responds instantly for one developer can collapse the moment real traffic arrives. Database connections run out, a slow query that was fine at ten requests a second falls over at ten thousand, a memory leak that took a week to matter suddenly matters in an hour. None of this shows up in functional tests, which check that features are correct, not that they survive a crowd. Load testing exists to answer a different question: not "does it work?" but "does it still work when everyone shows up at once?"

The family of performance tests

"Load testing" is often used loosely, but there are distinct flavors. A load test applies the traffic you actually expect, including peak, and confirms the system holds up. A stress test deliberately pushes past the expected limit to discover the breaking point and — just as important — how the system fails: does it degrade gracefully or fall over completely? A spike test slams it with a sudden surge to mimic a viral moment. A soak (or endurance) test holds steady load for hours or days to catch slow killers like memory leaks and connection exhaustion that only appear over time.

The metrics that matter

Throughput — requests per second the system can handle — is the headline number, but latency is where the truth lives, and the key is to look at percentiles rather than averages. An average response time of 200 milliseconds can hide the fact that the slowest one percent of users are waiting five seconds. That is why teams track p95 and p99: the response time below which 95% or 99% of requests fall. The tail of the distribution is what real users feel. Alongside latency you watch the error rate as load climbs — the point where errors spike is effectively the system's ceiling.

Doing it well

A load test is only as good as its realism. Tests should model actual user journeys — log in, browse, add to cart, check out — not just hammer a single endpoint, and they should use realistic data and think-time between actions. They must run against an environment that resembles production, because a test against an undersized staging box tells you about the staging box, not your real capacity. Tools like k6, JMeter, Gatling, and Locust generate the virtual users and report the numbers; the skill is in designing a scenario that reflects how people genuinely use the system.

Finding the fix, not just the failure

A load test that says "it broke at 5,000 users" is only half the value; the other half is knowing why. That is where the test pairs with observability and distributed tracing: while the load runs, you watch where time and resources go, and the bottleneck reveals itself — an unindexed query, a missing cache, a connection pool too small, a service that needs more instances. Often the fix is far cheaper than the brute-force answer of throwing hardware at the problem.

At QUANT LAB

We treat load testing as part of shipping, not a panic move before a launch. On the platforms we build under SaaS platform development and validate under QA and test automation, we model realistic traffic, watch the percentiles, and find the bottleneck before recommending anything as heavy as sharding. Done early and repeatedly, it turns capacity from a guess into a number — so a client walks into their big day knowing what their system can take.

Long-form deep-dives that use this term

All posts

Related terms

Big launch or traffic spike coming?

We load test against realistic traffic and fix the bottlenecks so your system holds up when it counts. Book a 30-minute call.

QA and test automation