MOFU Engineering Reference · 2026
Stripe Webhook Security: 12 Patterns That Survive Production
Engineering reference for production-grade Stripe webhooks. Signature verification, idempotency, retries, replay protection, queue-based processing, monitoring — and the gotchas that lose money.
By Bill Beltz, founder of QUANT LAB USA INC · Published May 12, 2026
Quick answer: how do I secure Stripe webhooks?
Verify the signature using the raw request body, enforce the timestamp tolerance, persist every event ID for deduplication, return a 2xx response under 200ms and queue the actual work, make every side-effect idempotent, monitor delivery success rate as an SLO, and rotate signing secrets through a dual-secret window. The most common production bug is parsing the JSON before verifying the signature — which silently breaks verification because the raw byte sequence changes.
Stripe webhooks are how the money state machine in your application stays consistent with the money state machine at Stripe. Every subscription created, every payment failure, every refund, every dispute — they all arrive as webhooks. Get this wrong and you have double-charges, missing entitlements, ghost subscriptions, and angry support emails.
We ship Stripe integrations as a core service at QUANT LAB USA. These twelve patterns are the production handbook. See also our Next.js + Stripe integration guide and our Stripe Connect Marketplace Architecture.
The 12 patterns at a glance
- Verify signatures with the raw request body
- Enforce timestamp tolerance for replay protection
- Persist every event ID for deduplication
- Acknowledge fast, process async
- Idempotency keys on every downstream side effect
- Subscribe only to events you handle
- Return 2xx for unhandled event types
- Queue-based processing with a durable worker
- Dead-letter queue for permanent failures
- Rotate signing secrets through dual-secret windows
- Monitor delivery success rate as an SLO
- Replay missed events from the dashboard
1. Verify signatures with the raw request body
Stripe signs the exact bytes it sent. JSON parsing reorders keys and normalizes whitespace, which changes the byte sequence. Read the body as a string or ArrayBuffer first, verify, then parse.
In Next.js App Router, the route handler must call request.text() before any JSON parsing. Use stripe.webhooks.constructEvent(rawBody, signature, secret) — never roll your own HMAC. The Stripe SDK handles signature scheme version (v1) and timestamp parsing.
Signing secrets begin with whsec_. Store in environment variables. Never log them.
2. Enforce timestamp tolerance for replay protection
Stripe's signed payload includes a Unix timestamp. The SDK rejects anything older than 5 minutes by default. This prevents an attacker who captures a valid webhook from replaying it weeks later.
Default tolerance is fine for most workloads. Tighten it (60 seconds) for high-value flows; loosen it (10 minutes) only if you operate in a degraded clock-skew environment.
3. Persist every event ID for deduplication
Stripe will sometimes re-deliver the same event legitimately — even after you returned 200. Maintain a database table of processed event IDs and reject duplicates at the handler entry point.
Use the event's id field (which starts with evt_) as the primary key. Add a unique constraint and catch the integrity violation. Retention can be capped at 30 days — older replays are unusual and Stripe documents the cap as 3 days.
4. Acknowledge fast, process async
Stripe's webhook delivery timeout is 30 seconds. Anything slower retries. Inline processing — sending emails, calling third-party APIs, running database transactions — risks timeouts and double-execution.
The pattern: verify, persist the event to a durable queue (BullMQ, SQS, Inngest, Trigger.dev), return 200. The worker processes the event asynchronously. If the worker fails, the queue handles retry without Stripe re-delivering.
Target inline ack time: under 200ms P95. We monitor this as an SLO.
5. Idempotency keys on every downstream side effect
Every downstream side effect — sending email, calling external API, writing to your database, triggering a fulfillment job — must be idempotent. The same event arriving twice should produce the same observable outcome.
Pattern: use the Stripe event ID as the idempotency key for downstream API calls (Stripe APIs natively support an Idempotency-Key header). For internal mutations, use unique constraints and ON CONFLICT DO NOTHING. For email, dedupe by event_id + recipient.
6. Subscribe only to events you handle
Stripe lets you enable all events or specific event types per endpoint. Subscribe only to what your handler implements. Reduces noise, reduces handler logic surface area, and reduces the risk that a new Stripe event type triggers behavior you did not intend.
Common event types for SaaS: customer.subscription.created, customer.subscription.updated, customer.subscription.deleted, invoice.paid, invoice.payment_failed, customer.subscription.trial_will_end, checkout.session.completed.
7. Return 2xx for unhandled event types
If an event arrives that your code does not handle, return 200, not 4xx. Returning 4xx signals "permanent failure" but Stripe will still retry for a while, and you will spam your error logs and alerting.
Pattern: log the unknown event type at info level for visibility, then return 200. Add a handler when you actually want to act on it.
8. Queue-based processing with a durable worker
The webhook handler enqueues. A separate worker process consumes. The worker has its own retry policy, its own scaling profile, and its own dead-letter behavior.
Our default stack: Inngest or Trigger.dev for managed; BullMQ on Redis for self-hosted. For higher-volume customers, SQS + Lambda. Each event becomes a job with the Stripe event ID as the deduplication key.
9. Dead-letter queue for permanent failures
Jobs that fail after all retries land in a dead-letter queue. A human reviews them, fixes the root cause, and replays. Without this, a transient bug becomes silent data drift.
Alert on any item in the DLQ. The bar for "you should know about this" is low — these are events your money state machine could not absorb.
10. Rotate signing secrets through dual-secret windows
Stripe lets you add a second active signing secret to the same endpoint during rotation. The pattern:
- Generate a new signing secret in Stripe dashboard.
- Deploy code that tries the new secret first, falls back to old secret. Both signing keys verify.
- Wait 24 hours to confirm new secret is in use and stable.
- Delete the old secret in Stripe dashboard.
- Deploy code that only uses the new secret.
11. Monitor delivery success rate as an SLO
The Stripe dashboard surfaces delivery success rate per endpoint. Alert when it drops below 99%. Alongside that, monitor:
- P95 ack latency (target under 200ms)
- Worker processing latency (target under 30s)
- DLQ depth (target zero)
- Duplicate event rejections per day (informational)
- Unhandled event types per week (capture for triage)
12. Replay missed events from the dashboard
When an outage causes Stripe to give up retrying, the Stripe dashboard lets you replay individual events or a time-window batch. Document the replay process in your runbook. Test it in staging at least once a year.
For very large replay windows, ask Stripe support for a bulk export — they can ship you a JSONL file of events for a date range that your worker can ingest offline.
Common failure modes in production
| Symptom | Root cause | Fix |
|---|---|---|
| All signatures fail | Body parsed before verification | Read raw bytes first |
| Duplicate emails after retry | No idempotency on email send | Dedupe by event_id + recipient |
| Timeouts on busy endpoints | Inline processing | Queue and ack fast |
| Out-of-order event processing | Worker concurrency without ordering | Use Stripe's request ordering or sort by created |
| Missing entitlements after upgrade | Event dropped during outage | Replay from dashboard |
Testing strategy: three layers
- Unit tests: Mock Stripe SDK, fixture event payloads, assert handler side effects.
- Integration tests: Use Stripe CLI
stripe triggeragainst a test Stripe account. - Pre-prod tests: Mirror production webhook config in a sandbox; run full E2E flows.
For PCI considerations around webhooks, see our PCI-DSS Compliance for SaaS Checklist.
FAQ
How do I verify a Stripe webhook signature?
Capture the raw request body (not the parsed JSON), retrieve the Stripe-Signature header, and pass both to stripe.webhooks.constructEvent with your webhook signing secret. The function throws on any signature mismatch, timestamp deviation greater than the tolerance window, or replay risk. Never roll your own verification. In Next.js App Router, you must read the body as text or arrayBuffer to preserve the original bytes.
Why is the raw body required for signature verification?
Stripe signs the exact bytes it sent you. JSON parsing reorders keys, normalizes whitespace, and changes the byte sequence. The signed payload no longer matches what you reconstruct. Read the raw body first, verify the signature, then parse the JSON. This catches 80% of broken webhook integrations.
How do I prevent webhook replay attacks?
Three layers. First, verify the signature so the request must have come from Stripe. Second, enforce the timestamp tolerance window (default 5 minutes — anything older is rejected). Third, track processed event IDs in your database and reject duplicates. Stripe will sometimes legitimately re-deliver an event; idempotency handling is required.
What is webhook idempotency and why does it matter?
Stripe retries webhook delivery up to 3 days if your endpoint returns non-2xx or times out. The same event will arrive multiple times. Your handler must produce the same outcome on a 2nd, 5th, or 50th delivery — no double charges, no duplicate emails. Idempotency keys on every side effect plus event ID tracking in your database is the pattern.
What does Stripe consider a successful webhook?
Any 2xx response within the timeout window (default 30 seconds, less than 10 strongly recommended). Anything else triggers retry. Slow handlers that eventually succeed still count as failures from Stripe's perspective if they exceed the timeout. Always acknowledge fast and queue the work — do not process inline.
Should I process webhook work inline or queue it?
Queue it. The handler should verify, persist, ack with 200, and return. Heavy work (sending emails, updating downstream systems, triggering jobs) belongs on a background worker. Inline processing risks timeouts, double-execution on retry, and database lock contention at scale.
How do I rotate a webhook signing secret without downtime?
Stripe lets you have multiple active signing secrets per endpoint during rotation windows. Add the new secret to your environment, deploy code that tries the new secret first and falls back to the old secret, rotate the endpoint in the Stripe dashboard, deploy code that only uses the new secret, and remove the old secret from the environment.
What is the Stripe Workbench or Stripe CLI's role in webhook development?
Stripe CLI forwards live webhook events to localhost for development. Stripe Workbench is the dashboard panel where you inspect event details, signing secrets, and delivery history. Both are essential for iterating safely. Never use production signing secrets in development environments.
How do I handle webhook events I don't care about?
Return 200 and move on. Stripe sends every enabled event type. Filtering at the endpoint configuration level (only subscribe to events you handle) reduces noise. Always return 2xx for unhandled event types — never 4xx or 5xx, which signals retry-worthy failure.
How do I test webhook handlers?
Three layers. Unit test the handler with fixture events from the Stripe API (real JSON shapes). Integration test against a Stripe test account using stripe CLI trigger to fire real events. Pre-prod test with a sandbox account that mirrors production setup. Production-test by replaying events from the Stripe dashboard.
What happens if my webhook endpoint is down for a day?
Stripe retries with exponential backoff for up to 3 days. After that, the event is marked failed and not retried. You can manually replay events from the Stripe dashboard. Best practice: monitor your webhook endpoint's success rate as a critical SLO. Below 99.5% should page somebody.
How do I monitor webhook health in production?
Three signals. Stripe dashboard delivery success rate (alert below 99%). Your own metrics on event types received vs processed. Latency of inline ack time (P95 under 200ms is healthy). We export Stripe webhook metrics into Datadog/Sentry and alert on anomalies.
Related reading and next steps
Get your Stripe integration audited.
Free 30-minute Stripe integration review. We will walk through your webhook handler, idempotency posture, and SLO setup. If something is leaking money, you want to know.
More Stripe engineering reading
All postsNext.js + Stripe: The Complete Integration Guide
Server Actions, the Payment Element, webhook idempotency, and subscriptions.
Read postStripe Connect Marketplace Architecture
Account types, fund flows, and reconciliation for multi-party payments.
Read postPCI-DSS Compliance for SaaS Checklist
What PCI scope reduction looks like when you route payments through Stripe.
Read post