# What You’ll Learn#
This guide explains Next.js rate limiting for Route Handlers, Server Actions, and Edge runtime, and how to combine it with bot protection. You’ll get copy-pasteable code for token bucket limits, Redis-backed consistency across serverless instances, and practical WAF or CDN rules.
It’s written for teams shipping Next.js apps with public APIs, auth flows, search endpoints, contact forms, or any route that attracts scrapers and credential stuffing. If you also want a broader security baseline, start with our web application security checklist.
# Why Rate Limiting Matters in Next.js Specifically#
Next.js makes it easy to expose powerful server capabilities through Route Handlers and Server Actions. That also makes abuse cheaper for attackers and more expensive for you.
Typical real-world failure modes we see in audits and incident reviews:
- Credential stuffing against
/api/logincreates CPU spikes, database load, and third-party auth costs. - Scrapers hit search and product endpoints at high concurrency, causing cache misses and origin traffic.
- Form spam inflates email provider bills and fills your CRM with junk.
- LLM crawlers and generic bots trigger expensive server components and server-side rendering paths.
A good strategy is not a single limiter. It’s a layered system that blocks obvious abuse early and applies stricter rules only where it matters.
# Architecture Overview: A Layered Defense Model#
Use the cheapest control first, then progressively more expensive checks.
| Layer | Where it runs | Best for | Typical tools |
|---|---|---|---|
| CDN or WAF rules | Before your app | Known bad bots, geo rules, basic request rate rules | Cloudflare WAF, Vercel WAF, AWS WAF |
| Edge middleware or Edge Route Handlers | At the edge | Fast IP-level shaping, early rejects, header-based checks | Next.js Middleware, Edge runtime |
| App-level limiter (Redis-backed) | Node runtime or Edge with Redis HTTP | Authenticated limits, per-user quotas, token bucket bursts | Upstash Redis, Redis on Fly, managed Redis |
| Business-logic throttles | Inside actions and services | Cost controls, downstream protection | DB-level caps, queueing, circuit breakers |
Runtime choice matters because Edge can reject earlier but has limitations. If you’re deciding between Edge and Node patterns, read Next.js Edge runtime vs Node.js runtime.
🎯 Key Takeaway: Treat rate limiting as a system. CDN/WAF blocks the noise, Edge rejects cheaply, and Redis-backed app limits enforce real quotas consistently.
# Choosing the Right Limiting Algorithm#
Most teams start with fixed windows and quickly regret it because of burst behavior. Token bucket and leaky bucket are typically better for user experience.
Fixed Window#
- Example: 100 requests per minute.
- Problem: an attacker can send 100 requests at the end of one minute and 100 at the start of the next, effectively doubling throughput.
Sliding Window#
- Smooths spikes but can be more complex and storage-heavy.
- Good for precise enforcement, especially at the WAF layer.
Token Bucket (recommended for most Next.js apps)#
- Think of a bucket that refills at a steady rate.
- Allows short bursts while enforcing sustained rate.
- Works well for UX: users can refresh or retry without immediate blocking.
A practical starting point for token bucket settings:
| Endpoint type | Sustained rate | Burst | Notes |
|---|---|---|---|
| Login, OTP, password reset | 5 per minute | 10 | Also add per-account and per-IP keys |
| Search | 30 per minute | 60 | Cache aggressively and consider bot rules |
| Contact form | 2 per minute | 5 | Add CAPTCHA only after suspicion |
| Public API (unauth) | 60 per minute | 120 | Prefer API keys where possible |
| Server Actions mutating data | 20 per minute | 40 | Key by user ID and session |
# Keying Strategy: Avoid IP-Only Limits#
IP-only limits are easy but cause false positives for:
- Corporate NATs
- Mobile carriers
- Shared Wi‑Fi
- Proxies and VPNs
Prefer composite keys:
- 1Authenticated user ID when available.
- 2API key for programmatic access.
- 3Session ID for anonymous but cookie-based flows.
- 4IP as a secondary dimension.
- 5Coarse UA bucket to reduce trivial evasion.
A safe pattern is to limit both per-user and per-IP, then block only when both are abusive, or apply stricter action when one is extreme.
⚠️ Warning: Never rely on
x-forwarded-forunless you are behind a trusted proxy and your platform guarantees it. On most platforms you should use the request IP provided by the runtime or verified headers.
# Next.js Rate Limiting for Route Handlers (Node Runtime)#
Node runtime is the most flexible environment for Redis clients and crypto libraries. It’s also common for database access and heavier logic.
Step 1: Define a Token Bucket in Redis#
A token bucket needs to store two values per key: current tokens and last refill timestamp. We’ll implement it with one atomic Redis script so concurrent requests behave correctly.
This example is designed for Redis-compatible services that support Lua scripts. If your provider does not support scripts, use a server-side rate limit product or a different algorithm.
// lib/rateLimit.ts
export type RateLimitResult = {
allowed: boolean;
remaining: number;
retryAfterMs: number;
};
export type TokenBucketConfig = {
capacity: number; // max tokens
refillPerSec: number; // tokens per second
};
const LUA_TOKEN_BUCKET = `
local key = KEYS[1]
local capacity = tonumber(ARGV[1])
local refill_per_sec = tonumber(ARGV[2])
local now_ms = tonumber(ARGV[3])
local data = redis.call("HMGET", key, "tokens", "ts")
local tokens = tonumber(data[1])
local ts = tonumber(data[2])
if tokens == nil then tokens = capacity end
if ts == nil then ts = now_ms end
local delta_ms = math.max(0, now_ms - ts)
local refill = (delta_ms / 1000.0) * refill_per_sec
tokens = math.min(capacity, tokens + refill)
local allowed = 0
if tokens >= 1.0 then
allowed = 1
tokens = tokens - 1.0
end
redis.call("HMSET", key, "tokens", tokens, "ts", now_ms)
redis.call("PEXPIRE", key, math.ceil((capacity / refill_per_sec) * 1000) + 60000)
local retry_after_ms = 0
if allowed == 0 then
retry_after_ms = math.ceil((1.0 - tokens) / refill_per_sec * 1000)
end
return {allowed, math.floor(tokens), retry_after_ms}
`;Step 2: Wire It to Your Redis Client#
Below is a minimal wrapper that assumes your Redis client has an eval method.
// lib/rateLimitRedis.ts
import type { RateLimitResult, TokenBucketConfig } from "./rateLimit";
export async function tokenBucketLimit(opts: {
redis: any;
key: string;
config: TokenBucketConfig;
nowMs?: number;
}): Promise<RateLimitResult> {
const nowMs = opts.nowMs ?? Date.now();
const { capacity, refillPerSec } = opts.config;
const res = await opts.redis.eval(LUA_TOKEN_BUCKET, {
keys: [opts.key],
arguments: [String(capacity), String(refillPerSec), String(nowMs)],
});
const allowed = res[0] === 1;
const remaining = Number(res[1] ?? 0);
const retryAfterMs = Number(res[2] ?? 0);
return { allowed, remaining, retryAfterMs };
}If your Redis library uses a different eval signature, adapt only that part. The logic stays the same.
Step 3: Apply It in a Next.js Route Handler#
Example route: app/api/search/route.ts
// app/api/search/route.ts
import { NextResponse } from "next/server";
import { tokenBucketLimit } from "@/lib/rateLimitRedis";
import { redis } from "@/lib/redis";
export const runtime = "nodejs";
function getClientIp(req: Request) {
return req.headers.get("x-forwarded-for")?.split(",")[0]?.trim() ?? "unknown";
}
export async function GET(req: Request) {
const ip = getClientIp(req);
const key = `rl:search:ip:${ip}`;
const limit = await tokenBucketLimit({
redis,
key,
config: { capacity: 60, refillPerSec: 0.5 }, // 30/min sustained, 60 burst
});
if (!limit.allowed) {
return new NextResponse("Too Many Requests", {
status: 429,
headers: {
"Retry-After": String(Math.ceil(limit.retryAfterMs / 1000)),
},
});
}
// Continue with search logic...
return NextResponse.json({ ok: true, remaining: limit.remaining });
}This is “good enough” for many public endpoints, but you’ll want better keys for authenticated users.
# Next.js Rate Limiting for Server Actions#
Server Actions are powerful because they run server-side and can be called from forms and components. They’re also attractive to abuse because they often trigger mutations and expensive downstream calls.
Pattern: Wrap Server Actions with a Limiter#
You can build a wrapper that:
- Derives a stable key from user ID or session.
- Falls back to IP when unauthenticated.
- Returns a safe error that your UI can handle.
// lib/withRateLimit.ts
"use server";
import { headers } from "next/headers";
import { tokenBucketLimit } from "@/lib/rateLimitRedis";
import { redis } from "@/lib/redis";
type LimiterOpts = {
name: string;
capacity: number;
refillPerSec: number;
keySuffix?: string;
};
function getIpFromHeaders() {
const h = headers();
return h.get("x-forwarded-for")?.split(",")[0]?.trim() ?? "unknown";
}
export function withRateLimit<TArgs extends any[], TResult>(
action: (...args: TArgs) => Promise<TResult>,
opts: LimiterOpts
) {
return async (...args: TArgs): Promise<TResult> => {
const ip = getIpFromHeaders();
const key = `rl:action:${opts.name}:ip:${ip}:${opts.keySuffix ?? "default"}`;
const limit = await tokenBucketLimit({
redis,
key,
config: { capacity: opts.capacity, refillPerSec: opts.refillPerSec },
});
if (!limit.allowed) {
throw new Error("RATE_LIMITED");
}
return action(...args);
};
}Then use it:
// app/actions/submitContact.ts
"use server";
import { withRateLimit } from "@/lib/withRateLimit";
async function submitContactImpl(formData: FormData) {
// validate, store, notify
return { ok: true };
}
export const submitContact = withRateLimit(submitContactImpl, {
name: "submitContact",
capacity: 5,
refillPerSec: 0.03, // roughly 2/min sustained
});Edge vs Node for Server Actions#
In practice, most Server Actions run in Node runtime on platforms like Vercel, and that’s usually what you want for Redis clients and auth SDKs. If you force Edge, verify all dependencies are Edge-compatible and that your Redis access is via HTTP.
If you want a deeper runtime comparison, use our Edge vs Node guide.
# Edge Runtime Patterns: Fast Rejection, Limited State#
Edge runtime is excellent for early, cheap denial and for shaping traffic before it hits origin. The tradeoff is limited library compatibility and state management.
Option A: Edge Middleware “Gate”#
Use middleware to block obvious abuse for entire route groups, such as /api/ or sensitive pages.
- Reject known bad bots by User-Agent signature.
- Enforce simple per-IP limits using an edge KV or a vendor-specific rate limit service.
- Require a minimal header for internal calls.
Because Next.js Middleware runs on Edge, it’s a good place to add coarse-grained rules and redirect or block.
ℹ️ Note: Middleware is not a perfect place for strict quotas because it can add latency and you must avoid high-cardinality state calls on every request. Use it for shaping and cheap checks.
Option B: Edge Route Handler With Redis HTTP#
Some managed Redis providers offer an HTTP-based API that is Edge-friendly. The same token bucket logic applies, but client and latency characteristics change.
When you’re at the edge, be realistic about latency. If your Redis is far from your edge POP, you can add 30 to 100 milliseconds per request, which is too expensive for high-volume routes. In those cases, let the WAF or CDN handle most noise and keep Redis checks for authenticated or expensive operations.
# Redis-Backed Limits: Consistency Across Serverless Instances#
In serverless and autoscaled environments, in-memory limits reset frequently and are not shared between instances. Redis solves this by being a shared store.
When Redis-backed limiting is worth it#
- You have API keys and quotas.
- You need per-user throttles for Server Actions.
- You pay per request to third parties and want to control cost.
- You need consistent enforcement across regions or instances.
When Redis-backed limiting is not worth it#
- Purely static pages behind a CDN.
- Endpoints already protected by WAF rules and caching.
- Low-traffic internal tools.
A practical approach is to implement Redis-based limiting only for these categories:
| Category | Example endpoints | Recommended key | Why |
|---|---|---|---|
| Authentication | /api/login, /api/otp | user ID plus IP | Prevent stuffing and account lockouts |
| Expensive reads | /api/search, /api/reports | user ID or session | Protect DB and prevent scraping |
| Mutations | Server Actions that write | user ID | Prevent spam and abuse |
| Third-party cost | AI calls, SMS, email | user ID plus plan | Prevent bill shock |
# WAF and CDN Rules: Your Highest ROI Protection#
Rate limiting at the application layer is more precise, but it’s more expensive. WAF and CDN rules should stop the majority of bad traffic before it touches your Next.js runtime.
Practical WAF rules that work#
- Request rate rules on
/api/login,/api/otp,/api/reset-password. - Bot score or managed bot protection for
/api/search,/products,/sitemap.xml. - Geo rules if your business is region-limited.
- Challenge on suspicious patterns rather than outright blocking.
Many teams see that the top 1 to 5 percent of abusive IPs generate a large share of requests. It’s common to reduce origin traffic significantly by blocking or challenging those IPs at the edge. Cloudflare has publicly shared case studies where bot management reduces bot traffic by large margins, and in practice we often see double-digit reductions in origin requests after enabling bot rules and tuning.
💡 Tip: Start with “challenge” for high-risk routes, not “block”. It reduces false positives while still making automated abuse expensive.
Sample policy approach by route sensitivity#
| Route | Default action | Escalation | Notes |
|---|---|---|---|
/api/login | strict limit | challenge or block | Add per-account throttles in app |
/api/search | moderate limit | challenge | Cache results and paginate |
/api/public/* | moderate limit | block on obvious abuse | Encourage API keys |
/app/* authenticated pages | light shaping | app-level user limits | Focus on abusive sessions |
# Response Design: 429, Retry-After, and UX#
Always respond with:
429 Too Many RequestsRetry-Afterheader in seconds- A stable error code your client can interpret
For Server Actions, map the error to a UI message like “Too many attempts, try again in 30 seconds” and keep it consistent across the app.
If you provide APIs to third parties, document limits and return headers like:
X-RateLimit-LimitX-RateLimit-RemainingX-RateLimit-Reset
Even if you don’t implement all of them today, start with Retry-After.
# Monitoring and Alerting: Detect Abuse and Tuning Opportunities#
Rate limiting without monitoring turns into silent user impact. Track both allowed and blocked traffic.
What to measure:
| Metric | Why it matters | Good alert trigger |
|---|---|---|
| Total 429 rate | Detect misconfiguration | Sudden increase over baseline |
| 429 by route | Find hot endpoints | One endpoint spikes |
| 429 by key type | Diagnose false positives | Many distinct IPs blocked |
| Top blocked keys | Identify abusive sources | Single key dominates |
| Latency added by limiter | Ensure limiter isn’t the bottleneck | P95 increases after rollout |
| Redis errors and timeouts | Prevent fail-open surprises | Error rate above 1 percent |
Log in a structured way. Include:
- route
- limiter name
- key type, not the full key when it contains personal data
- remaining tokens
- retryAfter
If you’re building an observability baseline, follow our web app observability guide and ensure logs and metrics can be correlated with request IDs.
⚠️ Warning: Do not log raw IP plus user identifiers together if you don’t need them. Treat these logs as sensitive and apply retention policies.
# False-Positive Mitigation and Safe Rollouts#
False positives are the main reason teams disable limits. Design for gradual rollout and quick mitigation.
Use these mitigation levers#
- 1
Shadow mode
Calculate limits and log would-block events, but don’t block. Roll out for 24 to 72 hours and review. - 2
Tiered thresholds
Set higher limits for authenticated users and paying customers. - 3
Allowlists
Allowlist your office IPs, uptime monitors, CI, and trusted third-party services. - 4
Key design
Use user ID when available. Avoid IP-only for authenticated flows. - 5
Grace for retries
Token bucket burst capacity prevents accidental blocks from double submits and flaky mobile connections. - 6
Escalation instead of blocking
Challenge, add CAPTCHA, or require email verification before hard block.
Common false-positive scenarios and fixes#
| Scenario | Symptom | Fix |
|---|---|---|
| Corporate NAT | Many users share one IP | Key by user ID, increase IP burst |
| Mobile carrier NAT | Random blocks on mobile | Prefer session key, lower IP weighting |
| Aggressive prefetching | Extra GET traffic | Exempt prefetch headers or specific routes |
| Webhooks | Provider retries on failure | Higher burst, idempotency keys, allowlist provider IPs |
# Putting It Together: A Practical Baseline Configuration#
If you want a configuration that works for most SaaS and content platforms, start here and tune.
| Component | Baseline setting | Applies to |
|---|---|---|
| WAF rule | 20 requests per 10 seconds per IP for /api/login | Stops credential stuffing bursts |
| Edge shaping | Block obvious bad bots by UA plus behavior | Cheap early rejection |
| Redis token bucket | Login: 5 per minute sustained, 10 burst per IP plus per account | Consistent enforcement |
| Redis token bucket | Search: 30 per minute sustained, 60 burst per session | Protects DB |
| Action wrapper | Mutations: 20 per minute sustained, 40 burst per user | Prevents spam |
| Observability | Alert on 429 rate and Redis errors | Prevents silent user impact |
Tie this back into your overall security program. Rate limiting and bot protection are core controls alongside input validation, auth hardening, and secure headers, all covered in our web application security checklist.
# Key Takeaways#
- Use a layered strategy: WAF or CDN rules first, Edge shaping second, Redis-backed token bucket limits for consistent quotas in Next.js.
- Avoid IP-only keys for authenticated flows; prefer composite keys like user ID plus IP to reduce false positives behind NAT.
- Implement token bucket limits for better UX: allow short bursts while enforcing sustained rates, and always return
429withRetry-After. - Rate limit Server Actions by wrapping them with a reusable limiter and storing counters in Redis so limits hold across serverless instances.
- Monitor 429 rate, top blocked keys, Redis latency, and errors; roll out in shadow mode before enforcing blocks.
# Conclusion#
Next.js apps are fast to ship, but they also expose high-value endpoints through APIs and Server Actions that bots love to exploit. Implement Next.js rate limiting as a layered system, start with token bucket limits for UX-friendly throttling, and use WAF or CDN rules to stop noisy traffic before it hits your origin.
If you want help designing thresholds, implementing Redis-backed limits, or tuning bot protection without hurting real users, reach out to Samioda. We’ll review your routes, recommend a rollout plan, and implement monitoring so you can enforce limits confidently.
FAQ
Founder & Senior Developer at Samioda. 8+ years building React, Next.js, Flutter and n8n automation solutions for clients across Europe.
More in Web Development
All →Next.js SaaS Onboarding Checklist: Accounts, Permissions, Emails, and Trials (App Router, 2026)
A production-ready Next.js SaaS onboarding checklist covering authentication, organizations, invites, RBAC, transactional emails, and trial-to-paid conversion with practical patterns, libraries, and pitfalls.
React Query at Scale: Cache Invalidation, Pagination, and Mutation Patterns for Real Apps
React Query cache invalidation best practices for real-world apps: scalable query key design, invalidation strategy, optimistic updates, infinite queries, and background refetching in Next.js App Router.
React Performance in 2026: Profiling, Memoization, and Rendering Patterns That Actually Work
A practical step-by-step guide to React performance profiling and memoization in 2026: how to diagnose slow UIs with React DevTools Profiler and why-did-you-render, pick the right rendering patterns, and avoid premature optimization.
Need help with your project?
We build custom solutions using the technologies discussed in this article. Senior team, fixed prices.
Related Articles
Next.js Edge Runtime vs Node.js Runtime (Vercel and Cloudflare): What to Run Where
A practical decision framework for choosing Next.js Edge Runtime vs Node.js Runtime in 2026, with real examples, limitations, and a final use-case matrix.
Next.js Background Jobs in 2026: Queues, Cron, and Long-Running Tasks on Vercel (and Beyond)
A practical guide to running background work in Next.js in 2026: Vercel Cron, serverless limits, queues with Upstash and Redis, and worker services for long-running tasks. Includes decision criteria, architecture diagrams, and a production checklist.
Server Actions in Next.js App Router: Production Form Patterns for Validation, Errors, and Optimistic UI
A production-ready guide to Next.js Server Actions form validation with Zod, structured error handling, progressive enhancement, optimistic UI, and rate limiting — plus when to choose Server Actions vs API routes.