Rate limiting strategies for form endpoints

Engineering

Rate limiting strategies for form endpoints

June 30, 202612 min read

Your contact form goes live. A few days later, you open the submission log and find three hundred entries from the past hour. The email address is randomized in each one. The message body is marketing copy for something unrelated to your business. Nothing in the endpoint code is throwing errors: it accepted every request, stored every record, and attempted to send a notification for each one.

By the time you notice, your email provider has throttled you for the day.

Rate limiting is the layer that caps how many requests a single source can push through your endpoint in a given time window. Without it, your endpoint is an open pipe. With it, you define what legitimate volume looks like and reject everything above that threshold automatically.

There are four main rate limiting strategies, and they each handle the volume problem differently. Which one belongs in your stack depends on the form, the environment, and how much implementation complexity you want to take on.

Why form endpoints need rate limiting specifically

Most APIs require authentication. A rate limit on an authenticated endpoint is a layer of abuse protection on top of an already-controlled surface. You know who's calling; the question is whether they're calling too much.

Contact forms work differently. They are, by design, open to the public. You want anyone who finds your site to be able to reach you. That openness is the whole point, and it's exactly what makes rate limiting non-optional.

Without it, you're exposed to at least three categories of abuse:

Spam bots crawl the web, find <form> elements, and submit them repeatedly. They don't target you specifically. They target everything. A form that went live this morning can have bot traffic by afternoon.
Email enumeration happens when an attacker uses error responses to probe whether email addresses exist in your system. A rate limit closes this window: after a few attempts, the source is blocked before the enumeration yields anything useful.
Accidental duplicate submissions from real users (a double-click, a network retry, an impatient re-submit) are lower-stakes but still worth handling. A rate limit is a natural backstop even when no abuse is involved.

All three look the same to your endpoint: POST requests arriving faster than a real person would send them. A rate limiter doesn't need to distinguish between them. It just needs to enforce a ceiling.

The four rate limiting strategies

Fixed window

The simplest implementation. Pick a time window (say, one hour) and count requests from each IP within that window. When the count exceeds your limit, reject further requests until the window resets.

The implementation is minimal. With Redis, you increment a counter keyed by the IP and the current hour, set an expiry on the first increment, and check the count on every request:

// Simple fixed window with Redis
async function checkFixedWindow(redis: Redis, ip: string, limit: number) {
  const windowKey = Math.floor(Date.now() / (60 * 60 * 1000)); // current hour
  const key = `rate:${ip}:${windowKey}`;

  const count = await redis.incr(key);
  if (count === 1) {
    await redis.expire(key, 3600); // expire after 1 hour
  }

  return count <= limit;
}

The tradeoff: fixed windows are vulnerable to bursting at window boundaries. A source can submit up to the limit in the final minutes of one window, then submit up to the limit again at the start of the next. For a limit of 10 per hour, that's potentially 20 requests in a few minutes while staying within the stated rule. For most contact forms, this is acceptable. For higher-value endpoints, it isn't.

Sliding window

A sliding window measures the actual trailing interval rather than snapping to a clock boundary. Instead of "10 per hour resetting at :00," it's "10 in any 60-minute period." A submission at 12:58 counts against the window until 1:58. The burst problem at window boundaries disappears.

The implementation is more involved if you build it manually, but Upstash's Ratelimit library handles it with a one-liner:

import { Redis } from "@upstash/redis";
import { Ratelimit } from "@upstash/ratelimit";

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, "1 h"),
});

export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anonymous";
  const { success, reset } = await ratelimit.limit(ip);

  if (!success) {
    return new Response("Too many requests", {
      status: 429,
      headers: {
        "Retry-After": String(Math.ceil((reset - Date.now()) / 1000)),
      },
    });
  }

  // proceed with submission handling
}

Sliding window is the right default for most form endpoints. It's accurate without requiring meaningfully more complexity than fixed window when you're using a library that handles the sliding logic for you.

Token bucket

Each IP starts with a bucket of tokens. Submitting the form uses one token. Tokens refill at a fixed rate up to the bucket's maximum capacity.

The key difference from window-based approaches: token bucket allows bursting within the capacity limit. An IP that hasn't submitted in a while has a full bucket and can make several rapid submissions before being throttled. This models legitimate user behavior better. Someone who fills out a contact form and then submits a newsletter signup a few seconds later shouldn't be penalized for the burst.

Upstash supports token bucket as well:

const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.tokenBucket(5, "1 m", 10), // 5 tokens/min refill, 10 max
});

Use token bucket when your site has several forms on one page and legitimate users might submit to more than one in quick succession. The burst capacity accommodates them; the refill rate limits sustained abuse.

Leaky bucket

A leaky bucket queues incoming requests and processes them at a fixed output rate, regardless of how fast they arrive. Requests that can't fit in the queue are rejected.

The name describes the behavior: water (requests) pours in at any rate, but drains out at a fixed rate. The bucket smooths spikes into a constant output.

Leaky bucket is rarely the right choice for a contact form endpoint. The latency it adds (requests wait in queue before being processed) is a poor fit for form submissions where you want an immediate response. It makes more sense for protecting downstream services with strict processing rate limits, not for a public-facing form handler.

Rate limiting in serverless environments

This is where most implementations go wrong.

If you're running a traditional long-lived server, you can keep an in-memory counter (a plain object or Map) that persists between requests. In a serverless environment (Vercel Edge Functions, Cloudflare Workers, AWS Lambda), each invocation is independent. An in-memory counter resets on every request. You can have 500 concurrent invocations all starting their counter at zero while the same IP submits through all of them.

The fix is shared, persistent state that all invocations can read and write. Redis is the standard answer. For serverless specifically, Upstash is the practical choice: it's an HTTP-based Redis that works from edge runtimes without the TCP connection overhead a traditional Redis client requires.

// Works in Vercel Edge Functions, Cloudflare Workers, Next.js App Router
import { Redis } from "@upstash/redis";
import { Ratelimit } from "@upstash/ratelimit";
import { NextRequest } from "next/server";

const redis = Redis.fromEnv(); // UPSTASH_REDIS_REST_URL + UPSTASH_REDIS_REST_TOKEN

const ratelimit = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(5, "1 h"),
  analytics: true, // optional: tracks usage in Upstash console
});

export async function POST(req: NextRequest) {
  const ip = req.ip ?? req.headers.get("x-forwarded-for") ?? "anonymous";
  const { success, reset } = await ratelimit.limit(ip);

  if (!success) {
    return Response.json(
      { error: "Too many requests. Please try again later." },
      {
        status: 429,
        headers: {
          "Retry-After": String(Math.ceil((reset - Date.now()) / 1000)),
        },
      }
    );
  }

  // validate, store, notify...
}

For Next.js App Router, place the rate limit check at the top of your route handler, before any database or email logic. Abusive requests should never touch your downstream services. The hidden complexity of handling form submissions covers the full backend stack this fits into.

Note

In-memory rate limiting is fine for local development and single-instance servers. Never use it in production serverless environments or behind a load balancer with multiple instances. The counter won't survive across invocations.

Setting the right limits

The right number depends on the form. The underlying principle is consistent: set a limit that stops bots but is invisible to a real user.

Personal contact forms (portfolio site, freelancer contact, small business): 3 to 5 submissions per hour per IP. A real person contacts you once, maybe twice if they didn't hear back. The limit is invisible to legitimate users and decisive against bots.
Marketing lead capture (newsletter signup, waitlist, gated content): 10 to 20 per hour. Users submitting from shared IPs (offices, universities, coffee shops) can share an address, so a tighter limit risks blocking legitimate traffic. Consider layering an email-based limit alongside the IP limit: one confirmation email per address per 24 hours is a natural additional constraint.
High-traffic launch forms (product launch waitlists, event registration): higher IP limits are appropriate, but pair them with email-based deduplication. IP rate limiting alone is insufficient when you expect legitimate traffic spikes from a single corporate network.

A useful calibration check: look at your actual submission logs before setting a limit. If real users regularly submit more than once per hour, your limit needs to accommodate that. If they don't, anything above 5 per hour is mostly noise.

How to respond to a rate-limited request

The HTTP status code matters. Use 429 Too Many Requests, not 403 Forbidden and not 200 OK.

200 misleads the sender into thinking the submission succeeded. Bots checking for success responses will keep trying. Legitimate users who hit the limit will believe their message went through when it didn't.

403 signals permanent access prohibition. A rate limit is temporary: the IP can try again after the window resets. 403 is the wrong semantic.

429 is the correct code for "you've exceeded a limit; try again after this interval." Pair it with a Retry-After header that tells the client how many seconds to wait:

return Response.json(
  { error: "Too many requests. Please try again later." },
  {
    status: 429,
    headers: {
      "Retry-After": "3600",
      "Content-Type": "application/json",
    },
  }
);

On the client side, check for 429 in your fetch handler and show a specific message rather than a generic error state:

const res = await fetch("/api/contact", { method: "POST", body: formData });

if (res.status === 429) {
  setError("You've submitted recently. Please wait a bit before trying again.");
  return;
}

if (!res.ok) {
  setError("Something went wrong. Please try again.");
  return;
}

Real users who hit the limit deserve an honest explanation, not a confusing failure state.

What rate limiting doesn't cover

Rate limiting is a volume control. It doesn't evaluate the content of what comes through.

A bot staying under your limit can still submit spam. The submissions arrive slowly enough to pass the rate check but are still junk. For content-level filtering, you need honeypot fields and server-side content scoring. How to prevent spam in contact forms covers those techniques in detail.

Rate limiting also doesn't validate data. An IP under the limit can still submit malformed input, injection attempts, or oversized payloads. Server-side validation runs independently of the rate limiter: both layers need to be present. The best ways to secure a form endpoint guide maps out the full security stack.

The right mental model: rate limiting handles volume, spam protection handles content, and validation handles data integrity. They're complementary layers, not substitutes for each other.

Putting it together

For most form endpoints, the practical choice is sliding window via Upstash Redis, with a limit tuned to the form's expected usage. Fixed window is a reasonable starting point if you want minimal implementation complexity. Token bucket earns its added complexity if your site has multi-form sessions where legitimate users might hit several forms in quick succession.

Whatever strategy you choose: check the rate limit before any downstream work, respond with 429 and Retry-After when the limit is hit, and handle that response on the client so real users get a clear, specific message.

If you'd rather skip the implementation entirely, Formtorch handles rate limiting at the platform level on every endpoint by default. The full feature set includes configurable limits, spam filtering, and storage without any of the Redis setup. For teams building their own backend, why you probably shouldn't build your own form backend lays out the honest accounting of what that actually involves.

FAQ

What's the difference between rate limiting and spam filtering?

Rate limiting is about volume: it caps how many requests any one source can make in a time window, regardless of what those requests contain. Spam filtering is about content: it evaluates the submission itself for signs of spam (filled honeypots, suspicious content patterns, known abusive senders). You need both. A low-volume spammer who stays under your rate limit will pass the rate check and needs to be caught by content filtering.

Does rate limiting work in serverless functions?

Yes, but not with in-memory counters. Each serverless invocation is independent, so a counter stored in process memory resets with every request. You need shared, persistent state: Redis is the standard solution. Upstash is designed specifically for this context: it's an HTTP-based Redis that works from edge runtimes and serverless functions without requiring a persistent TCP connection.

What HTTP status code should I return for a rate-limited request?

Return 429 Too Many Requests with a Retry-After header indicating how many seconds the client should wait before retrying. Don't return 200 (the sender thinks the submission worked) and don't return 403 (implies permanent prohibition rather than a temporary limit). The Retry-After header isn't required by the spec, but it's useful: it tells clients exactly when they can try again, which is more helpful than leaving them to guess.

Rate limiting built in.

Formtorch applies sliding window rate limiting on every endpoint with sensible defaults. Create a free account and your form is protected from the first submission.

Get started free