Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
config.go	config.go
config_test.go	config_test.go
doc.go	doc.go
example_test.go	example_test.go
metrics.go	metrics.go
middleware.go	middleware.go
middleware_test.go	middleware_test.go

httpserver/middleware/throttle

A comprehensive HTTP request throttling middleware with flexible configuration from JSON/YAML files and both rate limiting and in-flight request limiting capabilities.

Features

Rate Limiting: Controls the frequency of requests over time using leaky bucket (with burst support) or sliding window algorithms
In-Flight Limiting: Controls the number of concurrent requests being processed
Flexible Key Extraction: Global, per-client IP, per-header value, or custom identity-based throttling
Nginx-style Route Matching: Location-based route selection with exact matches and patterns
Request Backlogging: Queue requests when limits are reached with configurable timeouts
Dry-Run Mode: Test throttling configurations without enforcement
Comprehensive Metrics: Prometheus metrics for monitoring and alerting
Tag-Based Rules: Apply different throttling rules to different middleware instances
Key Filtering: Include/exclude specific keys with glob pattern support
Auto Retry-After: Automatic calculation of retry intervals for rate limits

Please see testable example to understand how to configure and use the middleware.

Throttling configuration

The throttling configuration is usually stored in the JSON or YAML file. The configuration consists of the following parts:

rateLimitZones. Each zone has a rate limit, burst limit, and other parameters.
inFlightLimitZones. Each zone has an in-flight limit, backlog limit, and other parameters.
rules. Each rule contains a list of routes and rate/in-flight limiting zones that should be applied to these routes.

Global throttling example

Global throttling assesses all traffic coming into an API from all sources and ensures that the overall rate/concurrency limit is not exceeded. Overwhelming an endpoint with traffic is an easy and efficient way to carry out a denial-of-service attack. By using a global rate/concurrency limit, you can ensure that all incoming requests are within a specific limit.

API-level throttling

rateLimitZones:
  rl_total:
    rateLimit: 5000/s
    burstLimit: 10000
    responseStatusCode: 503
    responseRetryAfter: 5s

inFlightLimitZones:
  ifl_total:
    inFlightLimit: 5000
    backlogLimit: 10000
    backlogTimeout: 30s
    responseStatusCode: 503
    responseRetryAfter: 1m

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_total
    inFlightLimits:
      - zone: ifl_total

With this configuration, all HTTP requests will be limited by rate (no more than 5000 requests per second, rateLimit). Excessive requests within the burst limit (10000 here, burstLimit) will be served immediately regardless of the specified rate, requests above the burst limit will be rejected with the 503 error (responseStatusCode), and "Retry-After: 5" HTTP header (responseRetryAfter).

Additionally, there is a concurrency limit. If 5000 requests (inFlightLimit) are being processed right now, new incoming requests will be backlogged (suspended). If there are more than 10000 such backlogged requests (backlogLimit), the rest will be rejected immediately. The request can be in backlogged status for no more than 30 seconds (backlogTimeout) and then it will be rejected. Response for rejected request contains 503 (responseStatusCode) HTTP error code and "Retry-After: 60" HTTP header (responseRetryAfter).

backlogLimit and backlogTimeout can be specified for rate-limiting zones too.

Per-client throttling example

Per-client throttling is focused on controlling traffic from individual sources and making sure that API clients are staying within their prescribed limits. Per-client throttling allows avoiding cases when one client exhausts the application resources of the entire backend service (for example, it uses all connections from the DB pool), and all other clients have to wait for their release.

To implement per-client throttling, the package uses the concept of "identity". If the client is identified by a unique key, the package can throttle requests per this key. MiddlewareOpts struct has a GetKeyIdentity callback that should return the key for the current request. It may be a user ID, JWT "sub" claim, or any other unique identifier. If any rate/in-flight limiting zone's key.type field is set to identity, the GetKeyIdentity callback must be implemented.

Example of per-client throttling configuration:

rateLimitZones:
  rl_identity:
    rateLimit: 50/s
    burstLimit: 100
    responseStatusCode: 429
    responseRetryAfter: auto
    key:
      type: identity
    maxKeys: 50000

inFlightLimitZones:
  ifl_identity:
    inFlightLimit: 64
    backlogLimit: 128
    backlogTimeout: 30s
    responseStatusCode: 429
    key:
      type: identity
    maxKeys: 50000

  ifl_identity_expensive_op:
    inFlightLimit: 4
    backlogLimit: 8
    backlogTimeout: 30s
    responseStatusCode: 429
    key:
      type: identity
    maxKeys: 50000
    excludedKeys:
      - "150853ab-322c-455d-9793-8d71bf6973d9" # Exclude root admin.

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_identity
    inFlightLimits:
      - zone: ifl_identity
    alias: per_identity
  - routes:
      - path: "= /api/v1/do_expensive_op_1"
        methods: POST
      - path: "= /api/v1/do_expensive_op_2"
        methods: POST
    inFlightLimits:
      - zone: ifl_identity_expensive_op
    alias: per_identity_expensive_ops

All throttling counters are stored inside an in-memory LRU cache (maxKeys determines its size).

For the rate-limiting zone, responseRetryAfter may be specified as "auto". In this case, the time when a client may retry the request will be calculated automatically.

Each throttling rule may contain an unlimited number of rate/in-flight limiting zones. All rule zones will be applied to all specified routes. The route is described as path + list of HTTP methods. To select a route, exactly the same algorithm is used as to select a location in Nginx (http://nginx.org/en/docs/http/ngx_http_core_module.html#location). Also, the route may have an alias that will be used in the Prometheus metrics label (see example below).

Sliding window rate-limiting

rateLimitZones:
  rl_identity:
    alg: sliding_window
    rateLimit: 15/m
    responseStatusCode: 429
    responseRetryAfter: auto
    key:
      type: identity
    maxKeys: 50000

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_identity
    alias: per_identity

In this example sliding window algorithm will be used for rate-limiting (alg parameter has "token_bucket" value by default). It means, only 15 requests are allowed per minute. They could be sent even simultaneously, but all exceeding requests that are received in the same minute will be rejected.

Example of throttling of all requests with a "bad" User-Agent HTTP header

rateLimitZones:
  rl_bad_user_agents:
    rateLimit: 500/s
    burstLimit: 1000
    responseStatusCode: 503
    responseRetryAfter: 15s
    key:
      type: header
      headerName: "User-Agent"
      noBypassEmpty: true
    includedKeys:
      - ""
      - "Go-http-client/1.1"
      - "python-requests/*"
      - "Python-urllib/*"
    maxKeys: 1000

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_bad_user_agents

Throttle requests by remote address example

rateLimitZones:
  rl_by_remote_addr:
    rateLimit: 100/s
    burstLimit: 1000
    responseStatusCode: 503
    responseRetryAfter: auto
    key:
      type: remote_addr
    maxKeys: 10000

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_by_remote_addr

Prometheus metrics

The package collects several metrics in the Prometheus format:

rate_limit_rejects_total. Type: counter; Labels: dry_run, rule.
in_flight_limit_rejects_total. Type: counter; Labels: dry_run, rule, backlogged.

Tags

Tags are useful when different rules of the same configuration should be used by different middlewares. For example, suppose you want to have two different throttling rules:

A rule for all requests.
A rule for all identity-aware (authorized) requests.

Tags can be specified at two levels:

Rule-level tags

Tags can be specified at the rule level. This approach is useful when you want different middlewares to process completely different sets of rules:

# ...
rules:
  - routes:
    - path: "/hello"
      methods: GET
    rateLimits:
      - zone: rl_zone1
    tags: all_reqs

  - routes:
    - path: "/feedback"
      methods: POST
    inFlightLimits:
      - zone: ifl_zone1
    tags: all_reqs

  - routes:
    - path: /api/1/users
      methods: PUT
    rateLimits:
      - zone: rl_zone2
    tags: require_auth_reqs
# ...

In your code, you will have two middlewares that will be executed at different steps of the HTTP request serving process. Each middleware should only apply its own throttling rule.

allMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"all_reqs"}})
requireAuthMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"require_auth_reqs"}})

Zone-level tags

You can specify tags per zone within a rule, allowing fine-grained control over which zones are applied by different middlewares. This approach avoids route duplication when the same routes need different zones for different middlewares:

# ...
rules:
  - routes:
    - path: "/"
    excludedRoutes:
    - path: "/healthz"
    - path: "/metrics"
    rateLimits:
      - zone: rl_total
        tags: all_reqs
      - zone: rl_identity
        tags: authn_reqs
    inFlightLimits:
      - zone: ifl_total
        tags: all_reqs
      - zone: ifl_identity
        tags: authn_reqs
# ...

Different middlewares can selectively apply zones based on their tags:

allMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"all_reqs"}})
authnMw := MiddlewareWithOpts(cfg, "my-app-domain", throttleMetrics, MiddlewareOpts{Tags: []string{"authn_reqs"}})

Tag precedence

When both rule-level and zone-level tags are specified, rule-level tags take precedence:

If the middleware's tags match the rule-level tags, all zones in that rule are applied (regardless of zone-level tags).
If the middleware's tags don't match the rule-level tags, then zone-level tags are checked for each zone individually.
If neither rule-level nor zone-level tags match, the rule is skipped entirely.

Dry-run mode

Before configuring real-life throttling, usually, it's a good idea to try the dry-run mode. It doesn't affect the processing requests flow, however, all excessive requests are still counted and logged. Dry-run mode allows you to better understand how your API is used and determine the right throttling parameters.

The dry-run mode can be enabled using the dryRun configuration parameter. Example:

rateLimitZones:
  rl_identity:
    rateLimit: 50/s
    burstLimit: 100
    responseStatusCode: 429
    responseRetryAfter: auto
    key:
      type: identity
    maxKeys: 50000
    dryRun: true

inFlightLimitZones:
  ifl_identity:
    inFlightLimit: 64
    backlogLimit: 128
    backlogTimeout: 30s
    responseStatusCode: 429
    key:
      type: identity
    maxKeys: 50000
    dryRun: true

rules:
  - routes:
      - path: "/"
    rateLimits:
      - zone: rl_identity
    inFlightLimits:
      - zone: ifl_identity
    alias: per_identity

If specified limits are exceeded, the corresponding messages will be logged.

For rate-limiting:

{"msg": "too many requests, serving will be continued because of dry run mode", "rate_limit_key": "ee9a0dd8-7396-5478-8b83-ab7402d6746b"}

For in-flight limiting:

{"msg": "too many in-flight requests, serving will be continued because of dry run mode", "in_flight_limit_key": "3c00e780-5721-59f8-acad-f0bf719777d4"}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

httpserver/middleware/throttle

Features

Throttling configuration

Global throttling example

API-level throttling

Per-client throttling example

Sliding window rate-limiting

Example of throttling of all requests with a "bad" User-Agent HTTP header

Throttle requests by remote address example

Prometheus metrics

Tags

Rule-level tags

Zone-level tags

Tag precedence

Dry-run mode

FilesExpand file tree

throttle

Directory actions

More options

Directory actions

More options

Latest commit

History

throttle

Folders and files

parent directory

README.md

httpserver/middleware/throttle

Features

Throttling configuration

Global throttling example

API-level throttling

Per-client throttling example

Sliding window rate-limiting

Example of throttling of all requests with a "bad" User-Agent HTTP header

Throttle requests by remote address example

Prometheus metrics

Tags

Rule-level tags

Zone-level tags

Tag precedence

Dry-run mode