Skip to main content
Rate limiting helps you control costs and prevent abuse by limiting the number of requests that can be made in a given time period.

Configuring Rate Limits

Set rate limits in your project settings:
  1. Go to your project Settings
  2. Navigate to GatewayRate Limiting
  3. Configure your limits

Limit Types

Requests Per Minute (RPM)

Limit the total number of requests per minute:
RPM: 100  →  Max 100 requests per minute

Tokens Per Minute (TPM)

Limit the total tokens (input + output) per minute:
TPM: 100000  →  Max 100,000 tokens per minute

Daily Request Limit

Limit total requests per day:
Daily: 10000  →  Max 10,000 requests per day

Rate Limit Headers

When rate limiting is enabled, responses include headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 87
X-RateLimit-Reset: 1699900000

Handling Rate Limits

When a rate limit is exceeded, the gateway returns a 429 Too Many Requests response:
{
  "error": {
    "message": "Rate limit exceeded. Please retry after 45 seconds.",
    "type": "rate_limit_error",
    "retry_after": 45
  }
}

Implementing Retries

import time
from openai import OpenAI, RateLimitError

client = OpenAI(
    base_url="https://gateway.muxx.dev/v1",
    default_headers={"X-Muxx-Api-Key": "muxx_sk_live_xxxxxxxxxxxx"}
)

def call_with_retry(messages, max_retries=3):
    for attempt in range(max_retries):
        try:
            return client.chat.completions.create(
                model="gpt-4o",
                messages=messages
            )
        except RateLimitError as e:
            if attempt < max_retries - 1:
                wait_time = e.retry_after or 60
                time.sleep(wait_time)
            else:
                raise

Per-User Rate Limits

You can apply rate limits per user by including user metadata:
client = OpenAI(
    base_url="https://gateway.muxx.dev/v1",
    default_headers={
        "X-Muxx-Api-Key": "muxx_sk_live_xxxxxxxxxxxx",
        "X-Muxx-User-Id": "user_123"
    }
)
Then configure per-user limits in the dashboard.

Rate Limit Alerts

Set up alerts to notify you when approaching limits:
  1. Go to SettingsAlerts
  2. Add a rate limit alert
  3. Choose threshold (e.g., 80% of limit)
  4. Select notification channel (email, Slack, webhook)