Caching

The Muxx Gateway can cache LLM responses, returning cached results for identical requests. This reduces both latency and costs.

How It Works

When caching is enabled:

The gateway generates a cache key from the request (model, messages, parameters)
If a cached response exists and hasn’t expired, it’s returned immediately
If not, the request goes to the provider and the response is cached

Request --> Cache Hit?
            |-- Yes --> Return cached response (fast, free)
            |-- No  --> Forward to provider --> Cache response --> Return

Enabling Caching

Caching is configured per-project in the dashboard:

Go to your project Settings
Navigate to Gateway → Caching
Toggle caching on
Set your preferred TTL (time-to-live)

Cache TTL

The TTL determines how long responses are cached:

TTL	Use case
1 hour	Frequently changing data
24 hours	Stable content, good for most use cases
7 days	Static content, maximum cost savings

Cache Keys

The cache key is generated from:

Model name
Messages/prompt content
Temperature (if set)
Other generation parameters

Requests with temperature > 0 are still cached, but you may want shorter TTLs since you might want varied responses.

Cache Headers

The gateway adds headers to indicate cache status:

X-Muxx-Cache: HIT    # Response from cache
X-Muxx-Cache: MISS   # Response from provider (now cached)

Bypassing Cache

To force a fresh response, add the header:

client = OpenAI(
    base_url="https://gateway.muxx.dev/v1",
    default_headers={
        "X-Muxx-Api-Key": "muxx_sk_live_xxxxxxxxxxxx",
        "X-Muxx-Cache-Control": "no-cache"
    }
)

Cost Savings

Cached responses are free—you only pay for the original request. For applications with repeated queries, caching can significantly reduce costs. Example savings:

1000 requests, 60% cache hit rate
Only 400 requests billed to the provider
60% cost reduction

Viewing Cache Stats

In the dashboard, you can see:

Cache hit rate over time
Cost savings from caching
Most frequently cached requests

Getting Started

Dashboard

Gateway

Python SDK

TypeScript SDK

Guides

API Reference

How It Works

Enabling Caching

Cache TTL

Cache Keys

Cache Headers

Bypassing Cache

Cost Savings

Viewing Cache Stats

Getting Started

Dashboard

Gateway

Python SDK

TypeScript SDK

Guides

API Reference

​How It Works

​Enabling Caching

​Cache TTL

​Cache Keys

​Cache Headers

​Bypassing Cache

​Cost Savings

​Viewing Cache Stats

How It Works

Enabling Caching

Cache TTL

Cache Keys

Cache Headers

Bypassing Cache

Cost Savings

Viewing Cache Stats