How Costs Are Calculated
Costs are calculated based on:- Model - Each model has different pricing
- Input tokens - Tokens in your prompt/messages
- Output tokens - Tokens in the response
- Provider pricing - Current rates from each provider
Pricing Data
Muxx maintains up-to-date pricing for all supported models:| Model | Input (per 1M tokens) | Output (per 1M tokens) |
|---|---|---|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| claude-3-5-sonnet | $3.00 | $15.00 |
| claude-3-5-haiku | $0.80 | $4.00 |
| gemini-1.5-pro | $1.25 | $5.00 |
| gemini-1.5-flash | $0.075 | $0.30 |
Pricing is updated regularly to reflect provider changes.
Viewing Costs
Per-Request
Every request in the logs shows its cost:- Input tokens and cost
- Output tokens and cost
- Total cost
Dashboard Analytics
The dashboard provides cost analytics:- Daily/weekly/monthly spend - Track spending over time
- Cost by model - See which models cost the most
- Cost by user - If using user metadata
- Cost by feature - If using feature metadata
Spend Alerts
Set up alerts to avoid surprise bills:- Go to Settings → Alerts
- Click Add Alert
- Choose Spend Alert
- Set threshold (e.g., $100/day)
- Select notification channel
Spend Caps
Automatically stop requests when spending exceeds a limit:- Go to Settings → Spend Caps
- Set a daily or monthly cap
- Choose action: Block or Alert Only
Cost Optimization Tips
Use smaller models when possible
Use smaller models when possible
GPT-4o-mini is 20x cheaper than GPT-4o for many tasks.
Enable caching
Enable caching
Cached responses are free. Enable caching for repeated queries.
Optimize prompts
Optimize prompts
Shorter prompts = fewer input tokens = lower costs.
Set max_tokens
Set max_tokens
Limit output length to control output token costs.
Exporting Cost Data
Export cost data for accounting:- Go to Analytics → Costs
- Select date range
- Click Export CSV