Anthropic's Hidden Token Cap
by 5-Hour Window

Each data point is one 5-hour Anthropic window — exposing how quota caps vary throughout the day. The implied cap = tokens burned ÷ utilization. A sudden jump = Anthropic changed your limit.

Live Auto-refreshes in 60s Generated: 2026-04-26 12:45 UTC
Current Implied Cap
5-Hour Window
170.03M
34.01M burned 20.0% utilized
7-Day (All Models)
2675.38M
1016.64M burned 38.0% utilized
7-Day (Sonnet)
684.68M
246.49M burned 36.0% utilized
What is the implied cap? Anthropic publishes utilization percentages but not the underlying token limits. By dividing tokens burned by the utilization fraction, we back-calculate the effective quota cap for each rate-limit window. A sudden jump in the chart below = Anthropic changed your limit.
Reading this chart: Multiple data points per calendar day = multiple 5h windows. A drop between the 10:00 UTC window and the 15:00 UTC window means Anthropic lowered the cap mid-day. Night windows (20:00+ UTC) often show higher caps than business-hour windows (14:00–20:00 UTC). 5h series (orange) on the right axis; 7-day series on the left.
Implied Token Cap — 5h Windows (30 Days)
Each data point is one 5-hour Anthropic window. Reveals intraday variation: business hours vs. evening vs. night caps.
Token Burn per 5h Window (30 Days)
Tokens burned within each 5-hour Anthropic window. A tall bar = heavy burn; when Anthropic adjusts the cap, burn patterns shift visibly.

Utilization
5-Hour Window 20.0%
34.01M burned  /  ~170.03M cap as of 2026-04-26 12:23 UTC
7-Day (All Models) 38.0%
1016.64M burned  /  ~2675.38M cap as of 2026-04-26 12:23 UTC
7-Day (Sonnet) 36.0%
246.49M burned  /  ~684.68M cap as of 2026-04-26 12:23 UTC
Utilization % — 5h Windows (30 Days)
How much of the quota is being consumed per 5-hour window.
Window Utilized Burned Implied Cap As Of
5-Hour Window 20.0% 34.01M 170.03M 2026-04-26 12:23 UTC
7-Day (All Models) 38.0% 1016.64M 2675.38M 2026-04-26 12:23 UTC
7-Day (Sonnet) 36.0% 246.49M 684.68M 2026-04-26 12:23 UTC

Anthropic doesn't publish rate limits directly. Every API response includes two values: tokens consumed in the current window and what percentage of the limit that represents. Dividing one by the other gives the implied cap — a real-time estimate of the actual limit.

A step-change in the chart means the underlying limit changed: a sudden jump indicates an increase, a drop indicates a reduction or a change in model mix. Three windows are tracked independently: a 5-hour rolling window, a 7-day window for all models, and a 7-day window for Sonnet specifically — each resets on its own schedule.

vmfarms operates on an Anthropic 20× Max plan — the caps shown here reflect our actual quota, which is substantially higher than the standard API tier. Your own limits will differ based on your plan.

Need fully managed cloud hosting without the markup? We handle the infrastructure — bare metal performance, managed for you.

vmfarms.com →