The default limit was raised from 100 to 200 requests per 60 seconds. A few high-volume endpoints have their own limits — see Per-endpoint limits.
How the limit works
The limit is a token bucket: you start with 200 tokens, each request spends one, and the bucket refills to full every 60 seconds. With OAuth 2.0 (Bearer tokens) the bucket is keyed per authorizing user, so if several apps use the same user’s token, they share that user’s bucket.How the bucket is keyed
The counter key is built from your OAuth client, the authorizing user, and — on creator-scoped endpoints — the creator:- Creator-scoped endpoints get a separate bucket per creator. Any endpoint under
/creators/{creatorUserUuid}/*keys on the creator UUID as well, so each creator you act on behalf of has its own independent 200/60s budget. Your total throughput therefore scales with the number of creators you operate. - Everything else shares one bucket per principal. Endpoints that are not creator-scoped — including the aggregate
/agencies/*endpoints and your own/chats/*,/posts/*, etc. — all key on justclientId:userUuid, so they draw from a single shared 200/60s bucket.
Per-token, not per-IP
The counter is global per identity. The count is the same no matter which of your servers makes the request, so distributing traffic across multiple servers, processes, or IP addresses does not multiply your budget — they all decrement the sameclientId:userUuid counter.
The API also applies separate protections against abusive traffic patterns by source IP. These are distinct from the rate limit documented here and won’t affect well-behaved clients that respect the headers below.
Headers on every response
Each response tells you where you stand:| Header | Meaning |
|---|---|
X-RateLimit-Limit | Requests allowed in the current window for this endpoint. |
X-RateLimit-Remaining | Requests left in the current window. |
X-RateLimit-Reset | Unix timestamp (seconds) when the bucket refills. |
Retry-After | On a 429 only: seconds to wait before retrying. |
When you hit the limit
Exceed the limit and you get429 Too Many Requests:
Retry-After, then retry.
Per-endpoint limits
Most endpoints use the default 200/60s. A few have their own limits:| Endpoint | Limit |
|---|---|
POST /chats/statuses (batch online status) | 80 / 60s |
GET /creators/{creatorUserUuid}/subscribers/online | 80 / 60s |
| Batch online-status (creator-scoped equivalent) | 80 / 60s |
X-RateLimit-Limit on the response rather than assuming a value — it reflects the actual limit for the endpoint you called.
Handle it in code
RespectRetry-After and back off. A minimal retry wrapper:
Capacity planning
The most common cause of throttling is re-polling chats for new messages. Polling every chat on a fixed interval scales withcreators × chats × poll frequency and burns your budget on requests that almost always return nothing new. It is also unnecessary: Fanvue offers two cheaper paths.
- Stop polling — use webhooks. Register a
message.received(andmessage.read) destination and let Fanvue push events to you in real time. Zero steady-state request cost. - When you must pull, pull deltas, not the world. Use the bulk
POST /chats/messages/batchendpoint with asinceMessageUuidcursor to fetch only what changed across up to 50 chats in a single request.
/agencies/* calls all share one. Reserve the shared bucket for genuinely cross-creator queries.