Rate Limits

Rate limits protect the API from abuse and ensure fair usage across all accounts. Limits vary by plan and endpoint type.

Rate Limit Tiers

Each plan has different limits for requests per minute, daily messages, REPLR count, and knowledge documents. Upgrade your plan to increase your quotas.

Plan	Requests/min	Messages/day	REPLRs	Knowledge Docs
Free	30	50	3	5
Plus	120	500	25	50
Pro	300	2,000	100	200
Creator	600	10,000	Unlimited	500

Need higher limits? Upgrade your plan or contact us for custom enterprise quotas.

Rate Limit Headers

Every API response includes headers that tell you where you stand within the current rate limit window. Use these to proactively manage your request flow.

Header	Description
`X-RateLimit-Limit`	Maximum number of requests allowed in the current window.
`X-RateLimit-Remaining`	Number of requests remaining in the current window.
`X-RateLimit-Reset`	Unix timestamp (seconds) when the current window resets.

Example response headers

HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 120
X-RateLimit-Remaining: 117
X-RateLimit-Reset: 1741521600

429 Too Many Requests

When you exceed the rate limit, the API responds with a 429 status code. The response body includes a retry_after field indicating how many seconds to wait before retrying. The same value is also available in the Retry-After response header.

{
  "error": "rate_limit_exceeded",
  "message": "You have exceeded the rate limit of 120 requests per minute. Please wait before retrying.",
  "retry_after": 32
}

Do not retry immediately. Always respect the retry_after value. Continuously hitting the rate limit may result in longer back-off periods or temporary suspension.

Best Practices

Follow these guidelines to avoid hitting rate limits and build a resilient integration.

•Implement exponential backoff. When a request fails with 429, wait for the retry_after period, then double the delay on each subsequent retry up to a maximum cap.
•Cache responses where possible. REPLR metadata, conversation lists, and configuration data change infrequently. Cache them locally to reduce redundant API calls.
•Use webhooks instead of polling. Instead of repeatedly polling for new messages, configure webhooks to receive real-time push notifications when events occur.
•Batch operations when available. Some endpoints support batch requests. Use them to combine multiple operations into a single API call.
•Monitor X-RateLimit-Remaining. Track the remaining quota from response headers and throttle your requests proactively before hitting the limit.

Retry Logic Examples

Here is a reusable pattern for handling rate limits with exponential backoff. The function retries up to 3 times, respecting the retry_after value from the response.

async function fetchWithRetry(url, options = {}, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    const res = await fetch(url, options);

    if (res.status !== 429) return res;

    // Parse retry delay from header or body
    const retryAfter = res.headers.get("Retry-After");
    const body = await res.json();
    const delay = (retryAfter ? parseInt(retryAfter) : body.retry_after) || 1;

    // Exponential backoff: retry_after * 2^attempt
    const backoff = delay * Math.pow(2, attempt) * 1000;
    console.warn(`Rate limited. Retrying in ${backoff / 1000}s...`);

    await new Promise((resolve) => setTimeout(resolve, backoff));
  }

  throw new Error("Max retries exceeded");
}

// Usage
const res = await fetchWithRetry("https://api.replr.ai/v1/replrs", {
  headers: { Authorization: `Bearer ${API_KEY}` },
});
const data = await res.json();

Endpoint-Specific Limits

Some resource-intensive endpoints have lower rate limits than the per-plan defaults shown above. These limits apply regardless of your plan unless noted otherwise.

Endpoint	Free	Pro+
Voice TTS	100/min	500/min
File uploads	10/min	10/min
Search	60/min	60/min

Endpoint-specific limits are tracked independently from the global per-minute limit. A request can fail with 429 even if you have remaining global quota.

← Authentication API Reference →

Plan

Requests/min

Messages/day

REPLRs

Knowledge Docs

Free

Plus

120

500

Pro

300

2,000

100

200

Creator

600

10,000

Unlimited

500

Header

Description

X-RateLimit-Limit

Maximum number of requests allowed in the current window.

X-RateLimit-Remaining

Number of requests remaining in the current window.

X-RateLimit-Reset

Unix timestamp (seconds) when the current window resets.

async function fetchWithRetry(url, options = {}, maxRetries = 3) { for (let attempt = 0; attempt <= maxRetries; attempt++) { const res = await fetch(url, options); if (res.status !== 429) return res; // Parse retry delay from header or body const retryAfter = res.headers.get("Retry-After"); const body = await res.json(); const delay = (retryAfter ? parseInt(retryAfter) : body.retry_after) || 1; // Exponential backoff: retry_after * 2^attempt const backoff = delay * Math.pow(2, attempt) * 1000; console.warn(`Rate limited. Retrying in ${backoff / 1000}s...`); await new Promise((resolve) => setTimeout(resolve, backoff)); } throw new Error("Max retries exceeded"); } // Usage const res = await fetchWithRetry("https://api.replr.ai/v1/replrs", { headers: { Authorization: `Bearer ${API_KEY}` }, }); const data = await res.json();

Endpoint-Specific Limits

Some resource-intensive endpoints have lower rate limits than the per-plan defaults shown above. These limits apply regardless of your plan unless noted otherwise.

Endpoint	Free	Pro+
Voice TTS	100/min	500/min
File uploads	10/min	10/min
Search	60/min	60/min

Endpoint-specific limits are tracked independently from the global per-minute limit. A request can fail with 429 even if you have remaining global quota.