Rate Limits
Understand API rate limits, headers, and error handling.
All API endpoints are rate-limited to ensure fair usage across all users. Limits vary by subscription tier.
Rate Limit Headers
Every API response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Minute-Limit | Maximum requests per minute |
X-RateLimit-Minute-Remaining | Requests remaining in current minute |
X-RateLimit-Minute-Reset | Unix timestamp when the minute window resets |
Execution endpoints (running apps/workflows) also include hourly limits:
| Header | Description |
|---|---|
X-RateLimit-Hour-Limit | Maximum executions per hour |
X-RateLimit-Hour-Remaining | Executions remaining in current hour |
X-RateLimit-Hour-Reset | Unix timestamp when the hour window resets |
General Rate Limit
All endpoints share a general rate limit of 120 requests per minute per user, regardless of subscription tier.
Execution Rate Limits
Run endpoints (POST /api/v1/apps/:id and POST /api/v1/runs) have additional per-tier limits:
| Tier | Per Minute | Per Hour |
|---|---|---|
| Starter | 10 | 100 |
| Plus | 20 | 200 |
| Max | 30 | 300 |
| Ultra | 90 | 900 |
Handling Rate Limits
When rate-limited, the API returns 429 Too Many Requests with a Retry-After header indicating how many seconds to wait:
{
"error": {
"code": "RATE_LIMITED",
"message": "Rate limit exceeded",
"retry_after": 30
}
}Best practice: Check the X-RateLimit-Minute-Remaining header before making requests. When remaining hits zero, wait until the X-RateLimit-Minute-Reset timestamp.
GPU Quota
In addition to rate limits, execution endpoints enforce GPU usage quotas. When exceeded, you receive 429 with code GPU_QUOTA_EXCEEDED.