Smart retry logic that distinguishes foreground user queries from background tasks to avoid amplifying capacity issues.
flowchart TD
A[API Call] --> B{Error?}
B -->|No| C[Return Result]
B -->|429 Rate Limit| D[Exponential Backoff]
B -->|529 Capacity| E{Foreground Query?}
E -->|Yes| D
E -->|No| F[Bail Immediately]
D --> G{Max Retries?}
G -->|No| A
G -->|Yes| H[Return Error]Mermaid diagram definition
The key insight is that not all API failures deserve the same treatment. A 429 is a temporary rate limit — always worth retrying. A 529 is a capacity issue — retrying blindly amplifies the problem for everyone.
Source-aware retry: only queries the user is actively waiting on (`repl_main_thread`, `sdk`, `agent`) retry on 529. Background queries (title generation, suggestions, classifier) bail immediately to reduce server load.
Keep-alive heartbeats (30s intervals) prevent proxy servers from closing idle connections during long tool executions. Without these, a 60-second bash command would lose its response stream.
Retry constants and foreground source set showing the source-aware retry decision.
const abortError = () => new APIUserAbortError()
const DEFAULT_MAX_RETRIES = 10
const FLOOR_OUTPUT_TOKENS = 3000
const MAX_529_RETRIES = 3
export const BASE_DELAY_MS = 500
// Foreground query sources where the user IS blocking on the result — these
// retry on 529. Everything else (summaries, titles, suggestions, classifiers)
// bails immediately: during a capacity cascade each retry is 3-10× gateway
// amplification, and the user never sees those fail anyway. New sources
// default to no-retry — add here only if the user is waiting on the result.
const FOREGROUND_529_RETRY_SOURCES = new Set<QuerySource>([
'repl_main_thread',
'repl_main_thread:outputStyle:custom',
'repl_main_thread:outputStyle:Explanatory',
'repl_main_thread:outputStyle:Learning',
'sdk',
'agent:custom',
'agent:default',
'agent:builtin',
'compact',
'hook_agent',
'hook_prompt',
'verification_agent',
'side_question',
// Security classifiers — must complete for auto-mode correctness.
// yoloClassifier.ts uses 'auto_mode' (not 'yolo_classifier' — that's
// type-only). bash_classifier is ant-only; feature-gate so the string
// tree-shakes out of external builds (excluded-strings.txt).
'auto_mode',Ask anything about Rate Limiting & Exponential Backoff
Powered by Groq · Enter to send, Shift+Enter for newline