Resilience Exponential Backoff + Source-Aware Retry See in Code Tour

Rate Limiting & Exponential Backoff

Smart retry logic that distinguishes foreground user queries from background tasks to avoid amplifying capacity issues.

Rate Limiting & Exponential Backoff — Architecture Diagram
flowchart TD
    A[API Call] --> B{Error?}
    B -->|No| C[Return Result]
    B -->|429 Rate Limit| D[Exponential Backoff]
    B -->|529 Capacity| E{Foreground Query?}
    E -->|Yes| D
    E -->|No| F[Bail Immediately]
    D --> G{Max Retries?}
    G -->|No| A
    G -->|Yes| H[Return Error]

Mermaid diagram definition

Deep Dive

The key insight is that not all API failures deserve the same treatment. A 429 is a temporary rate limit — always worth retrying. A 529 is a capacity issue — retrying blindly amplifies the problem for everyone.

🔑Key Insight

Source-aware retry: only queries the user is actively waiting on (`repl_main_thread`, `sdk`, `agent`) retry on 529. Background queries (title generation, suggestions, classifier) bail immediately to reduce server load.

💡Tip

Keep-alive heartbeats (30s intervals) prevent proxy servers from closing idle connections during long tool executions. Without these, a 60-second bash command would lose its response stream.

KEY TAKEAWAYS
  • Not all errors deserve retries — error type AND query source matter
  • Retrying capacity errors amplifies the problem; bail for background queries
  • Exponential backoff prevents thundering herd on recovery
  • Keep-alive heartbeats are essential for long-running tool calls

Source Code

Retry constants and foreground source set showing the source-aware retry decision.

const abortError = () => new APIUserAbortError()

const DEFAULT_MAX_RETRIES = 10
const FLOOR_OUTPUT_TOKENS = 3000
const MAX_529_RETRIES = 3
export const BASE_DELAY_MS = 500

// Foreground query sources where the user IS blocking on the result — these
// retry on 529. Everything else (summaries, titles, suggestions, classifiers)
// bails immediately: during a capacity cascade each retry is 3-10× gateway
// amplification, and the user never sees those fail anyway. New sources
// default to no-retry — add here only if the user is waiting on the result.
const FOREGROUND_529_RETRY_SOURCES = new Set<QuerySource>([
  'repl_main_thread',
  'repl_main_thread:outputStyle:custom',
  'repl_main_thread:outputStyle:Explanatory',
  'repl_main_thread:outputStyle:Learning',
  'sdk',
  'agent:custom',
  'agent:default',
  'agent:builtin',
  'compact',
  'hook_agent',
  'hook_prompt',
  'verification_agent',
  'side_question',
  // Security classifiers — must complete for auto-mode correctness.
  // yoloClassifier.ts uses 'auto_mode' (not 'yolo_classifier' — that's
  // type-only). bash_classifier is ant-only; feature-gate so the string
  // tree-shakes out of external builds (excluded-strings.txt).
  'auto_mode',
AI Assistant

Ask anything about Rate Limiting & Exponential Backoff

Powered by Groq · Enter to send, Shift+Enter for newline