ClawPaneClawPane

Model Fallback: How to Keep AI Agents Running When Providers Go Down

Every major LLM provider has had outages in the past year. OpenAI, Anthropic, Google — none are immune. If your AI agents depend on a single provider, an outage means downtime. Model fallback chains solve this.

What Is Model Fallback?

Model fallback is the practice of automatically trying alternative models when your primary model fails. Instead of returning an error to the user, the system routes the request to the next best available model.

Request → Primary Model (down) → Fallback #1 → Fallback #2 → Response

The user never knows there was a problem. The request completes, just with a different model behind it.

Why You Need Fallback Chains

Provider Outages Are Frequent

In 2025, major providers experienced:

  • OpenAI: 12+ documented outages affecting API availability
  • Anthropic: 8+ incidents with degraded performance
  • Google AI: 6+ Vertex AI service disruptions

That's roughly one major outage every 10 days across the ecosystem.

Rate Limits Hit Without Warning

Even without outages, rate limits can block requests during traffic spikes. If your agents are rate-limited on OpenAI, a fallback to Anthropic or Google keeps things moving.

Regional Failures

Provider issues often affect specific regions. A fallback to a provider with different regional infrastructure adds geographic resilience.

Designing Fallback Chains

Good fallback chains consider three factors:

1. Quality Parity

Your fallback model should produce comparable quality. Falling back from GPT-5 to GPT-5-nano might not be acceptable for complex tasks. Falling back to Claude Sonnet 4.5 or Gemini 2.5 Pro might be fine.

2. Provider Diversity

Fallbacks should span multiple providers. If OpenAI is down, falling back to another OpenAI model doesn't help. You need cross-provider fallbacks:

GPT-5 → Claude Sonnet 4.5 → Gemini 2.5 Pro → Llama 4 Maverick

3. Speed of Detection

The faster you detect a failure, the less time users wait. Key signals:

  • HTTP 5xx errors → immediate fallback
  • Rate limit (429) errors → immediate fallback
  • Timeout exceeded → fallback after threshold
  • Quality degradation → fallback based on scoring

Manual vs. Automatic Fallback

Manual Fallback (Application Level)

try {
  response = await openai.chat(request);
} catch (error) {
  response = await anthropic.chat(request);
}

Problems:

  • Every service needs its own fallback logic
  • You maintain multiple provider SDKs
  • Fallback models are hardcoded
  • No coordination across services

Automatic Fallback (Router Level)

The router handles fallback transparently. Your application sends one request; the router tries models in order until one succeeds.

Benefits:

  • Zero application code changes
  • Centralized fallback configuration
  • Dynamic model ordering based on current availability
  • Coordinated across all services

Fallback with ClawPane

ClawPane's fallback chains are built into the routing layer. When you create a router:

  1. The router scores all available models for each request
  2. If the top-scored model fails, it automatically tries the next best
  3. Fallback decisions happen in milliseconds
  4. Every response includes metadata about which model was selected (even if it was a fallback)

You enable fallbacks by default when creating a router — no additional configuration needed. Your OpenClaw agents stay operational even when individual providers fail.

Set up resilient routing →