Docs
Refer friends. Keep the rewards coming!Your friend can unlock up to 10M tokens · earn up to 30% revenue share.
+500K TokensGenerate link

Gateway routing

Every CrabCode model call goes through the acosmi gateway: unified accounts, unified billing, automatic fallback.

What it is

CrabCode does not call upstream model providers directly. The client sends every request to the acosmi gateway (acosmi.com or acosmi.ai), and the gateway routes it to the appropriate upstream model.

From the user's perspective there are only three things:

  1. Pick a model in /model
  2. Send your request — the gateway handles routing
  3. Your acosmi balance is debited per-token × per-model coefficient

The specific routing policy, upstream provider mapping, and wire-protocol adaptation are internal gateway implementation details that the CrabCode client does not need to know about.

Automatic fallback

CrabCode ships with a default main model and a fallback model. When the main model hits a transient failure, the client automatically switches to the fallback for the next request:

  • Default main model: deepseek-v4-flash
  • Default fallback model: qwen3.6-plus

Common fallback triggers:

  • Main model upstream is transiently unavailable (5xx / timeout)
  • Sustained 429 with no recovery in the budget window
  • Main model is not enabled on your account / has been deprecated

Fallback is request-granular — the switch only takes effect on the next request, never mid-response. CrabCode prints a system message in the TUI telling you which model it switched to.

Error codes — user-facing behavior

ErrorMeaningCrabCode behavior
HTTP 402 [overloaded_error] insufficient quotaThe subscription bucket for the current model is exhaustedDoes not retry. Top up or switch models
HTTP 429 short backoffRate-limited, transient congestionShort backoff retry, up to a small built-in budget
HTTP 429 long backoff (Retry-After ≥ 60s)Rate-limited, quota-styleDoes not retry; fail fast. Consider switching models
Upstream 5xx / network blipTransient failureAuto-retries a few times; then falls back

Concurrency budget

CrabCode applies a built-in concurrency budget on the client side (in particular for sub-agent fan-out), to avoid self-induced congestion against the gateway. This is a contract between client and gateway — you should not and do not need to raise it manually.

Custom models

CrabCode supports configuring custom models that bypass the acosmi gateway and call third-party chat-completion endpoints directly. This requires a CrabCode Pro / Max / Team / Enterprise subscription. Custom models do not use the routing described in this doc — pick the endpoint protocol when prompted during the /login → "Custom model configuration" flow.

Troubleshooting

  • Not sure which model is active/model
  • Not sure why a turn was slow or errored → check the TUI system messages and /cost
  • Suspect a specific model is unavailable/model <slug> to switch and try another

See also