Docs
Refer friends. Keep the rewards coming!Your friend can unlock up to 10M tokens · earn up to 30% revenue share.
+500K TokensGenerate link

Costs

Check session usage, inspect remaining quota, and keep token spend in check.

What it is

CrabCode bills by token usage. Every model call is metered through the acosmi gateway — there is no direct billing relationship with upstream providers:

  • Subscribers — debit against your entitlement bucket; each model has its own remaining quota
  • Prepaid-balance users — debit against your balance at the active model's per-token price

Each session accumulates token counts and an estimated USD spend locally; when that estimate crosses a built-in threshold, a "cost threshold" dialog pops up once.

When you see this doc

  • The "Learn more" link at the bottom of the cost-threshold dialog (fires once per session)
  • The "Read more" link from /cost

Current session spend

shell
/cost
/cost

Subscribers see the current quota status (allowed / running low / exhausted).

Other users see a session breakdown:

  • Input / output / cache-read / cache-write token counts, broken down per model
  • Cumulative estimated USD (priced at each model's current rate)
  • Total API duration and wall-clock duration
  • Total lines added / removed

The numbers come from a client-side accumulator and may drift slightly from the real bill on acosmi.com — the dashboard is authoritative.

Per-model remaining quota

In the model picker (/model), each model shows its remaining %. This is the gateway-aggregated "remaining / quota for this model's bucket." If you see "quota insufficient" for a model, switch to another model or top up / upgrade your plan at acosmi.com.

Cost-threshold reminder

CrabCode pops the cost dialog once when the estimated spend crosses a built-in threshold, nudging you to review your spend pace. The threshold is fixed and fires at most once per session; it isn't configurable in settings.json.

Acknowledge the dialog to keep going. For longer-term savings, use the tactics below.

Saving money

TacticWhy
/clear to drop unrelated contextLong context means more tokens on every request
Use /model to switch to a smaller / cheaper model for routine workLower per-token price
Split tasks: exploratory Q&A on a small model, key edits on a larger oneSave expensive tokens for the critical path
Lean on prompt caching (gateway-enabled by default)Highly repetitive prompts save on read tokens
Use subagents for bulk reading workKeeps the main transcript free of tool-output noise

Limits and caveats

  • Local estimate/cost is a client-side back-calculation at the model's rate and may differ from the acosmi.com bill; the dashboard is authoritative
  • MCP / WebFetch tokens count toward the session total
  • Subagent spend rolls up to the parent session
  • Single billing entry — China region debits the acosmi.com balance, Global debits acosmi.ai (see providers/routing)
  • Gateway-counted tokens are the truth — local numbers are a UI estimate; the gateway's metering is what bills