Back to Solutions

Cut OpenClaw Output Costs by 30%: Enable Concise Mode

Your agent writes three paragraphs when one sentence would do. Every extra word is an output token—and output tokens cost 5× more than input tokens on most models.

The Problem: AI Loves to Be Verbose

By default, large language models are trained to be helpful and thorough. Unfortunately, "thorough" often means:

  • A 200-word explanation when a 20-word answer would suffice
  • Repeating the question back to you before answering
  • Adding disclaimers, caveats, and "let me know if you need anything else" to every response
  • Formatting simple answers as multi-section documents

For an agent running autonomously, this verbosity is pure waste. Nobody is reading a beautifully-formatted 500-token reply to "what's today's date?"

How ClawBridge Detects This (Diagnostic A09)

The Cost Control Center analyzes your recent conversations and calculates:

  • Output-to-input ratio: How many tokens of output does your agent produce per input token? Industry baseline is ~0.5–1.0. If your ratio exceeds 1.5, the agent is likely verbose.
  • Average response length: How many tokens per response? Compared against similar use cases.
  • Potential savings: If response length were reduced by 30%, what would you save?

One-Tap Fix

Tap Apply and ClawBridge appends a concise instruction to your SOUL.md:

Be concise. Prefer short, direct answers.

This single line consistently reduces output verbosity by 25–35% across all major models, without degrading answer quality.

Trade-offs

  • User-facing conversations: If your agent talks to users via Telegram or Discord, overly concise responses may feel curt or unhelpful. Consider this optimization mainly for autonomous/background agents.
  • Documentation tasks: If your agent is supposed to write detailed reports or documentation, concise mode may not be appropriate.
  • Easily reversible: Remove the line from SOUL.md to restore full verbosity. Or use Undo in ClawBridge.

Real Numbers

Agent producing 100K output tokens/day on Claude Sonnet ($15/1M output tokens):

ScenarioDaily Output TokensMonthly CostSavings
Default (verbose)100K$45.00
Concise mode (-30%)70K$31.50$13.50/mo

On Claude Opus ($75/1M output tokens):

ScenarioDaily Output TokensMonthly CostSavings
Default100K$225.00
Concise mode70K$157.50$67.50/mo

FAQ

Q: Will "Be concise" actually work? A: Yes. Major LLMs are highly responsive to system prompt instructions about response style. This is one of the most well-documented prompt engineering techniques.

Q: Won't this affect the agent's personality? A: Minimally. The agent will still follow your SOUL.md personality directives—it just won't pad every response with unnecessary filler.

Q: What about code generation? Will it shorten code too? A: "Be concise" primarily affects natural language explanations, not code blocks. Code output length is usually driven by the task, not verbosity settings.


ClawBridge is free and open source (MIT License) — install it in seconds, own it forever. Get ClawBridge Free →


📖 Further Reading

Share this:

Ready to fix this?

Install ClawBridge in 30 seconds and gain total visibility over your OpenClaw agents — from your phone.