Your agent writes three paragraphs when one sentence would do. Every extra word is an output token—and output tokens cost 5× more than input tokens on most models.
The Problem: AI Loves to Be Verbose
By default, large language models are trained to be helpful and thorough. Unfortunately, "thorough" often means:
- A 200-word explanation when a 20-word answer would suffice
- Repeating the question back to you before answering
- Adding disclaimers, caveats, and "let me know if you need anything else" to every response
- Formatting simple answers as multi-section documents
For an agent running autonomously, this verbosity is pure waste. Nobody is reading a beautifully-formatted 500-token reply to "what's today's date?"
How ClawBridge Detects This (Diagnostic A09)
The Cost Control Center analyzes your recent conversations and calculates:
- Output-to-input ratio: How many tokens of output does your agent produce per input token? Industry baseline is ~0.5–1.0. If your ratio exceeds 1.5, the agent is likely verbose.
- Average response length: How many tokens per response? Compared against similar use cases.
- Potential savings: If response length were reduced by 30%, what would you save?
One-Tap Fix
Tap Apply and ClawBridge appends a concise instruction to your SOUL.md:
Be concise. Prefer short, direct answers.
This single line consistently reduces output verbosity by 25–35% across all major models, without degrading answer quality.
Trade-offs
- User-facing conversations: If your agent talks to users via Telegram or Discord, overly concise responses may feel curt or unhelpful. Consider this optimization mainly for autonomous/background agents.
- Documentation tasks: If your agent is supposed to write detailed reports or documentation, concise mode may not be appropriate.
- Easily reversible: Remove the line from SOUL.md to restore full verbosity. Or use Undo in ClawBridge.
Real Numbers
Agent producing 100K output tokens/day on Claude Sonnet ($15/1M output tokens):
| Scenario | Daily Output Tokens | Monthly Cost | Savings |
|---|---|---|---|
| Default (verbose) | 100K | $45.00 | — |
| Concise mode (-30%) | 70K | $31.50 | $13.50/mo |
On Claude Opus ($75/1M output tokens):
| Scenario | Daily Output Tokens | Monthly Cost | Savings |
|---|---|---|---|
| Default | 100K | $225.00 | — |
| Concise mode | 70K | $157.50 | $67.50/mo |
FAQ
Q: Will "Be concise" actually work? A: Yes. Major LLMs are highly responsive to system prompt instructions about response style. This is one of the most well-documented prompt engineering techniques.
Q: Won't this affect the agent's personality? A: Minimally. The agent will still follow your SOUL.md personality directives—it just won't pad every response with unnecessary filler.
Q: What about code generation? Will it shorten code too? A: "Be concise" primarily affects natural language explanations, not code blocks. Code output length is usually driven by the task, not verbosity settings.
ClawBridge is free and open source (MIT License) — install it in seconds, own it forever. Get ClawBridge Free →