Chop: Stop Wasting Tokens on Verbose CLI Output

March 7, 2026 · 5 min. · projects ,ai ,claude ,cli ,tools ,productivity

A couple of weeks ago I started paying more attention to something that had been quietly bothering me: my Claude Code sessions were burning through context faster than they should.

I started researching. Read dozens of posts about token optimization. Almost all of them pointed in the same direction: persist memory across sessions, use cross-context storage, summarize previous conversations.

I tried it. I built memory files, set up structured summaries, tuned the prompts.

The result wasn’t what I expected.

🤔 The real question

After a few frustrating experiments, I did something I probably should have done from the start — I just asked Claude directly:

What’s actually consuming most of my tokens in a typical session?

The answer was not memory. It wasn’t prompt length. It wasn’t even the code context.

It was the CLI output.

Every time Claude runs a command — git status, docker ps, npm test, kubectl get pods — the raw terminal output lands in the context window verbatim. All of it. The headers, the padding, the hints, the help text. The stuff a human skims past in half a second but that costs dozens or hundreds of tokens every single time.

git status alone: 247 tokens for something that could be communicated in 12. docker ps with a handful of containers: 850+ tokens for a table most of which is whitespace.

And it happens on every command. Silently. Constantly.

💡 The insight that changed things

The posts about memory persistence weren’t wrong — they just weren’t addressing the real bottleneck. Cross-session memory helps with context continuity. But it doesn’t do anything about the firehose of verbose output that fills up your window during an active session.

The fix isn’t about what Claude remembers. It’s about what Claude sees in the first place.

🔧 Building chop

So I built chop — a CLI output compressor designed specifically for Claude Code.

The name comes from chop chop: the sound of something eating through all that verbosity, bite by bite, before it ever reaches the context window.

The idea is simple: intercept every command Claude runs and compress the output before it enters the context window. Not summarizing with AI — just structural compression. Strip the noise, keep the signal.

# Without chop (247 tokens)
$ git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   src/app.ts
        modified:   src/auth/login.ts
        modified:   config.json

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        src/utils/helpers.ts

no changes added to commit (use "git add" and/or "git commit")

# With chop (12 tokens — 95% savings)
$ chop git status
modified(3): src/app.ts, src/auth/login.ts, config.json
untracked(1): src/utils/helpers.ts

Same information. A fraction of the tokens.

⚙️ How it works

You can use chop in two ways:

Manual — prefix any command yourself:

chop git status
chop docker ps
chop npm test
chop kubectl get pods
chop terraform plan

Automatic — install a Claude Code hook that wraps every Bash command transparently:

chop init --global

After that, every command Claude runs gets compressed automatically. You don’t have to think about it.

chop supports 52+ commands across Git, Docker, Kubernetes, JavaScript tooling, .NET, Go, Rust, Python, Java, Ruby, Terraform, cloud CLIs, and more. Anything not on the list still gets auto-detected and compressed via structural pattern matching — JSON, CSV, tables, log lines.

📊 The actual numbers

chop tracks every run in a local SQLite database:

chop gain             # overall stats
chop gain --history   # last 20 commands
chop gain --summary   # per-command breakdown

After a few regular workdays of Claude sessions:

today: 42 commands, 12,847 tokens saved
total: 318 commands, 89,234 tokens saved (73.2% avg)

That’s a real number. Not a benchmark, not a synthetic test — just normal development work with chop in the loop.

🎯 What this changes

The memory posts weren’t useless. Persistent context across sessions is genuinely useful for long-running projects. But if you’re trying to reduce token consumption during a session, the biggest lever by far is what goes into the context window from command output.

chop doesn’t require changing how you work. It doesn’t require rewriting prompts or restructuring projects. You install it, run chop init --global, and the compression happens in the background.

The conversations get tighter. Claude holds more of the actual work in context. And the token count stops growing as fast.

🙃 The plot twist

I started building chop on a Monday. By Thursday night, a friend sent me a message mentioning he’d found a tool that reduces token usage when working with AI coding assistants. It was RTK — a Rust-based CLI proxy that does essentially the same thing.

I visited the repo. Read through it. Didn’t test it, because at that point chop was already working.

Honestly? If my friend had mentioned it on Sunday, I probably would have just used RTK and saved myself the week. But I hadn’t heard of it, so I built my own — and in the process learned a lot about how Claude Code hooks work, how to structure per-command filters, and how much variance there is in CLI verbosity across different tools.

Both tools exist now and they solve the same problem. I’ll keep using my own implementation — mostly because I can improve it to fit my workflow as I go, and maybe make it generic enough that someone else finds it useful too.

✳️ Where to go from here

If you use Claude Code regularly for development, this is probably worth trying.

It’s free, open source, MIT licensed, and installs in one line:

curl -fsSL https://raw.githubusercontent.com/AgusRdz/chop/main/install.sh | sh

👉 chop project page

Sometimes the simplest question leads to the most useful answer. I spent days looking at memory strategies. The real fix was just reducing the noise — and apparently at least two people figured that out around the same time.