Found a 61% Token Savings in RAG Pipeline by Replacing JSON. Here’s How.

By skyforbes Dec 12, 2025 No Comments

Last month I watched a production RAG pipeline burn almost two thousand dollars in a weekend. Not because the model was large. Not because the workload spiked.

But because the team passed a 500-row customer table to the model as plain JSON. The same payload in TOON would have cost roughly a third of that.

That’s when it hits you: JSON wasn’t built for this world.

It came from 2001, a time of web round-trips and browser consoles. Every brace, quote, comma, and repeated key made sense back then.

In 2025, those characters are tokens. Tokens are money. And every repeated "id": and "name": is a tax you pay for no extra information. TOON is a format built to remove that tax.

It keeps the full JSON data model but strips away the syntax models don’t need.

It replaces braces with indentation, turns repeated keys into a single header row, and makes array sizes explicit so the model can’t hallucinate extra entries.

Same data.
Less noise.
Fewer tokens.

In real workloads, the difference is big.

We saw 61 percent savings on common datasets. Accuracy jumped as well because the structure is clearer and harder for the model to misinterpret.

TOON isn’t a new database. It isn’t compression. It’s simply a way to present structured data in a form that LLMs read more efficiently than JSON. For APIs, logs, storage systems JSON is still perfect. Inside prompts, it quietly becomes the most expensive part of your pipeline.

If you care about tokens, or if your context often includes tables, logs, or structured objects, this is worth a look.

I wrote up the full notes and benchmarks here.

Happy to answer questions or share examples if anyone wants to test TOON on their own datasets.

By skyforbes

Chat GPT

skyforbes Dec 12, 2025

Chat GPT

I gave chatgpt this prompt, guess it’s response.

skyforbes Dec 12, 2025

Chat GPT

Tenses board model – TLM | Tenses TLM | Tenses Working Model

skyforbes Dec 12, 2025

Found a 61% Token Savings in RAG Pipeline by Replacing JSON. Here’s How.

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

okayblizz – Euphoria [shoegaze] (2025)

Request: A Watchlist/Playlist of WWII Movies in Chronological Order

Before age 18 it’s a complement to look older than you are but after it’s a complement to look younger than you really are.

I know the secret of this toilet seat

Archives

Found a 61% Token Savings in RAG Pipeline by Replacing JSON. Here’s How.

Like this:

By skyforbes

Related Posts

I gave chatgpt this prompt, guess it’s response.

Tenses board model – TLM | Tenses TLM | Tenses Working Model

Leave a ReplyCancel reply

You Missed

okayblizz – Euphoria [shoegaze] (2025)

Request: A Watchlist/Playlist of WWII Movies in Chronological Order

Before age 18 it’s a complement to look older than you are but after it’s a complement to look younger than you really are.

I know the secret of this toilet seat