# Introduction
JSON is nice for APIs, storage, and software logic. However inside giant language mannequin (LLM) pipelines, it usually carries loads of token overhead that doesn’t add a lot worth to the mannequin: braces, quotes, commas, and repeated subject names on each row. TOON, brief for Token-Oriented Object Notation, is a more recent format designed particularly to maintain the identical JSON knowledge mannequin whereas utilizing fewer tokens and giving fashions clearer structural cues. The official TOON docs describe it as a compact, lossless illustration of JSON for LLM enter, particularly sturdy on uniform arrays of objects.
On this article, you’ll study what TOON is, when it is smart to make use of it, and how you can begin utilizing it step-by-step in your personal LLM workflow. We may even hold the tradeoffs trustworthy, as a result of TOON is beneficial in some circumstances, not all of them.
# Why JSON Wastes Tokens in LLM Pipelines
JSON turns into costly in prompts as a result of it repeats construction time and again. LLMs don’t care that JSON is a regular. They solely see tokens.
For those who ship 100 assist tickets, product rows, or person information to a mannequin, the identical subject names seem in each object. TOON reduces that repetition by declaring fields as soon as after which streaming row values in a compact tabular kind. Right here is a straightforward instance.
JSON:
{
"customers": [
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
}
TOON:
customers[3]{id,identify,position}:
1,Alice,admin
2,Bob,person
3,Charlie,person
Identical knowledge, much less litter.
The construction continues to be clear, however the repeated keys are gone. That’s the place TOON will get most of its worth.
# What TOON Really Is and When It Is Value Utilizing
TOON is a serialization format for the JSON knowledge mannequin. Which means it will possibly symbolize objects, arrays, strings, numbers, booleans, and null values — however in a means that’s extra compact for mannequin enter. The TOON mission presents it as lossless relative to JSON, which suggests you possibly can convert JSON to TOON and again with out shedding info. The vital factor to know is that this:
You do not want to switch JSON in your app.
A greater method is to maintain JSON in your backend, APIs, and storage, then convert it to TOON solely if you end up about to ship structured knowledge into an LLM.
TOON is most helpful when your immediate accommodates repeated structured information with the identical fields. Good examples embrace retrieved assist tickets, catalog rows, analytics information, device outputs, CRM entries, or reminiscence snapshots for agent programs. Nevertheless, in case your construction is deeply nested, extremely irregular, purely flat, or very small, the advantages can shrink or disappear.
# Getting Began with TOON
// Step 1: Putting in the TOON Command-Line Interface
The best method to strive TOON is with the official command-line interface (CLI) from the TOON mission. The TOON web site hyperlinks on to its CLI, and the primary repository presents the format as a part of a broader SDK and tooling ecosystem.
Set up the package deal:
npm set up -g @toon-format/cli
// Step 2: Changing a JSON File into TOON
Let’s create a folder first:
mkdir toon-test
cd toon-test
Now, run the next command to create the JSON file:
Paste this:
[
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "user" }
]
Now convert it:
npx @toon-format/cli customers.json -o customers.toon
You must get a compact outcome much like this:
[3]{id,identify,position}:
1,Alice,admin
2,Bob,person
3,Charlie,person
That is the core TOON sample: declare the form as soon as, then record the values row by row. That aligns with the official design aim of tabular arrays for uniform objects.
// Step 3: Utilizing TOON as Mannequin Enter
The very best place to make use of TOON is on the enter facet of your pipeline. As a substitute of pasting a big JSON blob right into a immediate, go the TOON model and hold the instruction easy.
For instance:
The next knowledge is in TOON format.
customers[3]{id,identify,position}:
1,Alice,admin
2,Bob,person
3,Charlie,person
Summarize the person roles and level out something uncommon.
This works effectively as a result of TOON is designed to assist the mannequin learn repeated construction with much less overhead. That can be how the official mission frames its benchmarks: as a take a look at of comprehension throughout totally different structured enter codecs.
// Step 4: Holding JSON for Outputs
This is likely one of the most vital sensible selections. TOON may be very helpful for enter, however JSON continues to be often the higher alternative for output when one other system must parse the mannequin response. That’s as a result of JSON has a lot stronger tooling assist, and fashionable APIs can implement structured JSON output with schemas.
In follow, the most secure sample is:
- JSON in your app.
- TOON for giant structured immediate context.
- JSON once more for machine-parseable mannequin responses.
This offers you effectivity on the enter facet and reliability on the output facet.
// Step 5: Benchmarking in Your Personal Pipeline
Don’t change codecs based mostly on hype alone.
Run a small benchmark in your personal workflow:
- Depend enter tokens for JSON.
- Depend enter tokens for TOON.
- Examine latency.
- Examine reply high quality.
- Examine complete price.
The official TOON mission positions token financial savings as one of many most important advantages, and third-party protection repeats these claims, however neighborhood dialogue additionally reveals that outcomes rely closely on the form of the information. That’s the reason the most effective query just isn’t “Is TOON higher than JSON?”
The higher query is: “Is TOON higher for this particular LLM step?”
# Remaining Ideas
TOON just isn’t one thing you’ll want to use in every single place.
It’s a focused optimization for one particular downside: losing tokens on repeated JSON construction inside LLM prompts. In case your pipeline passes plenty of repeated structured information right into a mannequin, TOON is value testing. In case your payloads are small, irregular, or closely nested, JSON should be the higher alternative.
The neatest method to undertake it’s easy: hold JSON the place JSON already works effectively, use TOON the place you might be packing giant structured inputs into prompts, and benchmark the outcomes by yourself duties earlier than committing to it.
Kanwal Mehreen is a machine studying engineer and a technical author with a profound ardour for knowledge science and the intersection of AI with drugs. She co-authored the e book “Maximizing Productiveness with ChatGPT”. As a Google Era Scholar 2022 for APAC, she champions variety and tutorial excellence. She’s additionally acknowledged as a Teradata Range in Tech Scholar, Mitacs Globalink Analysis Scholar, and Harvard WeCode Scholar. Kanwal is an ardent advocate for change, having based FEMCodes to empower ladies in STEM fields.







