Package dev.toonformat.jtoon
Overview
This package provides a Java implementation of the TOON (Token-Oriented Object Notation) format, a compact, human-readable data format optimized for Large Language Model (LLM) contexts. TOON achieves 30-60% token reduction compared to JSON while maintaining readability and structure.
Core Components
Public API
JToon- Main entry point for encoding Java objects to TOON formatEncodeOptions- Configuration options for encoding (indent, delimiter, length marker)Delimiter- Enum for array/tabular delimiter options (comma, tab, pipe)
Encoding Pipeline
JsonNormalizer- Converts Java objects to Jackson JsonNode representationValueEncoder- Core encoder that converts JsonNode to TOON formatPrimitiveEncoder- Handles encoding of primitive values and object keysHeaderFormatter- Formats array and tabular structure headersLineWriter- Accumulates indented lines for output
Utility Classes
StringValidator- Validates when strings can be unquotedStringEscaper- Escapes special characters in quoted stringsConstants- Shared constants used throughout the package
Usage Examples
Basic Encoding
import dev.toonformat.jtoon.JToon;
import java.util.*;
record User(int id, String name, boolean active) {}
User user = new User(123, "Ada", true);
String jtoon = JToon.encode(user);
// Output:
// id: 123
// name: Ada
// active: true
Tabular Arrays
record Item(String sku, int qty, double price) {
}
record Order(List<Item> items) {
}
Order order = new Order(List.of(
new Item("A1", 2, 9.99),
new Item("B2", 1, 14.5)));
String jtoon = JToon.encode(order);
// Output:
// items[2]{sku,qty,price}:
// A1,2,9.99
// B2,1,14.5
Custom Options
import dev.toonformat.jtoon.*;
// Use tab delimiters and length markers
EncodeOptions options = new EncodeOptions(2, Delimiter.TAB, true);
String jtoon = JToon.encode(data, options);
// Or use builder-style methods
EncodeOptions opts1 = EncodeOptions.withIndent(4);
EncodeOptions opts2 = EncodeOptions.withDelimiter(Delimiter.PIPE);
EncodeOptions opts3 = EncodeOptions.withLengthMarker(true);
Type Conversions
The library automatically normalizes Java-specific types for LLM-safe output:
- Numbers: Finite values in decimal form; NaN/Infinity → null; -0 → 0
- BigInteger: Converted to Long if within range, otherwise string
- BigDecimal: Preserved as decimal number
- Temporal types: Converted to ISO-8601 strings (LocalDateTime, Instant, etc.)
- Optional: Unwrapped to value or null
- Stream: Materialized to array
- Collections: Converted to arrays
- Maps: Converted to objects with string keys
Format Features
Indentation-Based Structure
Uses YAML-like indentation (default 2 spaces) for nested objects.
Tabular Arrays
Arrays of uniform objects with primitive values are encoded in CSV-like tabular format, declaring field names once in the header and then streaming rows.
Smart Quoting
Strings are quoted only when necessary (contains delimiters, special characters, looks like keywords/numbers, etc.). This minimizes token usage.
Delimiter Options
Arrays and tabular rows support three delimiters:
- Comma (default): Implicit in headers -
items[3]: a,b,c - Pipe: Explicit in headers -
items[3|]: a|b|c - Tab: Explicit in headers -
items[3\t]: a\tb\tc
Length Markers
Optional # prefix for array lengths to emphasize
count vs index:
items[#3] instead of items[3].
Architecture
Encoding Pipeline
- Normalization:
JsonNormalizerconverts Java objects to JsonNode - Encoding:
ValueEncoderrecursively encodes JsonNode to TOON - Output:
LineWriteraccumulates formatted lines
Design Principles
- Single Responsibility: Each class has one clear purpose
- Immutability: Configuration objects are immutable records
- Utility Classes: Static utility classes with private constructors
- Modern Java: Leverages Java 17 features (records, switch expressions)
Performance Considerations
- Tabular format detection is O(n×m) where n = rows, m = fields
- String validation uses precompiled regex patterns
- StringBuilder for efficient string concatenation
- No reflection in hot paths (relies on Jackson's object mapper)
Thread Safety
All public API methods are thread-safe. Internal classes are stateless utility classes or immutable configuration objects. The Jackson ObjectMapper instance is shared and thread-safe.
See Also
- Since:
- 0.1.0
-
ClassDescriptionConfiguration options for decoding TOON format to Java objects.Delimiter options for tabular array rows and inline primitive arrays.Configuration options for encoding data to JToon format.Main entry point for encoding and decoding TOON (Token-Oriented Object Notation) format.Path expansion mode for decoding dotted keys.