Package dev.toonformat.jtoon.encoder


package dev.toonformat.jtoon.encoder
Encoding engine for converting normalized JsonNode values to TOON format.

Overview

This package contains the core encoding logic that transforms Jackson JsonNode representations into TOON (Token-Oriented Object Notation) format strings. The architecture follows a delegation pattern with specialized encoders for different data types.

Core Components

ValueEncoder

The main orchestrator (42 lines) that determines node type and delegates to specialized encoders:

  • Primitives → PrimitiveEncoder
  • Arrays → ArrayEncoder
  • Objects → ObjectEncoder

ObjectEncoder

Handles encoding of JSON objects. Iterates through object fields and recursively encodes key-value pairs. Delegates arrays to ArrayEncoder and primitives to PrimitiveEncoder.

ArrayEncoder

Orchestrates array encoding by detecting array type and delegating appropriately:

  • Primitive arrays → inline format (tags[3]: a,b,c)
  • Uniform objects → TabularArrayEncoder for efficient tabular format
  • Mixed/nested → ListItemEncoder for list format

TabularArrayEncoder

Detects and encodes uniform arrays of objects in efficient tabular format. Analyzes array structure to determine if all objects have the same primitive fields. When possible, produces:

users[3]{id,name,age}:
  1,Alice,30
  2,Bob,25
  3,Charlie,35

ListItemEncoder

Handles encoding of objects as list items in non-uniform arrays. Implements the complex logic for placing the first field on the "- " line and indenting remaining fields.

PrimitiveEncoder

Handles encoding of primitive values (strings, numbers, booleans, null) and object keys. Implements smart quoting logic: quotes strings only when necessary to avoid ambiguity.

HeaderFormatter

Formats array and tabular structure headers following TOON syntax:

  • items[5]: - Simple array with 5 elements
  • users[3]{id,name,age}: - Tabular array with 3 rows and specified fields
  • data[#10|]: - Array with length marker and pipe delimiter

LineWriter

Accumulates formatted output lines with proper indentation. Manages indentation depth and joins lines with newlines.

Encoding Process

  1. Entry: ValueEncoder receives a JsonNode and EncodeOptions
  2. Type Detection: ValueEncoder determines if node is primitive, array, or object
  3. Delegation: Delegates to ObjectEncoder, ArrayEncoder, or PrimitiveEncoder
  4. Array Optimization: ArrayEncoder detects if arrays can use tabular format
  5. Specialized Encoding: TabularArrayEncoder or ListItemEncoder handle specifics
  6. Recursive Processing: Each encoder recursively handles nested structures
  7. Output Building: LineWriter accumulates formatted lines with proper indentation

Architecture Benefits

Single Responsibility

Each encoder has one clear responsibility:

  • ValueEncoder: Route to appropriate encoder (42 lines)
  • ObjectEncoder: Handle object key-value pairs (62 lines)
  • ArrayEncoder: Orchestrate array encoding (207 lines)
  • TabularArrayEncoder: Detect and encode tabular format (126 lines)
  • ListItemEncoder: Handle list item formatting (146 lines)

Maintainability

The refactored architecture improves maintainability:

  • Each class under 210 lines (vs. original 401-line monolith)
  • Clear separation of concerns
  • Easier to test in isolation
  • Easier to understand and modify

Tabular Format Optimization

Arrays of uniform objects with primitive values are encoded in CSV-like tabular format:

// Instead of:
items[2]:
  - id: 1
    name: Alice
  - id: 2
    name: Bob

// TOON produces:
items[2]{id,name}:
  1,Alice
  2,Bob

Delimiter Support

Supports three delimiters for arrays and tabular rows:

  • Comma: Default, implicit in headers
  • Tab: Explicit in headers, often better tokenization
  • Pipe: Explicit in headers, visual clarity
Since:
0.1.0
See Also: