Package dev.toonformat.jtoon.decoder


package dev.toonformat.jtoon.decoder
Decoding engine for converting TOON format to Java objects.

Overview

This package contains the core decoding logic that parses TOON (Token-Oriented Object Notation) format strings into Java objects (Maps, Lists, and primitives). The architecture follows a line-by-line parsing strategy with depth tracking based on indentation.

Core Components

ValueDecoder

The main orchestrator that manages the parsing state and routes lines to appropriate decoders. Contains an inner Parser class that maintains:

  • Current line index
  • Array of input lines
  • Indentation depth tracking
  • Delimiter configuration

PrimitiveDecoder

Handles parsing of scalar values with type inference:

  • "null" → null
  • "true"/"false" → Boolean
  • Numeric strings → Long or Double
  • Quoted strings → String (with unescaping)
  • Bare strings → String

ObjectDecoder

Parses key-value pairs and nested objects:

  • Detects unquoted colons in key: value format
  • Handles quoted keys like "order:id": value
  • Uses lookahead to detect nested objects (depth increase)
  • Recursively processes nested structures

ArrayDecoder

Detects array type from header and delegates parsing:

  • Tabular: items[2]{id,name}: → parses rows into Maps
  • List: items[2]: with - prefixed lines
  • Primitive: tags[3]: a,b,c → inline or multiline

Parsing Strategy

Pattern Matching

Uses regex patterns to detect structure:

  • \[(#?)\d+[\t|]?] - Standalone array header
  • \[(#?)\d+[\t|]?]\{(.+)\}: - Tabular array header with fields
  • ^(.+?)\[(#?)\d+[\t|]?](\{[^}]+\)?:.*$} - Keyed array pattern

Depth Tracking

Indentation determines nesting level:

user:              // depth 0
  id: 123          // depth 1
  contact:         // depth 1
    email: a@b.c   // depth 2

Delimiter Awareness

Delimiter is detected from array headers or configured via DecodeOptions:

  • [2] → comma (implicit)
  • [2 ] → tab (space in header)
  • [2|] → pipe (explicit)

Decoding Process

  1. Entry: ValueDecoder receives TOON string and DecodeOptions
  2. Line Splitting: Input split into array of lines
  3. Pattern Detection: Each line analyzed for array headers, key-value pairs
  4. Depth Calculation: Leading spaces determine nesting level
  5. Delegation: Route to ObjectDecoder, ArrayDecoder, or PrimitiveDecoder
  6. Recursive Parsing: Nested structures processed recursively with depth tracking
  7. Object Assembly: Maps and Lists built from parsed components

Array Format Detection

Tabular Arrays

Header with field specification:

users[2]{id,name,role}:
  1,Alice,admin
  2,Bob,user

List Arrays

Next line after header starts with "- ":

items[2]:
  - id: 1
    name: First
  - id: 2
    name: Second

Primitive Arrays

Inline or multiline values without field spec or list markers:

tags[3]: reading,gaming,coding

// or multiline:
tags[3]:
  reading,gaming,coding

Error Handling

Strict Mode (default)

  • Throws IllegalArgumentException on malformed input
  • Validates indentation consistency
  • Requires valid array headers

Lenient Mode

  • Best-effort parsing
  • Returns null on invalid input
  • Skips malformed lines

Special Parsing Cases

Quoted Keys

findUnquotedColon() correctly handles keys with colons:

"order:id": 7       // Colon inside quotes is literal
name: Ada           // Unquoted colon separates key/value

Escaped Strings

Delegates to StringEscaper.unescape() for:

  • \n → newline
  • \t → tab
  • \" → quote
  • \\ → backslash

Delimiter Parsing

Respects quotes when splitting values:

// Input: a,"b,c",d
// Splits to: ["a", "b,c", "d"]  (comma inside quotes preserved)

Architecture Benefits

Single Responsibility

Each decoder has one clear responsibility:

  • ValueDecoder: Parse state management and routing
  • PrimitiveDecoder: Type inference for scalars
  • ObjectDecoder: Key-value parsing with nesting
  • ArrayDecoder: Array type detection and delegation

Maintainability

  • Clear separation between array types
  • Recursive descent mirrors TOON's indentation structure
  • Regex patterns document expected formats
  • Testable in isolation
Since:
0.2.0
See Also: