Advanced Features
This guide covers advanced features and use cases for TokenOrientedObjectNotation.jl.
Key Folding and Path Expansion
Key folding and path expansion are complementary features for working with deeply nested objects.
Key Folding (Encoding)
Flatten nested objects into dotted keys:
data = Dict(
"database" => Dict(
"host" => "localhost",
"port" => 5432,
"credentials" => Dict(
"username" => "admin",
"password" => "secret"
)
)
)
# Without folding
TOON.encode(data)
# database:
# host: localhost
# port: 5432
# credentials:
# username: admin
# password: secret
# With folding
options = TOON.EncodeOptions(keyFolding="safe")
TOON.encode(data, options=options)
# database.host: localhost
# database.port: 5432
# database.credentials.username: admin
# database.credentials.password: secretPath Expansion (Decoding)
Expand dotted keys back into nested objects:
input = """
database.host: localhost
database.port: 5432
database.credentials.username: admin
database.credentials.password: secret
"""
options = TOON.DecodeOptions(expandPaths="safe")
data = TOON.decode(input, options=options)
# Dict("database" => Dict(
# "host" => "localhost",
# "port" => 5432,
# "credentials" => Dict(
# "username" => "admin",
# "password" => "secret"
# )
# ))Round-trip Compatibility
Key folding and path expansion are designed to work together:
original = Dict("a" => Dict("b" => Dict("c" => 42)))
# Encode with folding
encode_opts = TOON.EncodeOptions(keyFolding="safe")
encoded = TOON.encode(original, options=encode_opts)
# a.b.c: 42
# Decode with expansion
decode_opts = TOON.DecodeOptions(expandPaths="safe")
decoded = TOON.decode(encoded, options=decode_opts)
# Dict("a" => Dict("b" => Dict("c" => 42)))
# original == decoded ✓Conflict Detection
Path expansion detects conflicts when keys overlap:
# Conflict: 'a' is both a primitive and an object
input = """
a: 1
a.b: 2
"""
# Strict mode: error
options = TOON.DecodeOptions(expandPaths="safe", strict=true)
try
TOON.decode(input, options=options)
catch e
println(e) # "Cannot expand path 'a.b': segment 'a' already exists as non-object"
end
# Non-strict mode: last-write-wins
options = TOON.DecodeOptions(expandPaths="safe", strict=false)
data = TOON.decode(input, options=options)
# Dict("a" => Dict("b" => 2)) # 'a: 1' is overwrittenDepth Limiting
Control how deep folding goes:
data = Dict("a" => Dict("b" => Dict("c" => Dict("d" => 42))))
# Fold only 2 levels
options = TOON.EncodeOptions(keyFolding="safe", flattenDepth=2)
TOON.encode(data, options=options)
# a.b:
# c:
# d: 42Delimiter Selection
Choose the right delimiter for your use case.
Comma (Default)
Best for general purpose use:
users = [Dict("name" => "Alice", "age" => 30)]
TOON.encode(Dict("users" => users))
# users[1]{name,age}:
# Alice,30Pros:
- Most compact
- Familiar to JSON users
- Works well with most data
Cons:
- Requires quoting if values contain commas
Tab
Best for TSV-like data:
options = TOON.EncodeOptions(delimiter=TOON.TAB)
users = [Dict("name" => "Alice", "age" => 30)]
TOON.encode(Dict("users" => users), options=options)
# users[1 ]{name age}:
# Alice 30Pros:
- Easy to parse programmatically
- Natural for spreadsheet data
- Rarely needs quoting
Cons:
- Less readable in some contexts
- Invisible character
Pipe
Best for visual separation:
options = TOON.EncodeOptions(delimiter=TOON.PIPE)
users = [Dict("name" => "Alice", "age" => 30)]
TOON.encode(Dict("users" => users), options=options)
# users[1|]{name|age}:
# Alice|30Pros:
- Very readable
- Clear visual separation
- Database/SQL-like
Cons:
- Requires quoting if values contain pipes
- Slightly less compact
Working with Large Data
Streaming Considerations
For very large datasets, consider:
- Chunking: Process data in smaller batches
- Tabular format: Use tabular arrays for uniform data
- Delimiter choice: Tabs are fastest to parse
# Process in chunks
function encode_large_dataset(records, chunk_size=1000)
chunks = []
for i in 1:chunk_size:length(records)
chunk = records[i:min(i+chunk_size-1, length(records))]
push!(chunks, TOON.encode(Dict("data" => chunk)))
end
return chunks
endMemory Efficiency
TokenOrientedObjectNotation.jl is designed for correctness over performance, but you can optimize:
# Use tabular format for uniform data (most compact)
users = [Dict("id" => i, "name" => "User$i") for i in 1:10000]
toon_str = TOON.encode(Dict("users" => users))
# Use appropriate delimiter (tabs are fastest)
options = TOON.EncodeOptions(delimiter=TOON.TAB)
toon_str = TOON.encode(Dict("users" => users), options=options)Custom Indentation
Match your team's style preferences:
# 2 spaces (default, most compact)
options = TOON.EncodeOptions(indent=2)
# 4 spaces (common in many languages)
options = TOON.EncodeOptions(indent=4)
# 8 spaces (very readable)
options = TOON.EncodeOptions(indent=8)Error Recovery
Handle errors gracefully in production:
function safe_decode(input::String)
try
# Try strict mode first
return TOON.decode(input, options=TOON.DecodeOptions(strict=true))
catch e
@warn "Strict decoding failed, trying lenient mode" exception=e
try
# Fall back to lenient mode
return TOON.decode(input, options=TOON.DecodeOptions(strict=false))
catch e2
@error "Decoding failed completely" exception=e2
return nothing
end
end
endIntegration with Other Formats
From JSON
using JSON
# JSON to TOON
json_str = """{"name": "Alice", "age": 30}"""
data = JSON.parse(json_str)
toon_str = TOON.encode(data)
# TOON to JSON
toon_str = "name: Alice\nage: 30"
data = TOON.decode(toon_str)
json_str = JSON.json(data)From CSV/TSV
using CSV, DataFrames
# CSV to TOON
df = CSV.read("data.csv", DataFrame)
records = [Dict(pairs(row)) for row in eachrow(df)]
toon_str = TOON.encode(Dict("data" => records))
# Use tab delimiter for TSV-like output
options = TOON.EncodeOptions(delimiter=TOON.TAB)
toon_str = TOON.encode(Dict("data" => records), options=options)Performance Tips
- Use tabular format - Most compact for uniform data
- Choose appropriate delimiter - Tabs are fastest to parse
- Limit nesting depth - Flatter structures are faster
- Batch operations - Process multiple records together
- Reuse options - Create options once, reuse many times
# Good: reuse options
options = TOON.EncodeOptions(delimiter=TOON.TAB)
for data in datasets
toon_str = TOON.encode(data, options=options)
# process...
end
# Bad: create options every time
for data in datasets
toon_str = TOON.encode(data, options=TOON.EncodeOptions(delimiter=TOON.TAB))
# process...
endNext Steps
- See Examples for real-world use cases
- Check API Reference for complete function documentation
- Review Compliance for specification details