Skip to content

Commit 2f73996

Browse files
Copilotstreamich
andcommitted
Add comprehensive README documentation for Smile codec
Co-authored-by: streamich <9773803+streamich@users.noreply.github.com>
1 parent 00a933c commit 2f73996

File tree

1 file changed

+106
-0
lines changed

1 file changed

+106
-0
lines changed

src/smile/README.md

Lines changed: 106 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,106 @@
1+
# Smile Codec
2+
3+
A high-performance JavaScript implementation of the Smile binary format, compatible with the JSON data model.
4+
5+
## Overview
6+
7+
Smile is an efficient JSON-compatible binary data format developed by the Jackson JSON processor project team. It provides significant space and parsing efficiency improvements over JSON while maintaining full compatibility with the JSON data model.
8+
9+
## Features
10+
11+
- **Full Smile Format Support**: Implements Smile format specification v1.0.6
12+
- **High Performance**: Optimized encoder and decoder for maximum throughput
13+
- **Shared String Optimization**: Automatic deduplication of repeated strings and property names
14+
- **Safe Binary Encoding**: 7-bit encoding to avoid reserved byte values
15+
- **Comprehensive Type Support**: All JavaScript types including binary data
16+
- **Round-trip Integrity**: Perfect fidelity for all supported data types
17+
18+
## Usage
19+
20+
### Basic Encoding/Decoding
21+
22+
```typescript
23+
import {SmileEncoder, SmileDecoder} from '@jsonjoy.com/json-pack/smile';
24+
25+
// Encode JavaScript value to Smile binary format
26+
const encoder = SmileEncoder.create();
27+
const encoded = encoder.encode({name: 'John', age: 30});
28+
29+
// Decode Smile binary format back to JavaScript value
30+
const decoder = SmileDecoder.create(encoded);
31+
const decoded = decoder.decode();
32+
console.log(decoded); // {name: 'John', age: 30}
33+
```
34+
35+
### Configuration Options
36+
37+
```typescript
38+
// Encoder options
39+
const encoder = SmileEncoder.create({
40+
sharedStringValues: true, // Enable shared value string optimization
41+
sharedPropertyNames: true, // Enable shared property name optimization
42+
rawBinaryEnabled: false // Allow raw binary data (may contain reserved bytes)
43+
});
44+
45+
// Decoder options
46+
const decoder = SmileDecoder.create(encoded, {
47+
maxSharedReferences: 1024 // Maximum size for shared string tables
48+
});
49+
```
50+
51+
## Data Type Support
52+
53+
| JavaScript Type | Smile Encoding | Notes |
54+
|-----------------|----------------|-------|
55+
| `null` | Null token | Single byte |
56+
| `boolean` | Boolean tokens | Single byte each |
57+
| `number` (integer) | Variable-length integer | ZigZag encoded, 1-10 bytes |
58+
| `number` (float) | IEEE 754 float/double | 7-bit encoded, 5/10 bytes |
59+
| `string` | UTF-8 with length prefix | Optimized for ASCII, shared references |
60+
| `Array` | Structure markers + elements | Nested encoding |
61+
| `Object` | Structure markers + key/value pairs | Shared property names |
62+
| `Uint8Array` | Safe binary encoding | 7-bit encoding by default |
63+
64+
## Performance Characteristics
65+
66+
- **Small Integers (-16 to +15)**: Single byte encoding
67+
- **ASCII Strings (1-64 chars)**: Length-prefixed, no end markers
68+
- **Shared Strings**: Automatic deduplication reduces size
69+
- **Binary Data**: Safe 7-bit encoding avoids parsing issues
70+
71+
## Format Details
72+
73+
Smile format uses a 4-byte header followed by token-based encoding:
74+
75+
```
76+
Header: 0x3A 0x29 0x0A [config byte]
77+
: ) \n version + flags
78+
```
79+
80+
The format distinguishes between:
81+
- **Value Mode**: For encoding data values
82+
- **Key Mode**: For encoding object property names
83+
84+
## Limitations
85+
86+
- Very large integers beyond JavaScript's safe integer range are converted to strings
87+
- Maximum shared reference table size is 1024 entries
88+
- Some byte values (0xFE, 0xFF) are avoided in shared references as per specification
89+
90+
## Specification Compliance
91+
92+
This implementation follows the official Smile format specification v1.0.6, including:
93+
- Proper header validation and version checking
94+
- Correct token encoding for all value types
95+
- Shared string reference management with table rotation
96+
- Safe binary encoding to avoid reserved byte values
97+
- Long string handling with end markers
98+
99+
## Testing
100+
101+
The implementation includes comprehensive tests:
102+
- Round-trip integrity tests
103+
- Fuzzing tests with random data structures
104+
- Edge case handling for large numbers and Unicode
105+
- Shared string optimization verification
106+
- Binary data encoding/decoding validation

0 commit comments

Comments
 (0)