Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 164 additions & 0 deletions ISSUES_FOUND.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,164 @@
# Issues Found in JSON Encoding Fuzzer Test

## Summary

This document describes the issues identified and the changes made to expose them more consistently.

## Issues Identified

### 1. Floating Point Precision Loss in Random Number Generation

**Location**: `packages/json-type/src/random/Random.ts`, lines 118-122

**Problem**: When generating random numbers for integer types (u8, u16, u32, etc.), the `Random.num()` method adds/subtracts a tiny value (`0.000000000000001`) to avoid exact boundary values:

```typescript
if (gte !== undefined)
if (gte === lte) return gte;
else min = gte + 0.000000000000001; // ← Adds tiny float
if (lte !== undefined) max = lte - 0.000000000000001; // ← Subtracts tiny float
```

This results in values like `4.000000000000001` for the `Configuration.database.port` field (which is defined as `u16` with `gte: 1, lte: 65535`).

When this value is encoded to JSON and decoded back, JSON's number-to-string conversion loses the tiny decimal part, resulting in `4`. This causes the test assertion `expect(decoded).toEqual(json)` to fail because `4.000000000000001 !== 4`.

**Impact**: Intermittent test failures when random generation produces these edge-case floating point values for integer types.

### 2. Stale Buffer References in Generated Code

**Location**: `packages/json-type/src/codegen/binary/AbstractBinaryCodegen.ts`, lines 43-45

**Problem**: The generated encoder code caches references to the Writer's `uint8` array and `view` at the start:

```javascript
var writer = encoder.writer;
writer.ensureCapacity(capacityEstimator(r0));
var uint8 = writer.uint8, view = writer.view; // ← Cached references
```

If the buffer grows during encoding (e.g., when `encoder.writeStr()` calls `writer.ensureCapacity()` internally), these cached references become stale and point to the old, discarded buffer.

The generated code directly uses these cached references:
```javascript
uint8[writer.x - ...] = ...; // ← Uses stale reference if buffer grew
```

**Impact**: Potential data corruption or encoding errors when the initial capacity estimate is insufficient and the buffer needs to grow during encoding.

### 3. Potential Capacity Estimation Underestimation

**Location**: `packages/json-type/src/codegen/capacity/CapacityEstimatorCodegen.ts`

**Problem**: The capacity estimator uses conservative overhead estimates from `MaxEncodingOverhead` constants. However, these may not account for all edge cases:

- Unicode strings that require more bytes than ASCII
- Escaped characters in strings (`, \, newlines, etc.)
- The multiplier for string length (`StringLengthMultiplier = 5`) may be insufficient for worst-case scenarios

**Impact**: When combined with a small initial buffer size, underestimation can lead to multiple buffer reallocations and expose issue #2 (stale buffer references).

## Changes Made

### 1. Reduced Writer Buffer Sizes

**Modified Files**:
- `packages/json-type/src/codegen/binary/json/__tests__/automated.spec.ts`: Changed from `new Writer(16)` to `new Writer(1)`
- `packages/json-type/src/codegen/binary/writer.ts`: Changed default from `new Writer()` (64KB) to `new Writer(1)`

**Purpose**: By using extremely small initial buffer sizes (1 byte), we force the buffer to grow during encoding. This exposes:
- Stale buffer reference bugs
- Capacity estimation inaccuracies
- Edge cases in buffer management

### 2. Added Comprehensive Test Suites

**New Test Files**:

1. **`precision-edge-case.spec.ts`**: Tests floating point precision issues
- Demonstrates values like `4.000000000000001` becoming `4` after JSON round-trip
- Tests with minimal buffers

2. **`buffer-capacity.spec.ts`**: Stress tests capacity estimation
- Tests with buffer sizes from 1 to 128 bytes
- Runs 1000 iterations with random data
- Exposes underestimation issues

3. **`stale-buffer-bug.spec.ts`**: Targets the stale reference bug
- Uses 1-byte buffer with long strings to force growth
- Tests with many fields
- Tests with unicode strings that need more bytes

4. **`inspect-codegen.spec.ts`**: Debugging helper
- Prints generated encoder code for inspection

5. **`comprehensive-issues.spec.ts`**: Combined stress test
- Tests all three issues together
- Runs 100 iterations with random data and minimal buffer
- Categorizes failures into float precision vs other issues

## Expected Outcomes

With these changes, the tests should now:

1. **Fail more consistently** when the floating point precision issue occurs
2. **Expose the stale buffer reference bug** when encoding with insufficient initial capacity
3. **Identify specific cases** where capacity estimation is insufficient

## Recommendations for Fixes

### Fix #1: Floating Point Precision in Random.num()

Change lines 118-122 in `Random.ts` to not add/subtract tiny values for integer formats:

```typescript
if (gte !== undefined) {
if (gte === lte) return gte;
else {
min = gte;
// For integer formats, ensure we don't add floating point offsets
if (!schema.format || floats.has(schema.format)) {
min += 0.000000000000001;
}
}
}
```

Or better yet, for integer formats, ensure the final result is always an integer.

### Fix #2: Stale Buffer References

Option A: Don't cache `uint8` and `view` references. Access them through `writer` each time:
```javascript
writer.uint8[writer.x - ...] = ...;
```

Option B: Refresh cached references after any operation that might grow the buffer:
```javascript
if (/* some operation that might have grown buffer */) {
uint8 = writer.uint8;
view = writer.view;
}
```

Option C: Ensure capacity is always sufficient by being more conservative in estimation.

### Fix #3: Improve Capacity Estimation

Review and increase the overhead constants in `MaxEncodingOverhead` to be more conservative, especially:
- `StringLengthMultiplier`: Consider increasing from 5 to 6 or 7
- Account for worst-case unicode (4 bytes per char + JSON escaping)
- Add safety margin to prevent underestimation

## Testing Strategy

The new tests are designed to:
1. **Always fail** if the issues exist (by using minimal buffers)
2. **Clearly identify** which specific issue caused the failure
3. **Log details** about problematic values for debugging
4. **Stress test** the system with random data to find edge cases

Run these tests to validate any fixes:
```bash
yarn test packages/json-type/src/codegen/binary/json/__tests__/
```
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,9 @@ import {JsonCodegen} from '../JsonCodegen';
import {Random} from '../../../../random';
import {allSerializableTypes} from '../../../../__tests__/fixtures';

const encoder = new JsonEncoder(new Writer(16));
// Reduced buffer size from 16 to 1 byte to stress-test capacity estimation
// This will expose any bugs where the capacity estimator underestimates the required buffer size
const encoder = new JsonEncoder(new Writer(1));
const decoder = new JsonDecoder();

for (const [name, type] of Object.entries(allSerializableTypes)) {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,106 @@
import {Writer} from '@jsonjoy.com/buffers/lib/Writer';
import {JsonDecoder} from '@jsonjoy.com/json-pack/lib/json/JsonDecoder';
import {JsonEncoder} from '@jsonjoy.com/json-pack/lib/json/JsonEncoder';
import {JsonCodegen} from '../JsonCodegen';
import {Random} from '../../../../random';
import {Configuration} from '../../../../__tests__/fixtures';

describe('Buffer capacity estimation stress tests', () => {
test('should handle Configuration encoding with very small initial buffer', () => {
// Use extremely small buffer to force many reallocations
// This will expose bugs in capacity estimation
const encoder = new JsonEncoder(new Writer(1));
const decoder = new JsonDecoder();

// Run multiple iterations to increase chance of hitting edge cases
for (let i = 0; i < 1000; i++) {
const json = Random.gen(Configuration);

try {
const fn = JsonCodegen.get(Configuration);
fn(json, encoder);
const encoded = encoder.writer.flush();
const decoded = decoder.decode(encoded);

// The test should pass, but might fail due to precision issues
// or buffer estimation bugs
expect(decoded).toEqual(json);
} catch (error) {
console.log('Failed at iteration', i);
console.log('Generated JSON:', JSON.stringify(json, null, 2));
throw error;
}
}
});

test('should expose floating point precision issues in u16 port numbers', () => {
const encoder = new JsonEncoder(new Writer(1));
const decoder = new JsonDecoder();

// Manually create a case that's likely to trigger the precision issue
// When port has a tiny decimal part, JSON encoding/decoding will lose it
const problematicJson = {
environment: 'development' as const,
database: {
host: 'localhost',
port: 4.000000000000001, // This will become 4 after JSON round-trip
name: 'testdb',
},
features: {},
secrets: {},
};

const fn = JsonCodegen.get(Configuration);
fn(problematicJson, encoder);
const encoded = encoder.writer.flush();
const text = Buffer.from(encoded).toString('utf-8');
const decoded = decoder.decode(encoded);

// Log the actual values to see the precision loss
console.log('Original port:', problematicJson.database.port);
console.log('Encoded text:', text);
console.log('Decoded port:', (decoded as any).database.port);

// This will fail because 4.000000000000001 !== 4 after JSON round-trip
// This demonstrates the precision issue
expect(decoded).toEqual(problematicJson);
});

test('should test all buffer sizes from 1 to 128', () => {
const decoder = new JsonDecoder();

// Test with various small buffer sizes to find which ones work
for (let bufferSize = 1; bufferSize <= 128; bufferSize *= 2) {
const encoder = new JsonEncoder(new Writer(bufferSize));

const json = {
environment: 'production' as const,
database: {
host: 'db.example.com',
port: 5432,
name: 'production_db',
},
features: {
feature1: true,
feature2: false,
feature3: true,
},
secrets: {
secret1: 'value1',
secret2: 'value2',
},
logging: {
level: 'info' as const,
output: '/var/log/app.log',
},
};

const fn = JsonCodegen.get(Configuration);
fn(json, encoder);
const encoded = encoder.writer.flush();
const decoded = decoder.decode(encoded);

expect(decoded).toEqual(json);
}
});
});
Loading
Loading