Bab in TypeScript.
Bab is a cryptographic hash function that lets you incrementally verify parts of the download, as they stream in.
I made a kind of elaborate demo page for this.
Installation instructions
npm i -S @substrate-system/bab-tsSee the demo page also for an interactive version.
This example shows both sides of streaming verification: the file provider (who has the data) and the file downloader (who wants to verify it).
Root Digest
A cryptographic hash of the entire file, shared upfront via a trusted channel.
Chunk Metadata
Verification data for each chunk, including sibling labels from the Merkle tree path
The provider has data to share. It should:
- Build metadata for all chunks
- Share the root digest via a trusted channel (out-of-band)
- Stream chunks with their metadata
import {
buildVerificationMetadata,
BabDigest,
type ChunkVerificationData
} from '@substrate-system/bab-ts'
// The file provider has some data to share
const fileData = new TextEncoder().encode(
'This is a large file that will be streamed in chunks...'
)
// Build verification metadata for all chunks upfront
const { rootDigest, chunks } = buildVerificationMetadata(fileData)
// Share the root digest via a trusted channel
// (e.g., signed message, QR code)
// The downloader will use this to verify chunks
console.log('Root digest (share this):', rootDigest.toHex())
// Stream each chunk with its metadata
for (const chunk of chunks) {
// IRL, you would send this over the network
// The chunk includes both data and verification metadata
sendToDownloader({
index: chunk.chunkIndex,
data: chunk.chunkData,
// Metadata includes siblingLabels, siblingDirections, and mergeLengths
metadata: chunk
})
}The downloader receives:
- The trusted root digest via a trusted channel
- Chunks with metadata (streamed incrementally)
import {
verifyChunk,
BabDigest,
type ChunkVerificationData
} from '@substrate-system/bab-ts'
// Downloader receives the root digest via a trusted channel
// (e.g., from a signed message, a secure webpage, or scanned QR code)
const trustedRootDigest = BabDigest.fromHex(
'a1b2c3d4...' // The hex string from the provider
)
// Downloader also needs to know the total number of chunks
const totalChunks = 5 // Communicated by provider
// As each chunk arrives with its metadata, verify it immediately
function onChunkReceived(
chunkData:Uint8Array,
metadata:ChunkVerificationData,
chunkIndex:number
) {
// Verify this chunk with the trusted root digest
const isValid = verifyChunk(
chunkData,
metadata,
totalChunks,
trustedRootDigest
)
if (isValid) {
console.log(`Chunk ${chunkIndex} verified successfully`)
// Can immediately use/save this chunk
processVerifiedChunk(chunkData, chunkIndex)
} else {
console.error(`Chunk ${chunkIndex} failed verification!`)
// Reject this chunk - it may be corrupted or malicious
}
}
// Simulate receiving chunks
onChunkReceived(
receivedChunkData, // Uint8Array
receivedMetadata, // ChunkVerificationData
0 // chunk index
)- Provider: Builds all metadata upfront using
buildVerificationMetadata() - Provider: Shares root digest upfront via trusted channel
- Provider: For each chunk during streaming, sends:
- The chunk data (
chunkData) - The verification metadata (
siblingLabels,siblingDirections,mergeLengths)
- The chunk data (
- Downloader: Receives root digest first (trusted)
- Downloader: For each chunk received, immediately verifies it using the chunk data + verification metadata + trusted root digest
The metadata size scales with the depth of the Merkle tree, which is roughly
log2(number_of_chunks). Each chunk's metadata includes an array of sibling
labels (32 bytes each), one for each level of the tree.
Default chunk size: 1024 bytes (1 KB)
Examples of metadata overhead:
- 1 MB file (1,024 chunks): ~10 sibling labels = ~320 bytes per chunk (~32% overhead)
- 1 GB file (1,048,576 chunks): ~20 sibling labels = ~640 bytes per chunk (~62% overhead)
Important
If you use very small chunks, the metadata can exceed the chunk size, defeating the purpose of streaming verification. For example, 64-byte chunks would have ~448 bytes of metadata per chunk (7x overhead) for a 1MB file.
You can customize the chunk size:
buildVerificationMetadata(data, 4096) // 4KB chunksChoose a chunk size that balances:
- Smaller chunks: More frequent verification, better for slow/unreliable networks
- Larger chunks: Less metadata overhead, better for fast/reliable networks
This exposes ESM and common JS via package.json exports field.
import '@substrate-system/bab-ts'require('@substrate-system/bab-ts')import '@substrate-system/bab-ts'This package exposes minified JS files too. Copy them to a location that is accessible to your web server, then link to them in HTML.
cp ./node_modules/@substrate-system/bab-ts/dist/module.min.js ./public<script type="module" src="./module.min.js"></script>npm run compareCreate output from the rust version, and compare it to the output from this module. This command runs the file run-comparison.ts, which calls the Rust version of bab.
That means that this module depends on the Rust module for the tests. That's what ./test-comparison/rust/Cargo.toml is for.
You need Rust to do this.
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | shThat's it. Then you can run the npm run compare test.
When you call npm run compare, it executes:
cargo run --release --bin bab-comparisonThen Cargo (Rust's package manager):
- Downloads
bab_rsfrom https://codeberg.org/worm-blossom/bab_rs.git - Compiles it
- Runs the comparison binary
Execute these steps:
- Run TypeScript implementation (lines 63-99 in
run-comparison.ts):
- Execute
batchHash()andbuildVerificationMetadata()from this module on test cases - Write results to
test-comparison/ts-output.json
- Execute
- Run Rust implementation (lines 102-111):
- Execute
cargo run --release --bin bab-comparison - This compiles and runs the Rust binary which generates output using
the
bab_rslibrary - Write results to
test-comparison/rust-output.json
- Execute
- Compare outputs (lines 122-161):
- Load both JSON files
- Run 17 tests comparing:
- Batch hashes for "hello world", empty string, single chunk, and multiple chunks
- Input byte lengths
- Number of chunks
- Use
tapzeroto report results