Name	Name	Last commit message	Last commit date
Latest commit History 106 Commits
archive/hf-runs/bfl-ml-tierB	archive/hf-runs/bfl-ml-tierB
bfl_asic	bfl_asic
blog	blog
dataset	dataset
docs	docs
scripts	scripts
tests	tests
.gitignore	.gitignore
CLAUDE.md	CLAUDE.md
DEVLOG.md	DEVLOG.md
LEARNING.md	LEARNING.md
LICENSE	LICENSE
README.md	README.md
bfl-asic-repurpose.md	bfl-asic-repurpose.md
pyproject.toml	pyproject.toml

bfl-asic

Communication layer and statistical analysis tools for the Butterfly Labs BF0005G Jalapeno SHA-256 ASIC miner.

This project repurposes a Butterfly Labs BitForce SHA-256 ASIC cryptocurrency miner as a general-purpose cryptographic research platform. Rather than mining, it provides tools for statistical analysis of SHA-256 hash output, iterated hash dynamics exploration, and direct hardware interaction through a layered Python API.

Writeup: Teaching a Dead Mining ASIC to Measure Nothing, Carefully — the story & the negative results (also published as a Hugging Face Article — link added on publish)
Dataset: bshepp/round-reduced-sha256-learnability

Hardware

Device: Butterfly Labs BF0005G Jalapeno (5 GH/s)
ASIC: BitForce SHA256 SC 1.0 (Single Chip)
Interface: USB serial via FTDI (VID 0x0403, PID 0x6014), 115200 8N1
Protocol: ASCII commands (ZGX, ZLX, ZTX, ZDX, ZFX, ZPX) with 60-byte binary work packets
Power: 13V DC adapter

_{The rig in situ — the BF0005G Jalapeno (left). The sourdough starter really does live right next to it. Click to enlarge.}

Installation

pip install -e .

Requires Python 3.10+. Dependencies: pyserial, pyserial-asyncio, click, numpy, scipy, matplotlib.

Quick Start

With hardware connected

# Discover devices
bfl-asic discover

# Probe device on COM3
bfl-asic -p COM3 probe

# Read temperature and voltages
bfl-asic -p COM3 temperature

# Benchmark work submission throughput
bfl-asic -p COM3 benchmark --duration 10

Without hardware (simulator)

# All commands work with --simulate or with no --port flag
bfl-asic --simulate probe
bfl-asic --simulate identify
bfl-asic --simulate hash "hello world"

Statistical analysis

# Run SHA-256 statistical analysis (software engine)
bfl-asic stats run --samples 100000 -o snapshot.json --plot

# View saved results
bfl-asic stats report snapshot.json

# Animated visualisation of per-bit bias shrinking as N grows
# (great for an intuitive feel for the law of large numbers)
bfl-asic stats animate-convergence --samples 100000 --frames 60

Iterated hash dynamics

# Run orbit/convergence analysis
bfl-asic dynamics run --seeds 5 --max-iterations 50000 -o dynamics.json

# Generate plots from saved results
bfl-asic dynamics plot dynamics.json

Randomness validation (NIST SP 800-22)

# Harvest hashes and run the NIST randomness battery
bfl-asic randomness run --hashes 1000 -o randomness.json

# View saved results
bfl-asic randomness report randomness.json

Six tests are included: frequency (monobit), block frequency, runs, longest run of ones in a block, DFT spectral, and cumulative sums (forward + reverse). The battery consumes any HashSource, so an ASIC-backed source plugs in unchanged.

Sustained device work (SC queued path)

The naive ZDX/ZFX path stalls after roughly 42 submissions because the firmware queue fills and never gets drained — not a hardware ceiling. QueuedWorkSession speaks the SC queued protocol (ZNX/ZWX + continuous ZOX result-drain + ZCX JOBS IN QUEUE backpressure) exactly as cgminer/bfgminer do, and runs unbounded with no power-cycling required. NonceSource wraps QueuedWorkSession as the honest device surface: it yields nonces (mining winners), not full digests — it is not a HashSource and cannot feed the statistical/randomness battery directly.

# Fan control — thermal safety caveat applies
bfl-asic -p COM3 fan auto       # restore firmware thermal management (default)
bfl-asic -p COM3 fan 2          # fixed level 0-4 (0 = off, 4 = full)

Warning: a low fixed fan level during active hashing can cause thermal damage to the ASIC. The setting is persistent until changed or the device is power-cycled. Always restore with fan auto when done.

ML learnability instrument (optional [ml] extra)

```bash
pip install -e ".[ml]"

# Where does SHA-256 become unlearnable? (round-reduced sweep)
bfl-asic ml sweep --rounds 1,2,4,8,16,32,64 --plot

# Rigorous "is there ANY structure in full SHA-256?" bounded null
bfl-asic ml run full_structure

# Iterated-hash orbit learnability vs truncation width
bfl-asic ml run dynamics
```

Requires PyTorch (installed only via the optional `[ml]` extra). The
rest of the toolkit runs without it.

Where outputs go

When you do not pass -o, results land under a runs/ folder in the current directory, organised by command:

runs/
  stats/<timestamp>/snapshot.json + dashboard.png   (stats run --plot)
  animations/convergence-<timestamp>.gif            (stats animate-convergence)

Explicit -o paths/file.ext is always honoured verbatim. Two writes to the same path never overwrite each other -- the second one is suffixed with a timestamp. Override the output root with the BFL_ASIC_OUTPUT_DIR environment variable.

Architecture

bfl_asic/
  protocol/       # Pure encoding/decoding — no I/O
    constants.py   # Baud rate, commands, timing, response tokens
    commands.py    # Build ZGX, ZLX, ZTX, ZDX, ZFX, ZPX byte sequences
    responses.py   # Parse identify, temperature, voltage, work results
    work.py        # SHA-256 midstate computation, synthetic work generation
  transport/       # I/O abstraction
    base.py        # BaseTransport ABC (sync + async)
    serial.py      # Real hardware via pyserial
    simulator.py   # In-process simulated device with thermal model
    discovery.py   # FTDI device scanning
  stats/           # SHA-256 statistical analysis pipeline
    engine.py      # HashSource ABC, SoftwareHashEngine
    accumulators.py # Bit frequency, avalanche, correlation, entropy, etc.
    spectral.py    # FFT-based periodicity detection
    snapshot.py    # JSON-serializable results
    pipeline.py    # Orchestrator wiring engine → accumulators → snapshot
    visualization.py # Matplotlib plots: heatmaps, histograms, dashboards
  dynamics/        # Iterated hash dynamics (x → SHA-256(x) → ...)
    orbit.py       # Orbit computation with sampled trajectories
    rho.py         # Floyd's and Brent's cycle detection (O(1) memory)
    convergence.py # Multi-seed convergence analysis
    visualization.py # Orbit, convergence, and distribution plots
  randomness/      # NIST SP 800-22 randomness test battery
    tests.py       # Pure-function tests over uint8 bit arrays
    battery.py     # Orchestrator over any HashSource
    snapshot.py    # JSON-serializable results
  ml/              # Optional learnability instrument (torch behind [ml])
    roundreduced.py # Numpy-vectorized round-reduced SHA-256
    datasets.py     # Feature extractors + distinguisher/orbit datasets
    models.py       # TinyCNN + LinearProbe
    harness.py      # Deterministic train/eval + pos/neg controls
    experiments.py  # The four named experiments
    snapshot.py     # JSON-serializable results
    visualization.py # Learnability curve + saliency map
    publish.py      # Optional HF model-card upload
  device.py        # BFLDevice — sync high-level API
  async_device.py  # AsyncBFLDevice — async API with stream iterators
  cli.py           # Click CLI: identify, temperature, probe, discover,
                   #   benchmark, hash, stats, dynamics, randomness
  exceptions.py    # BFLError hierarchy

Layer separation

Protocol is pure Python — no I/O, no state, fully testable
Transport abstracts serial vs simulator vs future backends
Device combines transport + protocol into a clean API
Applications (stats, dynamics, randomness) are independent of the device layer

Protocol Reference

Command	Bytes	Response
Identify	`ZGX`	`BitForce SHA256 SC 1.0\n`
Temperature	`ZLX`	`Temp1: 30, Temp2: 30\n`
Voltages	`ZTX`	`3564,1011,11420\n` (mV: VCC1, VCC2, VMAIN)
Submit work	`ZDX` + 60-byte packet	`OK\n`
Poll result	`ZFX`	`IDLE\n` / `B\n` / `NONCE-FOUND:<hex>\n` / `NO-NONCE\n`
Nonce range	`ZPX` + 68-byte packet	`OK\n`

Work packet format (60 bytes): >>>>>>>> [32-byte midstate] [12-byte tail] >>>>>>>>

Testing

python -m pytest tests/ -q

783 tests. All tests run against the simulator — no hardware needed. Test coverage includes protocol encoding/decoding, transport lifecycle, simulator state machine, device API round-trips, CLI smoke tests, statistical accumulators, dynamics algorithms, NIST SP 800-22 tests (with reference p-values from the spec as regression anchors), and visualization. Heavy ML training tests are marked slow; the default fast run (pytest -m "not slow", 781 tests) excludes them. The ML subsystem requires pip install -e ".[ml]"; its tests skip cleanly when torch is absent.

Python API

from bfl_asic import BFLDevice
from bfl_asic.transport.serial import SerialTransport
from bfl_asic.transport.simulator import SimulatorTransport

# Real hardware
with BFLDevice(SerialTransport(port="COM3")) as dev:
    info = dev.identify()
    temp = dev.get_temperature()
    volts = dev.get_voltage()
    nonces = dev.hash_data(b"hello world")

# Simulator
with BFLDevice(SimulatorTransport()) as dev:
    info = dev.identify()

# Async
from bfl_asic import AsyncBFLDevice
async with AsyncBFLDevice(SimulatorTransport()) as dev:
    async for nonces in dev.hash_stream(count=100):
        print(nonces)

# Statistical analysis
from bfl_asic.stats import StatsPipeline

pipeline = StatsPipeline()
snapshot = pipeline.run(samples=100_000)
snapshot.save("results.json")
print(f"Max bias: {snapshot.bit_frequency['max_bias']}")
print(f"Mean Hamming: {snapshot.avalanche['mean']}")
print(f"Entropy: {snapshot.entropy['shannon_entropy']}")

# Iterated hash dynamics
from bfl_asic.dynamics import brent_detect, compute_orbit
from bfl_asic.dynamics.orbit import sha256_iterate

# Truncated hash for reachable cycles
def toy_hash(v: bytes) -> bytes:
    import hashlib
    return hashlib.sha256(v).digest()[:3].ljust(32, b'\x00')

cycle = brent_detect(b'\x00' * 32, max_steps=1_000_000, hash_fn=toy_hash)
if cycle:
    print(f"Cycle length: {cycle.cycle_length}, Tail: {cycle.tail_length}")

# NIST SP 800-22 randomness validation
from bfl_asic.randomness import RandomnessBattery
from bfl_asic.stats.engine import SoftwareHashEngine

battery = RandomnessBattery(engine=SoftwareHashEngine())
snapshot = battery.run(hash_count=1000)  # 256,000 bits
print(f"Passed: {snapshot.pass_count}/{len(snapshot.results)}")
for r in snapshot.results:
    print(f"  {r['name']:<28} p={r['p_value']:.4f}  "
          f"{'PASS' if r['passed'] else 'FAIL'}")

Learning

If you're using this toolkit to learn SHA-256 and cryptography from first principles, see LEARNING.md for a six-week study path that pairs each subsystem with free Coursera and YouTube lecture material.

License

MIT — see LICENSE. The published results dataset (huggingface.co/datasets/bshepp/round-reduced-sha256-learnability) is MIT as well.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bfl-asic

Hardware

Installation

Quick Start

With hardware connected

Without hardware (simulator)

Statistical analysis

Iterated hash dynamics

Randomness validation (NIST SP 800-22)

Sustained device work (SC queued path)

ML learnability instrument (optional [ml] extra)

Where outputs go

Architecture

Layer separation

Protocol Reference

Testing

Python API

Learning

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

bfl-asic

Hardware

Installation

Quick Start

With hardware connected

Without hardware (simulator)

Statistical analysis

Iterated hash dynamics

Randomness validation (NIST SP 800-22)

Sustained device work (SC queued path)

ML learnability instrument (optional [ml] extra)

Where outputs go

Architecture

Layer separation

Protocol Reference

Testing

Python API

Learning

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages