Communication layer and statistical analysis tools for the Butterfly Labs BF0005G Jalapeno SHA-256 ASIC miner.
This project repurposes a Butterfly Labs BitForce SHA-256 ASIC cryptocurrency miner as a general-purpose cryptographic research platform. Rather than mining, it provides tools for statistical analysis of SHA-256 hash output, iterated hash dynamics exploration, and direct hardware interaction through a layered Python API.
- Writeup: Teaching a Dead Mining ASIC to Measure Nothing, Carefully — the story & the negative results (also published as a Hugging Face Article — link added on publish)
- Dataset:
bshepp/round-reduced-sha256-learnability
- Device: Butterfly Labs BF0005G Jalapeno (5 GH/s)
- ASIC: BitForce SHA256 SC 1.0 (Single Chip)
- Interface: USB serial via FTDI (VID
0x0403, PID0x6014), 115200 8N1 - Protocol: ASCII commands (ZGX, ZLX, ZTX, ZDX, ZFX, ZPX) with 60-byte binary work packets
- Power: 13V DC adapter
The rig in situ — the BF0005G Jalapeno (left). The sourdough starter really does live right next to it. Click to enlarge.
pip install -e .Requires Python 3.10+. Dependencies: pyserial, pyserial-asyncio, click, numpy, scipy, matplotlib.
# Discover devices
bfl-asic discover
# Probe device on COM3
bfl-asic -p COM3 probe
# Read temperature and voltages
bfl-asic -p COM3 temperature
# Benchmark work submission throughput
bfl-asic -p COM3 benchmark --duration 10# All commands work with --simulate or with no --port flag
bfl-asic --simulate probe
bfl-asic --simulate identify
bfl-asic --simulate hash "hello world"# Run SHA-256 statistical analysis (software engine)
bfl-asic stats run --samples 100000 -o snapshot.json --plot
# View saved results
bfl-asic stats report snapshot.json
# Animated visualisation of per-bit bias shrinking as N grows
# (great for an intuitive feel for the law of large numbers)
bfl-asic stats animate-convergence --samples 100000 --frames 60# Run orbit/convergence analysis
bfl-asic dynamics run --seeds 5 --max-iterations 50000 -o dynamics.json
# Generate plots from saved results
bfl-asic dynamics plot dynamics.json# Harvest hashes and run the NIST randomness battery
bfl-asic randomness run --hashes 1000 -o randomness.json
# View saved results
bfl-asic randomness report randomness.jsonSix tests are included: frequency (monobit), block frequency, runs,
longest run of ones in a block, DFT spectral, and cumulative sums
(forward + reverse). The battery consumes any HashSource, so an
ASIC-backed source plugs in unchanged.
The naive ZDX/ZFX path stalls after roughly 42 submissions because
the firmware queue fills and never gets drained — not a hardware ceiling.
QueuedWorkSession speaks the SC queued protocol (ZNX/ZWX + continuous
ZOX result-drain + ZCX JOBS IN QUEUE backpressure) exactly as
cgminer/bfgminer do, and runs unbounded with no power-cycling required.
NonceSource wraps QueuedWorkSession as the honest device surface: it
yields nonces (mining winners), not full digests — it is not a
HashSource and cannot feed the statistical/randomness battery directly.
# Fan control — thermal safety caveat applies
bfl-asic -p COM3 fan auto # restore firmware thermal management (default)
bfl-asic -p COM3 fan 2 # fixed level 0-4 (0 = off, 4 = full)Warning: a low fixed fan level during active hashing can cause thermal
damage to the ASIC. The setting is persistent until changed or the device
is power-cycled. Always restore with fan auto when done.
```bash
pip install -e ".[ml]"
# Where does SHA-256 become unlearnable? (round-reduced sweep)
bfl-asic ml sweep --rounds 1,2,4,8,16,32,64 --plot
# Rigorous "is there ANY structure in full SHA-256?" bounded null
bfl-asic ml run full_structure
# Iterated-hash orbit learnability vs truncation width
bfl-asic ml run dynamics
```
Requires PyTorch (installed only via the optional `[ml]` extra). The
rest of the toolkit runs without it.When you do not pass -o, results land under a runs/ folder in the
current directory, organised by command:
runs/
stats/<timestamp>/snapshot.json + dashboard.png (stats run --plot)
animations/convergence-<timestamp>.gif (stats animate-convergence)
Explicit -o paths/file.ext is always honoured verbatim. Two writes to
the same path never overwrite each other -- the second one is suffixed with a
timestamp. Override the output root with the BFL_ASIC_OUTPUT_DIR
environment variable.
bfl_asic/
protocol/ # Pure encoding/decoding — no I/O
constants.py # Baud rate, commands, timing, response tokens
commands.py # Build ZGX, ZLX, ZTX, ZDX, ZFX, ZPX byte sequences
responses.py # Parse identify, temperature, voltage, work results
work.py # SHA-256 midstate computation, synthetic work generation
transport/ # I/O abstraction
base.py # BaseTransport ABC (sync + async)
serial.py # Real hardware via pyserial
simulator.py # In-process simulated device with thermal model
discovery.py # FTDI device scanning
stats/ # SHA-256 statistical analysis pipeline
engine.py # HashSource ABC, SoftwareHashEngine
accumulators.py # Bit frequency, avalanche, correlation, entropy, etc.
spectral.py # FFT-based periodicity detection
snapshot.py # JSON-serializable results
pipeline.py # Orchestrator wiring engine → accumulators → snapshot
visualization.py # Matplotlib plots: heatmaps, histograms, dashboards
dynamics/ # Iterated hash dynamics (x → SHA-256(x) → ...)
orbit.py # Orbit computation with sampled trajectories
rho.py # Floyd's and Brent's cycle detection (O(1) memory)
convergence.py # Multi-seed convergence analysis
visualization.py # Orbit, convergence, and distribution plots
randomness/ # NIST SP 800-22 randomness test battery
tests.py # Pure-function tests over uint8 bit arrays
battery.py # Orchestrator over any HashSource
snapshot.py # JSON-serializable results
ml/ # Optional learnability instrument (torch behind [ml])
roundreduced.py # Numpy-vectorized round-reduced SHA-256
datasets.py # Feature extractors + distinguisher/orbit datasets
models.py # TinyCNN + LinearProbe
harness.py # Deterministic train/eval + pos/neg controls
experiments.py # The four named experiments
snapshot.py # JSON-serializable results
visualization.py # Learnability curve + saliency map
publish.py # Optional HF model-card upload
device.py # BFLDevice — sync high-level API
async_device.py # AsyncBFLDevice — async API with stream iterators
cli.py # Click CLI: identify, temperature, probe, discover,
# benchmark, hash, stats, dynamics, randomness
exceptions.py # BFLError hierarchy
- Protocol is pure Python — no I/O, no state, fully testable
- Transport abstracts serial vs simulator vs future backends
- Device combines transport + protocol into a clean API
- Applications (stats, dynamics, randomness) are independent of the device layer
| Command | Bytes | Response |
|---|---|---|
| Identify | ZGX |
BitForce SHA256 SC 1.0\n |
| Temperature | ZLX |
Temp1: 30, Temp2: 30\n |
| Voltages | ZTX |
3564,1011,11420\n (mV: VCC1, VCC2, VMAIN) |
| Submit work | ZDX + 60-byte packet |
OK\n |
| Poll result | ZFX |
IDLE\n / B\n / NONCE-FOUND:<hex>\n / NO-NONCE\n |
| Nonce range | ZPX + 68-byte packet |
OK\n |
Work packet format (60 bytes): >>>>>>>> [32-byte midstate] [12-byte tail] >>>>>>>>
python -m pytest tests/ -q783 tests. All tests run against the simulator — no hardware needed. Test coverage includes protocol encoding/decoding, transport lifecycle, simulator state machine, device API round-trips, CLI smoke tests, statistical accumulators, dynamics algorithms, NIST SP 800-22 tests (with reference p-values from the spec as regression anchors), and visualization. Heavy ML training tests are marked slow; the default fast run (pytest -m "not slow", 781 tests) excludes them. The ML subsystem requires pip install -e ".[ml]"; its tests skip cleanly when torch is absent.
from bfl_asic import BFLDevice
from bfl_asic.transport.serial import SerialTransport
from bfl_asic.transport.simulator import SimulatorTransport
# Real hardware
with BFLDevice(SerialTransport(port="COM3")) as dev:
info = dev.identify()
temp = dev.get_temperature()
volts = dev.get_voltage()
nonces = dev.hash_data(b"hello world")
# Simulator
with BFLDevice(SimulatorTransport()) as dev:
info = dev.identify()
# Async
from bfl_asic import AsyncBFLDevice
async with AsyncBFLDevice(SimulatorTransport()) as dev:
async for nonces in dev.hash_stream(count=100):
print(nonces)# Statistical analysis
from bfl_asic.stats import StatsPipeline
pipeline = StatsPipeline()
snapshot = pipeline.run(samples=100_000)
snapshot.save("results.json")
print(f"Max bias: {snapshot.bit_frequency['max_bias']}")
print(f"Mean Hamming: {snapshot.avalanche['mean']}")
print(f"Entropy: {snapshot.entropy['shannon_entropy']}")# Iterated hash dynamics
from bfl_asic.dynamics import brent_detect, compute_orbit
from bfl_asic.dynamics.orbit import sha256_iterate
# Truncated hash for reachable cycles
def toy_hash(v: bytes) -> bytes:
import hashlib
return hashlib.sha256(v).digest()[:3].ljust(32, b'\x00')
cycle = brent_detect(b'\x00' * 32, max_steps=1_000_000, hash_fn=toy_hash)
if cycle:
print(f"Cycle length: {cycle.cycle_length}, Tail: {cycle.tail_length}")# NIST SP 800-22 randomness validation
from bfl_asic.randomness import RandomnessBattery
from bfl_asic.stats.engine import SoftwareHashEngine
battery = RandomnessBattery(engine=SoftwareHashEngine())
snapshot = battery.run(hash_count=1000) # 256,000 bits
print(f"Passed: {snapshot.pass_count}/{len(snapshot.results)}")
for r in snapshot.results:
print(f" {r['name']:<28} p={r['p_value']:.4f} "
f"{'PASS' if r['passed'] else 'FAIL'}")If you're using this toolkit to learn SHA-256 and cryptography from first principles, see LEARNING.md for a six-week study path that pairs each subsystem with free Coursera and YouTube lecture material.
MIT — see LICENSE. The published results dataset (huggingface.co/datasets/bshepp/round-reduced-sha256-learnability) is MIT as well.