(Also known as @screwyprof on GitHub)
Specialties: distributed systems, reliability engineering, system scalability, blockchain infrastructure, and fault-tolerant architecture.
I design and lead the build of fault-tolerant systems that keep critical platforms online when everything else fails. With over two decades in finance, blockchain, and high-scale e-commerce, I help organizations eliminate fragility and protect revenue through resilient engineering.
I eliminate the million-dollar risks that keep CTOs awake at 3 AM: system failures that lose revenue, infrastructure that can't scale, and teams that can't deliver reliably.
I focus on making reliability systematic — not reactive. Over 20 years, I've led transformations that made financial, e-commerce, and blockchain platforms scale seamlessly and recover automatically, blending technical leadership with deep engineering discipline.
Consistent pattern of turning fragile systems into resilient infrastructure:
- ✅ ConsenSys Metamask Staking ($2+ billion assets) → Helped build failureproof backend for MetaMask Staking and institutional validators, focusing on observability, latency reductions, and safe operations.
- ✅ Lazada (SEA e-commerce, Alibaba Group) → Helped design and optimize a product-catalog microservice in Go serving 16k+ RPS per instance; promoted Clean Architecture and TDD practices.
- ✅ Regulated trading system → Transformed a fragile, monolithic trading system into an event-driven, 99.99% uptime platform using DDD, CQRS, and Event Sourcing — establishing reliability standards adopted across the team.
- ✅ Distributed RNG protocol → Pioneered on-chain random number generation using cryptographic protocol with
El-Gamalencryption andTendermintvalidators.
Resolved critical OOM failures causing validator restarts every 2 hours, blocking institutional upgrades. Root cause analysis revealed architectural flaw in validator management code - O(n) decryption of all keys for single-validator updates. I transformed O(n) to O(1) behavior, boosting attestation success rates to ~99% and unblocking scale operations for tier-1 staking providers. Problem analysis and solution.
Solved fundamental scaling limitation preventing multi-client operations. Introduced per-validator relay architecture, delivering ~26% latency improvement with OpenTelemetry instrumentation. Proposal and solution.
Integrated Tendermint Proof-of-Stake consensus into go-ethereum before Ethereum's PoS transition. Contributed to consensus engine architecture for decentralized risk markets with delegated PoS and 1-second block times.
Built backend infrastructure for self-custodial Ethereum Staking powering MetaMask and institutional clients — $2+ billion in assets across 33,000+ validators. Documentation and API reference.
Contributed to migration of the product-catalog domain from a legacy PHP monolith to distributed Go microservices handling 16k+ RPS per instance across six markets. Advocated Clean Architecture and TDD practices across teams, improving reliability and maintainability at scale.
Transformed a fragile, monolithic trading system into an event-driven, 99.99% uptime platform using DDD, CQRS, and Event Sourcing. Introduced CI/CD pipelines and structured on-call processes, establishing reliability standards for the engineering organization.
Personal projects and experiments showcasing systematic approaches to reliability, architecture, and testing patterns.
High-performance Tezos delegation service with CQRS architecture achieving 15s full sync of 760k+ delegations and <1ms API responses. Demonstrates reliability patterns with 92% test coverage.
Complete DDD/CQRS/Clean Architecture implementation preventing costly financial errors. Features event-sourced aggregates, invariant enforcement, and full test coverage.
Rust CLI tool for managing Finder favorites on macOS with ADR-based architectural reasoning and nix reproducibility.
Event-sourced CQRS library featuring Given/When/Then DSL for business-readable domain specifications — a rare pattern in Go.
Form3 API client with GitHub-style HATEOAS pagination and executable documentation as acceptance tests.
Currently writing about reliability, architecture, and building systems that must not fail.
More to come — deep dives into scaling, process design, and the invisible layers of infrastructure.
📘 Articles
💬 Join the discussion on LinkedIn:
Building Infrastructure That Must Not Fail — LinkedIn Article
Exploring Principal Engineer, Staff Engineer, or Enginering Manager roles focused on architecture, reliability, and engineering excellence.
I guide teams through technical direction and process design — not management hierarchy — ensuring systems and practices meet the highest quality standards.
My passion is building infrastructure that must not fail and cultivating the engineering discipline that makes that possible.
If you're scaling critical systems or modernizing legacy infrastructure, I'd love to discuss how I can help.
Let's discuss how I can help transform your critical systems into reliable, scalable infrastructure.





