Skip to content

Detect Algorand mnemonic passphrases in source code. CLI, GitHub Action, and pre-commit hook.

License

Notifications You must be signed in to change notification settings

TxnLab/seedguard

Repository files navigation

@txnlab/seedguard

Detect Algorand mnemonic passphrases in source code to prevent accidental secret exposure.

npm version CI License: MIT

Table of contents

Why this exists

Algorand uses 25-word BIP-39 mnemonic phrases as secret keys. Unlike random strings, these look like plain English and can easily be overlooked in code reviews. If a mnemonic is accidentally committed to a repository, it can lead to catastrophic loss of funds.

Seedguard scans your codebase and detects these phrases with near-zero false positives by validating against the BIP-39 wordlist and verifying the Algorand checksum via algosdk.

Quick start

Scan your project in seconds:

npx @txnlab/seedguard .

Or with Bun:

bunx @txnlab/seedguard .

Requirements

  • Node.js >= 20
  • algosdk >= 3.0.0 (optional peer dependency — enables checksum verification for highest-confidence detection)

Installation

Package manager

# Install as a dev dependency
npm install -D @txnlab/seedguard
# or
bun add -d @txnlab/seedguard
# or
pnpm add -D @txnlab/seedguard

GitHub Action

Add this workflow to your repository:

name: Seedguard
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: TxnLab/seedguard@v1

The action runs on Node.js 20 and respects .guardignore files. When detections are found, it creates inline file annotations and a job summary table showing file, line, confidence, and a redacted preview.

Action inputs

Input Description Default
path Path to scan .
fail-on-detection Fail the workflow if a mnemonic is detected true

Action outputs

Output Description
detection-count Number of mnemonic detections found
results-json JSON array of scan results

Warn-only mode

To report detections without failing the workflow:

- uses: TxnLab/seedguard@v1
  with:
    fail-on-detection: 'false'

Pre-commit hook

Add to your .pre-commit-config.yaml:

repos:
  - repo: https://github.com/TxnLab/seedguard
    rev: v1.0.0 # replace with latest version tag
    hooks:
      - id: seedguard

The hook requires Node.js on the system. It scans staged changes and blocks commits containing mnemonics.

Husky setup

Set up a pre-commit hook with husky automatically:

bunx @txnlab/seedguard init

This detects your package manager, installs husky (if needed), and creates a pre-commit hook that runs <exec-prefix> @txnlab/seedguard --staged --fail-on-detection (where <exec-prefix> is bunx, npx, or pnpm dlx based on your lockfile).

The init command appends to an existing .husky/pre-commit hook rather than overwriting it, and skips setup if seedguard is already configured.

CLI reference

# Scan current directory
seedguard .

# Scan a specific path
seedguard ./src

# Scan only staged git changes (for use in hooks)
seedguard --staged

# Output as JSON (for CI consumption)
seedguard . --json

# Exit with 0 even if mnemonics are found (just print warnings)
seedguard . --warn-only

# Default behavior: exit with 1 if mnemonics are found
seedguard . --fail-on-detection

# Initialize husky pre-commit hook
seedguard init

# Show help
seedguard --help

# Show version
seedguard --version

Exit codes

Code Meaning
0 No mnemonics detected (or --warn-only used)
1 Mnemonic(s) detected, or an error occurred

Output

Terminal output shows:

  • File path and line number
  • Detection confidence level (checksum verified vs. wordlist match)
  • A redacted preview of the mnemonic (first 3 words ... last word)
  • Summary count of detections

The tool never prints full mnemonics in output to prevent the tool itself from leaking secrets in CI logs.

Configuration

.guardignore

Create a .guardignore file to exclude files or directories from scanning. It uses the same syntax as .gitignore and can be placed at the project root or in any subdirectory — patterns apply cumulatively from each directory level:

# Exclude test fixtures
tests/fixtures/

# Exclude a specific file
src/wordlist.ts

# Exclude by pattern
*.generated.ts

Seedguard also respects your .gitignore patterns and automatically skips these directories: .git/, .hg/, .svn/, node_modules/, dist/, build/, .next/, vendor/, __pycache__/, coverage/.

Files larger than 1 MB and binary files are also skipped.

How detection works

Seedguard uses a multi-stage pipeline to minimize false positives:

  1. Tokenization - Extracts candidate sequences of 25 whitespace-separated words from input text. Handles mnemonics split across lines, in quotes, comma-separated, in various formats (.env, JSON, template literals), and normalizes Unicode homoglyphs (Cyrillic/Greek look-alike characters).

  2. Wordlist check - Verifies that all 25 words appear in the BIP-39 2,048-word list used by Algorand. A random 25-word English sentence matching all 2,048 BIP-39 words is astronomically unlikely.

  3. Checksum validation - Verifies the Algorand checksum (the 25th word) via algosdk. A valid checksum means this is almost certainly a real mnemonic.

  4. Context boosting - If nearby text contains keywords like mnemonic, passphrase, secret, seed, private, .env, process.env, secret_key, private_key, or seed_phrase (case-insensitive), confidence is boosted even for partial matches.

Confidence levels

Level Meaning
checksum-verified All 25 words match BIP-39 AND the Algorand checksum is valid. Almost certainly a real mnemonic.
wordlist-match All 25 words match BIP-39 but checksum wasn't verified (algosdk not installed or checksum failed).
partial-match 24/25 words match with a context keyword nearby.

Library usage

Use seedguard programmatically in your own tools:

import { detectMnemonics, scanDirectory, scanGitDiff } from '@txnlab/seedguard'

// Detect mnemonics in a string
const detections = await detectMnemonics(someText)
for (const detection of detections) {
  console.log(`Line ${detection.line}: ${detection.confidence}`)
  console.log(`  ${detection.redacted}`) // never logs the full mnemonic
}

// Scan a directory
const { results, guardignoreFiles } = await scanDirectory('./my-project')
for (const file of results) {
  console.log(`${file.file}: ${file.detections.length} detection(s)`)
}

// Scan a git diff (returns ScanResult[] directly)
const diffResults = await scanGitDiff(gitDiffString)

API

Types

type Confidence = 'checksum-verified' | 'wordlist-match' | 'partial-match'

interface DetectionResult {
  line: number              // Line number (1-indexed)
  confidence: Confidence
  wordlistMatches: number   // Words matching BIP-39 (out of 25)
  redacted: string          // Safe preview: "word1 word2 word3 ... word25"
  contextBoosted: boolean   // Whether context keywords were found nearby
}

interface ScanResult {
  file: string              // Relative file path
  detections: DetectionResult[]
}

interface ScanSummary {
  results: ScanResult[]
  guardignoreFiles: string[] // .guardignore files found during scan
}

Note: DetectionResult does not include the raw mnemonic phrase by design — only the redacted preview is ever returned.

detectMnemonics(text: string): Promise<DetectionResult[]>

Detects Algorand mnemonics in a text string.

scanDirectory(dir: string): Promise<ScanSummary>

Scans all text files in a directory tree. Respects .gitignore and .guardignore patterns.

scanGitDiff(diff: string): Promise<ScanResult[]>

Scans only the added lines in a git diff string. Returns a flat array since .guardignore does not apply to diff scanning.

redactMnemonic(phrase: string): string

Returns a redacted version showing only the first 3 words and last word ("word1 word2 word3 ... word25").

formatTerminal(results: ScanResult[], guardignoreFiles?: string[]): string

Formats scan results as a colored terminal string with file paths, line numbers, confidence levels, and a summary count.

formatJSON(results: ScanResult[], guardignoreFiles?: string[]): string

Returns a JSON string with { results, guardignoreFiles }.

formatGitHubAnnotations(results: ScanResult[]): string[]

Returns an array of GitHub Actions ::error annotation strings for inline PR annotations.

Troubleshooting

Why does confidence show wordlist-match instead of checksum-verified?

The algosdk package is not installed or the checksum failed. Most Algorand projects already have algosdk — if yours doesn't, install it to enable checksum verification.

How do I suppress a false positive?

Add the file or pattern to a .guardignore file. It uses .gitignore syntax.

Why was a mnemonic not detected?

A partial match (24/25 words) requires a context keyword nearby (e.g., mnemonic, secret, seed) to trigger detection. Encoded mnemonics (Base64, hex, ROT13) are also not detected.

Does this work with monorepos?

Yes. Scan specific paths: seedguard ./packages/my-app

Which integration should I use?

Use the CLI (npx/bunx) for one-off scans, Husky or pre-commit for local commit hooks, the GitHub Action for CI, and the library API for custom tooling.

Contributing

  1. Fork the repository
  2. Create a feature branch (git checkout -b feature/my-feature)
  3. Make your changes
  4. Run tests (bun test)
  5. Run the build (bun run build)
  6. Commit and push
  7. Open a pull request

Development

# Install dependencies
bun install

# Run tests
bun test

# Build
bun run build

# Lint
bun run lint

# Format
bun run format

License

MIT

About

Detect Algorand mnemonic passphrases in source code. CLI, GitHub Action, and pre-commit hook.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published