Voicefield

Phone-powered voice input for any desktop text field. Turn your phone into a wireless microphone for any web application.

Scan a QR code → speak into your phone → text appears in the web field. Real-time, open source, self-hostable.

How it works

┌─────────────────────────────┐
│  voicefield.dev / your host │  Static phone page (no data stored)
└────────────┬────────────────┘
             │ loads phone page
             ▼
┌────────────────────┐         ┌──────────────────────┐
│  Phone browser     │  POST   │  Your server         │
│  STT runs here     │────────▶│  @voicefield/server  │
│  (client-side)     │  text   │  (relay only)        │
└────────────────────┘         └──────────┬───────────┘
                                          │ SSE
                                          ▼
                               ┌──────────────────────┐
                               │  Desktop browser     │
                               │  @voicefield/react   │
                               └──────────────────────┘

Works out of the box — uses the browser's built-in Web Speech API, no API key needed
Upgrade to Soniox — for higher accuracy, add a Soniox API key (or bring your own STT provider)
No audio leaves the phone — STT runs client-side, your server only relays text
Phone page is static — defaults to voicefield.dev, or self-host on your own domain

Quick Start

npm install @voicefield/react @voicefield/server

No API key needed — works immediately with the browser's built-in speech recognition.

1. Add API route (Next.js App Router)

// app/api/voice/[...voicefield]/route.ts
import { createVoicefieldHandler } from '@voicefield/server'

const { GET, POST, OPTIONS } = createVoicefieldHandler({
  cors: { origins: ['https://voicefield.dev'] },
})

export { GET, POST, OPTIONS }

Want higher accuracy? Add Soniox or another cloud STT provider. See Upgrading to a cloud STT provider.

2. Mount the phone page (for local dev)

// app/mic/page.tsx
"use client"
export { Mic as default } from "@voicefield/react/phone"

3. Use in your component

import { useVoicefield, QRPopup } from '@voicefield/react'

function MyComponent() {
  const inputRef = useRef<HTMLInputElement>(null)

  const vf = useVoicefield({
    serverUrl: '/api/voice',
    language: 'en',
  })

  vf.register('search', 'Search', inputRef)

  return (
    <>
      <input ref={inputRef} />
      <button onClick={() => vf.showQR()}>🎤</button>
      <QRPopup
        pairingCode={vf.pairingCode}
        secret={vf.secret}
        serverUrl={vf.serverUrl}
        phoneUrl={vf.phoneUrl}
        isVisible={vf.isQRVisible}
        onClose={vf.hideQR}
      />
    </>
  )
}

That's it. 3 files, and any web field has voice input.

Example App

A working example lives in apps/example/:

pnpm install && pnpm build
cd apps/example && pnpm dev

Works immediately with Web Speech API. For Soniox, copy .env.local.example and add your key.

Testing with a phone (ngrok)

Phones need HTTPS for microphone access. Use ngrok to expose your local dev server:

# Terminal 1: start the example app
cd apps/example && pnpm dev   # runs on http://localhost:3000

# Terminal 2: expose via ngrok
ngrok http 3000

Open the ngrok HTTPS URL on your desktop, scan the QR code with your phone, and speak.

Upgrading to a Cloud STT Provider

Web Speech API works great for most use cases. For higher accuracy or more language support, add a cloud provider like Soniox:

npm install @soniox/node

// app/api/voice/[...voicefield]/route.ts
import { createVoicefieldHandler } from '@voicefield/server'
import { SonioxNodeClient } from '@soniox/node'

const soniox = new SonioxNodeClient({ api_key: process.env.SONIOX_API_KEY! })

const { GET, POST, OPTIONS } = createVoicefieldHandler({
  generateSttKey: async () => {
    const result = await soniox.auth.createTemporaryKey({
      usage_type: 'transcribe_websocket',
      expires_in_seconds: 1800,
    })
    return { temporaryApiKey: result.api_key, expiresAt: Date.now() + 1800_000 }
  },
  cors: { origins: ['https://voicefield.dev'] },
})

export { GET, POST, OPTIONS }

The provider is selected automatically — if generateSttKey is configured, the phone uses Soniox. Otherwise, it falls back to the browser's Web Speech API. You can also build your own provider.

Packages

Package	Description	npm
`@voicefield/core`	Types and utilities (zero deps)
`@voicefield/react`	React hook + QR popup + phone page
`@voicefield/server`	Next.js API route handler (relay)

Deployment Modes

Mode	Phone page	Server	HTTPS	Setup effort	Notes
Local (LAN)	Your `/mic` page	localhost	Not needed	Zero	Desktop mic only — phones need HTTPS
ngrok	voicefield.dev	ngrok tunnel	Automatic	1 command	Phone mic works, best for dev
mkcert	Your `/mic` page	localhost + cert	Manual	Phone CA install	Phone mic works
Production	voicefield.dev	Your domain	Let's Encrypt	Standard deploy	Phone mic works
Self-hosted	Your domain	Your domain	Let's Encrypt	Deploy both	Phone mic works

Local development (no tunnel needed)

For local dev, mount the phone page in your app and let Voicefield auto-detect your LAN IP:

const vf = useVoicefield({
  serverUrl: '/api/voice',
  phoneUrl: '',        // local mode — uses your server's /mic page
  language: 'en',
})

The QR code points to http://192.168.x.x:PORT/mic — phone connects over WiFi.

Important: This mode only works for desktop-to-desktop testing (mic in the same browser). Phones require HTTPS for microphone access — use ngrok or the default production mode instead:

ngrok http 3000

Then open the ngrok HTTPS URL on your desktop. The QR code will automatically point the phone to the HTTPS tunnel.

Production (hosted phone page)

const vf = useVoicefield({
  serverUrl: '/api/voice',
  // phoneUrl defaults to https://voicefield.dev
  language: 'en',
})

Phone loads voicefield.dev/mic (static, open source), all API calls go to your server.

Security

Audio stays on the phone — STT runs client-side, only text is relayed
In-memory sessions — no database, no persistence, 30-min TTL
Cryptographic pairing — 256-bit secret in QR, 384-bit session token
Single-use codes — 6-digit pairing code deleted after use
Your server controls everything — STT keys generated on your infra, provider of your choice

See Security Model for the full threat model and design.

Documentation

Document	Description
Architecture	System design, data flow, design decisions
API Reference	All endpoints, request/response shapes, error codes
Security	Threat model, auth flow, crypto primitives
Deployment	Detailed setup for all deployment modes
Troubleshooting	Common issues and fixes
Contributing	Dev setup, branching, code style, testing

How-To Guides

Guide	Description
Add voice to Next.js	Step-by-step integration
Multi-field forms	Register multiple fields, field switching
Controlled inputs	Setter function pattern for React state
Custom STT provider	Replace Soniox with another STT
Self-host phone page	Deploy your own phone page

Why this architecture?

Why not just use the browser's SpeechRecognition API? That's exactly what Voicefield does by default — but with a twist: it runs on the phone's browser (better mic hardware) and relays only text to the desktop. For higher accuracy, you can upgrade to a cloud STT provider like Soniox without changing any client code.

Why a relay server? The phone needs a way to send transcripts to the desktop. The relay is minimal — in-memory, no persistence, only text passes through. When using a cloud STT provider, the server also generates temporary API keys.

Why voicefield.dev? The phone page needs HTTPS for microphone access. Rather than making every developer set up HTTPS locally, the phone loads its UI from voicefield.dev (static, open source) while making all API calls to your server. For production, you can self-host the phone page.

Development

# Clone and install
git clone https://github.com/tatargabor/voicefield.git
cd voicefield
pnpm install

# Build all packages
pnpm build

# Run example app (works immediately, no API key needed)
cd apps/example && pnpm dev

Testing & Linting

pnpm test           # unit tests (vitest)
pnpm lint           # eslint
pnpm format         # prettier
pnpm format:check   # check formatting

# E2E tests
cd apps/example && npx playwright test

Publishing

./scripts/publish.sh patch   # bump all → build → npm publish → git tag → GitHub release
./scripts/publish.sh minor
./scripts/publish.sh major
./scripts/publish.sh --dry-run patch  # preview without changes

All packages use lockstep versioning. Requires clean working tree, gh CLI, and npm auth.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
.claude		.claude
.github/workflows		.github/workflows
apps		apps
docs		docs
openspec		openspec
packages		packages
scripts		scripts
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
devto-article.md		devto-article.md
eslint.config.js		eslint.config.js
openapi.yaml		openapi.yaml
package.json		package.json
pnpm-lock.yaml		pnpm-lock.yaml
pnpm-workspace.yaml		pnpm-workspace.yaml
reddit-post.md		reddit-post.md
tsconfig.base.json		tsconfig.base.json
turbo.json		turbo.json
vitest.workspace.ts		vitest.workspace.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voicefield

How it works

Quick Start

1. Add API route (Next.js App Router)

2. Mount the phone page (for local dev)

3. Use in your component

Example App

Testing with a phone (ngrok)

Upgrading to a Cloud STT Provider

Packages

Deployment Modes

Local development (no tunnel needed)

Production (hosted phone page)

Security

Documentation

How-To Guides

Why this architecture?

Development

Testing & Linting

Publishing

License

About

Uh oh!

Releases 9

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Voicefield

How it works

Quick Start

1. Add API route (Next.js App Router)

2. Mount the phone page (for local dev)

3. Use in your component

Example App

Testing with a phone (ngrok)

Upgrading to a Cloud STT Provider

Packages

Deployment Modes

Local development (no tunnel needed)

Production (hosted phone page)

Security

Documentation

How-To Guides

Why this architecture?

Development

Testing & Linting

Publishing

License

About

Topics

Resources

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 9

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages