Browser-based AI Speaking Practice
Speako is a local-first application designed for practicing exam-style English speaking tests. It prioritizes user privacy, zero latency, and a premium user experience by running powerful AI models directly in your browser.
- 🔒 Privacy First: Voice data is processed locally on your device using Transformers.js.
- 🎨 Premium Design: A beautiful, distraction-free "Dark Glass" interface built with Pure CSS.
- 🧠 Smart Analysis:
- CEFR Level Detection: ML-powered proficiency assessment using a fine-tuned DeBERTa model (robg/speako-cefr-deberta).
- Grammar Check: Detects hedging, passive voice, and weak vocabulary.
- Clarity Score: Real-time evaluation of speaking clarity.
- Positive Reinforcement: Highlights strong vocabulary usage.
- ⚡️ Ultra-Low Latency: Instant feedback without server round-trips.
- 🚀 WebGPU Optimized: Uses hardware acceleration for fast in-browser inference, with automatic WASM fallback.
- 📱 PWA Support: Installable as a Progressive Web App with offline model caching.
Speako is a pure frontend application with no backend server.
- Frontend: Vite + Preact + TypeScript
- Styling: Zero-dependency Pure CSS
- AI Models:
- Speech Recognition:
Xenova/whisper-base(running locally via ONNX) - CEFR Classification:
robg/speako-cefr-deberta(fine-tuned DeBERTa)
- Speech Recognition:
- NLP: Compromise for grammar analysis
- State Management: Preact Signals for high-performance reactivity
speako/
├── src/
│ ├── components/ # UI components (split by feature)
│ │ ├── session/ # Recording session components
│ │ └── validation/ # Validation interface components
│ ├── hooks/ # Custom hooks (useSessionManager, useValidation, etc.)
│ ├── logic/ # Pure TS business logic
│ │ ├── local-transcriber.ts # Whisper integration
│ │ ├── model-loader.ts # Model singleton with WebGPU/WASM
│ │ ├── cefr-classifier.ts # CEFR ML prediction
│ │ ├── grammar-checker.ts # Grammar analysis
│ │ └── metrics-calculator.ts # Speaking metrics
│ └── types/ # TypeScript type definitions
├── ml/ # CEFR classifier training scripts
├── scripts/ # Helper scripts
└── public/ # Static assets and local models
- Node.js 20+ (check with
node -v) - Python 3.11+ with uv for ML training (optional)
# Install dependencies
npm install
# Start development server
npm run devOpen http://localhost:5173.
| Script | Description |
|---|---|
npm run dev |
Start development server |
npm run build |
Build for production |
npm run preview |
Preview production build |
npm run test |
Run unit tests |
npm run lint |
Run ESLint |
npm run format |
Format code with Prettier |
npm run prepare:models |
Download models locally for offline testing |
npm run prepare:data |
Convert corpus audio to WAV for validation |
npm run cefr:verify |
Verify CEFR model is working |
npm run deploy |
Build and deploy to Cloudflare Pages |
For testing with real L2 learner audio, we use the Speak & Improve Corpus 2025 from Cambridge University Press & Assessment.
- Visit ELiT Datasets - Speak & Improve Corpus 2025
- Complete the free registration and accept the license
- Download and extract
sandi-corpus-2025.zip
The audio files are hosted separately on S3. Download the dev set (smaller, for testing):
cd /path/to/sandi-corpus-2025
mkdir -p data && cd data
# Dev set (~2.7GB total)
curl -LO "https://speak-and-improve-corpus-2025.s3.eu-west-1.amazonaws.com/audio/data.flac.dev.01.zip"
curl -LO "https://speak-and-improve-corpus-2025.s3.eu-west-1.amazonaws.com/audio/data.flac.dev.02.zip"
# Unzip into data/flac/dev/
unzip data.flac.dev.01.zip
unzip data.flac.dev.02.zipcd /path/to/speako
ln -s /path/to/sandi-corpus-2025 ./test-data# Requires ffmpeg: brew install ffmpeg
npm run prepare:data| Property | Value |
|---|---|
| Duration | ~315 hours of L2 learner audio |
| Format | 16kHz FLAC |
| CEFR Levels | A2–C1 |
| Manual Transcriptions | ~55 hours with disfluency annotations |
| License | Non-commercial research only |
Caution
Do not share the corpus publicly or include it in any repository. See the license agreement for full terms.
Validation is performed through the web interface:
- Start the development server:
npm run dev - Navigate to http://localhost:5173/#validate
- Use the validation controls to run tests on the corpus
Results are saved to validation-results.json.
For information on training the CEFR classifier, see docs/ml.md.
Note
The CEFR model is trained on UniversalCEFR (CC-BY-NC-4.0) to ensure license compliance. The S&I Corpus is used for validation only.
See AGENTS.md for coding standards and agent instructions.
To build for production:
npm run buildThis produces a static output in dist/ which can be deployed to any static host (Cloudflare Pages, Vercel, Netlify).
npm run deploy- Transformers.js – Run Transformers in the browser
- Preact – Fast 3kB React alternative
- Vite – Next Generation Frontend Tooling
- Compromise – Modest natural-language processing
- Xenova/whisper-base – Speech recognition model
- robg/speako-cefr-deberta – CEFR classification model
- WebGPU Implementation Status – Browser support tracker
- WebGPU Explainer – Introduction to WebGPU
- Speak & Improve Corpus 2025 – L2 learner speech corpus
- Corpus Paper (DOI) – Academic citation
MIT
