Babel — Real-Time AI Translation Platform

Any room. Any language. Any input. Zero barriers.

Abstract

Language barriers fragment human communication — in hospitals, classrooms, emergency situations, and everyday life. Babel is a real-time AI translation platform built to dissolve those barriers, regardless of how someone communicates. A single shared room can hold multiple participants using completely different input methods simultaneously: one person speaks aloud, another types on a phone, a third texts in from WhatsApp without ever opening a browser. Every message, regardless of its origin, is translated by Claude AI and delivered to every participant in their own language within seconds.

Built at NJIT Hackathon, Babel explores a core idea: a room is not a chat — it's a shared understanding. The input method is an implementation detail. The language is an implementation detail. What matters is that meaning crosses the gap.

The project also includes an experimental attempt at ASL (American Sign Language) recognition using Claude's vision API, representing a push toward making Babel accessible to users who cannot speak or type at all — signing into the same room as everyone else.

What Babel Does

A Babel room is identified by a short 4-character code. Anyone who joins that code enters the same live conversation. The key insight is that participants don't all have to use the same interface — one person can be on the web app using voice, another using text, and a third texting from their iPhone via WhatsApp. They're all in the same room, and every message is translated for every participant in their own language.

Person A speaks in English via the web app
Person B types in Spanish from their phone browser
Person C texts in French from WhatsApp on their iPhone
Everyone sees and hears every message — translated, in real time

This works across any language pair Claude supports — English ↔ Spanish, English ↔ Japanese, Arabic ↔ French, and hundreds more.

Modes

Voice Mode (default)

The classic experience. Both users speak into their microphones. The browser uses the Web Speech API for speech-to-text and text-to-speech. Claude handles translation. The animated orb shows who is speaking, thinking, or idle.

Text Mode (iMessage-style)

Can't speak? Join a room in text mode from the web app. You get an iMessage-style chat interface — blue bubbles for your messages, gray for theirs — with real-time translation happening automatically on every send. A typing indicator appears while translation is in progress.

SMS / WhatsApp Bridge

Join a Babel room directly from your iPhone's native iMessage or WhatsApp app — no browser required. The other person uses the web app while you text from your phone:

Text JOIN <ROOM_CODE> <language> to the Twilio number
Receive a confirmation and start texting normally
Your messages get translated and appear in the web UI as chat bubbles
When the web user speaks or types, you get a translated SMS back

This uses Twilio's WhatsApp Sandbox for demo / development.

Solo Practice Mode

A single-user mode for language learners. Practice conversations with an AI partner in a target language. The AI adapts to your skill level and gives feedback on your responses.

ASL / Sign Language Mode (experimental attempt)

An attempt to extend Babel to users who cannot speak or type. This mode uses Claude's vision API to analyze webcam frames and recognize American Sign Language gestures, translating signs into text that gets spoken aloud for the other participant. It works for basic signs but real-time ASL recognition at scale is a hard problem — this is a proof-of-concept showing the direction, not a polished feature.

ESP32 Hardware Mode (optional)

A physical companion device (ESP32 microcontroller with an OLED or TFT screen) that connects to the room and displays an animated mascot reacting to the conversation state — bouncing when someone is speaking, pulsing while thinking, sleeping when idle.

Key Features

Feature	Details
Real-time translation	Sub-3-second turnaround via Claude Sonnet
Mixed input rooms	Voice, text, and SMS users can all share the same room simultaneously
Tone preservation	Preserves AAVE, slang, code-switching, formality
Distress detection	Flags medical emergency keywords — shows a red alert banner
Transcript	Full scrollable conversation history, downloadable as PDF
Language auto-detection	Claude detects the source language automatically
Typing indicator	Animated dots while translation is processing
Room system	4-character room codes, no sign-up needed
Multi-platform	Works on desktop and mobile browsers
SMS bridge	Join via WhatsApp without opening a browser
ASL attempt	Experimental webcam sign-language recognition via Claude Vision

Tech Stack

Layer	Technology
Frontend	React 18 + TypeScript
Styling	Tailwind CSS
Animations	Framer Motion
Build tool	Vite
Backend	Node.js + TypeScript (`tsx`)
Real-time comms	WebSockets (`ws`)
AI / Translation	Anthropic Claude API (`claude-sonnet-4-6`)
SMS / WhatsApp	Twilio
PDF export	jsPDF
Hardware	ESP32 (PlatformIO), OLED / TFT display

Architecture

                    ┌─────────────────────────────────────────┐
                    │            Babel Server (Node.js)        │
                    │                                          │
  Browser (Voice)──►│  Room Manager                           │
  Browser (Text) ──►│       │                                 │
  WhatsApp/SMS   ──►│       ▼                                 │
  ESP32 Device   ──►│  Claude API (translate + analyze)       │
                    │       │                                 │
                    │       ▼                                 │
                    │  Broadcast to all room participants      │
                    └──────────────────┬──────────────────────┘
                                       │
                    ┌──────────────────▼──────────────────────┐
                    │  Twilio (SMS/WhatsApp)                   │
                    │  POST /sms webhook ← cloudflared tunnel  │
                    └─────────────────────────────────────────┘

WebSocket Message Protocol

Message	Direction	Payload
`join_room`	client → server	`{ room_code, user_lang, is_device? }`
`utterance`	client → server	`{ original_text }`
`utterance`	server → client	`{ from_user, original_text, translated_text, source_lang, distress_flag, tone_note, timestamp }`
`utterance_echo`	server → sender	`{ original_text, timestamp }`
`state_change`	both directions	`{ state: idle\|listening\|thinking\|speaking\|error }`
`peers_update`	server → client	`{ peers: [{ userId, lang, isDevice }] }`
`distress_alert`	server → room	`{ from_user, message }`
`request_peers`	client → server	`{}`

Translation (Claude API)

One Claude call per utterance with a structured prompt that:

Auto-detects source language
Preserves tone, dialect, AAVE, code-switching, formality
Detects medical/safety distress keywords → sets distress_flag: true
Returns JSON: { source_lang, translated_text, distress_flag, tone_note }

Quick Start

1. Server

cd babel/server
npm install

# Create your .env file
cp .env.example .env

Edit .env:

ANTHROPIC_API_KEY=sk-ant-...
PORT=8080

# Optional: Twilio (for SMS/WhatsApp bridge)
TWILIO_ACCOUNT_SID=ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_AUTH_TOKEN=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
TWILIO_PHONE_NUMBER=whatsapp:+14155238886

npm run dev
# → Babel server listening on ws://localhost:8080
# → SMS bridge: POST http://localhost:8080/sms

2. Web App

cd babel/web
npm install
npm run dev
# → http://localhost:5173

For phones on the same Wi-Fi network, use your laptop's local IP instead of localhost. Find it with ipconfig (Windows) or ifconfig (Mac/Linux).

3. Twilio SMS Bridge (optional)

To accept WhatsApp/SMS messages, expose the local server with a tunnel:

# Download cloudflared (Windows)
# https://github.com/cloudflare/cloudflared/releases

cloudflared tunnel --url http://localhost:8080
# → https://your-unique-id.trycloudflare.com

Then in Twilio Console → Messaging → Try it out → Send a WhatsApp message → Sandbox settings, set the webhook to:

https://your-unique-id.trycloudflare.com/sms

iPhone user flow:

Text the sandbox join keyword to +1 415 523 8886 on WhatsApp (get keyword from Twilio Console)
Text: JOIN ROOMCODE es-ES (replace with actual room code and your language)
Start texting — messages are auto-translated both ways

4. ESP32 Hardware (optional)

cd babel/firmware
cp src/config.example.h src/config.h
# Edit config.h: WiFi SSID, password, server IP, room code
# Uncomment USE_OLED or USE_TFT in platformio.ini

pio run --target upload
pio device monitor

Running a Demo

Start the server and web app (steps 1–2 above)
Open http://localhost:5173 on two devices (or two browser tabs)
On Device A: click New Room, pick English
On Device B: click Join Room, enter the room code, pick Spanish
Device A speaks → Device B hears it in Spanish
Device B speaks → Device A hears it in English

Classic demo pair: English ↔ Spanish
Maximum wow factor: English ↔ Japanese

Project Structure

babel/
├── server/
│   ├── index.ts          # WebSocket server, translation, SMS webhook
│   ├── roomArchive.ts    # Transcript persistence
│   └── .env              # API keys (not committed)
├── web/
│   └── src/
│       ├── App.tsx
│       ├── components/
│       │   ├── HeroScreen.tsx            # Landing page + room join
│       │   ├── ConversationScreen.tsx    # Voice mode UI
│       │   ├── TextConversationScreen.tsx # iMessage-style text UI
│       │   ├── SoloPracticeScreen.tsx    # Solo language learning
│       │   ├── LessonScreen.tsx          # Structured lesson mode
│       │   ├── TranscriptView.tsx        # Scrollable transcript
│       │   └── StatusOrb.tsx             # Animated state indicator
│       ├── hooks/
│       │   ├── useWebSocket.ts           # WS connection + message bus
│       │   └── useSpeech.ts              # STT + TTS via Web Speech API
│       └── lib/
│           └── types.ts                  # Shared TypeScript types
└── firmware/
    └── src/
        ├── main.cpp                      # ESP32 entry point
        └── config.example.h             # Hardware config template

Notes

HTTPS on phones: Mobile browsers block microphone access on non-localhost origins without HTTPS. Use ngrok, cloudflared, or deploy to Vercel/Railway for phone testing.
WhatsApp Sandbox: Twilio's WhatsApp Sandbox requires both the sender and developer to opt in. For production SMS, a registered 10DLC or toll-free number with A2P verification is required.
Trial account limits: Twilio trial accounts can only send SMS to verified phone numbers. Upgrade to a paid account for unrestricted sending.

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
babel		babel
babel_esp		babel_esp
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Babel — Real-Time AI Translation Platform

Abstract

What Babel Does

Modes

Voice Mode (default)

Text Mode (iMessage-style)

SMS / WhatsApp Bridge

Solo Practice Mode

ASL / Sign Language Mode (experimental attempt)

ESP32 Hardware Mode (optional)

Key Features

Tech Stack

Architecture

WebSocket Message Protocol

Translation (Claude API)

Quick Start

1. Server

2. Web App

3. Twilio SMS Bridge (optional)

4. ESP32 Hardware (optional)

Running a Demo

Project Structure

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Babel — Real-Time AI Translation Platform

Abstract

What Babel Does

Modes

Voice Mode (default)

Text Mode (iMessage-style)

SMS / WhatsApp Bridge

Solo Practice Mode

ASL / Sign Language Mode (experimental attempt)

ESP32 Hardware Mode (optional)

Key Features

Tech Stack

Architecture

WebSocket Message Protocol

Translation (Claude API)

Quick Start

1. Server

2. Web App

3. Twilio SMS Bridge (optional)

4. ESP32 Hardware (optional)

Running a Demo

Project Structure

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages