Transcriptr is a modern web application that converts audio files to text using artificial intelligence. It provides a clean, intuitive interface for uploading audio files and receiving high-quality transcriptions powered by Replicate's Incredibly Fast Whisper model.
- Audio Transcription: Convert audio to text with high accuracy
- Multiple Format Support: Download transcriptions in TXT, MD, PDF, and DOCX formats
- Language Selection: Choose from multiple languages for better accuracy
- Speaker Diarization: Optionally identify different speakers in the transcription
- Batch Processing: Handle large files efficiently with optimized processing
- Export Options: Download individual formats or all formats as a ZIP
- Frontend: React with TypeScript, powered by Vite for fast development
- UI: Tailwind CSS with shadcn/ui components for a modern interface
- Backend: Express.js server for handling API requests
- AI Integration: Replicate API for accessing the Incredibly Fast Whisper model
- Document Handling:
- Printerz for high-quality PDF template rendering
- Libraries for generating DOCX, and ZIP files
- Storage: Firebase Storage for saving generated documents
- Node.js (v16 or later)
- npm or yarn
- Replicate API token (for AI transcription)
-
Clone the repository:
git clone https://github.com/yourusername/transcriptr.git cd transcriptr
-
Install dependencies:
npm install
-
Create a .env file in the root directory with your Replicate API token:
VITE_REPLICATE_API_TOKEN=your_replicate_api_token_here
-
Start the development server:
npm run dev
-
Open your browser to
http://localhost:5173
to see the application.
Collecting workspace information# Adding Environment Variables Section to README.md
Based on your .env file and existing documentation, I'll create an environment variables section for your README.md that explains all the required environment variables for Transcriptr:
Transcriptr requires several environment variables to function properly. Create a .env
file in the project root with the following variables:
Variable | Description |
---|---|
VITE_REPLICATE_API_TOKEN |
Your Replicate API token for accessing the Incredibly Fast Whisper model |
VITE_FIREBASE_API_KEY |
Firebase API key for storage services |
VITE_FIREBASE_AUTH_DOMAIN |
Firebase auth domain |
VITE_FIREBASE_PROJECT_ID |
Firebase project ID |
VITE_FIREBASE_STORAGE_BUCKET |
Firebase storage bucket for storing transcriptions and PDFs |
VITE_FIREBASE_MESSAGING_SENDER_ID |
Firebase messaging sender ID |
VITE_FIREBASE_APP_ID |
Firebase application ID |
VITE_PRINTERZ_API_KEY |
API key for Printerz PDF generation services |
VITE_LARGE_FILE_THRESHOLD |
Threshold in MB for large file warnings |
Variable | Description | Default |
---|---|---|
VITE_CLOUDCONVERT_API_KEY |
API key for CloudConvert services (for additional file format support) | None |
PORT |
Port for the server to listen on | 3001 |
NODE_ENV |
Environment mode (development or production ) |
development |
VITE_REPLICATE_API_TOKEN=your_replicate_token_here
VITE_FIREBASE_API_KEY=your_firebase_api_key
VITE_FIREBASE_AUTH_DOMAIN=your-project.firebaseapp.com
VITE_FIREBASE_PROJECT_ID=your-project-id
VITE_FIREBASE_STORAGE_BUCKET=your-project.appspot.com
VITE_FIREBASE_MESSAGING_SENDER_ID=123456789012
VITE_FIREBASE_APP_ID=1:123456789012:web:abcdef1234567890
VITE_PRINTERZ_API_KEY=your_printerz_api_key
VITE_LARGE_FILE_THRESHOLD=1
VITE_CLOUDCONVERT_API_KEY=your_cloudconvert_api_key
- Replicate API Token: Sign up at Replicate and create an API token
- Firebase: Set up a project in Firebase Console and get your credentials
- Printerz: Create an account at Printerz and get your API key
- CloudConvert (optional): Register at CloudConvert for additional file format conversion capabilities
This section provides clear documentation on all the environment variables needed for your application, where to get them, and which ones are optional versus required. The table format makes it easy to understand what each variable is for.
You can place this section in your README.md after the "Getting Started" section and before the "Build and Deployment" section to maintain a logical flow of information.
## Build and Deployment
### Building for Production
To build the application for production:
```bash
npm run build
This command creates optimized production builds for both client and server:
- Client files are generated in
dist/client
- Server files are generated in server
- Build the application as described above
- Set the environment variable
NODE_ENV
toproduction
- Start the server:
npm run start
The server will run on port 3001 by default, but you can override this by setting the PORT
environment variable.
Create a Dockerfile in the root directory:
FROM node:18-alpine
WORKDIR /app
COPY package*.json ./
RUN npm install
COPY . .
RUN npm run build
ENV NODE_ENV=production
ENV PORT=3001
EXPOSE 3001
CMD ["npm", "run", "start"]
Build and run the Docker container:
docker build -t transcriptr .
docker run -p 3001:3001 -e VITE_REPLICATE_API_TOKEN=your_token_here transcriptr
For local development, the app uses an Express.js server to handle API requests:
npm run dev
transcriptr/
├── public/ # Static assets
├── src/ # Source code
│ ├── assets/ # Images and other assets
│ ├── components/ # React components
│ │ ├── ui/ # UI components based on shadcn/ui
│ │ ├── TranscriptionOptions.tsx # Language and diarization options
│ │ ├── TranscriptionResult.tsx # Display and download results
│ │ └── UploadAudio.tsx # File upload component
│ ├── hooks/ # Custom React hooks
│ ├── lib/ # Utility functions
│ ├── server/ # Express server for API handling
│ ├── App.tsx # Main application component
│ ├── index.css # Global CSS
│ └── main.tsx # Entry point
├── index.html # HTML template
├── tailwind.config.js # Tailwind CSS configuration
├── tsconfig.json # TypeScript configuration
├── vite.config.ts # Vite configuration
└── package.json # Dependencies and scripts
Method: POST
Description: Upload an audio file for transcription
Request Body:
{
"audioData": "base64-encoded-audio-data",
"options": {
"modelId": "vaibhavs10/incredibly-fast-whisper:3ab86df6c8f54c11309d4d1f930ac292bad43ace52d10c80d87eb258b3c9f79c",
"task": "transcribe",
"batch_size": 64,
"return_timestamps": true,
"language": "english",
"diarize": false
}
}
Response: JSON object with prediction ID or immediate transcription
Method: GET
Description: Check the status of a transcription in progress
Parameters:
id
: The prediction ID returned from the transcribe endpoint
Response: JSON object with prediction status and results (if complete)
Method: POST
Description: Proxy endpoint for rendering PDFs with Printerz
Request Body:
{
"templateId": "your-printerz-template-id",
"printerzData": {
"variables": {
"title": "Document Title",
"content": "Document Content",
"timestamp": "Formatted Date"
},
"options": {
"printBackground": true
}
}
}
Transcriptr currently supports the following audio formats:
- MP3 (.mp3)
- WAV (.wav)
- FLAC (.flac)
- OGG (.ogg)
For other formats like M4A, AAC, or WMA, please convert your files to one of the supported formats before uploading. You can use online tools like CloudConvert for this purpose.
We're working on adding native support for more audio formats. Contributions are welcome!
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Replicate for providing the Incredibly Fast Whisper model
- shadcn/ui for the component library
- Tailwind CSS for styling
- React for the UI framework
- Vite for the build tool
- Printerz for PDF template rendering and generation
Developed by Abdur-Rahman Bilal (aramb-dev)