🎙️ Text Reader - Advanced Text-to-Speech Platform

A modern, feature-rich Text-to-Speech web application with multi-format file support, voice customization, and real-time audio controls.

Report Bug • Request Feature

Table of Contents

About The Project
- Key Features
Tech Stack
Getting Started
- Prerequisites
- Installation
Usage
Architecture
Roadmap
Contributing
License
Contact
Acknowledgments

🌟 About The Project

This project demonstrates full-stack development expertise with a focus on modern web technologies, user experience, and real-time audio processing. It is built as a portfolio piece to showcase:

🎨 Modern UI/UX Design - Clean, intuitive interface with smooth animations powered by Framer Motion.
🏗️ Full-Stack Architecture - Robust React/TypeScript frontend paired with a high-performance Python FastAPI backend.
🎵 Real-time Audio Processing - Custom waveform visualization and seamless playback controls.
📄 Multi-Format Support - Intelligent parsing of PDF, DOCX, TXT, and Markdown files.
🎭 Voice Customization - Diverse voice options with adjustable pitch, rate, and presets.
⚡ Performance Optimized - Lightning-fast development and production builds using Vite.

[Add screenshots of your application here to showcase the UI]

✨ Key Features

🎤 Voice & Audio

Voice Gallery - Browse and select from a wide range of TTS voices.
Custom Voice Presets - Pre-configured celebrity/character voices with optimized settings.
Real-time Controls - Adjust speech rate and pitch on the fly.
Waveform Visualization - Visual feedback during playback.
Audio Download - Export generated speech as MP3 files.

📁 File Processing

Multi-Format Support - PDF, DOCX, TXT, Markdown.
Smart Text Extraction - Preserves formatting and structure.
Drag & Drop Upload - Intuitive file handling.
Direct Text Input - Type or paste text directly.

🎨 User Experience

Responsive Design - Works seamlessly on desktop and mobile.
Modern UI Components - Glassmorphism, smooth animations, and vibrant colors.
Real-time Feedback - Loading states and progress indicators.
Keyboard Shortcuts - Efficient workflow for power users.

🛠️ Tech Stack

Frontend

React 19 - Latest React features with functional components
TypeScript - Type-safe development
Vite - Next-generation build tool
TailwindCSS - Utility-first CSS framework
Framer Motion - Production-ready animation library
Lucide React - Beautiful icon system

Backend

FastAPI - High-performance Python web framework
Edge-TTS - Microsoft Edge's TTS engine integration
Uvicorn - Lightning-fast ASGI server

Libraries & Tools

PDF.js - PDF parsing and rendering
Mammoth.js - DOCX to HTML conversion
Marked - Markdown parser
ESLint - Code quality and consistency

🚀 Getting Started

Prerequisites

Node.js (v18 or higher)
Python (v3.8 or higher)
npm or pnpm

Installation

Clone the repository

git clone https://github.com/VijayAdithyaBK/text-reader.git
cd text-reader

Install frontend dependencies
```
npm install
```

Install backend dependencies

cd backend
pip install -r requirements.txt

Running the Application

Option 1: Using the startup script (Windows)

./start_server.bat

Option 2: Manual startup

Start the backend server:
```
cd backend
uvicorn main:app --reload
```
In a new terminal, start the frontend:
```
npm run dev
```
Open your browser to http://localhost:5173

⚡ Usage

🚦 API Endpoints

Method	Endpoint	Description
`GET`	`/voices`	Fetch available TTS voices
`POST`	`/tts`	Generate speech from text

Request Body for /tts:

{
  "text": "Hello, world!",
  "voice": "en-US-GuyNeural",
  "rate": "+0%",
  "pitch": "+0Hz"
}

🔧 Configuration

The application can be customized through:

Voice Presets (src/data/voicePresets.ts) - Add custom voice configurations
Backend URL (src/App.tsx) - Configure API endpoint
Tailwind Config - Customize design tokens

🏗️ Architecture

📂 Project Structure

text-reader/
├── src/
│   ├── components/          # React components
│   │   ├── Controls.tsx     # Audio control components
│   │   ├── FileUploader.tsx # File upload handling
│   │   ├── TextInput.tsx    # Text input component
│   │   ├── VoiceGallery.tsx # Voice selection UI
│   │   └── WaveformPlayer.tsx # Audio visualization
│   ├── utils/
│   │   └── fileParsers.ts   # Multi-format file parsers
│   ├── data/
│   │   └── voicePresets.ts  # Voice configuration
│   └── App.tsx              # Main application
├── backend/
│   ├── main.py              # FastAPI server
│   └── requirements.txt     # Python dependencies
├── package.json
└── vite.config.ts

🎯 Technical Achievements

Separation of Concerns - Clean frontend/backend architecture.
Type Safety - Full TypeScript coverage for maintainability.
API Design - RESTful API with clear endpoints.
State Management - React hooks for efficient state handling.
Performance Optimizations - Lazy loading components, efficient blob handling, and tree-shaking.
Code Quality - ESLint integration, modular component architecture, and graceful error handling.

📊 Performance Metrics

Build Size: Optimized production bundle
First Contentful Paint: <1s
Time to Interactive: <2s
Lighthouse Score: (Add your scores here)

🔜 Roadmap

Multi-language support with i18n
User authentication and saved preferences
Cloud storage integration
Batch processing for multiple files
Advanced audio effects (reverb, echo, etc.)
Voice cloning capabilities
Progressive Web App (PWA) support
Real-time collaboration features

🤝 Contributing

Contributions are what make the open source community such an amazing place to learn, inspire, and create. Any contributions you make are greatly appreciated.

Fork the Project
Create your Feature Branch (git checkout -b feature/AmazingFeature)
Commit your Changes (git commit -m 'Add some AmazingFeature')
Push to the Branch (git push origin feature/AmazingFeature)
Open a Pull Request

� License

This project is available for portfolio demonstration purposes.

👨‍💻 Contact

Vijay Adithya B K

📧 Email: vijayadithyak@gmail.com
💼 LinkedIn: linkedin.com/in/vijayadithyabk
🌐 Portfolio: vijayadithyabk.github.io/data-nexus/
🐙 GitHub: @VijayAdithyaBK

🙏 Acknowledgments

Microsoft Edge TTS for voice synthesis
The React and FastAPI communities

⭐ If you find this project interesting, please consider giving it a star! ⭐

⚡ Crafted by Vijay Adithya B K

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
backend		backend
public		public
src		src
.gitignore		.gitignore
README.md		README.md
eslint.config.js		eslint.config.js
index.html		index.html
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
start_server.bat		start_server.bat
tailwind.config.js		tailwind.config.js
tsconfig.app.json		tsconfig.app.json
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Text Reader - Advanced Text-to-Speech Platform

🌟 About The Project

✨ Key Features

🎤 Voice & Audio

📁 File Processing

🎨 User Experience

🛠️ Tech Stack

Frontend

Backend

Libraries & Tools

🚀 Getting Started

Prerequisites

Installation

Running the Application

⚡ Usage

🚦 API Endpoints

🔧 Configuration

🏗️ Architecture

📂 Project Structure

🎯 Technical Achievements

📊 Performance Metrics

🔜 Roadmap

🤝 Contributing

� License

👨‍💻 Contact

🙏 Acknowledgments

About

Uh oh!

Releases

Packages

Languages

VijayAdithyaBK/text-reader

Folders and files

Latest commit

History

Repository files navigation

🎙️ Text Reader - Advanced Text-to-Speech Platform

🌟 About The Project

✨ Key Features

🎤 Voice & Audio

📁 File Processing

🎨 User Experience

🛠️ Tech Stack

Frontend

Backend

Libraries & Tools

🚀 Getting Started

Prerequisites

Installation

Running the Application

⚡ Usage

🚦 API Endpoints

🔧 Configuration

🏗️ Architecture

📂 Project Structure

🎯 Technical Achievements

📊 Performance Metrics

🔜 Roadmap

🤝 Contributing

� License

👨‍💻 Contact

🙏 Acknowledgments

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages