A cutting-edge text extraction and summarization solution built with React, TypeScript, and Python, featuring a modern tech stack and intelligent document processing capabilities. Transform your physical documents into digital format with ease and precision.
- AI Text Summarization - Added intelligent text summarization with OpenRouter Mistral integration
- PDF Support - Upload and process PDF documents for text extraction and summarization
- Smart Response Cleaning - AI responses automatically formatted for human readability
- Multilanguage Support - Added support for 13+ languages with seamless language switching
- UI Improvements - Enhanced user interface for better user experience
- Backend Optimization - Improved text detection accuracy and processing speed
- Drag-and-drop file upload support
- Real-time camera-based document scanning
- Intelligent frame capture and processing
- Support for multiple image formats (JPG, PNG, BMP)
- Advanced image preprocessing for optimal results
- High-accuracy text extraction using Tesseract OCR
- Smart document layout analysis
- Automatic document deskewing and orientation correction
- Noise reduction and image enhancement
- Confidence-based text filtering
- Multi-language support for text recognition in 13+ languages
- Multiple export formats:
- Text files (TXT) with UTF-8 encoding
- Word documents (DOCX) with formatting preservation
- Excel spreadsheets (XLSX) for tabular data
- Smart paragraph and layout detection
- Header/footer identification
- Table structure recognition
- Live camera feed processing
- Instant text extraction feedback
- Progress indicators and status updates
- Performance optimization for smooth operation
- Frame rate and resolution control
- Comprehensive input validation
- Detailed error messaging
- Automatic retry mechanisms
- Fallback strategies
- Secure file handling
- Smart Summarization Options:
- Local Smart Summary - Advanced algorithms for intelligent text condensation
- AI Summary (OpenRouter Mistral) - High-quality AI-powered summarization
- Multiple Summary Formats:
- Paragraph summaries for natural reading
- Bullet point summaries for quick scanning
- Key phrase extraction for topic identification
- Customizable Length Control:
- Short (25%) - Quick overview in 2-3 sentences
- Medium (50%) - Balanced summary in 3-4 sentences
- Long (75%) - Detailed summary in 5-7 sentences
- PDF Support - Upload and summarize PDF documents directly
- Rate Limit Management - Real-time API usage monitoring with cooldown display
- Clean Output - AI responses automatically cleaned for human readability
- Support for 13+ languages including:
- English, French, Spanish, German
- Italian, Portuguese, Dutch
- Chinese, Japanese, Korean
- Russian, Arabic, Hindi
- Language-specific text recognition optimization
- Easy language selection via dropdown interface
- Real-time language switching
VisionScript/
├── src/ # React + TypeScript frontend
│ ├── components/ # Reusable UI components
│ │ ├── ImageUpload.tsx # Image upload component
│ │ ├── RealTimeDetection.tsx # Camera-based detection
│ │ ├── LanguageSelector.tsx # Language selection component
│ │ └── TextSummarizer.tsx # Text summarization component
│ └── styles/ # Tailwind CSS styling
├── backend/ # Python Flask backend
│ ├── main_test.py # Main application server
│ └── .env # Environment configuration
├── docs/ # Documentation
└── node_modules/ # Dependencies
-
Clone the repository:
git clone git@github.com:imadnan4/VisionScript.git
-
Navigate to the project directory:
cd VisionScript
-
Install dependencies:
npm install cd backend pip install -r requirements.txt
-
Start the development server:
npm run dev
-
(Optional) Set up AI Text Summarization:
- Get a free API key from OpenRouter
- Add to
backend/.env
file:OPENROUTER_API_KEY=your_api_key_here
- Restart the backend server to enable AI summarization
-
Using the multilanguage feature:
- Look for the globe icon in the interface
- Click to open the language dropdown
- Select your preferred language for OCR processing
- The system will automatically optimize text recognition for the selected language
-
Using the text summarizer:
- Upload images/PDFs or use camera to extract text
- Click "Summarize Text" button on extracted text
- Or go directly to "Text Summarizer" from the home screen
- Choose between Local Smart Summary or AI Summary (OpenRouter)
- Select format (paragraph, bullets, key phrases) and length
- Export summaries in TXT, DOCX, or Excel format
- React 18+ with TypeScript
- Vite for fast development
- Tailwind CSS for modern styling
- Framer Motion for smooth animations
- React Context API for state management
- Python Flask for API endpoints
- Tesseract OCR for text extraction
- EasyOCR for multi-language support
- OpenCV for image processing
- python-docx for Word document generation
- PyPDF2 for PDF text extraction
- Text Summarization Stack:
- SUMY library for local extractive summarization
- NLTK for natural language processing
- OpenRouter API integration for AI-powered summarization
- Mistral AI model for high-quality text generation
- Custom response cleaning algorithms
- Text Translation - Integrate Google Translate API for multi-language text translation
- Voice-to-Text - Add speech recognition for audio file transcription
- Smart Text Comparison - Compare multiple documents and highlight differences
- Text Analytics Dashboard - Word count, reading time, sentiment analysis, and readability scores
- Custom Summary Templates - User-defined summary formats (executive summary, meeting notes, etc.)
- Batch Processing - Process multiple images/PDFs in one operation
- Text Search & Highlight - Search within extracted text and highlight matches
- Export to More Formats - Add support for Markdown, HTML, and plain text exports
- OCR Confidence Scoring - Display confidence levels for extracted text with editing suggestions
- Dark/Light Theme Toggle - Customizable UI themes for better user experience
This project is licensed under the MIT License - see the LICENSE file for details.
- Adnan
- GitHub: @imadnan4
Contributions, issues, and feature requests are welcome! Feel free to check the issues page.