- Calls API
This project was developed as part of the DevChallenge IT XXI Backend category. It processes and analyzes telephone conversations to extract structured datasets for analysis. The system extracts details such as names, locations, emotional tones, and categorizes conversations based on content. It operates without internet dependency and supports local file processing. Detailed task you can see in the task.md
- Submit audio files via a URL for processing.
- Extract key information, including names and locations mentioned in conversations.
- Determine the emotional tone of conversations.
- Categorize conversations into relevant groups.
- Support for multiple audio formats (e.g., WAV, MP3).
- RESTful API accessible through a user-friendly documentation interface.
- Local file handling and offline processing capabilities.
- FastAPI: A modern, fast (high-performance) web framework for building APIs with Python 3.6+ based on standard Python type hints.
- PostgreSQL: A powerful, open-source object-relational database system.
- SQLAlchemy: SQL toolkit and ORM for database interaction.
- Alembic: A lightweight database migration tool for SQLAlchemy.
- Whisper: OpenAI’s automatic speech recognition model, used for transcribing audio to text.
- ffmpeg: A multimedia framework for handling audio and video processing.
- SpaCy: Used for extracting names and locations from transcriptions.
- TextBlob: Provides tools for text analysis, such as sentiment analysis.
- Docker: To containerize the application for consistent deployment across environments.
- Docker Compose: For defining and running multi-container Docker applications.
- aiohttp: Used for asynchronous HTTP requests.
- Poetry: Python packaging and dependency management tool.
- Docker Desktop installed on your machine.
Run the following commands in the app directory:
docker-compose up -d --buildAfter the first run, the database and additional resources (e.g., whisper/base.pt) will be set up. This may take some
time. To restart the server:
docker-compose downdocker-compose up -dAccess the API via the documentation interface at:
http://localhost:8080/docs#/
See detailes in the readme.md
Access the API at:
http://localhost:8080/docs#/
From the app directory:
poetry shell
uvicorn main:app --port 8080 --reload- Add more detailed error handling and logging.
- Add support for more audio file formats.
- Improve the accuracy of name and location extraction.
- Enhance the emotional tone detection algorithm.
- Add more categories and improve category detection.
- Implement a web-based user interface for easier interaction with the API.
- Fork the repository.
- Create a new feature branch:
git checkout -b feature-name
- Commit your changes:
git commit -m 'Add some feature' - Push to the branch:
git push origin feature-name
- Open a pull request.
This project is licensed under the MIT License. See the LICENSE file for more information.