This project takes podcast episodes from the Podcast Index, converts the audio into text, summarizes the content, generates an image based on the summary, translates the summary into French, and allows users to ask questions about the episode. Additionally, ElevenLabs is used for audio generation.
- Audio to Text: Convert podcast episodes into text using Hugging Face models.
- Summarization: Create concise summaries of podcast episodes.
- Translation: Translate summarized content into French.
- Image Generation: Generate images based on the summarization.
- Q&A: Ask questions about the episode and get accurate answers.
- Audio Creation: Generate audio content with ElevenLabs.
- User Authentication: Secure authentication and user management with ClerkJs.
- Next.js: Frontend framework for building fast and scalable web applications.
- Hugging Face: Provides models for transcription, summarization, and translation.
- ElevenLabs: Generates audio content based on summaries.
- LangChain: Orchestrates the entire process by creating a chain that integrates all functionalities.
- ClerkJs: User authentication and management.
- Axios: Handles API requests.
-
Install dependencies:
npm install
-
Start the development server:
npm run dev
-
Visit
http://localhost:3000
to access the app.
- Fetch Podcast: Axios is used to retrieve podcast audio from the Podcast Index.
- Audio Transcription: Hugging Face models convert the audio into text.
- Summarization: The transcribed text is summarized using Hugging Face models.
- Translation: The summary is translated into French using Hugging Face translation models.
- Image Generation: An image is generated from the summarization using AI tools.
- Audio Creation: ElevenLabs generates audio from the summarized content.
- Q&A: Users can ask questions about the episode, and LangChain coordinates the response process.
- Authentication: ClerkJs handles user login and account management.
To deploy the app:
-
Build the app for production:
npm run build
-
Start the production server:
npm start
Feel free to open issues or submit pull requests to improve the project. Contributions are welcome!
This project is licensed under the MIT License.