- Second Prize at the Artificial Intelligence Competition 2025
Organized by the Faculty of Information Technology, NTT University.
🔗 Official News
This project consists of two main components:
- Data Pipeline Architecture
- AI Chatbot Architecture
Both systems are containerized using Docker for easy deployment and scalability.
- Client UI: Frontend interface with WebHook integration
- API Controller: Handles incoming requests and orchestrates data flow
- MinIO: Object storage for raw data
- Data Pipeline Worker: Main processing unit including:
- Data Preprocessor
- Text Parser
- Extract Doc Information
- Dense/Sparse Embedding generators
- Integration with AI Services:
- LangChain
- Microsoft Markitdown
- OpenAI LLMs
- Huggingface models
- Message Queue System: Kafka-based with:
- SUCCESS QUEUE
- FAILED QUEUE
- JOB QUEUE
- Vector Storage: Qdrant for vector data storage
- Data enters through Client UI or WebHook
- API Controller processes and routes requests
- Data is stored in MinIO
- Pipeline Worker processes documents through various stages
- Results are stored in vector database
- Status updates are managed through Kafka queues
- Client UI: Web-based interface
- API Controller: Request handling and routing
- Chat Engine: Core conversational processing
- Knowledge Service: Information management
- Storage Systems:
- Store History: Chat history storage
- MetaStore: Metadata management
- MinIO: Object storage
- External Services:
- Telegram integration
- Gmail integration
- Internet Search (BCP)
- LangChain Agents AI
- OpenAI LLMs
- Multi-channel support (Telegram, Gmail)
- Admin Portal for system management
- Integrated knowledge base
- Real-time chat capabilities
- External service integrations
- Containerization: Docker
- Storage: MinIO, Qdrant, Redis
- Message Queue: Kafka
- AI/ML: OpenAI, Huggingface, LangChain
- External APIs: Telegram, Gmail
- Docker and Docker Compose
- MinIO credentials
- API keys for external services
- Sufficient storage and computing resources
This project is private and not publicly distributed.
All source code and model details are owned by the author.














