A modern Python web application for aggregating and browsing AI tools. Built with FastHTML for the frontend and featuring AI-powered search capabilities.
🌐 Live at: drose.io/aitools
The AI Tools Website aggregates various AI tools and presents them in a responsive, searchable interface. Recent refactoring has separated core functionalities into distinct modules—web, search, logging, data management, and storage—to better organize and scale the application.
- Modular Architecture: Separation of concerns across data processing, logging, search, and storage.
- Modern Tech Stack: Built with FastHTML for server-side rendering.
- Enhanced Search: AI-powered search with support for OpenAI and Tavily integrations.
- Flexible Storage: Supports both local storage and Minio S3-compatible storage.
- Robust Logging: Improved logging configuration for easier debugging and monitoring.
- Efficient Dependency Management: UV used for dependency synchronization and task execution.
.
├── ai_tools_website/ # Main application package
│ ├── __init__.py # Package initializer
│ ├── config.py # Application configuration
│ ├── data_manager.py # Data processing and validation
│ ├── logging_config.py # Logging configuration
│ ├── search.py # AI-powered search implementation
│ ├── storage.py # Storage interfaces (local/Minio)
│ ├── web.py # FastHTML web server
│ ├── utils/ # Utility functions
│ └── static/ # Client-side assets
│
├── scripts/ # Automation scripts
│ ├── crontab # Scheduled task configuration
│ └── run-update.sh # Tool update script
│
├── data/ # Data storage directory
├── logs/ # Application logs
│
├── docker-compose.yml # Docker Compose configuration
├── Dockerfile # Web service container
├── Dockerfile.updater # Update service container
├── pyproject.toml # Python project configuration
└── uv.lock # UV dependency lock file
- Web Interface:
- The FastHTML server (web.py) renders a responsive UI with real-time client-side search.
- Data Management & Search:
- Data is processed and validated in data_manager.py.
- search.py leverages AI integrations to provide enhanced search functionality.
- Storage & Logging:
- storage.py handles file storage, supporting local and Minio backends.
- logging_config.py sets up comprehensive logging for monitoring and debugging.
The system uses a multi-stage pipeline for discovering and validating AI tools:
-
Search Integration
- Uses Tavily API for initial tool discovery
- Focuses on high-quality domains (github.com, producthunt.com, huggingface.co, replicate.com)
- Implements caching in development mode for faster iteration
-
Validation Pipeline
- Multi-stage verification using LLMs:
- Initial filtering of search results (confidence threshold: 80%)
- Page content analysis and verification (confidence threshold: 90%)
- Category assignment based on existing tool context
- URL validation to filter out listing/search pages
- Async processing for improved performance
- Multi-stage verification using LLMs:
-
Deduplication System
- Two-pass deduplication:
- Quick URL-based matching
- LLM-based semantic comparison for similar tools
- Confidence-based decision making for updates vs. new entries
- Smart merging of tool information when duplicates found
- Two-pass deduplication:
-
Data Models
ToolUpdate: Tracks tool verification decisionsSearchAnalysis: Manages search result analysisDuplicateStatus: Handles deduplication decisions- Strong typing with Pydantic for data validation
-
Categorization
- Dynamic category management
- LLM-powered category suggestions
- Supported categories:
- Language Models
- Image Generation
- Audio & Speech
- Video Generation
- Developer Tools
- Other
The updater service (Dockerfile.updater) implements:
- Scheduled tool discovery using supercronic
- Automatic deduplication of new entries
- Health monitoring of the update process
- Configurable update frequency via crontab
- Weekly Supercronic job calls
run-enhancement.sh, which executesuv run python -m ai_tools_website.v1.content_enhancer_v2inside the updater container. - The V2 enhancer uses a multi-stage pipeline (Tavily search + LLM analysis) to enrich tool records with detailed information, installation commands, and feature lists.
- Quality Tiering: Tools are automatically assigned to tiers (Tier 1, Tier 2, Tier 3, or noindex) based on importance signals like GitHub stars and HuggingFace downloads. This ensures resources are focused on high-value tools.
- Regeneration limits (
CONTENT_ENHANCER_MAX_PER_RUN) are configurable.CONTENT_ENHANCER_MODELmust be set in the environment.
The system implements a flexible storage system:
-
Minio Integration
- S3-compatible object storage
- Automatic bucket creation and management
- LRU caching for improved read performance
- Graceful handling of initialization (empty data)
- Content-type aware storage (application/json)
-
Data Format
- JSON-based storage for flexibility
- Schema:
{ "tools": [ { "name": "string", "description": "string", "url": "string", "category": "string" } ], "last_updated": "string" } - Atomic updates with cache invalidation
- Error handling for storage operations
-
Development Features
- Local filesystem fallback
- Development mode caching
- Configurable secure/insecure connections
- Comprehensive logging of storage operations
The frontend is built with FastHTML for efficient server-side rendering:
-
Architecture
- Server-side rendering with FastHTML components
- Async request handling with uvicorn
- In-memory caching with background refresh
- Health check endpoint for monitoring
-
UI Components
- Responsive grid layout for tool cards
- Real-time client-side search filtering
- Category-based organization
- Dynamic tool count display
- GitHub integration corner
-
Performance Features
- Background cache refresh mechanism
- Efficient DOM updates via client-side JS
- Static asset serving (CSS, JS, images)
- Optimized search with data attributes
-
Development Mode
- Hot reload support
- Configurable port via environment
- Static file watching
- Detailed request logging
# Install UV if you haven't already
pip install uv
# Install dependencies
uv sync
# Set up environment variables (copy from .env.example)
cp .env.example .env
# Run the web server
uv run python -m ai_tools_website.web
# Run background search/updater
uv run python -m ai_tools_website.searchVisit https://drose.io/aitools or http://localhost:8000 (for local development) in your browser.
The application uses environment variables for configuration. Copy .env.example to .env and configure the following:
WEB_PORT: Web server port (default: 8000)LOG_LEVEL: Logging verbosity (default: INFO)
When running the search module:
-
--cache-searches: Cache Tavily search results for faster iteration -
--dry-run: Run without saving any changes -
OPENAI_API_KEY: OpenAI API key for enhanced search -
TAVILY_API_KEY: Tavily API key for additional search features -
CONTENT_ENHANCER_MODEL: Model for content enhancement (required, no default) -
SEARCH_MODEL: Model for search and deduplication (required, no default) -
MAINTENANCE_MODEL: Model for maintenance tasks (required, no default) -
WEB_SEARCH_MODEL: Model for web search API calls (required, no default) -
LANGCHAIN_API_KEY: Optional LangChain integration -
LANGCHAIN_TRACING_V2: Enable LangChain tracing (default: false) -
LANGCHAIN_PROJECT: LangChain project name
TOOLS_FILE: Path to tools data file (default: "data/tools.json")
If using Minio for storage, configure:
MINIO_ENDPOINT: Minio server endpointMINIO_ACCESS_KEY: Minio access keyMINIO_SECRET_KEY: Minio secret keyMINIO_BUCKET_NAME: Bucket name for tool storage
See .env.example for a template with default values.
- Refactored the codebase to separate concerns:
- data_manager.py now handles data processing and validation.
- search.py is refactored for clarity and integration with AI services.
- Improved logging configuration in logging_config.py.
- Enhanced storage interface in storage.py to support multiple backends.
- Adopted UV for dependency management and task execution best practices.
The application is containerized using Docker with two services:
-
Web Service
- Serves the main web application
- Built from
Dockerfile - Exposes the configured web port
- Includes health checks for reliability
-
Updater Service
- Runs scheduled tool updates using supercronic
- Built from
Dockerfile.updater - Automatically keeps tool data fresh
- Includes health monitoring
- Nightly sitemap exports run inside the updater container via
run-sitemaps.sh(scheduled inscripts/crontabat 05:00 UTC). - You can generate the XML bundle manually with
uv run python -m ai_tools_website.v1.sitemap_builder --dry-runor omit--dry-runto publish directly to MinIO. - Sitemaps are stored under the
sitemaps/prefix in object storage and served through/sitemap.xmlplus/sitemaps/<file>.xmlroutes.
To deploy using Docker Compose:
# Build and start all services
docker compose up -d
# View logs
docker compose logs -f
# Stop services
docker compose downMake sure to configure your .env file before deployment. See Configuration section above for required variables.
This project is licensed under the Apache License 2.0.