Advanced dual-mode web scraping tool with intelligent exhibition directory extraction
- 🎯 Enhanced Selenium Scraper with intelligent "Load More" button handling
- 🏢 Exhibition Directory Specialization optimized for InfoSecurity Europe, Milipol, etc.
- 🔄 Advanced Pagination with scroll detection and content stability checks
- 📊 Comprehensive Coverage ensuring 95%+ exhibitor extraction
- 🎪 Preset Management with 25+ pre-configured exhibition scrapers
- 🚀 Executable Build - Ready-to-use .exe application
- ⚡ Simple Mode: Fast scraping for static HTML sites (requests + BeautifulSoup)
- 🌐 Selenium Mode: Full JavaScript rendering for modern websites
- 🔄 Auto Mode: Automatically detects the best scraping method
- Intelligent "Load More" handling with multiple selector fallbacks
- Scroll-triggered content loading detection
- Pagination automation for multi-page directories
- 25+ Pre-configured exhibition presets (InfoSecurity Europe, Milipol, Eurosatory, etc.)
- Company name extraction with UI element filtering
- Modern GUI with tkinter
- Real-time progress tracking
- Preset management system
- CSV preview before export
- One-click executable
- CSV export with timestamps
- Duplicate removal
- Data filtering and cleaning
- Batch processing support
- Download the latest
WebTagContentExtractor.exefrom Releases - Run the executable - no installation required!
- Load presets and start scraping
# Clone the repository
git clone https://github.com/yourusername/WebTagContentExtractor.git
cd WebTagContentExtractor
# Install dependencies
pip install -r requirements.txt
# Run the application
python main_window.py