This high-performance, multi-threaded Python application concurrently retrieves public data from multiple social media platforms using a single target username. It is optimized for speed via parallel execution, intentionally leveraging high system resources (CPU and memory) as a trade-off for maximum performance. Advanced anti-detection measures and configurable auto-termination ensure controlled and stealthy operation.
- 🔍 Multi-Platform Concurrency: Simultaneously collects data from YouTube, Instagram, X, Threads, Quora, and Reddit.
- 👤 Human-Like Behavior: Randomized, time-delayed actions simulate human activity to reduce the risk of bot detection and IP bans.
- 💾 Unified Logging: Aggregates all collected data into a structured text file (
storing.txt). - 🛑 Auto-Termination: Automatically terminates the process once a pre-defined item count is reached.
- 🔑 Credential Support: Supports dummy/burner accounts for authenticated scraping sessions on specific platforms.
- Execution: Uses
ThreadPoolExecutoror equivalent to run platform-specific scrapers in parallel. - Resource Trade-Off: High parallelism increases CPU and memory usage to drastically reduce scraping time.
- Shared Resources: Output file (
storing.txt) and global collected item counter. - Synchronization: Threading locks ensure safe updates to the counter, file writes, and limit checks.
- Introduces random
time.sleep()delays before each request , mimicking human behavior.
- Allows dummy/burner account credentials for authenticated sessions, increasing rate limits, data visibility, and IP safety.
- Limit Variable:
MAX FETCH COUNTsets the operational lifespan. - Termination Logic: The main thread monitors the thread-safe counter and executes
sys.exit(0)once the limit is reached.
- Post-execution, verify no orphaned threads or helper processes remain to prevent unnecessary resource usage.
- Data is appended to
storing.txtin a pipe-delimited format: