Enterprise-grade data intelligence API with 9 enrichment tools + bulk processing
π Status: Production-Ready | SaaS-Ready | Fully Deployed
π Live Endpoint: https://scaile--g-mcp-tools-fast-api.modal.run
π Interactive Docs: Swagger UI | ReDoc
A complete data enrichment API built on Modal.com, combining AI-powered web scraping with 8 specialized intelligence tools. Perfect for sales intelligence, market research, lead enrichment, and data validation.
β 9 Enrichment Tools - Web scraping, email intel, company data, phone validation, and more β Bulk Processing - Process 100s-1000s of records in parallel with auto-detection β Smart Auto-Detection - Automatically detect data types and apply appropriate tools β Multi-Tool Enrichment - Combine multiple tools on a single record β AI-Powered Extraction - Uses Gemini 2.5 Flash for intelligent data extraction β Production-Ready - Authentication, health checks, comprehensive error handling β Auto-Scaling - Serverless architecture handles traffic spikes automatically β 24-Hour Cache - Reduces costs and improves response times β OpenAPI Docs - Swagger/ReDoc for easy integration β Type-Safe - Pydantic models for all inputs/outputs
NEW: Process multiple records in parallel with intelligent auto-detection!
Enrich a single record with multiple tools at once:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/enrich \
-H 'Content-Type: application/json' \
-d '{
"data": {
"phone": "+14155552671",
"email": "john@anthropic.com"
},
"tools": ["phone-validation", "email-intel", "email-pattern"]
}'Automatically detect data types and apply appropriate tools:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/enrich/auto \
-H 'Content-Type: application/json' \
-d '{
"data": {
"contact_phone": "+14155552671",
"work_email": "john@anthropic.com",
"company_domain": "anthropic.com"
}
}'Response: Automatically detected and enriched with 5 tools (phone validation, email intel, email pattern, WHOIS, tech stack)!
Process multiple records in parallel with explicit tools:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/bulk \
-H 'Content-Type: application/json' \
-d '{
"rows": [
{"name": "Alice Johnson", "email": "alice@example.com"},
{"name": "Bob Smith", "email": "bob@example.com"}
],
"tools": ["email-intel", "email-pattern"]
}'Response:
{
"success": true,
"batch_id": "batch_1761503726531_7AzCBh1nHak",
"status": "completed",
"total_rows": 2,
"successful": 2,
"failed": 0,
"processing_time_seconds": 1.24,
"results": [ /* enriched rows */ ]
}Process multiple records with automatic tool detection:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/bulk/auto \
-H 'Content-Type: application/json' \
-d '{
"rows": [
{"name": "Alice", "email": "alice@example.com", "website": "example.com"},
{"name": "Bob", "phone": "+14155551234"}
]
}'Smart Features:
- β Automatically detects emails, phones, domains, companies, GitHub usernames
- β Applies appropriate tools (email-intel, email-pattern, whois, tech-stack, etc.)
- β Processes rows in parallel using asyncio
- β Handles up to 10,000 rows per batch
- β Returns detailed success/error stats
Extract structured data from any website using natural language prompts.
Capabilities:
- AI-powered extraction with Gemini 2.5 Flash
- Multi-page scraping with auto-discovery
- Custom JSON schema support
- Link extraction
- 24-hour intelligent caching
Example:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/scrape \
-H 'Content-Type: application/json' \
-d '{
"url": "https://anthropic.com",
"prompt": "Extract the company mission and product names",
"max_pages": 1
}'Response:
{
"success": true,
"data": {
"company_mission": "Build safe, beneficial AI...",
"product_names": ["Claude", "Claude Code", "Opus", "Sonnet", "Haiku"]
},
"metadata": {
"extraction_time": 10.31,
"pages_scraped": 1,
"model": "gemini-2.5-flash"
}
}Check which platforms an email is registered on (holehe).
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/email-intel \
-H 'Content-Type: application/json' \
-d '{"email": "user@example.com"}'Find email addresses associated with a domain (theHarvester).
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/email-finder \
-H 'Content-Type: application/json' \
-d '{"domain": "anthropic.com", "limit": 10}'Get company registration and corporate information (OpenCorporates).
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/company-data \
-H 'Content-Type: application/json' \
-d '{"companyName": "Anthropic", "domain": "anthropic.com"}'Validate phone numbers with carrier, location, and line type info.
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/phone-validation \
-H 'Content-Type: application/json' \
-d '{"phoneNumber": "+14155552671", "defaultCountry": "US"}'Response:
{
"success": true,
"data": {
"valid": true,
"formatted": {
"e164": "+14155552671",
"international": "+1 415-555-2671",
"national": "(415) 555-2671"
},
"country": "San Francisco, CA",
"carrier": "Unknown",
"lineType": "FIXED_LINE_OR_MOBILE",
"lineTypeCode": 2
}
}Detect technologies and frameworks used by a website.
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/tech-stack \
-H 'Content-Type: application/json' \
-d '{"domain": "anthropic.com"}'Response:
{
"success": true,
"data": {
"domain": "anthropic.com",
"technologies": [
{"name": "Next.js", "category": "Framework"},
{"name": "cloudflare", "category": "Web Server"}
],
"totalFound": 2
}
}Generate common email patterns for a domain.
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/email-pattern \
-H 'Content-Type: application/json' \
-d '{"domain": "anthropic.com", "firstName": "John", "lastName": "Doe"}'Look up domain registration information.
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/whois \
-H 'Content-Type: application/json' \
-d '{"domain": "anthropic.com"}'Response:
{
"success": true,
"data": {
"domain": "anthropic.com",
"registrar": "MarkMonitor, Inc.",
"creationDate": "2001-10-02",
"expirationDate": "2033-10-02",
"nameServers": ["ISLA.NS.CLOUDFLARE.COM", "RANDY.NS.CLOUDFLARE.COM"]
}
}Analyze GitHub user profiles and repositories.
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/github-intel \
-H 'Content-Type: application/json' \
-d '{"username": "anthropics"}'Response:
{
"success": true,
"data": {
"username": "anthropics",
"name": "Anthropic",
"location": "United States of America",
"publicRepos": 54,
"followers": 14565,
"languages": {
"Python": 6,
"TypeScript": 3,
"JavaScript": 1
}
}
}The API supports optional API key authentication via the x-api-key header.
- Create Modal secret:
modal secret create modal-api-key MODAL_API_KEY=your-secret-key-here- Redeploy the API:
./DEPLOY_G_MCP_TOOLS.sh- Include API key in requests:
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/scrape \
-H 'Content-Type: application/json' \
-H 'x-api-key: your-secret-key-here' \
-d '{"url": "https://example.com", "prompt": "Extract data"}'Note: If MODAL_API_KEY is not set, the API is publicly accessible (useful for development).
- Install Modal CLI:
pip install modal- Authenticate:
modal setup- Create Gemini API secret:
modal secret create gemini-secret GOOGLE_GENERATIVE_AI_API_KEY=your-gemini-keychmod +x DEPLOY_G_MCP_TOOLS.sh
./DEPLOY_G_MCP_TOOLS.shOr manually:
modal deploy g-mcp-tools-complete.pyMonitor API status:
curl https://scaile--g-mcp-tools-fast-api.modal.run/healthResponse:
{
"status": "healthy",
"service": "g-mcp-tools-fast",
"version": "1.0.0",
"tools": 9,
"timestamp": "2025-10-26T17:30:00.000000Z"
}All endpoints follow a consistent response format:
{
"success": true,
"data": { ... },
"metadata": {
"source": "tool-name",
"timestamp": "2025-10-26T17:30:00.000000Z"
}
}{
"success": false,
"error": "Error message",
"metadata": {
"source": "tool-name",
"timestamp": "2025-10-26T17:30:00.000000Z"
}
}The API includes several cost-saving features:
- 24-Hour Cache - Repeated requests return cached results
- Timeouts - Prevents runaway processes (30s default, 120s max)
- Container Idle Timeout - Containers shut down after 120s of inactivity
- Efficient Resource Usage - Only runs when needed
Estimated costs (Modal pricing):
- Web scraping: ~$0.001 per request
- Other tools: ~$0.0001 per request
- Cache hits: $0 (served from memory)
# Test all 9 endpoints
./test-all-endpoints.sh# Email pattern
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/email-pattern \
-H 'Content-Type: application/json' \
-d '{"domain": "anthropic.com"}'
# Phone validation
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/phone-validation \
-H 'Content-Type: application/json' \
-d '{"phoneNumber": "+14155552671"}'
# GitHub intel
curl -X POST https://scaile--g-mcp-tools-fast-api.modal.run/github-intel \
-H 'Content-Type: application/json' \
-d '{"username": "anthropics"}'- Health Check Endpoint -
/healthfor monitoring - API Authentication - Optional
x-api-keyheader - OpenAPI Documentation - Swagger UI + ReDoc
- Error Handling - Comprehensive error responses
- Input Validation - Pydantic models
- Rate Limiting - Handled by Modal platform
- Monitoring - Modal dashboard + logs
- Auto-Scaling - Serverless architecture
- Cost Optimization - Caching + timeouts
- Type Safety - TypeScript-style typing
β B2B SaaS API β Data Enrichment Service β Lead Intelligence Platform β Market Research Tool
modal app logs g-mcp-tools-fast --followmodal app list | grep g-mcp-toolsmodal secret listClient Request
β
FastAPI (Modal ASGI)
β
Authentication Check (optional)
β
Input Validation (Pydantic)
β
Cache Check (24h TTL)
β (cache miss)
Tool Execution
ββ Web Scraper (crawl4ai + Gemini)
ββ Email Intel (holehe)
ββ Email Finder (theHarvester)
ββ Company Data (OpenCorporates API)
ββ Phone Validation (libphonenumber)
ββ Tech Stack (custom detection)
ββ Email Pattern (pattern generation)
ββ WHOIS (python-whois)
ββ GitHub Intel (GitHub API)
β
Cache Result
β
JSON Response
See parent repository for license information.
- Documentation: Swagger UI
- Issues: Report via GitHub Issues
- Modal Support: modal.com/docs
- Enrich lead data with company info
- Find contact emails and phone numbers
- Validate contact information
- Scrape competitor websites
- Analyze tech stacks
- Track company changes via WHOIS
- Analyze GitHub profiles
- Detect technologies used
- Research developer ecosystems
- Validate phone numbers
- Verify email patterns
- Check domain registrations
Built with: Modal.com | FastAPI | Gemini 2.5 Flash | crawl4ai