An automated system that analyzes AI case studies (you can change the use case by updating the instructions.md file and prompts) to identify and document enterprise-level AI implementations using Claude 3.5 Sonnet API.
It starts by reading URLs from a CSV file and uses web scraping (either through WebLoader or Firecrawl) to extract the content from each case study.
The extracted content is then sent to Claude 3.5 Sonnet, which analyzes whether the case study represents a genuine enterprise AI implementation based on specific criteria like company maturity, implementation scale, and measurable business outcomes.
For each URL, the system first saves the raw content and then performs this initial qualification analysis.
If Claude determines that a case study qualifies as an enterprise AI implementation, the system proceeds to generate a detailed analysis.
It creates three types of reports:
- an individual case study report with sections like Executive Summary, AI Strategy Analysis, and Business Impact Assessment
- a cross-case analysis that identifies patterns and trends across multiple case studies
- and an executive dashboard summarizing key metrics and insights.
All of these reports are saved in structured formats (markdown for individual reports, JSON for cross-case analysis and dashboard) in their respective directories.
If a case study doesn't qualify as an enterprise AI implementation, the system logs the reason and moves on to the next URL.
The entire process is asynchronous and provides detailed terminal feedback about its progress and decisions.
- Extracts content from AI case study URLs
- Analyzes them to identify enterprise AI implementations
- Generates detailed reports and insights
- Creates cross-case analysis and executive dashboards
- Web scraping with BeautifulSoup
- Structured data extraction
- Automatic content cleaning and organization
- Support for various page layouts
- Enterprise AI qualification check
- Confidence scoring
- Detailed multi-section analysis
- Business impact assessment
-
Individual Case Study Reports
- Executive Summary
- AI Strategy Analysis
- Technical Implementation Details
- Business Impact Assessment
- Key Success Factors
- Lessons Learned
-
Cross-Case Analysis
- Common patterns
- Success factors
- Implementation challenges
- Technology trends
- ROI patterns
-
Executive Dashboard
- Company profiles
- Technology stacks
- Success metrics
- Implementation scales
-
Clone the repository: bash git clone https://github.com/yourusername/ai-case-study-analyzer.git cd ai-case-study-analyzer
-
Create a virtual environment: bash python -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
-
Install dependencies: bash pip install -r requirements.txt
-
Create a .env file with your API keys: env ANTHROPIC_API_KEY=your_claude_api_key
project/ ├── src/ │ ├── scrapers/ │ │ └── web_loader.py │ ├── processors/ │ │ └── claude_processor.py │ ├── config.py │ ├── main.py │ └── test_setup.py ├── input/ │ └── urls.csv ├── raw_content/ │ └── case_[id]/ │ ├── raw_content.txt │ ├── structured_content.json │ └── metadata.json ├── sections/ │ └── case_[id]/ │ ├── company_context.md │ ├── business_challenge.md │ └── [...].md ├── reports/ │ ├── individual/ │ │ └── case_[id].md │ ├── cross_case_analysis/ │ │ └── cross_case_analysis.json │ └── executive_dashboard/ │ └── executive_dashboard.json └── logs/ ├── processing_log.json └── validation_log.json
-
Prepare Input:
- Place your case study URLs in
input/urls.csv
- Format: single column with header 'url'
- Place your case study URLs in
-
Run Tests: bash python -m src.test_setup
-
Run Analysis: bash python -m src.main
-
Content Extraction
- Web scraping of case study URLs
- Content cleaning and structuring
- Metadata extraction
-
AI Analysis
- Enterprise AI qualification check
- Detailed section analysis
- Report generation
-
Report Generation
- Individual case study reports
- Cross-case analysis
- Executive dashboard updates
- Established company (not startup)
- Business AI implementation
- Enterprise-scale deployment
- Clear business outcomes
- AI/ML technology details
- Enterprise integration aspects
- Business process transformation
- ROI metrics
- Change management approach
- Executive Summary
- AI Strategy Analysis
- Technical Implementation Details
- Business Impact Assessment
- Key Success Factors
- Lessons Learned
- Common patterns
- Success factors
- Implementation challenges
- Technology trends
- ROI patterns
- Company profiles
- Technology stacks
- Success metrics
- Implementation scales
The system includes robust error handling for:
- Web scraping failures
- API timeouts
- Content parsing errors
- File system operations
- JSON parsing issues
Detailed logging is provided in:
- processing_log.json: Processing status and errors
- validation_log.json: Content validation results
Key settings in config.py:
- API configurations
- Model parameters
- File paths
- Processing options
- anthropic: Claude API client
- beautifulsoup4: Web scraping
- aiohttp: Async HTTP requests
- pandas: Data processing
- python-dotenv: Environment variables
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Claude 3.5 Sonnet by Anthropic for AI analysis
- BeautifulSoup4 for web scraping
- The open-source community for various tools and libraries
For support, please open an issue in the GitHub repository or contact the maintainers.