A Python-based system that processes and analyzes compliance-related keywords using AI, organizing them by country and providing detailed analysis.
This system takes a list of flagged keywords and their compliance information, processes them through an AI analysis pipeline, and generates structured outputs for compliance checking across different international markets.
- AI-Powered Analysis: Uses Brain API for intelligent keyword analysis
- Batch Processing: Efficiently processes keywords in batches of 10
- Parallel Processing: Uses ThreadPoolExecutor for improved performance
- Error Handling: Includes retry logic and fallback mechanisms
- International Compliance: Organizes keywords by country/region
- Structured Outputs: Generates both JSON and CSV formats
- Python 3.x
- Brain API client (
brain_platform_client
) - Required Python packages (install via pip):
pip install pandas python-dotenv pydantic brain-platform-client
-
Clone the repository:
git clone [repository-url]
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables: Create a
.env
file with:BRAIN_API_KEY=your_api_key_here
-
Prepare your input file:
- Create a CSV file named
keywords.csv
- Required columns:
Keyword
Reason_to_Flag
Country_Code
Valid_Country_Codes
Country
Compliance_Region
- Create a CSV file named
-
Run the processor:
python process_keywords.py
{
"US": ["keyword1", "keyword2"],
"UK": ["keyword3", "keyword4"],
"ALL": ["global_keyword1", "global_keyword2"]
}
A detailed spreadsheet containing:
- Keyword information
- Reasons for flagging
- AI analysis
- Country-specific applicability
process_keywords.py
: Main processing scriptKeywordProcessor
: Main class handling the processingKeywordAnalysis
: Pydantic model for structured responses- Batch processing and analysis functions
-
Input Processing
- Reads keywords.csv
- Validates input data
- Handles missing values and data cleanup
-
AI Analysis
- Batches keywords for efficient processing
- Sends to Brain API for analysis
- Handles structured responses and fallbacks
-
Output Generation
- Creates country-organized JSON
- Generates detailed CSV with analysis
- Handles special cases (ALL countries, region-specific)
The system processes:
- ~10,000+ keywords
- Multiple country codes
- Global and region-specific patterns
- Compliance reasons and analysis
- Retry mechanism for API calls
- Fallback to JSON mode if structured output fails
- NaN value handling in country codes
- Invalid data filtering
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request