-
Notifications
You must be signed in to change notification settings - Fork 0
Getting Started
This guide will help you set up and run the GEX LLM Patterns validation framework.
- Python: 3.9 or higher
- OS: Linux, macOS, or Windows (WSL recommended for Windows)
- Memory: 4GB RAM minimum
- Storage: 2GB for code + cache
-
OpenAI API Key: For GPT-4 LLM calls
- Sign up at: https://platform.openai.com
- Need credits for API usage (~$0.03/validation day with GPT-4)
-
Options Data Source (Optional):
- Currently uses
yfinance(free, limited historical data) - For production: Consider HistoricalOptionData.com, OptionMetrics, etc.
- Currently uses
git clone https://github.com/iAmGiG/gex-llm-patterns.git
cd gex-llm-patterns# Using pip
pip install -r requirements.txt
# Or using conda
conda create -n gex-llm python=3.9
conda activate gex-llm
pip install -r requirements.txtKey Dependencies:
-
openai- LLM API client -
pandas- Data manipulation -
numpy- Numerical computing -
yfinance- Options data fetching (free tier) -
pyyaml- Validation report generation
# Set Python path (required for imports)
export PYTHONPATH=$(pwd):$PYTHONPATH
# Set OpenAI API key
export OPENAI_API_KEY="sk-your-key-here"
# Optional: Configure LLM model
export LLM_MODEL="gpt-4o-mini" # Default: gpt-4o-mini (cheap)
# export LLM_MODEL="gpt-4" # More accurate but expensiveTip: Add these to your ~/.bashrc or ~/.zshrc for persistence
# Check imports work
python -c "from src.agents.market_mechanics_agent import MarketMechanicsAgent; print('✅ Imports OK')"
# Check API key configured
python -c "import os; print('✅ API key set' if os.getenv('OPENAI_API_KEY') else '❌ No API key')"python scripts/validation/validate_pattern_taxonomy.py \
--pattern gamma_positioning \
--symbol SPY \
--start-date 2024-01-02 \
--end-date 2024-03-29 \
--confidence 60.0Expected Output:
- Processing bar:
Processing dates: 100%|████████████| 53/53 - Validation report:
reports/validation/pattern_taxonomy/gamma_positioning_SPY_2024Q1.yaml - Summary: Detection rate, predictive accuracy, net alpha
Time: ~5-10 minutes for 53 days (with GPT-4o-mini) Cost: ~$1-2 in API calls
python scripts/validation/validate_all_patterns.py \
--patterns gamma_positioning stock_pinning 0dte_hedging \
--start-date 2024-01-02 \
--end-date 2024-03-29 \
--skip-completedExpected Output:
- 3 YAML reports (one per pattern)
- Summary table comparing detection rates
Time: ~15-30 minutes Cost: ~$3-6 in API calls
# reports/validation/pattern_taxonomy/gamma_positioning_SPY_2024Q1.yaml
pattern_name: gamma_positioning
symbol: SPY
date_range: 2024-01-02 to 2024-03-29
total_days: 53
# Aggregate Results
detection_rate_pct: 100.0 # LLM detected constraint on 100% of days
predictive_accuracy_pct: 96.2 # 96.2% of predictions materialized
avg_return_pct: 0.26 # Average daily return
net_alpha_pct: 0.21 # Return above risk-free rate
sample_size: 53 # Number of test days
# Per-Day Results
results:
- test_date: 2024-01-02
obfuscated_date: "Day T+0"
detected: true # LLM detected constraint
confidence: 85.0 # LLM confidence (0-100)
predicted_direction: "UP" # LLM prediction
forward_return_t1: 0.45 # Actual T+1 return (%)
prediction_correct: true # Did prediction materialize?
net_gex_usd: -8950000000.0 # -$8.95B (negative gamma)
spot_price: 474.60
- test_date: 2024-01-03
# ... (52 more days)Detection Rate:
- 100%: LLM detected constraint on every test day
- 60-80%: Strong detection (pattern is mechanical)
- <60%: Weak detection (pattern may be narrative)
Predictive Accuracy:
- 96%: LLM predictions materialized 96% of time
- High accuracy = LLM understands causal mechanism
- Low accuracy = pattern detected but doesn't drive price
Net Alpha:
- +0.21%: Strategy outperformed risk-free rate by 21 bps/day
- Note: Q1 2024 was profitable, but Q3/Q4 declined to near-zero
- Detection remains stable despite alpha decline (key finding!)
Run full 2024 validation (Q1, Q3, Q4) for all 3 patterns:
# Q1 2024 (Jan-Mar)
python scripts/validation/validate_all_patterns.py \
--patterns gamma_positioning stock_pinning 0dte_hedging \
--start-date 2024-01-02 \
--end-date 2024-03-29
# Q3 2024 (Jul-Sep)
python scripts/validation/validate_all_patterns.py \
--patterns gamma_positioning stock_pinning 0dte_hedging \
--start-date 2024-07-01 \
--end-date 2024-09-30
# Q4 2024 (Oct-Dec)
python scripts/validation/validate_all_patterns.py \
--patterns gamma_positioning stock_pinning 0dte_hedging \
--start-date 2024-10-01 \
--end-date 2024-12-31Total: 9 validation reports matching Paper #1 results
Define a new pattern in src/validation/pattern_taxonomy.py:
PATTERNS = {
# ... existing patterns ...
"my_new_pattern": {
"name": "My New Pattern",
"status": "MECHANICAL",
"description": "Clear description of constraint",
"who": "Market participants",
"whom": "Who is forced?",
"what": "What are they forced to do?",
"constraint_mechanism": "Why can't they avoid it?",
"academic_basis": "Published research citation"
}
}Run validation:
python scripts/validation/validate_pattern_taxonomy.py \
--pattern my_new_pattern \
--symbol SPY \
--start-date 2024-01-02 \
--end-date 2024-03-29# Validate gamma positioning on QQQ instead of SPY
python scripts/validation/validate_pattern_taxonomy.py \
--pattern gamma_positioning \
--symbol QQQ \
--start-date 2024-01-02 \
--end-date 2024-03-29Note: Requires options data for that ticker (may need premium data source)
# Unbiased (default)
python scripts/validation/validate_pattern_taxonomy.py \
--pattern gamma_positioning \
--symbol SPY \
--start-date 2024-01-02 \
--end-date 2024-03-29
# Biased (assumes pattern exists)
python scripts/validation/validate_pattern_taxonomy.py \
--pattern gamma_positioning \
--symbol SPY \
--start-date 2024-01-02 \
--end-date 2024-03-29 \
--biasedCompare detection rates (biased should be 100%, unbiased more realistic)
Available Models:
-
gpt-4o-mini: Fast, cheap (~$0.03/day), good accuracy -
gpt-4: Slower, expensive (~$0.15/day), highest accuracy -
gpt-4-turbo: Balanced performance
How to Switch:
# Via environment variable
export LLM_MODEL="gpt-4"
# Or edit config/config.json
{
"llm": {
"model": "gpt-4",
"temperature": 0.0,
"max_tokens": 2000
}
}Default: Obfuscation enabled (recommended for research)
Disable (for debugging only):
python scripts/validation/validate_pattern_taxonomy.py \
--pattern gamma_positioning \
--symbol SPY \
--start-date 2024-01-02 \
--end-date 2024-03-29 \
--no-obfuscateWarning: Disabling obfuscation may allow LLM to use temporal context (invalidates methodology)
Default: Options data cached in .cache/
Clear cache (force fresh data fetch):
rm -rf .cache/options_data_cache.dbRebuild historical GEX database:
python scripts/data/rebuild_historical_gex.py \
--symbol SPY \
--start-date 2024-01-01 \
--end-date 2024-12-31Cause: PYTHONPATH not set
Fix:
export PYTHONPATH=$(pwd):$PYTHONPATHCause: API key not in environment
Fix:
export OPENAI_API_KEY="sk-your-key-here"Cause: yfinance doesn't have data for that date (weekends, holidays, or too old)
Fix:
- Use business days only (skip weekends)
- Check if date is a market holiday
- Consider premium data source for complete history
Symptoms: Validation takes >1 hour for 50 days
Causes & Fixes:
-
Using GPT-4 → Switch to
gpt-4o-mini(10x faster) - Fresh data fetches → Enable caching (default)
- Serial processing → Use batch mode (experimental)
# Faster: Use gpt-4o-mini + ensure caching
export LLM_MODEL="gpt-4o-mini"
python scripts/validation/validate_pattern_taxonomy.py --pattern gamma_positioning ...Symptoms: Validation costs $10+ for 50 days
Cause: Using expensive model (GPT-4)
Fix:
# Switch to gpt-4o-mini (10x cheaper, similar accuracy)
export LLM_MODEL="gpt-4o-mini"Cost Comparison (50 days):
- GPT-4: ~$7.50
- GPT-4o-mini: ~$0.75
- Methodology - Understand obfuscation testing framework
- Pattern Taxonomy - See all validated patterns
- Key Results - Detailed Paper #1 findings
- API Reference - Code documentation
- Reproduce Paper #1: Validate all 3 patterns across full 2024
- Test new patterns: Define and validate your own dealer constraints
- Compare assets: Run on QQQ, IWM, or individual stocks
- Open GitHub Issues for bugs/questions
- Review Research Roadmap for future directions
- Contact author for collaboration
Issues: https://github.com/iAmGiG/gex-llm-patterns/issues
Documentation: https://github.com/iAmGiG/gex-llm-patterns/tree/development/docs
Contact: See Publications page
Last Updated: October 25, 2025