Skip to content

Commit f85a59d

Browse files
committed
Fix HistoricalGEXDatabaseBuilder: Store real prices, not obfuscated fallback
Problem: Database was storing 450.0 obfuscated fallback when underlyingPrice column missing, causing 1000-4500x magnitude errors in GEX values. Root Cause: get_stock_price() returned hardcoded 450.0 instead of fetching real market data. This violated separation: obfuscation is for LLM layer ONLY, storage must use real prices. Fix: Enhanced get_stock_price() with 3-tier fallback: 1. Check options_data for underlyingPrice column 2. Estimate from put-call parity (estimate_spot_from_options) 3. Fetch from Polygon API 4. ERROR if all fail (never store fake data) Impact: - Q1 2024 rebuild: 53/53 dates successful with real prices - GEX values now in correct range ($500M-$9B vs previous $500B-$45T) - 100% validation match between fresh calculations and database Testing: Database rebuild validated on Q1 2024 (53 trading days) See: reports/validation/database_rebuild_Q1_2024.yaml
1 parent ed15ec3 commit f85a59d

File tree

1 file changed

+38
-8
lines changed

1 file changed

+38
-8
lines changed

src/data_sources/historical_gex_builder.py

Lines changed: 38 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -506,20 +506,50 @@ def flush_batch(self):
506506

507507
def get_stock_price(self, symbol, date, options_data: pd.DataFrame = None) :
508508
"""
509-
Get stock closing price for the date.
509+
Get REAL stock closing price for the date.
510510
511-
CRITICAL: Must use SAME logic as validation pipeline to ensure GEX values match.
512-
Validation uses: options_data['underlyingPrice'].iloc[0] if 'underlyingPrice' in options_data.columns else 450.0
511+
CRITICAL: Database must store REAL market prices, NEVER obfuscated values.
512+
Obfuscation is ONLY for LLM analysis layer (data_obfuscation.py), not storage.
513+
514+
Methods (in priority order):
515+
1. Check options_data for underlyingPrice column
516+
2. Estimate from options using put-call parity
517+
3. Fetch from market data API
518+
4. ERROR if all methods fail (never store fake/obfuscated data)
513519
"""
514-
# First try to get from options data (same as validation)
520+
# Method 1: Check for explicit underlying price in options data
515521
if options_data is not None and 'underlyingPrice' in options_data.columns:
516522
spot = float(options_data['underlyingPrice'].iloc[0])
517-
self.logger.debug(f"Using spot price from underlyingPrice column: {spot}")
523+
self.logger.debug(f"Method 1: Got spot price from underlyingPrice column: {spot}")
518524
return spot
519525

520-
# Fallback to 450.0 (same as validation pipeline)
521-
self.logger.debug(f"underlyingPrice not in options data, using fallback: 450.0")
522-
return 450.0
526+
# Method 2: Estimate from options data using put-call parity
527+
if options_data is not None and not options_data.empty:
528+
estimated = self.estimate_spot_from_options(options_data)
529+
if estimated:
530+
self.logger.info(f"Method 2: Estimated spot price from put-call parity: {estimated:.2f}")
531+
return estimated
532+
533+
# Method 3: Fetch from market data API
534+
if self.has_stock_data:
535+
try:
536+
# Try to get closing price from Polygon
537+
price = self.stock_client.get_daily_close(symbol, date)
538+
if price:
539+
self.logger.info(f"Method 3: Fetched spot price from API: {price:.2f}")
540+
return price
541+
except Exception as e:
542+
self.logger.warning(f"Method 3 failed: Could not fetch price from API: {e}")
543+
544+
# NO FALLBACK TO 450.0 - Raise error instead of storing bad data
545+
error_msg = (
546+
f"Cannot determine real spot price for {symbol} {date}. "
547+
f"All methods failed: underlyingPrice column missing, "
548+
f"put-call parity estimation failed, API fetch failed. "
549+
f"Database must store REAL prices only - refusing to store obfuscated/fake value."
550+
)
551+
self.logger.error(error_msg)
552+
raise ValueError(error_msg)
523553

524554
def estimate_spot_from_options(self, options_data: pd.DataFrame) :
525555
"""Estimate spot price from options data using put-call parity."""

0 commit comments

Comments
 (0)