-
Notifications
You must be signed in to change notification settings - Fork 83
Open
Description
Feature Request: Migrate GSOD Fetching to Meteostat
Summary
Replace all raw GSOD (NOAA) data ingestion code with the Meteostat Python library.
This eliminates manual fixed‑width parsing, removes local file management overhead, and provides a cleaner, more reliable data interface for engineering workflows.
Why Migration Makes Sense
Comparison: Current GSOD Pipeline vs. Meteostat
| Aspect | Current (Raw GSOD) | Proposed (Meteostat) |
|---|---|---|
| Output Format | Manual fixed‑width parsing → CSV | Direct Pandas DataFrame |
| Data Quality | Missing data encoded as 99, 9999 |
Interpolated, merged GSOD + METAR + ISD |
| Storage | Downloads & local cache mgmt | Optional built‑in SQLite cache, stateless |
| Units | Imperial (°F, knots, inHg) | SI/metric by default |
| Location Resolution | NOAA Station IDs required | Address → Lat/Lon via geocoding |
| Wind / Pressure Reliability | Often sparse or missing | Multi‑source merged, higher fidelity |
Engineering Advantages
1. Accuracy
- Meteostat fuses GSOD + METAR + ISD, producing a more reliable ambient dataset.
- Built‑in interpolation removes “data gaps” typical in raw GSOD (e.g., wind speed missing for multiple days).
2. Maintainability
- Removes parsing logic, column remapping, unit conversion, and NOAA folder iteration.
- Significantly reduces boilerplate and long‑term technical debt.
3. UX Improvements
- Allows users to search by address or city name, not station ID.
- Automatically selects the nearest high‑quality weather station.
Implementation Concept
from geopy.geocoders import Nominatim
from meteostat import Point, Daily
# 1. Convert user location --> geospatial point
loc = Nominatim(user_agent="fluids").geocode("Houston, TX")
point = Point(loc.latitude, loc.longitude)
# 2. Fetch daily weather data (temp, wind, pressure, etc.)
df = Daily(point, start, end).fetch()
# DataFrame includes:
# tavg, tmin, tmax, prcp, snow, wdir, wspd, pres, tsun
``Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels