SunspotDataScraper is a Python Jupyter Notebook-based tool designed to scrape, analyze, and visualize historical sunspot group data from observatories. This project aims to provide insights into solar activity trends by collecting sunspot data across specified years, calculating the average lifespans of sunspot groups, and generating visual representations of these patterns over time. It’s ideal for researchers, students, and anyone interested in exploring long-term solar cycle trends and the impact of sunspot activity on space weather.
The primary objectives of SunspotDataScraper are:
To retrieve historical sunspot data from multiple observatory databases. Analyzing and calculating sunspot groups' average lifespan shows how long groups remain visible. Analyze and predict Zurich Classification of sunspots to determine patterns in lifespan. To generate time-series visualizations of sunspot data, allowing users to observe solar cycle patterns and trends. To enable customizable data collection ranges, users can specify particular years or periods of interest in the list_of_years data structure. This project can help provide context for studies of solar behavior, potential impacts on Earth's geomagnetic environment, and historical solar cycle analysis.
- Data Retrieval: Automatically scrapes sunspot data from multiple observatories.
- Data Processing: Cleans and organizes raw data into structured formats for easy analysis.
- Visualization: Generates plots for average sunspot lifespan and trends across specific timeframes.
- Customizable: Easily modify the years or range of data being retrieved for tailored analysis.
-
Clone the repository:
git clone https://github.com/daxpatel2/SunspotDataScraper.git cd SunspotDataScraper
-
Install dependencies: This project uses Python libraries including
requests
,pandas
, andmatplotlib
. Install them with:pip install -r requirements.txt
- Set the Desired Years: In the scripts, you can modify the
list_of_years
to specify the years you want to retrieve data for. - Run the Scrapers:
- Specola Database Scraper: Simply run each cell in the Jupyter Notebook
- Fenyi Scraper: Automatically ran inside Specola Database Scraper. However, there is a separate file in case you want to run just the web scrapper code
- View Data: All graphs will be outputted inside of Jupyter Notebook cells
- Generate Visualizations: After running the scrapers, the
SunspotDataScraper
will generate plots to show sunspot lifespans and activity trends and get saved as png using the plt.savefig() method
Specola_Database_Scrapper.ipynb
: Scrapes sunspot data from the Specola Observatory database.Fenyi_Scrapper.ipynb
: Scrapes data from the Fenyi Observatory database(Already implemented inside of Specola Database Scraper).sunspot_data_excel.csv
: Example output data file for sunspot group appearances and lifespans downloaded from Specola Database Archive.Find_sunspots_switching_zurich_classifications.ipynb
: Example output data file for sunspot group appearances and lifespans downloaded from Specola Database Archive.
Contributions are welcome! Please submit an issue or pull request if you have suggestions.
- Fork the repository
- Create a feature branch (
git checkout -b feature-branch
) - Commit your changes (
git commit -m 'Add feature'
) - Push to the branch (
git push origin feature-branch
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
Created by daxpatel2 – feel free to reach out with questions or feedback!
Big thank you to my research advisors Dr.Asif Ud-Doula and Dr.Gillian Pearce for their guidance in this research.