A Yahoo Finance stock crawler for financial data analysis and visualization with Excel/XSLX export.
StockPyrate is a python script to gather stock information about prices, dividends, volumes, market caps and corporate statistics at Yahoo Finance with a focus on components of DOW JONES, NASDAQ 100, DAX 30, MDAX, SDAX, IBEX 35, CAC 40 and FTSE 100. As a user you enter a list of stocks or indices to be crawled, parsed, analyzed and exported. The script returns xls files for each stock and/or index components with metrics and visualization for e. g. current dividend yields or delta percentages of highs and lows in predefined time periods. It also provides a function to concatenate all parsed stocks into a single xlsx file to get a bird's-eye-view on all stocks at one place. Purpose is to enhance data, automate the time-consuming work of screening stock information and hereby make investment decisions easier and faster for individual investors.
- Latest updates can be found here.
- Example of Console Output: Go to txt file.
- Example of XLSX Output: Go to xlsx file.
- Screenshot of merged xlsx files for all parsed stocks:
- Screenshot of single xlsx files for each stock:
- The script relies on the slow, old-fashioned approach of web scraping instead of rapid API calls.
- The current implementation sends requests to the german subdomain of Yahoo Finance which requires your consent to its data policy, technically done by a cookie. It thus forces you to visit Yahoo Finance on your own with a common browser beforehand to initially set consent values that you easily need to enter and save before starting the script (cf. setup.md).
- Most of the of the captions and labels in XLS content are (currently) in german.
Python 3.7.6 (maybe Python 3.5+ does the job). To run StockPyrate you need:
- Requests (for crawling)
- BeautifulSoup (for parsing)
- Pandas (for exporting)
Other modules come with the standard Python package such as datetime, random or os (cf. requirements.txt).
- Set up your cookie in crawler.py (/functions)
- Define stock names to parse in stockpyrate.py (root).
- Hit play.
crawler.py:
# Define your cookie values by replacing the empty strings. Done!
def create_cookie():
cookie = {
'EuConsent': '',
'UIDR': '', [...]
stockpyrate.py:
# Define your stock names to parse as a list of strings. Done!
custom_filter["whitelist_stocks"] = ["intel", "qualcomm"]
custom_filter["whitelist_sectors"] = ["semiconductors", "gaming"]
Apart from module requirements you initially need to set up your individual cookie - and you are ready to go (cf. setup.md).
Crawling and Parsing
- Whitelist filter for stock symbols
- Whitelist filter for sectors
- Blacklist filter for stock symbols
- Whitelist filter for stock indices
- Blacklist filter for stock indices
- Timer in seconds for average crawling delay
- Estimated time needed to for crawling
- Status messages steadily commenting on what is going on
- Error handling primarily to avoid stalling ("in case of misunderstanding, read on!")
- Automated user-friendly file naming convention for xls export
- Custom filename and folder for xlsx concatenation procedure
- Predefined list of stocks now with more than 600 stocks with sector information
- Once started, crawl, parse, analyze and export in one shot
Financial Data Analysis and Visualization
- Current stock price
- Sector information
- Historical stock prices
- Current dividend
- Current dividend yield
- Current ex date
- Daily stock prices with line chart
- Avg. weekly stock prices with line chart
- Avg. quarterly stock prices with line chart
- Dividend history with column chart
- Transaction volumes in number of trades shares
- Transaction amounts, cf. above multiplied with closing prices
- Free float market capitalization history
- Percentage of delta of daily/weekly/quarterly changes
- Visualization of Ups and Downs with arrow symbols
- High, mean, low stock quote, date and percent delta for 1 week, ... up to 15 years
- Number of days a stock needs to recover its before-ex-date-price after dividend payment
- Key facts based on statistics section at Yahoo Finance
Excel/XSLX file export
- Includes all financial data cf. above
- One XLSX file for each stock
- Exported sheets contain:
1. Key data
2. Daily Stock Prices
3. Weekly Stock Prices
4. Quarterly Stock Prices
5. Dividends History
6. Transaction Volume 12 Months
7. Transaction Volume History
8. Market Capitalization - Optional output of one single xlsx file containing key facts of all stocks.
- Refactoring (especially parser).
- Translation.
- Write documentation.
- Change from german subsite to US site.
- More stocks/indices.
- Integrate financial data such as income statements (SEC fillings).
- Make it more versatile (untie the Yahoo-biased parsing, other sources).
- Rebuild or additional feature: API calls.
- Build web frontend.
- Find and report bugs.
- Give ideas on data analysis.
This project is licensed under the terms of the GPL-3.0.