This is a robust data engineering solution focused on streamlining the collection, transformation, and analysis of financial data specific to gold as a commodity. This project aims to empower investment portfolio managers with timely and accurate insights for making informed decisions in the world of precious metals investments, gold against the us dollar.
What I learned
- handling market data (ohlc)
- web scraping using python
- sentiment analysis using LLMs
- text summarization
The news data pipeline
- Scrapes news articles from a website.
- Performs sentiment analysis on the articles.
- Uploads the resulting datasets to AWS S3.
The market data pipeline
-
Scrapes ohlc market data via the twelvedata api.
-
Adds a new column based on the difference between the open price and close price.
-
Uploads the resulting datasets to AWS S3.
To get these pipelines up and running you have to run the following command:
kedro run
To learn how to set up a kedro project visit https://docs.kedro.org/en/stable/get_started/install.html
This project is licensed under the MIT License - see the LICENSE file for details.