Projects and studies regarding Data Engineering Area
-
Updated
May 27, 2024 - HTML
Projects and studies regarding Data Engineering Area
This is a data scraping project that sources data from the Houzz e-commerce platform, the CNN YouTube channel, and the TedTalk official website. The implementation uses the Apache Beam framework to build an ETL pipeline and write the results into an Elasticsearch database. The final step visualizes the crawler results using Kibana.
Add a description, image, and links to the apache-beam topic page so that developers can more easily learn about it.
To associate your repository with the apache-beam topic, visit your repo's landing page and select "manage topics."