Data Engineering Project with Hadoop HDFS and Kafka
-
Updated
Nov 4, 2023 - Python
Data Engineering Project with Hadoop HDFS and Kafka
The scan to model pipeline (SMP) is an open source tool that automatically generates mesh model files (PLY) by filtering and clustering data in LAS, a point cloud format. Using this, you can automate the extraction of buildings, ground, and trees from PCD. It is structured as a pipeline, so you can easily adjust parameters for each data processing
Including pre-trained language models for fine-tuning on other NLP tasks
My small cheatsheets for tf2onnx, git commands, linux commands and evaluations
batch machine learning
hGMNet : host Genetics and Microbe interaction Networks
Sparkify - Data Pipelines with Airflow - Udacity Data Engineering Expert Track.
Deploying Flask Application on Azure Web Apps
The simplest application for planning your tasks.
deltalake tutorial w/ spark, hive, hadoop
💻🔧 Scrapy and Selenium Projects 🪛 ⚙💻
Add a description, image, and links to the pipline topic page so that developers can more easily learn about it.
To associate your repository with the pipline topic, visit your repo's landing page and select "manage topics."