datalake-ingestion

Here are 5 public repositories matching this topic...

KeeplerIO / de-identification-framework

Application of our De-identification Framework with open source technologies, enabling enterprises to take ownership of the de-identification process and deploy it in trusted environments.

data privacy-protection data-security pii de-identification datalake-ingestion

Updated Nov 15, 2021
Python

ac-gomes / data_engineer_with_airflow

Star

Este projeto é uma adaptação com base em um teste real para uma posição de Engenheiro de Dados Jr.

postgres airflow json-api aws-s3 python3 azure-storage datalake datalake-ingestion

Updated Jun 16, 2023
Python

makz17 / youtube-video-data-analytics-aws

Star

python cli aws sql etl lambda-functions pyspark parquet-files glue-etl datalake-ingestion

Updated Jun 9, 2025
Python

Akshay8147 / Data-Engineering-Youtube-Analysis-Project

Star

A cloud-based ETL testing and data analysis pipeline for YouTube trending video data using AWS services including Lambda, Glue, Athena, S3, and QuickSight. This project focuses on ingesting, transforming, storing, and analyzing structured and semi-structured data to generate insights based on video categories and trending metrics.

python aws sql aws-lambda athena etl aws-s3 glue datalake iam-role quicksight datalake-ingestion

Updated Jul 14, 2025
Python

BurakCakan / gcs-data-ingestion

Star

This repo is designed to show how to read and write data from/to google cloud storage with pyspark. The raw data is ingested, transformed and stored in the data lake in snapshot format.

unit-testing spark ci-cd python3 google-cloud-platform datalake-ingestion

Updated Feb 27, 2023
Python

Improve this page

Add a description, image, and links to the datalake-ingestion topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the datalake-ingestion topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

datalake-ingestion

Here are 5 public repositories matching this topic...

KeeplerIO / de-identification-framework

ac-gomes / data_engineer_with_airflow

makz17 / youtube-video-data-analytics-aws

Akshay8147 / Data-Engineering-Youtube-Analysis-Project

BurakCakan / gcs-data-ingestion

Improve this page

Add this topic to your repo