Csv-to-Parquet-and-data-reporting-pipeline

Problem Statement

This Problem was to extract 800k records from csv file, convert it to parquet and store in datalake and then create report in Power BI. The major problem that I was facing that in csv data there were spaces in column names. So my pipeline was crashing again and again because I was converting data to parquet format and parquet does not support spaces in column names then I removed spaces and pipeline ran successfuly after that I fetch that parquet data in power bi and created a report.

The Json files

The Third Portfolio Project.json file contains information about the ADF pipeline, including the pipeline name, description, and the resources that make up the pipeline. The manifest.json file contains information about the dependencies and structure of the ARM template of the pipeline in Azure DataFactory.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
ThirdPortfolioProject.json		ThirdPortfolioProject.json
manifest.json		manifest.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Csv-to-Parquet-and-data-reporting-pipeline

Problem Statement

The Json files

PowerBI Report

About

Releases

Packages

MuhammadHasaanWahid/Csv-To-Parquet-And-Data-Reporting-Pipeline

Folders and files

Latest commit

History

Repository files navigation

Csv-to-Parquet-and-data-reporting-pipeline

Problem Statement

The Json files

PowerBI Report

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages