Skip to content

๐Ÿ”น Clean & Explore with Missing and Categorical Data ๐Ÿ”นPython script to clean and preprocess Titanic dataset using Pandas & NumPy. Handles missing values and encodes categorical data. Produces a clean, numeric dataset ready for analysis and machine learning models.

Notifications You must be signed in to change notification settings

Abdullah321Umar/CodeSentinel_DataAnalytics-Task2

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

16 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“Š CodeSentinel_DataAnalytics-Task2

๐Ÿง  Task Overview

In this task, I worked with the Titanic dataset to practice data cleaning and preprocessing using Pandas & NumPy. The main goal was to handle missing values and transform categorical data into numeric form, making the dataset ready for further analysis or modeling. This exercise is an essential step in preparing real-world datasets, ensuring they are clean, structured, and machine-learning friendly.

๐Ÿ“Š Key Steps Performed:

  • Missing Value Handling โ†’ Filled missing values in Age and Fare with their median values, and dropped the Cabin column due to excessive nulls.
  • Categorical Encoding โ†’ Converted categorical columns (Sex and Embarked) into numeric form using One-Hot Encoding.
  • Final Clean Dataset โ†’ Produced a structured DataFrame with no missing values and all categorical features transformed into numeric columns.

๐Ÿ“ˆ Console-Based Insights Generated:

  • โœ… Dataset Info โ†’ Displays rows, columns, data types, and missing values.
  • ๐Ÿ“Š Encoded Features โ†’ New numeric columns created: Sex_male, Embarked_Q, Embarked_S.
  • ๐Ÿ“Œ Cleaned Data โ†’ Ready-to-use dataset free from nulls and categorical text.
  • ๐Ÿง‘โ€๐Ÿ’ป Quick Preview โ†’ Printed the first 5 rows to confirm transformations.

๐Ÿ›  Tools & Techniques Used

The dashboard was built using the following tools and technologies:

  • Python (Jupyter Notebook / Script) โ†’ for data preprocessing
  • Pandas โ†’ handling missing data & encoding categorical variables
  • NumPy โ†’ numerical operations during preprocessing

๐Ÿš€ Learning Impact

  • ๐Ÿ“Š Strengthened skills in real-world data cleaning and preprocessing.
  • ๐Ÿ’ก Learned strategies for handling missing values (median filling, dropping).
  • โšก Practiced categorical variable encoding for machine learning readiness.
  • ๐ŸŒ Built a reusable preprocessing script for structured datasets.

๐Ÿ”— Connect

๐Ÿ“ง Email: umerabdullah048@gmail.com

6. Screenshots / Demos

Show what the Code and Output looks like. Code Preview Output Preview

About

๐Ÿ”น Clean & Explore with Missing and Categorical Data ๐Ÿ”นPython script to clean and preprocess Titanic dataset using Pandas & NumPy. Handles missing values and encodes categorical data. Produces a clean, numeric dataset ready for analysis and machine learning models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published