Skip to content

Commit c77b302

Browse files
authored
Update CaseStudy.md
1 parent ea665eb commit c77b302

File tree

1 file changed

+150
-0
lines changed

1 file changed

+150
-0
lines changed

35 Day Pandas Basic/CaseStudy.md

Lines changed: 150 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -74,3 +74,153 @@ This case study demonstrates common Pandas operations:
7474
8. Customer behavior analysis
7575
9. Filtering for specific conditions
7676
10. Exporting results to CSV files
77+
78+
---------------------------------
79+
80+
81+
# Pandas Step-by-Step Example Guide
82+
*Updated: June 25, 2025*
83+
84+
## 1. What is Pandas?
85+
Pandas is a powerful Python library for data manipulation and analysis. It provides data structures like Series and DataFrames, which are ideal for handling structured data, such as tabular data, time series, and more.
86+
87+
```python
88+
import pandas as pd
89+
import numpy as np
90+
```
91+
92+
## 2. Series and DataFrames
93+
- **Series**: A one-dimensional array-like object that can hold data of any type (integers, strings, floats, etc.). It has an index for labeling data.
94+
- **DataFrame**: A two-dimensional, tabular data structure with labeled rows and columns, similar to a spreadsheet or SQL table.
95+
96+
```python
97+
# Creating a Series
98+
series = pd.Series([10, 20, 30, 40], index=['a', 'b', 'c', 'd'])
99+
print("Series:\n", series)
100+
101+
# Creating a DataFrame
102+
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
103+
df = pd.DataFrame(data)
104+
print("\nDataFrame:\n", df)
105+
```
106+
107+
## 3. Creating DataFrames
108+
DataFrames can be created from dictionaries, lists, or other data structures.
109+
110+
```python
111+
# From a dictionary
112+
data = {'Product': ['Apple', 'Banana', 'Orange'], 'Price': [1.0, 0.5, 0.75]}
113+
df = pd.DataFrame(data, index=['P1', 'P2', 'P3'])
114+
print("DataFrame from dictionary:\n", df)
115+
116+
# From a list of dictionaries
117+
list_data = [{'Product': 'Apple', 'Price': 1.0}, {'Product': 'Banana', 'Price': 0.5}]
118+
df_list = pd.DataFrame(list_data)
119+
print("\nDataFrame from list:\n", df_list)
120+
```
121+
122+
## 4. Reading and Writing Data
123+
Pandas supports reading from and writing to various file formats like CSV, Excel, and JSON.
124+
125+
```python
126+
# Writing DataFrame to CSV
127+
df.to_csv('products.csv', index=True)
128+
129+
# Reading from CSV
130+
df_read = pd.read_csv('products.csv', index_col=0)
131+
print("Read from CSV:\n", df_read)
132+
```
133+
134+
## 5. Data Types and Missing Values
135+
Pandas automatically infers data types, but you can inspect and handle missing values.
136+
137+
```python
138+
# Checking data types
139+
print("Data types:\n", df.dtypes)
140+
141+
# Introducing missing values
142+
df_with_na = df.copy()
143+
df_with_na.loc['P1', 'Price'] = np.nan
144+
print("\nDataFrame with missing values:\n", df_with_na)
145+
146+
# Checking for missing values
147+
print("\nMissing values:\n", df_with_na.isna())
148+
```
149+
150+
## 6. Indexing Methods: loc and iloc
151+
- `loc`: Access data by label/index.
152+
- `iloc`: Access data by integer position.
153+
154+
```python
155+
# Using loc
156+
print("Using loc (select P1):\n", df.loc['P1'])
157+
158+
# Using iloc
159+
print("\nUsing iloc (select first row):\n", df.iloc[0])
160+
```
161+
162+
## 7. Boolean Indexing
163+
Filter rows based on conditions using boolean masks.
164+
165+
```python
166+
# Select rows where Price > 0.6
167+
print("Boolean indexing (Price > 0.6):\n", df[df['Price'] > 0.6])
168+
```
169+
170+
## 8. Selection Based on Conditions
171+
Combine conditions for more complex filtering.
172+
173+
```python
174+
# Select rows where Product is 'Apple' or Price > 0.6
175+
condition = (df['Product'] == 'Apple') | (df['Price'] > 0.6)
176+
print("Complex condition:\n", df[condition])
177+
```
178+
179+
## 9. Adding and Deleting Columns
180+
Modify DataFrames by adding or removing columns.
181+
182+
```python
183+
# Adding a new column
184+
df['Stock'] = [100, 200, 150]
185+
print("After adding Stock column:\n", df)
186+
187+
# Deleting a column
188+
df = df.drop('Stock', axis=1)
189+
print("\nAfter deleting Stock column:\n", df)
190+
```
191+
192+
## 10. Handling Missing Data
193+
Handle missing values by filling or dropping them.
194+
195+
```python
196+
# Filling missing values
197+
df_with_na['Price'] = df_with_na['Price'].fillna(df_with_na['Price'].mean())
198+
print("After filling missing values:\n", df_with_na)
199+
200+
# Dropping rows with missing values
201+
df_dropped = df_with_na.dropna()
202+
print("\nAfter dropping missing values:\n", df_dropped)
203+
```
204+
205+
## 11. Grouping and Aggregation
206+
Group data by a column and perform aggregations like sum, mean, etc.
207+
208+
```python
209+
# Adding a Category column for grouping
210+
df['Category'] = ['Fruit', 'Fruit', 'Fruit']
211+
grouped = df.groupby('Category').agg({'Price': ['mean', 'sum']})
212+
print("Grouped by Category:\n", grouped)
213+
```
214+
215+
## 12. Merging and Joining DataFrames
216+
Combine DataFrames using merge or join operations.
217+
218+
```python
219+
# Creating another DataFrame
220+
df2 = pd.DataFrame({'Product': ['Apple', 'Banana'], 'Region': ['North', 'South']})
221+
222+
# Merging DataFrames
223+
merged = pd.merge(df, df2, on='Product', how='left')
224+
print("Merged DataFrame:\n", merged)
225+
```
226+

0 commit comments

Comments
 (0)