Image Data Loading and Preprocessing in TensorFlow

1. Project Title

Efficient Image Data Loading and Preprocessing for Deep Learning

2. Problem Statement and Goal of Project

Loading and preprocessing image data efficiently is critical for training performant deep learning models. This project demonstrates how to load, preprocess, batch, and visualize image datasets using TensorFlow/Keras utilities, ensuring the pipeline is optimized for GPU training and scalable datasets.

3. Solution Approach

The notebook is organized into key steps:

Directory-based dataset loading – Use tf.keras.utils.image_dataset_from_directory to automatically label and split datasets.
Exploring dataset properties – View shapes, class names, and sample counts.
Data preprocessing – Resize, normalize, and prepare images for model ingestion.
Performance optimization – Apply cache(), shuffle(), and prefetch() for efficient training throughput.
Visualization – Display batches of images with their labels for inspection.

4. Technologies & Libraries

From the code:

TensorFlow / Keras – Dataset loading, preprocessing, and pipeline optimization.
Matplotlib – Visualizing sample images and labels.
NumPy – Basic numerical handling (if used).

5. Description about Dataset

Not provided explicitly – The notebook loads image data from a local directory structure, where subfolder names correspond to class labels.

6. Installation & Execution Guide

Requirements:

pip install tensorflow matplotlib numpy

Run the notebook:

jupyter notebook image_data_loader.ipynb

or in JupyterLab:

jupyter lab image_data_loader.ipynb

Ensure the dataset is organized in a directory with subfolders for each class:

dataset/
├── class1/
│   ├── image1.jpg
│   ├── image2.jpg
├── class2/
│   ├── image3.jpg
│   ├── image4.jpg

7. Key Results / Performance

Successfully loaded and labeled images directly from directory structure.
Normalized and resized images to a consistent shape for model compatibility.
Optimized pipeline with caching, shuffling, and prefetching to reduce training bottlenecks.
Verified correct label assignment through visualization.

Example dataset info:

Image shape: (180, 180, 3)
Number of classes: 5
Class names: ['cat', 'dog', 'bird', 'fish', 'horse']

8. Screenshots / Sample Out

Visualization sample:

[Image of class 'cat']  [Image of class 'dog']  [Image of class 'bird'] ...

Prefetch optimization:

AUTOTUNE = tf.data.AUTOTUNE
dataset = dataset.cache().shuffle(1000).prefetch(buffer_size=AUTOTUNE)

9. Additional Learnings / Reflections

image_dataset_from_directory simplifies loading while handling labeling automatically.
Proper caching and prefetching significantly improve GPU utilization.
Visual inspection ensures the dataset is loaded correctly before training.
A well-prepared data pipeline prevents downstream model performance issues.

💡 Some interactive outputs (e.g., plots, widgets) may not display correctly on GitHub. If so, please view this notebook via nbviewer.org for full rendering.

👤 Author

Mehran Asgari Email: imehranasgari@gmail.com GitHub: https://github.com/imehranasgari

📄 License

This project is licensed under the Apache 2.0 License – see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitattributes		.gitattributes
LICENSE		LICENSE
README.md		README.md
image_data_loader.ipynb		image_data_loader.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Image Data Loading and Preprocessing in TensorFlow

1. Project Title

2. Problem Statement and Goal of Project

3. Solution Approach

4. Technologies & Libraries

5. Description about Dataset

6. Installation & Execution Guide

7. Key Results / Performance

8. Screenshots / Sample Out

9. Additional Learnings / Reflections

👤 Author

📄 License

About

Uh oh!

Releases

Packages

Languages

License

imehranasgari/DL_TensorFlow_LowLevelAPI_ImageDataLoader

Folders and files

Latest commit

History

Repository files navigation

Image Data Loading and Preprocessing in TensorFlow

1. Project Title

2. Problem Statement and Goal of Project

3. Solution Approach

4. Technologies & Libraries

5. Description about Dataset

6. Installation & Execution Guide

7. Key Results / Performance

8. Screenshots / Sample Out

9. Additional Learnings / Reflections

👤 Author

📄 License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages