This repository provides a comprehensive tutorial for learning computer vision concepts, starting from the basics to more advanced techniques. It is designed for both beginners and those looking to enhance their understanding of computer vision and image processing.
- Image Processing: Learn how to manipulate and process images using various techniques like thresholding, edge detection, and contour finding.
- Object Detection: Understand how to detect and recognize shapes and objects in images.
- Real-Time Video Manipulation: Work with live video feeds, perform image transformations, and integrate real-time camera inputs.
- Thresholding and Color Transformation: Learn about color spaces, including converting between RGB, HSV, and grayscale images.
- Contour Detection: Explore techniques for identifying and extracting contours from images.
- OpenCV: The primary library for image processing, object detection, and video manipulation.
- NumPy: Used for matrix and array operations required in image manipulation.
- Hands-On Examples: Includes practical examples of basic and advanced computer vision tasks.
- Interactive Tutorials: Code snippets and tutorials for users to implement on their own systems.
- Live Camera Feed Processing: Real-time video processing and manipulation using a webcam.
- Computer Vision Tutorials
- Skills and Concepts Covered
- Tutorials
- 📷1. Load and Inspect an Image
- 🎯2. Access and Modify Pixels in an Image
- ✨3. Compare Blurring Techniques
- 🎨4. Grayscale Conversion and Cropping
- 🔍5. Extracting Region of Interest (ROI) from an Image
- 🖍️6. Drawing Shapes and Text on an Image
- 🔍7. Contour Detection
- 🔁8. Image Translation and Rotation
- 🌫9. Gaussian Blur Effect
- 🔧10. Bitwise Operations on Shapes
- 🔀11. Merging Two Images
- 📌12. Erosion and Dilation
- 🎯13. Real-Time Red Color Detection
- 🖼️14. Image Resizer GUI (Python + Tkinter + OpenCV)
- 🟢15. Real-Time Shape Detection
- 16. Image Viewer using Tkinter and OpenCV
- 17.Image Thresholding
- 18. Live Camera with GrayScale
How to load an image using OpenCV with different flags (color, grayscale, or unchanged). It prints the raw pixel data and image shape, then displays the image in a window.
How to read an image using OpenCV, access and modify individual pixel values, and change a specific region of interest (ROI). It displays both the original and the modified image for comparison.
How to apply and compare different blurring techniques — average, Gaussian, and median — on an image. The results are resized and displayed side by side for visual comparison.
Loads a color image, converts it to grayscale, and crops a specific region from it. The cropped grayscale portion is then displayed in a window using OpenCV.
Loads an image and extracts a specific Region of Interest (ROI) from it using slicing. It then displays the cropped section in a separate window, helping you focus on a particular portion of the image for further analysis or processing.
How to draw basic shapes like a circle, line, and rectangle, and how to overlay custom text on an image using OpenCV.
- Useful for image annotation, object highlighting, or creating overlays for tutorials and projects.
How to detect and draw contours in an image using OpenCV. Contours are useful for object detection, shape analysis, and image segmentation. The code first converts the image to grayscale, applies binary thresholding, and then finds and visualizes the external contours.
How to translate (move) and rotate an image using affine transformation in OpenCV.
-
The image is first moved in the X and Y directions using a translation matrix.
-
Then, it's rotated around its center by a specified angle using a rotation matrix.
-
These operations are useful in image augmentation and geometric transformations.
Applies a Gaussian Blur to the image using a (5x5) kernel and a sigma value of 8. Gaussian blur helps reduce image noise and detail, often used in preprocessing steps for edge detection or smoothing.
how to use bitwise operations (AND, OR, NOT) on two geometric shapes (rectangle and circle). These operations are useful for image masking, blending, and region-of-interest filtering in computer vision.
-
bitwise_and: shows the overlapping area between the shapes.
-
bitwise_or: shows the combined area of both shapes.
-
bitwise_not: inverts the rectangle image (white becomes black and vice versa).
how to merge two images using OpenCV by:
-
Resizing them to the same dimensions (required for arithmetic operations).
-
Using cv2.add() to perform pixel-wise addition of two images, blending them.
This script demonstrates Morphological Transformations in image processing using OpenCV:
-
Erosion reduces white regions (foreground), shrinking object boundaries.
-
Dilation increases white regions, expanding object boundaries.
This script detects red-colored objects from a webcam feed in real-time using the HSV (Hue, Saturation, Value) color space. Key Concepts:
-
HSV Color Space is better for color detection than BGR because it separates color info (hue) from brightness (value).
-
Red Hue wraps around the HSV color wheel, so two ranges are needed to detect it effectively.
A simple Python desktop application that allows users to open an image from their computer, view it, and resize it dynamically using a graphical interface. The app uses Tkinter for the GUI and OpenCV for image processing and display.
This Python script uses OpenCV to detect and classify basic geometric shapes (Triangle, Square, Rectangle, Pentagon, Hexagon, Circle) in real time using your webcam.
It processes each frame by applying thresholding, contour detection, and polygon approximation to recognize the shape based on the number of sides and other features like aspect ratio and circularity.
This is a simple GUI-based image viewer built with Python's Tkinter and OpenCV. It allows users to open and view image files (.jpg, .jpeg, .png, .bmp) using a file menu.
This project demonstrates binary and inverse binary thresholding using OpenCV. It reads an image, converts it to grayscale, and applies both thresholding techniques to display the results.
This Python script captures live video from the webcam, displays the original feed, and also shows the grayscale version of the video in real-time. It allows you to press 'q' to exit the video feed and close the windows.