Skip to content

Zaidfarooqui01/DeepDrive-ADAS-Perception

Repository files navigation

🚗 Autonomous EV Perception System

A ROS 2-Based Real-Time AI Perception Pipeline for Autonomous Electric Vehicles

Python ROS2 TensorFlow OpenCV Platform License


🧠 Project Overview

This project implements the complete perception stack of an Autonomous Electric Vehicle (AEV) using ROS 2 as the integration backbone. It combines classical computer vision techniques with deep learning models to achieve real-time lane detection, vehicle detection, and traffic sign recognition — the three core pillars of any production-grade autonomous driving system.

The system is built entirely from scratch on Ubuntu (via Windows WSL2 setup) and demonstrates end-to-end AI system integration: raw camera frames enter the pipeline, pass through a sequence of modular processing stages, and emerge as structured ROS 2 topic messages consumable by downstream planning and control nodes.


🏗️ System Architecture

┌─────────────────────────────────────────────────────────────────────────────┐
│                        AUTONOMOUS EV PERCEPTION PIPELINE                    │
│                          (ROS 2 Node: LaneDetectionNode)                    │
└─────────────────────────────────────────────────────────────────────────────┘

  📷 RAW CAMERA FRAME (Video Input / Live Feed)
          │
          ▼
  ┌───────────────────┐
  │  CAMERA           │   ALD_CAMERA_CAL.py
  │  CALIBRATION      │   -  Chessboard pattern (9×6 grid)
  │                   │   -  cv2.calibrateCamera()
  │  Removes lens     │   -  Computes mtx, dist, newcameramtx
  │  distortion       │   -  cv2.undistort() applied to each frame
  └────────┬──────────┘
           │ Undistorted Frame
           ▼
  ┌───────────────────┐
  │  COLOR &          │   ALD_COLORANDGRAD.py
  │  GRADIENT         │   -  HLS color space → S-channel binary
  │  THRESHOLDING     │   -  RGB → R-channel binary
  │                   │   -  Canny edge detection (low=200, high=140)
  │  Isolates lane    │   -  Bitwise OR fusion of all three masks
  │  markings         │
  └────────┬──────────┘
           │ Binary Combined Image
           ▼
  ┌───────────────────┐
  │  BIRD'S EYE VIEW  │   ALD_BEV.py
  │  (BEV) TRANSFORM  │   -  src: [(600,450),(750,450),(1100,700),(250,700)]
  │                   │   -  dst: [(200,0),(1200,0),(900,720),(200,720)]
  │  Top-down road    │   -  cv2.getPerspectiveTransform()
  │  perspective      │   -  cv2.warpPerspective()
  └────────┬──────────┘
           │ Warped Binary Image
           ▼
  ┌───────────────────┐
  │  HISTOGRAM &      │   ALD_HISTORGRAM.py
  │  SLIDING WINDOW   │   -  Column histogram on bottom half
  │  LANE DETECTION   │   -  9 sliding windows (margin=100px, minpix=50)
  │                   │   -  Tracks left & right lane pixel clusters
  │  Detects curved   │   -  np.polyfit(y, x, deg=2) → 2nd order polynomial
  │  lane geometry    │   -  Overlays green/yellow fitted curves
  └────────┬──────────┘
           │ Lane-Annotated Image
           ▼
  ┌────────────────────────────────────────────┐
  │         ROS 2 PUBLISHER NODE               │   ALD.py
  │         (LaneDetectionNode @ 20 Hz)        │
  │                                            │
  │  Topic: /lane_detected_image  → Image msg  │
  │  Topic: /lane_status          → String msg │
  └────────────────────────────────────────────┘


  ═══ PARALLEL AI MODULES (Run Independently) ══════════════════════

  📷 RAW FRAME ──► Object_classification.py
                    -  Model: SSD MobileNet v2 (TF Hub, CPU-optimized)
                    -  Input: 300×300 RGB tensor
                    -  Classes: car, bus, truck, bicycle, person, motorcycle
                    -  Confidence threshold: 0.50
                    -  Frame skip: every 2nd frame (performance optimization)
                    -  Bounding box overlay with class-specific colors

  📷 32×32 SIGN IMAGE ──► traffic_signs_cnn.py
                    -  Custom CNN trained on GTSRB dataset
                    -  43-class traffic sign classification
                    -  Architecture: LeNet-5 inspired (details below)
                    -  Normalization: (pixel - 128) / 128.0

🤖 Custom CNN Architecture — Traffic Sign Classifier

This is the custom Neural Network built entirely from scratch in TensorFlow/Keras. Inspired by the LeNet-5 architecture, modified for 43-class traffic sign recognition.

Input Image: 32×32×3 (RGB, normalized to [-1, 1])
          │
          ▼
┌─────────────────────────────────────────────────┐
│  CONV LAYER 1                                   │
│  Conv2D(filters=6, kernel=5×5, stride=1)        │
│  Activation: ReLU                               │
│  Output: 28×28×6                                │
│  MaxPooling2D(pool=2×2, stride=2)               │
│  Output: 14×14×6                                │
└────────────────────┬────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  CONV LAYER 2                                   │
│  Conv2D(filters=16, kernel=5×5, stride=1)       │
│  Activation: ReLU                               │
│  Output: 10×10×16                               │
│  MaxPooling2D(pool=2×2, stride=2)               │
│  Output: 5×5×16 = 400 neurons                  │
└────────────────────┬────────────────────────────┘
                     │
                     ▼
              Flatten → 400
                     │
                     ▼
┌─────────────────────────────────────────────────┐
│  FULLY CONNECTED LAYERS                         │
│  Dense(120, activation=ReLU)                    │
│  Dense(80,  activation=ReLU)                    │
│  Dense(43,  activation=Softmax)  ← 43 classes   │
└─────────────────────────────────────────────────┘
          │
          ▼
   Predicted Traffic Sign Class
   (e.g., "Stop Sign", "Speed Limit 50", ...)

Training Config:
  -  Optimizer : Adam
  -  Loss      : Categorical Cross-Entropy
  -  Epochs    : 10
  -  Batch Size: 256
  -  Dataset   : GTSRB (German Traffic Sign Recognition Benchmark)
                Train: 34,799 | Valid: 4,410 | Test: 12,630 images

📁 Repository Structure

Autonomous_EV/
│
├── ALD.py                  # ROS 2 main node — integrates full pipeline
├── ALD_CAMERA_CAL.py       # Camera calibration using chessboard images
├── ALD_COLORANDGRAD.py     # Color (HLS/RGB) + Canny gradient thresholding
├── ALD_BEV.py              # Bird's Eye View perspective transform
├── ALD_HISTORGRAM.py       # Sliding window lane pixel detection + polynomial fit
├── lane_detection.py       # Standalone Hough Transform lane detection
├── Object_classification.py# SSD MobileNet v2 vehicle detection (CPU mode)
├── traffic_signs_cnn.py    # Custom CNN — 43-class traffic sign classifier
└── README.md

⚙️ Technology Stack

Layer Technology Purpose
System Integration ROS 2 (Humble/Foxy) Node communication, topic publishing
Deep Learning TensorFlow 2.x + Keras Custom CNN training & inference
Object Detection TensorFlow Hub (SSD MobileNet v2) Vehicle detection
Computer Vision OpenCV 4.x Frame processing, calibration, transforms
Numerical Computing NumPy Matrix ops, polynomial fitting
Visualization Matplotlib Training curves, lane overlays
Message Bridge cv_bridge ROS Image ↔ OpenCV conversion
Platform Ubuntu (WSL2 on Windows) ROS 2 runtime environment

🚦 Module Deep-Dive

1. Camera Calibration (ALD_CAMERA_CAL.py)

Uses a 9×6 chessboard pattern to compute the camera intrinsic matrix and distortion coefficients. The cv2.calibrateCamera() function solves for mtx and dist, while cv2.getOptimalNewCameraMatrix() computes the refined matrix with zero black border cropping. Applied via cv2.undistort() before every frame enters the pipeline.

2. Color & Gradient Thresholding (ALD_COLORANDGRAD.py)

Three detection strategies are fused:

  • S-channel (HLS): Robust to lighting changes, range [220, 250]
  • R-channel (RGB): Strong on white/yellow lane markings, range [220, 250]
  • Canny Edges: Structural edges at low=200, high=140
  • All combined with cv2.bitwise_OR for maximum lane visibility

3. Bird's Eye View (ALD_BEV.py)

Applies a perspective warp that transforms a trapezoidal road region into a top-down rectangle. Source trapezoid: (600,450)→(750,450)→(1100,700)→(250,700). Destination rectangle: (200,0)→(1200,0)→(900,720)→(200,720). This eliminates perspective distortion and enables accurate polynomial curve fitting of lane lines.

4. Histogram + Polynomial Fit (ALD_HISTORGRAM.py)

A column histogram on the bottom half of the BEV image identifies lane base positions. A 9-window sliding algorithm tracks left and right lane pixel clusters upward through the frame. np.polyfit(y, x, deg=2) fits a 2nd-order polynomial (x = Ay^2 + By + C) to each lane, enabling smooth curve representation.

5. Vehicle Detection (Object_classification.py)

Loads SSD MobileNet v2 from TensorFlow Hub (CPU-optimized, 300×300 input). Detects 6 COCO classes: car, bus, truck, bicycle, motorcycle, person at ≥0.50 confidence. Processes every 2nd frame for CPU performance. Color-coded bounding boxes: 🟢 car, 🔵 truck, 🔴 bicycle, 🟡 person.

6. Traffic Sign CNN (traffic_signs_cnn.py)

Custom LeNet-5 inspired CNN built layer-by-layer in Keras with no pre-built model. Trained on GTSRB (German Traffic Sign Recognition Benchmark) — 43 classes, ~51,000 total images. Input normalized as (pixel - 128) / 128.0 to range [-1, 1]. Adam optimizer with categorical cross-entropy loss. 10 epochs, batch size 256.

7. ROS 2 Integration Node (ALD.py)

LaneDetectionNode extends rclpy.node.Node and runs the complete pipeline at 20 Hz via a ROS timer. Publishes processed frames to /lane_detected_image (sensor_msgs/Image via cv_bridge) and lane status strings to /lane_status (std_msgs/String). Declares video_file as a ROS 2 parameter for runtime configurability.


📡 ROS 2 Topics

Topic Message Type Publisher Description
/lane_detected_image sensor_msgs/Image LaneDetectionNode BGR8 processed frame with lane overlay
/lane_status std_msgs/String LaneDetectionNode Lane detection status ("Lane detected")

🛠️ Setup & Installation

Prerequisites

  • Ubuntu 22.04 (native or WSL2 on Windows)
  • ROS 2 Humble Hawksbill
  • Python 3.10+
  • GTSRB dataset (for CNN training)

Step 1 — Clone Repository

mkdir -p ~/ros2_ws/src
cd ~/ros2_ws/src
git clone https://github.com/Zaidfarooqui01/Autonomous_EV.git
cd ~/ros2_ws

Step 2 — Install Dependencies

pip install tensorflow tensorflow-hub opencv-python numpy matplotlib
sudo apt install ros-humble-cv-bridge python3-colcon-common-extensions

Step 3 — Build ROS 2 Workspace

cd ~/ros2_ws
colcon build
source install/setup.bash

Step 4 — Download GTSRB Dataset

Download from Kaggle GTSRB and place train.p, valid.p, test.p, signnames.csv in AEV_Datasets/

Step 5 — Train the CNN (One-Time)

python traffic_signs_cnn.py

Step 6 — Run Vehicle Detection

python Object_classification.py

Step 7 — Launch ROS 2 Lane Detection Node

ros2 run autonomous_ev lane_detection_node \
  --ros-args -p video_file:=/path/to/your/video.mp4

Step 8 — Monitor ROS 2 Topics

# In separate terminals:
ros2 topic echo /lane_status
ros2 topic hz  /lane_detected_image

📊 Model Performance

Model Dataset Metric Value
Custom CNN (traffic_signs_cnn.py) GTSRB Test Set Accuracy ~95%+
SSD MobileNet v2 COCO mAP@0.5 22.0 (standard benchmark)
Lane Detection Highway Video Visual Stable polynomial fit at 20Hz

CNN test accuracy achieved after 10 epochs on GTSRB benchmark dataset. Standard LeNet-5 variants on GTSRB typically achieve 93–97% accuracy.


🔮 Future Enhancements

  • LiDAR point cloud integration (sensor_msgs/PointCloud2)
  • Camera + LiDAR sensor fusion for 3D obstacle detection
  • Path planning node (A* / RRT algorithm)
  • PID controller node for steering angle output
  • Replace video input with live USB/IP camera feed
  • Deployment on NVIDIA Jetson Nano / Raspberry Pi 5
  • ONNX model export for edge inference optimization
  • Docker containerization for reproducible ROS 2 environment

📦 Requirements

tensorflow>=2.12.0
tensorflow-hub>=0.14.0
opencv-python>=4.8.0
numpy>=1.24.0
matplotlib>=3.7.0
pandas>=2.0.0
rclpy                    # via ROS 2 installation
sensor_msgs              # via ROS 2 installation
cv_bridge                # via ROS 2 installation

👤 Author

Mohammad Zaid B.Tech Artificial Intelligence and Machine Learning | United College of Engineering & Research, Prayagraj GitHub: @Zaidfarooqui01

Contributions, issues and feature requests are welcome!


⚠️ Disclaimer

This project is intended for academic, research, and learning purposes only. It does not represent a production-ready autonomous driving system. Never deploy perception-only systems for real vehicle control.


⭐ Star this repo if you found it helpful!

About

Real-time AI perception pipeline for Autonomous Electric Vehicles — lane detection, traffic sign classification (custom CNN), and vehicle detection integrated via ROS 2 nodes using TensorFlow and OpenCV.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages