This project implements the complete perception stack of an Autonomous Electric Vehicle (AEV) using ROS 2 as the integration backbone. It combines classical computer vision techniques with deep learning models to achieve real-time lane detection, vehicle detection, and traffic sign recognition — the three core pillars of any production-grade autonomous driving system.
The system is built entirely from scratch on Ubuntu (via Windows WSL2 setup) and demonstrates end-to-end AI system integration: raw camera frames enter the pipeline, pass through a sequence of modular processing stages, and emerge as structured ROS 2 topic messages consumable by downstream planning and control nodes.
┌─────────────────────────────────────────────────────────────────────────────┐
│ AUTONOMOUS EV PERCEPTION PIPELINE │
│ (ROS 2 Node: LaneDetectionNode) │
└─────────────────────────────────────────────────────────────────────────────┘
📷 RAW CAMERA FRAME (Video Input / Live Feed)
│
▼
┌───────────────────┐
│ CAMERA │ ALD_CAMERA_CAL.py
│ CALIBRATION │ - Chessboard pattern (9×6 grid)
│ │ - cv2.calibrateCamera()
│ Removes lens │ - Computes mtx, dist, newcameramtx
│ distortion │ - cv2.undistort() applied to each frame
└────────┬──────────┘
│ Undistorted Frame
▼
┌───────────────────┐
│ COLOR & │ ALD_COLORANDGRAD.py
│ GRADIENT │ - HLS color space → S-channel binary
│ THRESHOLDING │ - RGB → R-channel binary
│ │ - Canny edge detection (low=200, high=140)
│ Isolates lane │ - Bitwise OR fusion of all three masks
│ markings │
└────────┬──────────┘
│ Binary Combined Image
▼
┌───────────────────┐
│ BIRD'S EYE VIEW │ ALD_BEV.py
│ (BEV) TRANSFORM │ - src: [(600,450),(750,450),(1100,700),(250,700)]
│ │ - dst: [(200,0),(1200,0),(900,720),(200,720)]
│ Top-down road │ - cv2.getPerspectiveTransform()
│ perspective │ - cv2.warpPerspective()
└────────┬──────────┘
│ Warped Binary Image
▼
┌───────────────────┐
│ HISTOGRAM & │ ALD_HISTORGRAM.py
│ SLIDING WINDOW │ - Column histogram on bottom half
│ LANE DETECTION │ - 9 sliding windows (margin=100px, minpix=50)
│ │ - Tracks left & right lane pixel clusters
│ Detects curved │ - np.polyfit(y, x, deg=2) → 2nd order polynomial
│ lane geometry │ - Overlays green/yellow fitted curves
└────────┬──────────┘
│ Lane-Annotated Image
▼
┌────────────────────────────────────────────┐
│ ROS 2 PUBLISHER NODE │ ALD.py
│ (LaneDetectionNode @ 20 Hz) │
│ │
│ Topic: /lane_detected_image → Image msg │
│ Topic: /lane_status → String msg │
└────────────────────────────────────────────┘
═══ PARALLEL AI MODULES (Run Independently) ══════════════════════
📷 RAW FRAME ──► Object_classification.py
- Model: SSD MobileNet v2 (TF Hub, CPU-optimized)
- Input: 300×300 RGB tensor
- Classes: car, bus, truck, bicycle, person, motorcycle
- Confidence threshold: 0.50
- Frame skip: every 2nd frame (performance optimization)
- Bounding box overlay with class-specific colors
📷 32×32 SIGN IMAGE ──► traffic_signs_cnn.py
- Custom CNN trained on GTSRB dataset
- 43-class traffic sign classification
- Architecture: LeNet-5 inspired (details below)
- Normalization: (pixel - 128) / 128.0
This is the custom Neural Network built entirely from scratch in TensorFlow/Keras. Inspired by the LeNet-5 architecture, modified for 43-class traffic sign recognition.
Input Image: 32×32×3 (RGB, normalized to [-1, 1])
│
▼
┌─────────────────────────────────────────────────┐
│ CONV LAYER 1 │
│ Conv2D(filters=6, kernel=5×5, stride=1) │
│ Activation: ReLU │
│ Output: 28×28×6 │
│ MaxPooling2D(pool=2×2, stride=2) │
│ Output: 14×14×6 │
└────────────────────┬────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ CONV LAYER 2 │
│ Conv2D(filters=16, kernel=5×5, stride=1) │
│ Activation: ReLU │
│ Output: 10×10×16 │
│ MaxPooling2D(pool=2×2, stride=2) │
│ Output: 5×5×16 = 400 neurons │
└────────────────────┬────────────────────────────┘
│
▼
Flatten → 400
│
▼
┌─────────────────────────────────────────────────┐
│ FULLY CONNECTED LAYERS │
│ Dense(120, activation=ReLU) │
│ Dense(80, activation=ReLU) │
│ Dense(43, activation=Softmax) ← 43 classes │
└─────────────────────────────────────────────────┘
│
▼
Predicted Traffic Sign Class
(e.g., "Stop Sign", "Speed Limit 50", ...)
Training Config:
- Optimizer : Adam
- Loss : Categorical Cross-Entropy
- Epochs : 10
- Batch Size: 256
- Dataset : GTSRB (German Traffic Sign Recognition Benchmark)
Train: 34,799 | Valid: 4,410 | Test: 12,630 images
Autonomous_EV/
│
├── ALD.py # ROS 2 main node — integrates full pipeline
├── ALD_CAMERA_CAL.py # Camera calibration using chessboard images
├── ALD_COLORANDGRAD.py # Color (HLS/RGB) + Canny gradient thresholding
├── ALD_BEV.py # Bird's Eye View perspective transform
├── ALD_HISTORGRAM.py # Sliding window lane pixel detection + polynomial fit
├── lane_detection.py # Standalone Hough Transform lane detection
├── Object_classification.py# SSD MobileNet v2 vehicle detection (CPU mode)
├── traffic_signs_cnn.py # Custom CNN — 43-class traffic sign classifier
└── README.md
| Layer | Technology | Purpose |
|---|---|---|
| System Integration | ROS 2 (Humble/Foxy) | Node communication, topic publishing |
| Deep Learning | TensorFlow 2.x + Keras | Custom CNN training & inference |
| Object Detection | TensorFlow Hub (SSD MobileNet v2) | Vehicle detection |
| Computer Vision | OpenCV 4.x | Frame processing, calibration, transforms |
| Numerical Computing | NumPy | Matrix ops, polynomial fitting |
| Visualization | Matplotlib | Training curves, lane overlays |
| Message Bridge | cv_bridge | ROS Image ↔ OpenCV conversion |
| Platform | Ubuntu (WSL2 on Windows) | ROS 2 runtime environment |
Uses a 9×6 chessboard pattern to compute the camera intrinsic matrix and distortion
coefficients. The cv2.calibrateCamera() function solves for mtx and dist, while
cv2.getOptimalNewCameraMatrix() computes the refined matrix with zero black border cropping.
Applied via cv2.undistort() before every frame enters the pipeline.
Three detection strategies are fused:
- S-channel (HLS): Robust to lighting changes, range
[220, 250] - R-channel (RGB): Strong on white/yellow lane markings, range
[220, 250] - Canny Edges: Structural edges at
low=200, high=140 - All combined with
cv2.bitwise_ORfor maximum lane visibility
Applies a perspective warp that transforms a trapezoidal road region into a top-down
rectangle. Source trapezoid: (600,450)→(750,450)→(1100,700)→(250,700). Destination
rectangle: (200,0)→(1200,0)→(900,720)→(200,720). This eliminates perspective distortion
and enables accurate polynomial curve fitting of lane lines.
A column histogram on the bottom half of the BEV image identifies lane base positions.
A 9-window sliding algorithm tracks left and right lane pixel clusters upward through the
frame. np.polyfit(y, x, deg=2) fits a 2nd-order polynomial (x = Ay^2 + By + C)
to each lane, enabling smooth curve representation.
Loads SSD MobileNet v2 from TensorFlow Hub (CPU-optimized, 300×300 input). Detects
6 COCO classes: car, bus, truck, bicycle, motorcycle, person at ≥0.50 confidence.
Processes every 2nd frame for CPU performance. Color-coded bounding boxes: 🟢 car,
🔵 truck, 🔴 bicycle, 🟡 person.
Custom LeNet-5 inspired CNN built layer-by-layer in Keras with no pre-built model.
Trained on GTSRB (German Traffic Sign Recognition Benchmark) — 43 classes,
~51,000 total images. Input normalized as (pixel - 128) / 128.0 to range [-1, 1].
Adam optimizer with categorical cross-entropy loss. 10 epochs, batch size 256.
LaneDetectionNode extends rclpy.node.Node and runs the complete pipeline at
20 Hz via a ROS timer. Publishes processed frames to /lane_detected_image
(sensor_msgs/Image via cv_bridge) and lane status strings to /lane_status
(std_msgs/String). Declares video_file as a ROS 2 parameter for runtime configurability.
| Topic | Message Type | Publisher | Description |
|---|---|---|---|
/lane_detected_image |
sensor_msgs/Image |
LaneDetectionNode |
BGR8 processed frame with lane overlay |
/lane_status |
std_msgs/String |
LaneDetectionNode |
Lane detection status ("Lane detected") |
- Ubuntu 22.04 (native or WSL2 on Windows)
- ROS 2 Humble Hawksbill
- Python 3.10+
- GTSRB dataset (for CNN training)
mkdir -p ~/ros2_ws/src
cd ~/ros2_ws/src
git clone https://github.com/Zaidfarooqui01/Autonomous_EV.git
cd ~/ros2_wspip install tensorflow tensorflow-hub opencv-python numpy matplotlib
sudo apt install ros-humble-cv-bridge python3-colcon-common-extensionscd ~/ros2_ws
colcon build
source install/setup.bashDownload from Kaggle GTSRB
and place train.p, valid.p, test.p, signnames.csv in AEV_Datasets/
python traffic_signs_cnn.pypython Object_classification.pyros2 run autonomous_ev lane_detection_node \
--ros-args -p video_file:=/path/to/your/video.mp4# In separate terminals:
ros2 topic echo /lane_status
ros2 topic hz /lane_detected_image| Model | Dataset | Metric | Value |
|---|---|---|---|
| Custom CNN (traffic_signs_cnn.py) | GTSRB Test Set | Accuracy | ~95%+ |
| SSD MobileNet v2 | COCO | mAP@0.5 | 22.0 (standard benchmark) |
| Lane Detection | Highway Video | Visual | Stable polynomial fit at 20Hz |
CNN test accuracy achieved after 10 epochs on GTSRB benchmark dataset. Standard LeNet-5 variants on GTSRB typically achieve 93–97% accuracy.
- LiDAR point cloud integration (sensor_msgs/PointCloud2)
- Camera + LiDAR sensor fusion for 3D obstacle detection
- Path planning node (A* / RRT algorithm)
- PID controller node for steering angle output
- Replace video input with live USB/IP camera feed
- Deployment on NVIDIA Jetson Nano / Raspberry Pi 5
- ONNX model export for edge inference optimization
- Docker containerization for reproducible ROS 2 environment
tensorflow>=2.12.0
tensorflow-hub>=0.14.0
opencv-python>=4.8.0
numpy>=1.24.0
matplotlib>=3.7.0
pandas>=2.0.0
rclpy # via ROS 2 installation
sensor_msgs # via ROS 2 installation
cv_bridge # via ROS 2 installation
Mohammad Zaid B.Tech Artificial Intelligence and Machine Learning | United College of Engineering & Research, Prayagraj GitHub: @Zaidfarooqui01
Contributions, issues and feature requests are welcome!
This project is intended for academic, research, and learning purposes only. It does not represent a production-ready autonomous driving system. Never deploy perception-only systems for real vehicle control.