This project presents a framework for adaptive video bitrate allocation in teleoperated vehicles. The framework is designed to optimize the video quality while considering the available bandwidth and the importance of different camera views. The system dynamically adjusts video resolution and bitrate based on the spatiotemporal complexity of the video content and the criticality of detected objects, ensuring efficient and effective video transmission for teleoperation.
The solution comprises two main chains:
- Video Processing Chain.
- Object Detection Chain.
-
Input Video Segment: The input video is processed in segments to enable real-time adaptation.
-
Spatiotemporal Complexity Feature Extraction: This step extracts features related to the spatial and temporal complexity of the video content. These features are critical for predicting the optimal resolution and quality parameters.
-
Optimized Resolution Prediction: Based on the extracted features and the constraints of maximum encoding/decoding time, the system predicts an optimal set of resolutions.
-
Optimized QP Prediction: The system predicts an optimal Quantization Parameter (QP) for each resolution to maintain video quality while reducing bitrate.
-
JND-based Representation Elimination: A Just Noticeable Difference (JND) model is used to eliminate redundant video representations that may not be perceptible to the human eye, reducing unnecessary data transmission.
-
cVBR Encoder: The constant Variable Bitrate (cVBR) encoder encodes the video using the selected resolution, QP, and bitrate to optimize video transmission.
-
Sensor Data: Sensor data from multiple cameras is fed into the object detection module.
-
Object Detection: The system detects objects in the video frames captured by different cameras (L1 to L6).
-
Prioritizing and Importance Weighting: Detected objects are prioritized based on their importance, and weights (W1 to W6) are assigned to each camera view accordingly.
-
Acceptable Bitrates Estimation: Using the camera reference and importance weights, the system estimates acceptable bitrates for each camera view to ensure critical information is transmitted with higher quality.
- The system integrates with a bandwidth prediction model to ensure that the allocated bitrates are within the available bandwidth, allowing for real-time adjustments and preventing video quality degradation or buffering.
- Input: Video segments and sensor data from multiple cameras.
- Output: Optimally encoded video stream for teleoperation, with dynamically allocated bitrates based on object importance and available bandwidth.
This framework is inspired by the Quality-Aware Dynamic Resolution Adaptation Framework for Adaptive Video Streaming, which emphasizes efficient video transmission by dynamically adjusting resolution and bitrate based on content complexity and network conditions. The original paper can be accessed here.