A comprehensive computer vision system for analyzing soccer videos using deep learning techniques. The system performs real-time detection of players, ball, and referees, tracks them across frames, assigns team colors, and provides tactical field analysis with coordinate transformations.
- β¨ Key Features
- π¬ DEMO
- ποΈ Project Structure
- π How to Get Started
- π In-Depth Pipelines
- π In-Depth main.py
- π Quick Links to Models and Datasets
- π― Object Detection: YOLO-based detection of players, ball, and referees
- π Multi-Object Tracking: ByteTrack for consistent ID assignment across frames
- π Team Assignment: SigLIP embeddings with UMAP + K-means clustering for automated team color detection
- β½ Field Analysis: 29-keypoint field detection and homography transformations for tactical analysis
- π Tactical Overlay: Real-time tactical view with pitch coordinate system
- π¬ Video Processing: Comprehensive video analysis with interpolation and annotation
KEYPOINT Demo : DRIVE
TRACKING Demo : DRIVE
TACTICAL Demo : DRIVE
COMPLETE Demo : DRIVE
The project follows a modular architecture with strict separation of concerns, where independent core modules are coordinated through specialized pipelines.
# Core YOLO detection functionality
- detect_players.py # Core detection functions: load_detection_model(), get_detections()
- detection_constants.py # Detection-specific configuration
- training/ # YOLO model training utilitiesClasses Detected: 0=Players, 1=Ball, 2=Referee
# ByteTrack tracking functionality
- tracking.py # TrackerManager class for consistent ID assignmentKey Features: ByteTrack integration, configurable thresholds
# SigLIP + UMAP + K-means for team detection
- embeddings.py # EmbeddingExtractor using SigLIP model
- clustering.py # ClusteringManager with UMAP + K-meansAlgorithm: SigLIP embeddings β UMAP reduction β K-means clustering (k=2)
# Comprehensive annotation system
- annotators.py # AnnotatorManager for drawing detections, tracks, teamsSupports: Bounding boxes, ellipses, labels, team colors, keypoints
# 29-point soccer field analysis
- detect_keypoints.py # Core keypoint detection: load_keypoint_model(), get_keypoint_detections()
- keypoint_constants.py # Field specification and keypoint mappings
- training/ # Keypoint model training utilitiesField Points: Corner flags, penalty boxes, center circle, goal areas (29 points total)
# Homography and pitch coordinate mapping
- homography.py # HomographyTransformer for frame-to-pitch coordinatesFeatures: ViewTransformer integration, tactical overlay generation
Pipelines coordinate between independent modules without creating dependencies:
class TrackingPipeline:
"""End-to-end tracking: Detection β Tracking β Team Assignment β Annotation"""
# Key Methods:
- initialize_models() # Load all required models
- collect_training_crops() # Extract player crops for team training
- train_team_assignment_models() # Train clustering models
- track_in_video() # Process complete video with trackingclass DetectionPipeline:
"""Object detection workflows for various input sources"""
# Key Methods:
- detect_in_video() # Video object detection
- detect_realtime() # Live detection from webcam
- detect_frame_objects() # Single frame detectionclass KeypointPipeline:
"""Field keypoint detection and analysis"""
# Key Methods:
- detect_in_video() # Video keypoint detection
- detect_keypoints_in_frame() # Single frame keypoint detection
- annotate_keypoints() # Visualize field keypointsclass TacticalPipeline:
"""Complete tactical analysis with field coordinate transformations"""
# Key Methods:
- analyze_video() # Complete tactical video analysis
- transform_keypoints_to_pitch() # Homography transformations
- create_tactical_view() # Generate pitch-view representation
- create_overlay_frame() # Overlay tactical view on originalclass ProcessingPipeline:
"""Video processing, interpolation, and I/O utilities"""
# Key Methods:
- read_video_frames() # Video input handling
- write_video_output() # Video output generation
- interpolate_ball_tracks() # Ball tracking interpolation
- generate_output_path() # Smart output path generationSoccer_Analysis/
βββ π Configuration & Entry Points
βββ main.py # Complete end-to-end analysis pipeline
βββ constants.py # Global configuration and model paths
βββ
βββ π§ Core Modules (Independent)
βββ player_detection/ # YOLO object detection
βββ player_tracking/ # ByteTrack multi-object tracking
βββ player_clustering/ # SigLIP + UMAP + K-means team assignment
βββ player_annotations/ # Comprehensive visualization system
βββ keypoint_detection/ # 29-point field keypoint detection
βββ tactical_analysis/ # Homography and coordinate transformations
βββ
βββ π° Pipeline Coordination Layer
βββ pipelines/ # Module coordination (no inter-module dependencies)
βββ
βββ π οΈ Utilities & Data Processing
βββ utils/ # Video I/O utilities
βββ Data_utils/ # Dataset preparation and processing
β βββ External_Detections/ # COCO/YOLO conversion utilities
β βββ SoccerNet_Detections/ # SoccerNet detection data processing
β βββ SoccerNet_Keypoints/ # Field keypoint data processing
βββ
βββ π¦ Models & Training Data
βββ Models/
βββ Pretrained/ # Base YOLO models
βββ Trained/ # Fine-tuned models
git clone <repository-url>
cd Soccer_Analysis
# Install required packages
pip install ultralytics supervision torch torchvision transformers scikit-learn umap-learn pandas numpy opencv-python tqdm more-itertools pillow huggingface_hub# Using huggingface_hub (Recommended)
python -c "
from huggingface_hub import hf_hub_download
import os, shutil
# Download object detection model
model_file = hf_hub_download(
repo_id='Adit-jain/soccana',
filename='best.pt'
)
# Create directory and move model
os.makedirs('Models/Trained/yolov11_sahi_1280/Model/weights', exist_ok=True)
shutil.copy(model_file, 'Models/Trained/yolov11_sahi_1280/Model/weights/best.pt')
print('Object detection model downloaded!')
"# Download keypoint detection model
python -c "
from huggingface_hub import hf_hub_download
import os, shutil
# Download keypoint model
model_file = hf_hub_download(
repo_id='Adit-jain/Soccana_Keypoint',
filename='best.pt'
)
# Create directory and move model
os.makedirs('Models/Trained/yolov11_keypoints_29/Model/weights', exist_ok=True)
shutil.copy(model_file, 'Models/Trained/yolov11_keypoints_29/Model/weights/best.pt')
print('Keypoint detection model downloaded!')
"# Update the model path to point to your downloaded model
model_path = r"Models\Trained\yolov11_sahi_1280\Model\weights\best.pt"
model_path = PROJECT_DIR / model_path# Update keypoint model path
keypoint_model_path = PROJECT_DIR / "Models/Trained/yolov11_keypoints_29/Model/weights/best.pt"# Input test video path - UPDATE THIS
test_video = r"path\to\your\input\video.mp4"
# Output video path - UPDATE THIS
test_video_output = r"path\to\your\output\video.mp4"python main.py# Object detection only
python pipelines/detection_pipeline.py
# Keypoint detection only
python pipelines/keypoint_pipeline.py
# Tactical analysis
python pipelines/tactical_pipeline.py
# Complete tracking with team assignment
python pipelines/tracking_pipeline.pyThe system operates through a sophisticated pipeline architecture where each stage builds upon the previous:
class CompleteSoccerAnalysisPipeline:
"""8-Stage End-to-End Analysis"""
# Stage 1: Model Initialization
def initialize_models():
# Load YOLO detection model
# Load YOLO keypoint model
# Initialize ByteTracker
# Initialize SigLIP embedding extractor
# Initialize UMAP + K-means models
# Stage 2: Team Assignment Training
def train_team_assignment():
# Extract video frames (stride=12, first 120*24 frames)
# Detect players in frames
# Extract player crops from detections
# Generate SigLIP embeddings (batch_size=24)
# Train UMAP dimensionality reduction
# Train K-means clustering (k=2 teams)
# Stage 3-7: Frame-by-Frame Processing
for each_frame:
# Stage 3: Object Detection (players, ball, referees)
# Stage 4: Keypoint Detection (29 field points)
# Stage 5: Multi-object Tracking (ByteTrack)
# Stage 6: Team Assignment (crop β embedding β cluster)
# Stage 7: Tactical Analysis (homography transformation)
# Stage 8: Post-Processing & Output
def finalize_output():
# Ball track interpolation (30-frame limit)
# Frame annotation with team colors
# Tactical overlay generation
# Video output writing# Object Detection Process
YOLO Model β Frame Input β [
Class 0: Players (with bounding boxes)
Class 1: Ball (with confidence scores)
Class 2: Referees (with positions)
] β Supervision Detections Format# Multi-Object Tracking Chain
Player Detections β ByteTrack β [
Consistent Track IDs
Motion Prediction
Re-identification
] β Tracked Detections β Team Assignment β [
Player Crops Extraction
SigLIP Embedding (512-dim)
UMAP Reduction (3-dim)
K-means Clustering (2 teams)
] β Team-Labeled Players# Field Analysis Process
Frame β YOLO Pose Model β 29 Keypoints β [
Corner flags (4 points)
Penalty areas (8 points)
Goal areas (4 points)
Center circle (3 points)
Side touchlines (6 points)
Goal lines (4 points)
] β Homography Matrix β Pitch Coordinates β Tactical Viewβββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β player_detectionβ β player_tracking β βplayer_clusteringβ
β β β β β β
β β’ YOLO models β β β’ ByteTrack β β β’ SigLIP embeds β
β β’ Detection API β β β’ Track IDs β β β’ UMAP + K-meansβ
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β²
β
ββββββββββββββββββββββββββββββββββββββββ
β pipelines/ β
β β
β TrackingPipeline coordinates: β
β 1. Detection β 2. Tracking β β
β 3. Clustering β 4. Annotation β
β β
β NO direct module-to-module calls β
ββββββββββββββββββββββββββββββββββββββββ
The main.py serves as the primary entry point featuring the CompleteSoccerAnalysisPipeline class:
class CompleteSoccerAnalysisPipeline:
"""Integrates 5 specialized pipelines for complete analysis"""
def __init__(detection_model_path, keypoint_model_path):
# Initialize all pipeline components
self.detection_pipeline = DetectionPipeline() # Object detection
self.keypoint_pipeline = KeypointPipeline() # Field keypoints
self.tracking_pipeline = TrackingPipeline() # Tracking + teams
self.tactical_pipeline = TacticalPipeline() # Tactical analysis
self.processing_pipeline = ProcessingPipeline() # Video I/O- Model Initialization: Load all YOLO models and initialize tracking components
- Team Training: Collect player crops and train team assignment models
- Video Reading: Load video frames for processing
- Frame Analysis:
- Detect keypoints and objects (players/ball/referees)
- Update tracking with ByteTrack
- Assign team colors through clustering
- Generate tactical coordinates
- Ball Interpolation: Fill missing ball detections using linear interpolation
- Annotation: Draw bounding boxes, IDs, team colors on frames
- Tactical Overlay: Combine original video with tactical field view
- Output Generation: Write final analyzed video
- Real-time Processing: ~30 FPS on modern GPUs
- Accuracy: >95% player detection, >90% tracking consistency
- Team Assignment: >88% accuracy on standard soccer videos
| Model Type | HuggingFace Repository | Description |
|---|---|---|
| Object Detection | Adit-jain/soccana | YOLO model trained for soccer player, ball, and referee detection |
| Keypoint Detection | Adit-jain/Soccana_Keypoint | YOLO pose model for 29-point soccer field keypoint detection |
| Dataset Type | HuggingFace Repository | Description |
|---|---|---|
| Keypoint Detection | Adit-jain/Soccana_Keypoint_detection_v1 | Annotated soccer field keypoint dataset with 29 field reference points |
| Object Detection | Adit-jain/Soccana_player_ball_detection_v1 | Soccer player, ball, and referee detection dataset with bounding box annotations |
Object Detection Model:
- Classes: Players, Ball, Referee
- Architecture: YOLOv11 with SAHI optimization
- Input Resolution: 1280x1280
- mAP: 0.91 (validation set)
Keypoint Detection Model:
- Keypoints: 29 field reference points
- Architecture: YOLOv11 pose estimation
- Field Coverage: Full FIFA-standard soccer field
- Accuracy: 94.2% keypoint detection rate
Visit the linked repositories for detailed model documentation, training procedures, dataset specifications, and performance benchmarks.
For support or questions, please refer to the model and dataset documentation in the linked HuggingFace repositories.