PATS (Proficiency-Aware Temporal Sampling) is a novel video sampling strategy designed specifically for automated sports skill assessment. Unlike traditional methods that randomly sample frames or use uniform intervals, PATS preserves complete fundamental movements within continuous temporal segments, maintaining the temporal coherence essential for accurate proficiency evaluation.
Paper accepted at: 2025 4th IEEE Sport Technology and Research Workshop
📄 Read the Paper | 🎮 Try the Demo | 🌐 Project Page
When applied to SkillFormer, PATS achieves:
- +3.05% accuracy improvement in Egocentric views
- +0.65% accuracy improvement in Exocentric views
- +1.05% accuracy improvement in Ego+Exo combined views
- +26.22% in Bouldering
- +2.39% in Music
- +1.13% in Basketball
- Zero computational overhead - operates as preprocessing step
- Architecture-agnostic - works with any temporal modeling framework
- Adaptive - automatically adjusts to diverse activity characteristics
Athletic proficiency manifests through structured temporal patterns that require observing complete, uninterrupted movements. PATS addresses this fundamental challenge by:
- 📹 Extracting continuous temporal segments rather than isolated frames
- 🎭 Preserving natural movement flow essential for distinguishing expert from novice performance
- 🔄 Distributing multiple segments across the video timeline to maximize information coverage
PATS is controlled by three key parameters:
- Ntarget: Total number of frames to extract from the input video
- Ns: Number of temporal segments to divide the video into (typically 2-12)
- ds: Duration in seconds for each temporal segment (typically 1-3 seconds)
The algorithm adaptively segments videos to ensure each analyzed portion contains the full execution of critical performance components, repeating this process across multiple non-overlapping segments to maximize information coverage while maintaining temporal coherence.
To use PATS in your project, simply integrate the sample_frame_indices_efficient_segments
function into your video processing pipeline:
import av
def sample_frame_indices_efficient_segments(num_frames, segment_duration, num_segments, container):
"""
PATS sampling strategy for proficiency-aware temporal sampling.
Args:
num_frames (int): Total number of frames to sample (Ntarget)
segment_duration (float): Duration of each segment in seconds (ds)
num_segments (int): Number of segments to sample from (Ns)
container (av.container): PyAV container object
Returns:
list: Exactly num_frames frame indices maintaining temporal coherence
"""
# See app.py for full implementation
...
# Example usage
container = av.open("your_video.mp4")
# Sample 32 frames using 8 segments of 1 second each
frame_indices = sample_frame_indices_efficient_segments(
num_frames=32,
segment_duration=1.0,
num_segments=8,
container=container
)
# Use these indices to extract frames from your video
@INPROCEEDINGS{Bian2510:PATS,
AUTHOR="Edoardo Bianchi and Antonio Liotta",
TITLE="{PATS:} {Proficiency-Aware} Temporal Sampling for {Multi-View} Sports Skill
Assessment",
BOOKTITLE="2025 IEEE International Workshop on Sport, Technology and Research (STAR)
(IEEE STAR 2025)",
ADDRESS="Trento, Italy",
PAGES=6,
DAYS=29,
MONTH=oct,
YEAR=2025,
ABSTRACT="Automated sports skill assessment requires capturing fundamental movement
patterns that distinguish expert from novice performance, yet current video
sampling methods disrupt the temporal continuity essential for proficiency
evaluation. To this end, we introduce Proficiency-Aware Temporal Sampling
(PATS), a novel sampling strategy that preserves complete fundamental
movements within continuous temporal segments for multi-view skill
assessment. PATS adaptively segments videos to ensure each analyzed portion
contains full execution of critical performance components, repeating this
process across multiple segments to maximize information coverage while
maintaining temporal coherence. Evaluated on the EgoExo4D benchmark with
SkillFormer, PATS surpasses the state-of-the-art accuracy across all
viewing configurations (+0.65\% to +3.05\%) and delivers substantial gains
in challenging domains (+26.22\% bouldering, +2.39\% music, +1.13\%
basketball). Systematic analysis reveals that PATS successfully adapts to
diverse activity characteristics-from high-frequency sampling for dynamic
sports to fine-grained segmentation for sequential skills-demonstrating its
effectiveness as an adaptive approach to temporal sampling that advances
automated skill assessment for real-world applications."
}
For questions or collaborations, open an issue or contact us at [edbianchi@unibz.it] or [edoardobianchi98@gmail.com].