Skip to content

chaotics-labs/Slice

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

33 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Chaotics Slice Banner

Chaotics Slice βœ‚

AI-powered silence remover for video editors
Detects speech with Silero VAD and cuts silences. Runs locally, free forever, always open source.

Python 3.10+ License: GPL Active Development


✨ What It Does

Upload a video β†’ Pick an aggression level β†’ Slice removes all the silence.

Give it a video file, select how aggressive you want the silence removal (Chill β†’ Savage), and Chaotics Slice automatically detects every moment of speech and cuts everything else. Preview your cuts on an interactive timeline, then either:

  • Export the sliced video as a new file, or
  • Export a cut list (EDL / FCPXML / Premiere XML) to edit in your NLE

All processing happens locally on your machine. Download once, work offline forever.


🎯 Features

  • Local processing β€” Everything runs on your machine. No uploads, no cloud, no tracking.
  • AI-powered detection β€” Uses Silero VAD (Voice Activity Detection) to find speech, not just audio levels.
  • Flexible aggression levels β€” Chill, Normal, Tight, or Savage presets, plus full manual control over thresholds.
  • NLE-ready exports β€” Cut lists compatible with Final Cut Pro, Premiere Pro, and DaVinci Resolve.
  • Optional GPU acceleration β€” Auto-detects CUDA (NVIDIA) and MPS (Apple Silicon); falls back to CPU seamlessly.
  • Works offline β€” After the model downloads once, the app is fully offline-capable.
  • Free and open source β€” GPL licensed. No paywalls, no ads, no feature gates.

πŸ“‹ Requirements (All Platforms)

Dependency Version Notes
Python 3.10 – 3.12 3.13 not yet tested
FFmpeg + FFprobe 6+ Must be on PATH
PyTorch 2.x CPU works; GPU optional
torchaudio any Auto-detected at startup

GPU acceleration is optional. The app auto-detects CUDA and MPS at startup and falls back to CPU silently.


πŸš€ Quick Start (Pick Your Platform)

| Python | 3.10 – 3.12 | 3.13 not yet tested | | FFmpeg + FFprobe | 6+ | Must be on PATH | | PyTorch | 2.x | CPU works; GPU optional | | torchaudio | any | Auto-detected at startup |

GPU acceleration is optional. The app auto-detects CUDA and MPS at startup and falls back to CPU silently.


πŸš€ Quick Start (Pick Your Platform)

macOS

1. Install system dependencies

# Install Homebrew if you don't have it
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"

brew install python@3.11 ffmpeg

2. Clone, create virtual environment, and install

git clone https://github.com/yourname/chaotics-slice.git
cd chaotics-slice

python3.11 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install torch torchvision torchaudio flask waitress

3. Run

python app.py

Browser opens automatically at http://127.0.0.1:5000. Done!


Windows

1. Install Python

Download Python 3.11 from python.org.
βœ… Check "Add Python to PATH" during install.

2. Install FFmpeg

  1. Download a build from ffmpeg.org/download.html (e.g. gyan.dev full build)
  2. Extract to C:\ffmpeg
  3. Add C:\ffmpeg\bin to your System PATH:
    Control Panel β†’ System β†’ Advanced β†’ Environment Variables β†’ Path β†’ Edit β†’ New
  4. Verify: open a fresh terminal and run ffmpeg -version

3. Clone, create virtual environment, and install

git clone https://github.com/yourname/chaotics-slice.git
cd chaotics-slice

python -m venv .venv
.venv\Scripts\activate
pip install --upgrade pip

Choose one based on your GPU:

# NVIDIA (CUDA 12.1):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# CPU only:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

# Then all:
pip install flask waitress

4. Run

python app.py

Browser opens automatically at http://127.0.0.1:5000. Done!


Linux (Ubuntu / Debian)

1. Install system dependencies

sudo apt update
sudo apt install -y python3.11 python3.11-venv python3-pip ffmpeg git

On Ubuntu 22.04, Python 3.11 may need the deadsnakes PPA:

sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install -y python3.11 python3.11-venv

2. Clone, create virtual environment, and install

git clone https://github.com/yourname/chaotics-slice.git
cd chaotics-slice

python3.11 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip

Choose one based on your GPU:

# NVIDIA (CUDA 12.1):
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121

# CPU only:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

# Then all:
pip install flask waitress

3. Run

python app.py

⚠️ The app won't auto-open a browser on Linux. Visit http://127.0.0.1:5000 manually or Ctrl+Click the URL in the terminal.


Verify GPU Detection

After the app starts, check the terminal for a line like:

[Chaotics Slice] torch=2.10.0  device=cuda
[Chaotics Slice] torch=2.10.0  device=mps
[Chaotics Slice] torch=2.10.0  device=cpu

If you expected GPU but see cpu:

  • CUDA: Verify nvidia-smi shows your driver's max CUDA version, and you used the matching --index-url
  • Apple Silicon: Ensure you're running native Python: python3 -c "import platform; print(platform.machine())" should output arm64

πŸ“Ή Using Chaotics Slice

Workflow

  1. Upload β€” Choose a video file (.mp4, .mkv, .mov, .avi, .webm, .m4v, .flv; up to 8 GB)
  2. Configure β€” Pick an aggression level or adjust thresholds manually:
    • Chill β€” Keeps more breathing room; fewer cuts
    • Normal β€” Balanced; good for most content
    • Tight β€” Aggressive; minimal silence
    • Savage β€” Maximum cuts; speech-only edit
  3. Preview β€” See cuts on the interactive timeline before rendering
  4. Export β€” Either:
    • Video β€” Download the sliced video file
    • Cut List β€” Export EDL, FCPXML, or Premiere XML to edit in your NLE

Supported Formats

Video: .mp4, .mkv, .mov, .avi, .webm, .m4v, .flv
Maximum upload: 8 GB
Export: Native sliced video or NLE-compatible cut lists

First Run

On your first upload, Silero VAD downloads its model weights (~2 MB) from PyTorch Hub. This requires an internet connection once. After that, the model is cached locally and the app works fully offline.


πŸ—οΈ Architecture (For Developers)

Project Structure

chaotics-slice/
β”œβ”€β”€ app.py                    # Flask app + HTTP routes
β”œβ”€β”€ config.py                 # Constants & VAD mode presets
β”œβ”€β”€ ffmpeg.py                 # FFmpeg wrappers (encode, extract audio)
β”œβ”€β”€ vad.py                    # Silero VAD inference + speech detection
β”œβ”€β”€ jobs.py                   # Job queue, cut computation, EDL/XML export
β”œβ”€β”€ test.py                   # Unit tests
β”œβ”€β”€ requirements.txt          # Python dependencies
β”œβ”€β”€ build.bat / build.sh      # PyInstaller bundling scripts
β”œβ”€β”€ chaotics_slice.spec       # PyInstaller spec file
β”œβ”€β”€ static/
β”‚   β”œβ”€β”€ index.html            # Single-page UI
β”‚   β”œβ”€β”€ css/ style.css        # Styling
β”‚   β”œβ”€β”€ js/                   # Frontend logic (app.js, player.js, etc.)
β”‚   └── res/                  # Logo, icons, assets
β”œβ”€β”€ uploads/                  # Temporary upload directory (auto-cleared)
β”œβ”€β”€ outputs/                  # Temporary output directory (auto-deleted)
└── silero_vad/               # Silero VAD submodule & tuning tools
    β”œβ”€β”€ src/                  # VAD model loading & inference
    └── tuning/               # Threshold optimization utilities

How It Works

  1. Audio Extraction β€” FFmpeg extracts PCM audio (16 kHz, mono) from the video
  2. VAD Inference β€” Silero model identifies speech chunks with configurable thresholds
  3. Cut Computation β€” Combines speech chunks with padding & silence minimums to generate cuts
  4. Rendering β€” FFmpeg re-encodes the video with only the cut segments
  5. Export β€” Generate EDL/FCPXML/Premiere XML for NLE import, or output the final video

Key Parameters (in config.py)

MODE_PRESETS = {
    "chill":  {"padding": 350, "min_silence": 600},    # More breathing room
    "normal": {"padding": 200, "min_silence": 300},    # Balanced
    "tight":  {"padding": 80,  "min_silence": 150},    # Aggressive
    "savage": {"padding": 30,  "min_silence": 80},     # Minimal silence
}
  • Padding β€” Milliseconds of audio to keep around each speech chunk
  • Min Silence β€” Minimum silence duration (ms) before cutting

πŸ”§ Development & Contributing

Setting up for development

# Clone and navigate
git clone https://github.com/yourname/chaotics-slice.git
cd chaotics-slice

# Create virtual environment
python3.11 -m venv .venv
source .venv/bin/activate

# Install dev dependencies
pip install -r requirements.txt

# Run tests
python test.py

Building a standalone executable

Uses PyInstaller to bundle the app:

# macOS / Linux:
bash build.sh

# Windows:
build.bat

Output: dist/Chaotics-Slice.app (macOS) or dist/Chaotics-Slice.exe (Windows)

Contributing

We welcome bug reports, feature requests, and pull requests! If you're interested in contributing:

  1. Fork the repository
  2. Create a feature branch β€” git checkout -b feature/your-feature
  3. Make your changes and write/update tests
  4. Submit a pull request with a clear description

Areas we're looking for help with:

  • Performance optimizations (VAD inference, FFmpeg encoding)
  • Additional NLE export formats (Avid AAF, Media Composer, etc.)
  • Batch processing mode
  • GUI improvements and accessibility
  • Language/localization support
  • Platform-specific installers (DMG, MSI, deb/rpm packages)

For major features or architectural changes, please open an issue for discussion first.


πŸ› Troubleshooting

Error Solution
ffmpeg: command not found FFmpeg is not on your PATH. Re-check the install step for your platform and open a fresh terminal.
No speech detected Try Chill mode or lower the VAD Threshold slider. Noisy audio or non-speech content (music, B-roll) may cause misdetection.
FFmpeg render failed Check the Activity log for details. Common causes: corrupted file, unsupported codec, or disk full.
torchaudio requires torchcodec You have torchaudio β‰₯ 2.9. The app uses stdlib wave for audio; this is handled automatically. If you still see it, update to the latest app.py.
Port 5000 already in use Edit the last line of app.py to use a different port: serve(app, host="127.0.0.1", port=5001)
GPU not detected when expected Verify CUDA version matches your PyTorch --index-url. On Apple Silicon, confirm native Python: python -c "import platform; print(platform.machine())" β†’ arm64

πŸ“„ Notes & Caveats

  • Supported audio codecs β€” AAC, MP3, FLAC, PCM, Opus, Vorbis. Unusual codecs may cause FFmpeg errors.
  • Silero VAD language β€” Trained on multilingual data; works best with clear speech (English, Ukrainian, Russian, and other languages with similar phonetics).
  • GPU memory β€” VAD inference is memory-intensive; GPU acceleration is most beneficial on audio files with long, continuous segments.
  • Output file size β€” Sliced output is typically 60–80% of the original on podcasts; less on videos with substantial B-roll.

πŸ“œ License

GPL License. See LICENSE for details.


πŸ™ Credits & Attribution

  • Silero VAD β€” VAD model and framework (github.com/snakers4/silero-vad)
  • Flask β€” Web framework
  • FFmpeg β€” Audio/video processing
  • PyTorch β€” ML inference backend

πŸ’¬ Questions or Feedback?

  • πŸ’‘ Feature request? Open a GitHub issue.
  • πŸ› Bug? Describe steps to reproduce; include terminal output and OS/GPU details.
  • πŸ’­ General question? Start a discussion or check existing issues.

About

A lightweight, cross-platform Flask app for automatic speech-based video slicing. Users can upload videos, detect speech segments with Silero VAD, and render only the spoken parts. Works out of the box on Windows and macOS with CPU-only PyTorch, no GPU setup required.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors