A command-line tool for backing up Canvas LMS courses and Zoom cloud recordings.
u_crawler automates the backup of your educational content from Canvas Learning Management System, including:
- Course content: Module pages, assignment instructions, and announcements exported as Markdown
- Attachments: PDFs, documents, images, and other files linked in your courses
- Zoom recordings: Cloud recordings from Zoom meetings integrated with Canvas
The tool supports resumable downloads, rate limiting, and incremental syncs to efficiently maintain up-to-date backups.
- Features
- Prerequisites
- Installation
- Quick Start
- Commands
- Configuration
- Zoom Recording Workflow
- Troubleshooting
- Exit Codes
- License
- Canvas course backup: Export module pages and assignments as Markdown files
- Attachment downloads: Automatically download linked files (PDF, DOCX, PNG, etc.)
- Zoom integration: Download cloud recordings from Zoom-enabled courses
- Incremental sync: Only download new or modified content
- Resumable downloads: Interrupted downloads resume from where they stopped
- Rate limiting: Configurable request throttling to avoid API limits
- Dry-run mode: Preview changes before writing files
- Course filtering: Include or exclude specific courses from sync operations
Before installing u_crawler, ensure you have:
| Requirement | Version | Purpose |
|---|---|---|
| Rust toolchain | 1.70+ | Building from source |
| ffmpeg | Any recent | Downloading Zoom recordings |
| Chromium or Edge | Any recent | Zoom authentication via Chrome DevTools Protocol |
-
Install Rust
Download and run the installer from rustup.rs, then restart your terminal.
-
Install ffmpeg
Download from ffmpeg.org, extract to a folder (e.g.,
C:\ffmpeg), and addC:\ffmpeg\binto your system PATH. -
Build u_crawler
git clone https://github.com/yourusername/u_crawler.git cd u_crawler cargo build --release
-
Add to PATH (optional)
copy target\release\u_crawler.exe C:\Windows\System32\
-
Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh source "$HOME/.cargo/env"
-
Install ffmpeg
brew install ffmpeg
-
Build u_crawler
git clone https://github.com/yourusername/u_crawler.git cd u_crawler cargo build --release -
Add to PATH (optional)
# Add to your shell profile (.zshrc or .bash_profile) export PATH="$HOME/path/to/u_crawler/target/release:$PATH"
-
Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh source "$HOME/.cargo/env"
-
Install ffmpeg
# Ubuntu/Debian sudo apt update && sudo apt install ffmpeg # Fedora sudo dnf install ffmpeg # Arch Linux sudo pacman -S ffmpeg
-
Build u_crawler
git clone https://github.com/yourusername/u_crawler.git cd u_crawler cargo build --release -
Install system-wide (optional)
sudo cp target/release/u_crawler /usr/local/bin/
Confirm all components are installed correctly:
rustc --version # Should show 1.70.0 or later
ffmpeg -version # Should display ffmpeg version info
cargo run -- --help # Should show u_crawler helpCreate the default configuration file:
cargo run -- initThis creates ~/.config/u_crawler/config.toml (or %APPDATA%\u_crawler\config.toml on Windows).
Using a Personal Access Token (PAT):
cargo run -- auth canvas --base-url https://your-school.instructure.com --token YOUR_TOKENOr retrieve the token from a password manager:
cargo run -- auth canvas --base-url https://your-school.instructure.com \
--token-cmd "pass show canvas/pat"cargo run -- scanPreview what would be downloaded:
cargo run -- sync --dry-runDownload all courses:
cargo run -- syncDownload a specific course:
cargo run -- sync --course-id 123456First, launch a browser with remote debugging enabled:
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/u_crawler-profileThen run the Zoom backup:
cargo run -- zoom flow --course-id 123456Creates a default configuration file.
cargo run -- initConfigures authentication credentials for Canvas.
# Using a token directly
cargo run -- auth canvas --base-url URL --token TOKEN
# Using a command to retrieve the token
cargo run -- auth canvas --base-url URL --token-cmd "command"Lists courses and inspects their content.
# List all active courses
cargo run -- scan
# Inspect a specific course
cargo run -- scan --course-id 123456Downloads course content to the local filesystem.
| Flag | Description |
|---|---|
--course-id ID |
Sync only the specified course |
--dry-run |
Preview changes without downloading |
--verbose |
Show skipped items and additional details |
# Sync all courses
cargo run -- sync
# Sync one course with verbose output
cargo run -- sync --course-id 123456 --verboseManages Zoom recording downloads. The primary command is zoom flow, which handles the entire process automatically.
| Flag | Description |
|---|---|
--course-id ID |
Target course (required) |
--debug-port PORT |
CDP port (default: 9222) |
--keep-tab |
Keep the browser tab open after capture |
--concurrency N |
Number of parallel downloads (default: 1) |
--since DATE |
Only download recordings after this date (YYYY-MM-DD) |
cargo run -- zoom flow --course-id 123456 --since 2024-01-01For advanced use cases, individual subcommands are available:
zoom sniff-cdp- Capture authentication credentialszoom list- List available recordingszoom fetch-urls- Retrieve download URLszoom dl- Download recordings
Configuration is stored in ~/.config/u_crawler/config.toml (Linux/macOS) or %APPDATA%\u_crawler\config.toml (Windows).
# General settings
download_root = "~/Documents/Canvas-Backup"
concurrency = 4 # Parallel downloads
max_rps = 2 # API requests per second
user_agent = "" # Custom user agent (optional)
# Canvas LMS settings
[canvas]
base_url = "https://your-school.instructure.com"
token = "" # Leave empty if using token_cmd
token_cmd = "pass show canvas/pat"
ignored_courses = ["153095", "153607"]
# Logging settings
[logging]
level = "info" # trace | debug | info | warn | error
file = "~/.config/u_crawler/u_crawler.log"
# Zoom settings
[zoom]
enabled = true
ffmpeg_path = "ffmpeg"
cookie_file = "~/.config/u_crawler/zoom_cookies.txt"
user_agent = "Mozilla/5.0"
external_tool_id = 187| Option | Description | Default |
|---|---|---|
download_root |
Directory for downloaded files | Required |
concurrency |
Number of parallel downloads | 4 |
max_rps |
Maximum API requests per second | 2 |
canvas.base_url |
Your Canvas instance URL | Required |
canvas.token |
Personal Access Token | - |
canvas.token_cmd |
Command to retrieve token | - |
canvas.ignored_courses |
Course IDs to skip | [] |
logging.level |
Log verbosity | info |
zoom.enabled |
Enable Zoom features | true |
zoom.ffmpeg_path |
Path to ffmpeg binary | ffmpeg |
zoom.external_tool_id |
Zoom LTI tool ID in Canvas | - |
The zoom flow command automates the complete process of downloading Zoom cloud recordings:
-
Launch a Chromium-based browser with remote debugging enabled:
chromium --remote-debugging-port=9222 --user-data-dir=/tmp/u_crawler-profile
-
Log into Canvas in that browser instance and complete any SSO authentication.
-
Ensure ffmpeg is available (check with
ffmpeg -version).
-
Credential Capture: Opens the Zoom external tool in Canvas via Chrome DevTools Protocol (CDP), capturing authentication cookies and API headers.
-
Recording Discovery: Queries the Zoom API to enumerate available meetings and their download URLs.
-
URL Resolution: Opens each recording page in an ephemeral browser tab to capture the signed download headers.
-
Download: Attempts to download using
ffmpeg -c copy. If that fails, falls back to direct HTTP download with resume support.
Recordings are saved to:
<download_root>/Zoom/<course_id>/<meeting_title>_<date>.mp4
Downloads use .part files and HTTP Range requests, allowing safe resumption if interrupted.
Symptoms: Error "ffmpeg missing" or exit code 13.
Solutions:
- Verify installation:
ffmpeg -version - On Windows: Add ffmpeg to PATH or set
zoom.ffmpeg_pathto the full path - On Linux/macOS: Install via package manager or set absolute path in config
Symptoms: "auth error" or exit code 11.
Solutions:
- Verify your Personal Access Token is valid and not expired
- Confirm
base_urlmatches your Canvas instance exactly - Test your
token_cmdmanually to ensure it returns the token - Re-run:
cargo run -- auth canvas --base-url URL --token TOKEN
Symptoms: CDP flow times out or fails to capture credentials.
Solutions:
- Ensure browser is launched with
--remote-debugging-port=9222 - Log into Canvas in that browser before running
zoom flow - Complete SSO prompts when they appear
- Use
--debug-portif your browser uses a different port
Symptoms: Network errors or exit code 12.
Solutions:
- Reduce
max_rpsin config (e.g., from 2 to 1) - Reduce
concurrency(e.g., from 4 to 2) - Wait a few minutes before retrying
Symptoms: Some files fail to download (exit code 15).
Solutions:
- Re-run the command; downloads are resumable
- Check available disk space
- Verify write permissions for
download_root - Use
--verboseto identify specific failures - Check logs with
level = "debug"
Symptoms: Recordings are listed but fail to download.
Solutions:
- Verify you have download permissions in Zoom
- Confirm ffmpeg works:
ffmpeg -version - Exit code 14 indicates insufficient permissions
- Try the manual flow:
zoom sniff-cdp,zoom list,zoom dl - Check logs for specific error messages
Symptoms: Tool can't find config.toml.
Solutions:
- Run
cargo run -- initto create the default config - Verify the config directory exists
- Check file permissions
For detailed diagnostics, enable debug logging:
[logging]
level = "debug"Then check ~/.config/u_crawler/u_crawler.log after running commands.
| Code | Meaning |
|---|---|
| 0 | Success |
| 10 | Configuration error |
| 11 | Authentication error |
| 12 | Network or rate limit error |
| 13 | ffmpeg missing or failed |
| 14 | Permission denied (no download rights) |
| 15 | Partial failure (some items failed) |
- Incremental sync: The sync command only downloads new or modified content.
- File naming: Names are sanitized to ASCII with underscores; repeated separators are collapsed.
- Idempotent operations: Commands can be safely re-run; they resume from where they stopped.
- Ignored courses: Use
ignored_coursesto exclude specific courses from bulk operations. - Dry-run mode: Always preview with
--dry-runbefore large sync operations.
See LICENSE for details.