Time-to-Move

Training-Free Motion-Controlled Video Generation via Dual-Clock Denoising

Assaf Singer^† · Noam Rotstein^† · Amir Mann · Ron Kimmel · Or Litany

^† Equal contribution

Warped Ours Warped Ours

Inference

Time-to-Move (TTM) is a plug-and-play technique that can be integrated into any image-to-video diffusion model. We provide implementations for Wan 2.2, CogVideoX, and Stable Video Diffusion (SVD). As expected, the stronger the base model, the better the resulting videos. Adapting TTM to new models and pipelines is straightforward and can typically be done in just a few hours. We recommend using Wan, which generally produces higher‑quality results and adheres more faithfully to user‑provided motion signals.

For each model, you can use the included examples or create your own as described in Generate Your Own Cut-and-Drag Examples.

Dual Clock Denoising

TTM depends on two hyperparameters that start different regions at different noise depths. In practice, we do not pass tweak and tstrong as raw timesteps. Instead we pass tweak-index and tstrong-index, which indicate the iteration at which each denoising phase begins out of the total num_inference_steps (50 for all models). Constraints: 0 ≤ tweak-index ≤ tstrong-index ≤ num_inference_steps.

tweak-index — when the denoising process outside the mask begins.
- Too low: scene deformations, object duplication, or unintended camera motion.
- Too high: regions outside the mask look static (e.g., non-moving backgrounds).
tstrong-index — when the denoising process within the mask begins. In our experience, this depends on mask size and mask quality.
- Too low: object may drift from the intended path.
- Too high: object may look rigid or over-constrained.

Wan

To set up the environment for running Wan 2.2, follow the installation instructions in the official Wan 2.2 repository. Our implementation builds on the 🤗 Diffusers Wan I2V pipeline adapted for TTM using the I2V 14B backbone.

Run inference (using the included Wan examples):

python run_wan.py \
  --input-path "./examples/cutdrag_wan_Monkey" \
  --output-path "./outputs/wan_monkey.mp4" \
  --tweak-index 3 \
  --tstrong-index 7

Good starting points:

Cut-and-Drag: tweak-index=3, tstrong-index=7
Camera control: tweak-index=2, tstrong-index=5

CogVideoX

To set up the environment for running CogVideoX, follow the installation instructions in the official CogVideoX repository. Our implementation builds on the 🤗 Diffusers CogVideoX I2V pipeline, which we adapt for Time-to-Move (TTM) using the CogVideoX-I2V 5B backbone.

Run inference (on the included 49-frame CogVideoX example):

python run_cog.py \
  --input-path "./examples/cutdrag_cog_Monkey" \
  --output-path "./outputs/cog_monkey.mp4" \
  --tweak-index 4 \
  --tstrong-index 9

Stable Video Diffusion

To set up the environment for running SVD, follow the installation instructions in the official SVD repository.
Our implementation builds on the 🤗 Diffusers SVD I2V pipeline, which we adapt for Time-to-Move (TTM).

To run inference (on the included 21-frame SVD example):

python run_svd.py \
  --input-path "./examples/cutdrag_svd_Fish" \
  --output-path "./outputs/svd_fish.mp4" \
  --tweak-index 16 \
  --tstrong-index 21

Generate Your Own Cut-and-Drag Examples

We provide an easy-to-use GUI for creating cut-and-drag examples that can later be used for video generation in Time-to-Move. We recommend reading the GUI guide before using it.

To get started quickly, create a new environment and run:

pip install PySide6 opencv-python numpy imageio imageio-ffmpeg
python GUIs/cut_and_drag.py

Community Adoption

ComfyUI – WanVideoWrapper by @kijai: native TTM nodes and an example Wan 2.2 I2V workflow.
Wan 2.2 Time-To-Move ComfyUI Guide: YouTube tutorial by Benji’s AI Playground.
ComfyUI – WanVideoWrapper spline editor by @siraxe: keyframe-based editor and input-video assembly tool.

If you are using TTM in your own project or product, feel free to open a PR to add it to this section.

TODO 🛠️

BibTeX

@misc{singer2025timetomovetrainingfreemotioncontrolled,
      title={Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising}, 
      author={Assaf Singer and Noam Rotstein and Amir Mann and Ron Kimmel and Or Litany},
      year={2025},
      eprint={2511.08633},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.08633}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
GUIs		GUIs
assets		assets
examples		examples
pipelines		pipelines
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
run_cog.py		run_cog.py
run_svd.py		run_svd.py
run_wan.py		run_wan.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Time-to-Move

Training-Free Motion-Controlled Video Generation via Dual-Clock Denoising

Table of Contents

Inference

Dual Clock Denoising

Wan

Run inference (using the included Wan examples):

Good starting points:

Run inference (on the included 49-frame CogVideoX example):

To run inference (on the included 21-frame SVD example):

Generate Your Own Cut-and-Drag Examples

Community Adoption

TODO 🛠️

BibTeX

About

Uh oh!

Releases

Packages

Contributors 3

Languages

License

time-to-move/TTM

Folders and files

Latest commit

History

Repository files navigation

Time-to-Move

Training-Free Motion-Controlled Video Generation via Dual-Clock Denoising

Table of Contents

Inference

Dual Clock Denoising

Wan

Run inference (using the included Wan examples):

Good starting points:

Run inference (on the included 49-frame CogVideoX example):

To run inference (on the included 21-frame SVD example):

Generate Your Own Cut-and-Drag Examples

Community Adoption

TODO 🛠️

BibTeX

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages