Jumpcut Smoothing

Making cuts disappear by regenerating the frames across them

Matthew Bendel Xingzhe He Stephen W. Bailey Mithilesh Vaidya Sumukh Badam Geoffry Berlin Keith Simmons Vicki Anand Jose Sotelo

Descript, Inc.

Jumpcut smoothing

Watch the cut disappear regenerated bridge

Cut a video and the picture jumps: trimming or re-timing a clip drops frames, and the head snaps to a new position. Jumpcut Smoothing fills that gap with regenerated frames so the join flows like a continuous take. Play both versions below and watch the boundary.

The problem

Editing the audio almost always changes its length. A re-recorded or trimmed line can run longer or shorter than the footage it replaces, and to keep picture and sound aligned, frames get removed or repeated. Dropping frames leaves a jumpcut: a hard break where the head suddenly snaps to a new position and the motion stutters.

It's the kind of thing you don't consciously register but always feel: the moment a clip stops looking like a continuous take.

Hard cut no jumpcut smoothing

cut boundary

With smoothing regenerated bridge

regenerated bridge

Both clips play in sync. On the left, frames are dropped to hit the new timing, so the head snaps and the motion stutters at the boundary. On the right, a regenerated bridge spans the same boundary and the cut disappears.

How it works

We treat the cut as an inpainting problem in time. Given the frames before and after the break, a video model fills in a short bridge that connects them, carrying head pose, expression, and motion over from both sides. Instead of jumping, the head moves through the gap the way it would have on camera.

Because the bridge is generated from the surrounding footage, it inherits the same identity, lighting, and background, so there's no separate clip to match and nothing to line up by hand. The same approach covers the two common cases: shortening a clip (where a span of frames is removed) and looping a short clip to fill time (where the loop point would otherwise be obvious).

Bridging the head motion is only half of it. The mouth still has to say the right thing across the join, so we always run Video Regenerate over the bridge to correct the lip-sync, re-syncing the lips to the audio on the generated frames. The result is seamless in both motion and speech.

Frames in
context + fully-masked gap

→

Generator

Video infill

→

Frames out
seamless bridge

Citation

If you reference this work, please cite:

@misc{bendel2026jumpcutsmoothing,
  title         = {Jumpcut Smoothing: Hiding Cuts in Talking-Head Video by Generative Frame Infill},
  author        = {Matthew Bendel and Xingzhe He and Stephen W. Bailey and
                   Mithilesh Vaidya and Sumukh Badam and Geoffry Berlin and
                   Keith Simmons and Vicki Anand and Jose Sotelo},
  year          = {2026},
  howpublished  = {Descript, Inc.},
  url           = {https://descriptinc.github.io/jumpcut-smoothing/}
}