Phantom is a unified video generation framework for single and multi-subject references, built on existing text-to-video and image-to-video architectures. It achieves cross-modal alignment using ...
Abstract: Video inpainting modifies local regions in video while ensuring spatial and temporal coherence. However, existing methods-both traditional and recent diffusion-based ones-face key ...
Abstract: Video deblurring remains challenging due to the difficulty of modeling long-range temporal dependencies and global spatial structures in a stable and efficient manner. Existing temporal ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results