Understanding semantic intricacies and high-level concepts is essential in image sketch generation, and this challenge becomes even more formidable when applied to the domain of videos. To address this, we propose a novel optimization-based framework for sketching videos represented by the frame-wise Bézier Curves. In detail, we first propose a cross-frame stroke initialization approach to warm up the location and the width of each curve. Then, we optimize the locations of these curves by utilizing a semantic loss based on CLIP features and a newly designed consistency loss using the self-decomposed 2D atlas network. Built upon these design elements, the resulting sketch video showcases impressive visual abstraction and temporal coherence. Furthermore, by transforming a video into SVG lines through the sketching process, our method unlocks applications in sketch-based video editing and video doodling, enabled through video composition, as exemplified in the teaser.
Left, Layered Atlas: Firstly, we train a layered atlas to decompose the video into separated 2D images (foreground/background atlas). The atlas maps video 3D coordinates into 2D coordinates, where the same image coordinates have the same color.
Right, Video Sketching Optimization: Afterward, we propose novel initialization methods to utilize the mapping network Mf to generate proper sketches cross-video with correspondence and optimize the location of the generated sketches, maintaining temporal coherence and semantic alignment by introducing novel consistency and semantic loss.
@article{zheng2023sketch,
title={Sketch Video Synthesis},
author={Yudian Zheng and Xiaodong Cun and Menghan Xia and Chi-Man Pun},
year={2023},
eprint={2311.15306},
archivePrefix={arXiv},
primaryClass={cs.CV}
}