- VideoPoet, Google's new AI tool, creates videos from text inputs.
- VideoPoet is preferred for its accurate text fidelity and engaging motion production in videos.
This new tool, based on a large language model (LLM), can perform a range of video generation tasks, including text-to-video, image-to-video, video stylization, and even video-to-audio conversions.
VideoPoet stands out in its field by integrating various video generation capabilities into a single LLM, unlike other models, which rely on separate components for each task.
This integration allows for more seamless and coherent video creation, especially in tasks involving large motions, which has been a challenge for current models.
One of the key features of VideoPoet is its ability to animate still images and edit videos for tasks like inpainting, outpainting, and stylization.
For example, it can take a static image of a ship at sea and animate it to show the ship navigating through a thunderstorm. This capability is enhanced by the use of text prompts, which guide the motion and style of the generated videos.
The model’s training and inference inputs and outputs across different tasks are particularly intriguing.
VideoPoet uses multiple tokenizers (MAGVIT V2 for video and image, and SoundStream for audio) to convert various modalities into tokens and vice versa.
This process enables the model to generate tokens based on context, which are then converted back into a viewable representation.
VideoPoet has also shown promise in generating longer videos maintaining the appearance and consistency of objects over several iterations. Additionally, the model can interactively edit existing video clips, allowing users to change the motion of objects within a video.
The evaluation results of VideoPoet are equally impressive. In terms of text fidelity and motion interestingness, VideoPoet was preferred over competing models, showcasing its ability to follow prompts and produce interesting motions accurately.
For those interested in seeing more examples of VideoPoet’s capabilities, a demo is available on their website.