Google introduces Lumiere, an AI-powered tool that generates videos from text prompts. Lumiere uses generative AI and a Space-Time U-Net architecture to create realistic, diverse, and coherent motion videos. It leverages a pre-trained text-to-image diffusion model and learns to generate full-frame-rate, low-resolution videos. Lumiere can generate videos based on descriptive text prompts, animate regions of reference images with text prompts, fill in missing or corrupted parts of videos, apply distinct styles or themes to visuals, integrate text-based image editing methods for video editing, and create captivating cinemagraphs. The tool ensures visual integrity and enhances content quality.
Soon after announcing the Gemini, its latest large
language model
, Google has introduced Lumiere, its new
AI-powered tool
that uses
generative AI
to generate videos from simple text prompts.
Lumiere is a new text-to-video diffusion model that’s designed to generate videos from text in a realistic, diverse and coherent motion. The new video language model relies on Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model.
With Lumiere, Google has used both spatial and temporal down and up sampling and it leverages a pre-trained text-to-image difussion model. Google says that their new model learns to directly generate a full-frame-rate, low-resolution video by processing it in multiple space-time scales.
Google has also shared some samples and types of videos
Lumiere AI
can generate:
Text-to-Video: Lumiere excels at generating videos based on descriptive text prompts, effectively bringing textual scenes to life with realistic motion.
Image-to-Video: Utilizing a single reference image and a specified text prompt, Lumiere generates videos that animate a designated region, enabling dynamic content creation from static visuals.
Video Inpainting: Lumiere seamlessly fills in missing or corrupted parts of input videos, ensuring a smooth and uninterrupted viewing experience.
Stylized Generation: This feature allows users to apply distinct styles or themes to input images or videos, fostering creativity and enabling a personalized visual aesthetic.
Video Stylization: Lumiere integrates off-the-shelf text-based image editing methods for consistent and coherent video editing, ensuring a harmonious visual narrative.
Cinemagraphs: Lumiere's innovative model animates specific user-defined regions within an image, creating captivating cinemagraphs that blend static and dynamic elements seamlessly.
Video Inpainting (Reiteration): The tool excels in filling missing or damaged regions in a video sequence, maintaining visual integrity and enhancing the overall quality of the content.