Grok Imagine Video
Video with Audio, T2V & I2V
xAI's video generation model that can generate videos with audio. Use text-to-video or image-to-video, choose duration up to 15 seconds, and control aspect ratio and resolution with flexible pay-per-second pricing.
Key Features
Discover what makes Grok Imagine Video a powerful video generation model
Video with Audio Output
Generate MP4 videos that can include audio. Great for cinematic short clips and social content.
Text-to-Video & Image-to-Video
Create videos directly from a text prompt, or animate an input image into a video with a prompt.
Up to 15 Seconds
Choose durations from 1 to 15 seconds depending on your use case.
Pay-Per-Second
Flexible pricing at 60 credits per second. Pay only for the duration you need, from 1 to 15 seconds.
Resolution Control
Select 480p or 720p output resolution for speed and cost control.
Multi-Format Support
Support for multiple aspect ratios including 16:9, 9:16, 1:1, and more—ready for YouTube, TikTok, and Instagram.
See It In Action
Real examples created with Grok Imagine Video on OpenCreator
Talking Portrait
Lip-Sync Demo
Create realistic talking head videos with perfectly synced mouth movements for presentations and content.
Product Showcase
E-commerce Video
Generate engaging product videos with dynamic camera movements and atmospheric audio for marketing.
Playful Scene
Creative Content
Create fun and engaging short-form video content perfect for social media and entertainment.
Lion Roar
Audio Sync Demo
Experience native audio synthesis with a majestic lion roaring in sync with its movements.
Ocean Waves
Ambient Sound
Create peaceful nature scenes with synchronized ambient sound effects for relaxation content.
Tokyo Night
Cinematic Scene
Generate cinematic urban scenes with atmospheric audio perfect for storytelling and music videos.
Latte Art
Lifestyle Content
Create cozy lifestyle videos with ambient sounds perfect for social media and brand content.
Nature Macro
Wildlife Video
Generate beautiful nature macro shots with gentle ambient sounds for peaceful content.
Talking Portrait
Lip-Sync Demo
Create realistic talking head videos with perfectly synced mouth movements for presentations and content.
Product Showcase
E-commerce Video
Generate engaging product videos with dynamic camera movements and atmospheric audio for marketing.
Playful Scene
Creative Content
Create fun and engaging short-form video content perfect for social media and entertainment.
Lion Roar
Audio Sync Demo
Experience native audio synthesis with a majestic lion roaring in sync with its movements.
Ocean Waves
Ambient Sound
Create peaceful nature scenes with synchronized ambient sound effects for relaxation content.
Tokyo Night
Cinematic Scene
Generate cinematic urban scenes with atmospheric audio perfect for storytelling and music videos.
Latte Art
Lifestyle Content
Create cozy lifestyle videos with ambient sounds perfect for social media and brand content.
Nature Macro
Wildlife Video
Generate beautiful nature macro shots with gentle ambient sounds for peaceful content.
Technical Specifications
Everything you need to know about Grok Imagine Video capabilities
Video Specs
- Duration1-15 seconds
- Resolution480p, 720p
- Frame RateReturned in output (fps)
- Aspect Ratiosauto, 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16
Audio Capabilities
- Native Audio
- Lip-SyncNot specified
- Sound EffectsNot specified
Input Support
- Text to Video
- Image to Video
Pricing
- Cost Per Second60 credits
- Minimum Duration1 second
- Maximum Duration15 seconds
Flexible Pricing
Save more with a subscription
$0.90 per 6s video
7,600 credits / month
$0.47 per 6s video
22,050 credits / month
$0.36 per 6s video
100,000 credits / month
Credits scale with duration: 60 credits per second. A 6-second video costs 360 credits, 15-second costs 900 credits.
FAQ
Grok Imagine Video is xAI's video generation model. It supports generating videos with audio from text prompts or from input images.