LongCat Video - Free Open Source AI Video Generator

Key Features

🎬

Three Tasks, One Model

Unified architecture supporting Text-to-Video, Image-to-Video, and Video-Continuation within a single framework. No need for multiple models.

⚡

Efficient Inference

Generate 720p 30fps videos within minutes using coarse-to-fine strategy and Block Sparse Attention for high-resolution efficiency.

🎞️

Long Video Generation

Natively pretrained on Video-Continuation tasks, enabling minutes-long videos without color drifting or quality degradation.

🏆

Multi-Reward RLHF

🔓

MIT Licensed

Completely free and open source. Use commercially, modify freely, and deploy anywhere without restrictions.

💪

Dense Architecture

13.6B dense parameters outperforming 28B MoE models. All parameters are activated, ensuring consistent quality.

Performance Benchmarks

Text-to-Video MOS Scores

Model	Accessibility	Architecture	Total Params	Text-Alignment ↑	Visual Quality ↑	Motion Quality ↑	Overall Quality ↑
Veo3	Proprietary	-	-	3.99	3.23	3.86	3.48
PixVerse-V5	Proprietary	-	-	3.81	3.13	3.81	3.36
Wan 2.2-T2V-A14B	Open Source	MoE	28B (14B activated)	3.70	3.26	3.78	3.35
LongCat-Video	Open Source	Dense	13.6B	3.76	3.25	3.74	3.38

Image-to-Video MOS Scores

Model	Accessibility	Architecture	Total Params	Image-Alignment ↑	Text-Alignment ↑	Visual Quality ↑	Motion Quality ↑	Overall Quality ↑
Seedance 1.0	Proprietary	-	-	4.12	3.70	3.22	3.77	3.35
Hailuo-02	Proprietary	-	-	4.18	3.85	3.18	3.80	3.27
Wan 2.2-I2V-A14B	Open Source	MoE	28B (14B activated)	4.18	3.33	3.23	3.79	3.26
LongCat-Video	Open Source	Dense	13.6B	4.04	3.49	3.27	3.59	3.17

Quick Start

1. Clone the Repository

git clone https://github.com/meituan-longcat/LongCat-Video
cd LongCat-Video

2. Install Dependencies

# Create conda environment
conda create -n longcat-video python=3.10
conda activate longcat-video

# Install torch
pip install torch==2.6.0+cu124 torchvision==0.21.0+cu124 --index-url https://download.pytorch.org/whl/cu124

# Install flash-attn-2
pip install ninja psutil packaging flash_attn==2.7.4.post1

# Install other requirements
pip install -r requirements.txt

3. Download Model

pip install "huggingface_hub[cli]"
huggingface-cli download meituan-longcat/LongCat-Video --local-dir ./weights/LongCat-Video

4. Run Text-to-Video

# Single-GPU inference
torchrun run_demo_text_to_video.py --checkpoint_dir=./weights/LongCat-Video --enable_compile

# Multi-GPU inference
torchrun --nproc_per_node=2 run_demo_text_to_video.py --context_parallel_size=2 --checkpoint_dir=./weights/LongCat-Video --enable_compile

Use Cases

📱

Social Media Content

Create engaging videos for Instagram, TikTok, and YouTube from simple text prompts.

🎓

Educational Material

Generate educational videos and visual explanations for online courses and tutorials.

🛍️

Marketing Videos

Produce product demos and promotional content without expensive video production.

🎨

Creative Projects

Bring your artistic visions to life with AI-powered video generation.

🔬

Research

Experiment with video generation models and advance AI research.

📺

Content Production

Scale video content production for media companies and agencies.

Frequently Asked Questions

Is LongCat Video really free?

Yes! LongCat Video is completely free and open source under MIT license. You can use it commercially, modify it freely, and deploy it anywhere without any restrictions or licensing fees.

How long can the generated videos be?

LongCat Video can generate videos up to several minutes long without color drifting or quality degradation, thanks to native Video-Continuation pretraining. This is one of its key advantages over other models.

What hardware do I need to run LongCat Video?

LongCat Video supports both single-GPU and multi-GPU inference. For optimal performance, we recommend using modern NVIDIA GPUs with at least 24GB VRAM. The model uses FlashAttention-2 for efficient memory usage.

How does LongCat compare to commercial solutions like Sora or Runway?

LongCat Video achieves performance comparable to leading commercial solutions (Overall Quality MOS score of 3.38), while being completely free and open source. With only 13.6B dense parameters, it outperforms larger 28B MoE models.

Can I use LongCat Video for commercial projects?

Absolutely! The MIT license allows you to use LongCat Video for any purpose, including commercial projects. You can modify the model, integrate it into your products, and even sell services based on it.

What makes LongCat Video's architecture unique?

LongCat Video uses a unified architecture that handles Text-to-Video, Image-to-Video, and Video-Continuation tasks within a single model. It employs Block Sparse Attention and a coarse-to-fine generation strategy for efficient 720p 30fps video generation.

Is there an API or cloud service available?

Currently, LongCat Video is available for self-hosting. You can deploy it on your own infrastructure using the provided installation instructions. Cloud API services may be available in the future.

How can I contribute to the project?

You can contribute by submitting issues, pull requests, or improvements on the GitHub repository. The project welcomes contributions from the community under the MIT license.

Generate Minutes-Long Videos with AI