Video
Active
0.21

Wan 2.1

Discover Wan 2.1: Alibaba's groundbreaking video model supporting T2V, I2V, multilingual text generation, and multimodal tasks!
Try it now

AI Playground

Test all API models in the sandbox environment before you integrate. We provide more than 200 models to integrate into your app.
AI Playground image
Ai models list in playground
Testimonials

Our Clients' Voices

Wan 2.1Techflow Logo - Techflow X Webflow Template

Wan 2.1

Wan 2.1: Advanced video model excelling in generative tasks.

Model Overview Card for Wan 2.1

Basic Information

  • Model Name: Wan 2.1
  • Developer/Creator: Alibaba
  • Release Date: February 25, 2025
  • Version: 2.1
  • Model Type: AI Video Generation Model

Description

Overview:

Wan 2.1, developed by Alibaba's Wan AI team, is a state-of-the-art video foundation model designed for advanced generative video tasks. Supporting Text-to-Video (T2V), it incorporates groundbreaking innovations to deliver high-quality outputs with exceptional computational efficiency.

Key Features:
  • Visual text generation: Generates text in both Chinese and English within videos.
  • 3D Variational Autoencoder (Wan-VAE): Encodes and decodes unlimited-length 1080P videos with temporal precision.
  • High-quality outputs: Produces visually dynamic and temporally consistent videos at resolutions of up to 720P.
Intended Use:

Wan 2.1 is designed for applications in:

  • Creative industries (video production).
  • Content generation for social media and marketing campaigns.
  • Automated workflows involving multimedia processing.
Language Support:

The model supports multilingual text generation, including Chinese and English.

Technical Details

Architecture:

Wan 2.1 is built on the diffusion transformer paradigm with several innovative features:

  • 3D Variational Autoencoder (Wan-VAE): Enhances spatio-temporal compression and ensures temporal causality during video generation.
  • Video Diffusion DiT Framework: Uses Flow Matching with a T5 Encoder for text encoding and cross-attention layers embedded in transformer blocks.
Performance Metrics:

Wan 2.1 achieves an impressive 84.7% VBench score, excelling in dynamic scenes, spatial consistency, and aesthetics. It generates 1080p video at 30 FPS with realistic motion, thanks to its advanced space-time attention mechanism. As a leading open-source video generation model, it rivals proprietary alternatives like Sora, though they may outperform it in certain areas.

Usage

Code Samples

The model is available on the AI/ML API platform as "Wan 2.1" .

Params:
  • negative_prompt [str]: The negative prompt to use. Use it to address details that you don't want in the image. This could be colors, objects, scenery and even the small details (e.g. moustache, blurry, low resolution).
  • seed [int]: Random seed for reproducibility. If None, a random seed is chosen
  • aspect_ratio [9:16, 16:9]: Aspect ratio of the generated video
  • inference_steps [int]: Number of inference steps for sampling. Higher values give better quality but take longer.
  • guidance_scale [number]: Classifier-free guidance scale. Controls prompt adherence / creativity
  • shift [number]: Noise schedule shift parameter. Affects temporal dynamics
  • sampler ['unipc', 'dpm+']: The sampler to use for generation.
  • enable_safety_checker [boolean]: If set to true, the safety checker will be enabled.
  • enable_prompt_expansion [boolean]: Whether to enable prompt expansion.
To get the generated video
API Documentation

Detailed API Documentation is available here.

Ethical Guidelines

Alibaba emphasizes responsible usage of Wan 2.1 for ethical applications in content creation while discouraging misuse such as deepfake generation or inappropriate content creation.

Licensing

Wan 2.1 is licensed under Apache 2.0, allowing both commercial and research use with transparent terms.

Get Wan 2.1 API here.

Try it now

The Best Growth Choice
for Enterprise

Get API Key
No items found.
No items found.