Scalingcache: Extreme Acceleration of DiT through Difference Scaling and Dynamic Interval Caching

1Zhejiang University,  2Kuaishou Technology 
* Work done during an internship at Kling AI Infra, Kuaishou Technology
‡ contributed equally to this work † Corresponding author

Demos

~2.5x Speedup on Wan2.1 14B Text-to-Video Task

Prompt: A person is air drumming.

Original

ScalingCache

Prompt: A sleek motorcycle shreds through neon-lit alleyways, bullets sparking off dumpsters.

Original

ScalingCache

Prompt: CG animation digital art, a sleek and advanced robot standing in the bustling center of Times Square.

Original

ScalingCache

Prompt: Two anthropomorphic cats in comfy boxing gear and bright gloves fight intensely on a spotlighted stage.

Original

ScalingCache

Prompt: A cat wearing sunglasses and working as a lifeguard at a pool.

Original

ScalingCache

Abstract

Here is the abstract.

Method

Here is the method.


Text-to-Image Visual Quality Comparison

Comparison of generation results across different methods

Awesome Related Works


TaylorSeer: From Reusing to Forecasting Accelerating Diffusion Models with TaylorSeers.
EasyCache: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching.
TeaCache: Timestep Embedding Tells It's Time to Cache for Video Diffusion Model.

BibTeX

BibTex Code Here