Hello there!

Welcome to Kunjun Li’s website! I’m a final-year undergraduate student at National University of Singapore (NUS). Currently, I am a research intern at Princeton University, supervised by Prof. Zhuang Liu. I also collaborate closely with Prof. Jenq-Neng Hwang from UW and Prof. Xinchao Wang from NUS.

My research focuses on Efficient Deep Learning, particularly optimizing training and inference of LLMs, Diffusion and Multimodal Models. I have been working on sparse attention, network pruning and efficient architectures. My work strives to achieve computational breakthroughs, making deep learning affordable and accessible to everyone, everywhere.

🔥 News

2025.09: 🥳 ScaleKV is accepted to NeurIPS 2025!
2025.04: 🚀 TinyFusion is selected as CVPR 2025 Highlight.
2025.01: 🌟 Our team won the Second Place of 2025 SkiTB Visual Tracking Challenge.
2024.05: 🎉 PixelGen wins ACM/IEEE IPSN 2024 Best Demonstration Runner-Up award.

📝 Publications

NeurIPS 2025

Memory-Efficient Visual Autoregressive Modeling with Scale-Aware KV Cache
NeurIPS 2025

Kunjun Li, Zigeng Chen, Cheng-Yen Yang, Jenq-Neng Hwang

Scale-Aware KV cache tailored for next-scale prediction paradigm in VAR.
Lossless Compression while achieving 90% memory reduction (85 GB → 8.5GB) and substantial speedup.
Facilitating the scaling of VAR models to ultra-high resolutions like 4K.

[paper] [code] [abstract]

CVPR 2025

TinyFusion: Diffusion Transformers Learned Shallow
CVPR 2025 Highlighted Paper (3%)

Gongfan Fang*, Kunjun Li*, Xinyin Ma, Xinchao Wang (Equal-first author)

End-to-end learnable depth pruning framework for Diffusion Transformers with 50% model parameters and depth.
Achieveing a 2x faster inference with comparable performance.
Tiny DiTs at 7% of the original training costs.

[paper] [code] [abstract]

IPSN 2024

PixelGen: Rethinking Embedded Camera Systems for Mixed-Reality
ACM/IEEE IPSN 2024 Best Demonstration Runner-Up

Kunjun Li, Manoj Gulati, Dhairya Shah, Steven Waskito, Shantanu Chakrabarty and Ambuj Varshney

Generate High Resolution RGB images from Monochrome and sensor data.
Novel representation of the surroundings from invisible signal.

[paper] [code] [abstract]