Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md

README.md

InternVideo2.5 [Paper]

This repo will give the code and models of 'InternVideo2.5: Empowering Video MLLMS with long and rich context modeling'. InternVideo2.5 is a video multimodal large language model (MLLM, built upoon InternVL2.5) enhanced with long and rich context (LRC) modeling. It significantly improves upon existing MLLMs by enhancing their ability to perceive fine-grained details and capture long-form temporal structures. We achieve this through dense vision task annotations using direct preference optimization (TPO) and compact spatiotemporal representations via adaptive hierarchical token compression (HiCo).

Our experiments demonstrate substantial performance gains on mainstream short and long video understanding benchmarks. InternVideo2.5 can memorize video inputs at least 6x longer than the original model and exhibits specialized vision capabilities like object tracking and segmentation. This work highlights the importance of rich multimodal context (length and detail) for empowering MLLM focus and memory, offering valuable insights for future video MLLM research.

Updates

2025/01/23: InternVideo2.5 (InternVL2.5 + LRC) and InternVL2.5-HiCo have been officially released on HuggingFace.
2025/01/22: The technical report of InternVideo2.5 is released.

yoga-iv2.2.mp4

-.-.mp4

car-iv2.5.mp4

teach-install.mp4

Model Zoo

MLLM	Link	MVBench	Perception Test	LongVideoBench	MLVU	VideoMME	LVBench	#Tokens per frame	#Params
InternVideo2.5	huggingface	75.7	74.9	60.6	72.8	65.1	46.4	16	8B
InternVL2.5 + HiCo	huggingface	74.0	71.4	59.6	71.5	64.9	-	16	8B
InternVL2.5 + HiCo	huggingface	74.4	71.9	62.7	72.6	66.4	-	64	8B

Citation

If this work is helpful for your research, please consider citing InternVideo2.5.

@article{wang2025internvideo,
  title={InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling},
  author={Wang, Yi and Li, Xinhao and Yan, Ziang and He, Yinan and Yu, Jiashuo and Zeng, Xiangyu and Wang, Chenting and Ma, Changlian and Huang, Haian and Gao, Jianfei and Dou, Min and Chen, Kai and Wang, Wenhai and Qiao, Yu and Wang, Yali and Wang, Limin},
  journal={arXiv preprint arXiv:2501.12386},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InternVideo2.5

InternVideo2.5

README.md

InternVideo2.5 [Paper]

Updates

Model Zoo

Citation

Files

InternVideo2.5

Directory actions

More options

Directory actions

More options

Latest commit

History

InternVideo2.5

Folders and files

parent directory

README.md

InternVideo2.5 [Paper]

Updates

Model Zoo

Citation