Meet CMMMU: A New Chinese Massive Multi-Discipline Multimodal Understanding Benchmark Designed to Evaluate Large Multimodal Models LMMs

Meet CMMMU: A New Chinese Massive Multi-Discipline Multimodal Understanding Benchmark Designed to Evaluate Large Multimodal Models LMMs

26 views
1 min read

In the realm of artificial intelligence, Large Multimodal Models (LMMs) have exhibited remarkable problem-solving capabilities across diverse tasks, such as zero-shot image/video classification, zero-shot image/video-text retrieval, and multimodal question answering (QA). However, recent studies highlight a substantial gap between powerful LMMs and expert-level artificial intelligence, particularly in tasks involving complex perception and reasoning with domain-specific knowledge. This paper aims to bridge this gap by introducing CMMMU, a pioneering Chinese benchmark meticulously designed to evaluate LMMs’ performance on an extensive array of multi-discipline tasks, guiding the development of bilingual LMMs towards achieving expert-level artificial intelligence. CMMMU ( C hinese M assive M ulti-discipline M ultimodal U nderstanding) stands out as one of the most comprehensive benchmarks ( some examples are shown in Figure 2 ), comprising 12,000 manually collected Chinese […]

Latest from Blog

withemes on instagram