A Comprehensive Review of the Technological Evolution of Large Multimodal Reasoning Models: From Modular Architectures to Native Reasoning Capabilities
This study systematically reviews and analyzes the technological development of Large Multimodal Reasoning Models (LMRMs). It outlines the evolution of the field from early modular, perception-driven architectures to unified, language-centric frameworks, and introduces the cutting-edge concept of Native Large Multimodal Reasoning Models (N-LMRMs). The paper constructs a structured roadmap for the development of multimodal reasoning, … Read more