video understanding Articles

New Applications of LoRA: Dynamic Combination Without Training

2026-07-16 by boardor

Title: LoRA on the Go: Instance-level Dynamic LoRA Selection and Merging Paper Link: https://arxiv.org/pdf/2511.07129 Innovations For the first time, a combination of Generative Mask and Discriminative Mask is used, where the generative mask is applied to video data reconstruction tasks, and the discriminative mask is used for video understanding tasks. Both share the same network … Read more

Integrating Visual Perception and Language Reasoning: A New Video Cognition Framework Based on Q-Former Heuristic Module!

2026-01-03 by boardor

Click the card below to follow「AI Vision Engine」public account ( Please note: direction + school/company + nickname/name ) The current video understanding models excel at recognizing “what happened,” but they fall short in high-level cognitive tasks such as causal reasoning and future prediction, a limitation stemming from their lack of common-sense world knowledge. To bridge … Read more

Axera Technology | Axera Tongyuan NPU Adaptation for Qwen2.5-VL-3B

2025-10-19 by boardor

Qwen2.5-VL:the new flagship vision-language model of Qwen and also a significant leap from the previous Qwen2-VL. Axera Tongyuan:an AI computing processor based on operator as the atomic instruction set. It efficiently supports mixed precision algorithm design and Transformers, providing a strong foundation for large models (DeepSeek, Qwen, MiniCPM, etc.) in “cloud-edge-end” AI applications. https://www.axera-tech.com/Skill/166.html TLDR … Read more