This repository is the official implementation of CMFS, a unified framework that leverages CLIP-guided modality interaction to mitigate noise in multi-modal image fusion and segmentation.
Abstract: RGB-T tracking aims to effectively leverage the complement ability of visual (RGB) and infrared (TIR) modalities to achieve robust tracking performance in various scenarios. Existing RGB-T ...
Research has focused on Multi-Modal Semantic Segmentation (MMSS), where pixel-wise predictions are derived from multiple visual modalities captured by diverse sensors. Recently, the large vision model ...
1 Capital University of Physical Education and Sports, Beijing, China 2 Department of Physical Education, Hanyang University, Seoul, Republic of Korea In the realm of obesity and overweight, the risk ...
Abstract: The goal of mixed-modality clustering, which differs from typical multi-modality/view clustering, is to divide samples derived from various modalities into several clusters. This task has to ...