We propose an encoder-decoder for open-vocabulary semantic segmentation comprising a hierarchical encoder-based cost map generation and a gradual fusion decoder. We introduce a category early ...
Abstract: Existing image aesthetics assessment methods mainly rely on the visual features of images but ignore their rich semantics. Nowadays, with the widespread application of social media, the ...
AI has transformed how we interact with technology, moving from traditional graphical interfaces to language-based, collaborative systems. To measure these new human-AI interactions, we developed ...
ABSTRACT: Multi-modal data abounds in biomedicine, such as radiology images and reports. Interpreting this data at scale is essential for improving clinical care and accelerating clinical research.
Abstract: Existing object-level simultaneous localization and mapping (SLAM) methods often overlook the correspondence between semantic information and geometric features, resulting in a significant ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results