Abstract: Audio-visual target speaker extraction (AV-TSE) aims to extract the specific person's speech from the audio mixture given auxiliary visual cues. Previous methods usually search for the ...
Abstract: This paper introduces the first audio-visual dataset for traffic anomaly detection called MAVAD, taken from real-world scenes, with a diverse range of illumination conditions. In addition, a ...
remove-circle Internet Archive's in-browser bookreader "theater" requires JavaScript to be enabled. It appears your browser does not have it turned on. Please see ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results