RISS 검색 - 국내학술지논문 상세보기

국문 초록 (Abstract)

오디오와 비쥬얼 정보를 활용한 멀티 모달 기반의 객체 분할은 현재 컴퓨터 비전 분야에서 활발히 연구가 진행되고 있는 주제이다. Audio-Visual Segmentation (AVS)은 오디오 정보를 추가적으로 사용하여 비쥬얼 정보 내에 소리가 나는 객체 만을 픽셀 단위로 분할할 수 있게 제안된 오디오-비쥬얼 멀티 모달 객체 분할 연구이다. 이러한 기술은 로봇 인식과 자율 주행과 같이 객체를 정확하게 인식해야하는 응용 분야에 있어 중요하다. 실제 세계의 정보들을 수집하다 보면 원치 않은 정보들이 포함되거나 기계적인 결함과 같은 이유로 노이즈가 빈번하게 발생하며 이로 인해 AVS모델의 성능이 크게 저하될 수 있다. 본 논문에서는 오디오와 비쥬얼에 노이즈가 추가되면 성능이 저하되는 것을 확인하였으며, 이에 대처할 수 강인한 AVS연구의 필요성을 확인하였다. 따라서 본 연구에서는 노이즈를 제거하는 네트워크를 추가하여 노이즈가 추가되더라도 성능이 저하되는 문제를 개선한다.

번역하기

오디오와 비쥬얼 정보를 활용한 멀티 모달 기반의 객체 분할은 현재 컴퓨터 비전 분야에서 활발히 연구가 진행되고 있는 주제이다. Audio-Visual Segmentation (AVS)은 오디오 정보를 추가적으로 사...

다국어 초록 (Multilingual Abstract)

Multi-modal-based object segmentation using audio and visual information is a topic that is currently being actively studied in the field of computer vision. Audio-Visual Segmentation (AVS) is an audio-visual multi-modal object segmentation method proposed to allow only objects that make sounds in visual information to be segmented in pixel units by additional audio information. These technologies are important for applications that require accurate object recognition, such as robot recognition and autonomous driving. When collecting information from the real world, unwanted information can be included. Noise can also occur due to mechanical defects, which can significantly degrade the performance of the AVS model. In this paper, it was confirmed that the addition of noise to audio and visual could reduces the performance. The necessity of a robust AVS study to cope with it was also confirmed. Therefore, this study can improve the problem of performance degradation even when noise is added by adding a network that can removes noise.

번역하기

상세검색

RISS 보유자료

상세검색

해외전자자료

노이즈가 추가된 입력에서 멀티 모달 오디오 비주얼 객체 분할 모델의 성능 개선 = Improved Performance of Multi-Modal Audio-Visual Segmentation with Noise

부가정보

동일학술지(권/호) 다른 논문

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료