http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
LVLN: 시각-언어 이동을 위한 랜드마크 기반의 심층 신경망 모델
황지수,김인철 한국정보처리학회 2019 정보처리학회 논문지 Vol.8 No.9
In this paper, we propose a novel deep neural network model for Vision-and-Language Navigation (VLN) named LVLN (Landmark- based VLN). In addition to both visual features extracted from input images and linguistic features extracted from the natural language instructions, this model makes use of information about places and landmark objects detected from images. The model also applies a context-based attention mechanism in order to associate each entity mentioned in the instruction, the corresponding region of interest (ROI) in the image, and the corresponding place and landmark object detected from the image with each other. Moreover, in order to improve the success rate of arriving the target goal, the model adopts a progress monitor module for checking substantial approach to the target goal. Conducting experiments with the Matterport3D simulator and the Room-to-Room (R2R) benchmark dataset, we demonstrate high performance of the proposed model.