RISS 검색 - 학위논문 상세보기

국문 초록 (Abstract)

완전 컨볼루션 시간 영역 음성 분리 네트워크(Conv-TasNet)는 구조적 우수성 때문에 다양한 연구에서 백본 (backbone) 모델로 사용된다. 본 논문에서는 음성 Conv-TasNet의 성능과 효율성 극대화를 위...

완전 컨볼루션 시간 영역 음성 분리 네트워크(Conv-TasNet)는 구조적 우수성 때문에 다양한 연구에서 백본 (backbone) 모델로 사용된다. 본 논문에서는 음성 Conv-TasNet의 성능과 효율성 극대화를 위해 neural architecture search (NAS) 적용을 시도하였다. NAS는 사람의 개입을 최소화하면서 최적의 모델 구조를 자동으로 탐색하는 AutoML의 한 분야이다. 본 논문에서는 먼저 NAS를 Conv-TasNet에 적용하기 위해 NAS의 탐색 공간을 정의하는 후보 연산을 결정하였다. 다음으로 Conv-TasNet의 모델 훈련에 GPU 메모리를 과다하게 사용하는 한계를 극복하기 위해 메모리 효율적인 NAS를 적용하였다. 다음으로 경사하강법과 강화학습 알고리즘을 통해 최적의 분리 모듈 구조 를 찾았다. 또한, NAS를 단순 적용하였을 때 NAS에 대한 파라미터인 아키텍처 파라미터가 레이어별로 불균형하게 업데이트되어 일부 레이어에서는 NAS가 무작위로 진행되는 현상을 관찰할 수 있었다. 해당 문제점을 Conv-TasNet에 적합한 보조 손실 방법을 도입하는 것으로 해결하였다. 또한 제안하는 보조 손실 방법은 아키텍처 파라미터 업데이트의 불균형을 완화할 뿐만 아니라 음원 분리 모델의 분리 정확도를 향상했다.

다국어 초록 (Multilingual Abstract)

The fully convolutional time-domain speech separation network
(Conv-TasNet) has been used as a backbone model in various studies
because of its structural excellence. In this study, we attempt to apply a
neural architecture search (NAS) to maximize the performance and
efficiency of Conv-TasNet. NAS is a type of automated machine learning
that automatically searches for an optimal model structure with the least
amount of human intervention. In this study, we introduce a candidate
operation to define the NAS search space for Conv-TasNet. In addition,
we introduce a low-cost NAS to overcome the limitations of the backbone
model, which requires large GPU memory for training. Next, we use two
search strategies based on gradient descent and reinforcement learning to
find the optimized separation module structures. Furthermore, there is an
imbalance in the updating of architecture parameters, which are NAS
parameters, when NAS is simply applied. As a result, for a balanced
architecture parameter update of the entire model, we introduce an
auxiliary loss method that is appropriate for the Conv-TasNet architecture.
We also found that using an auxiliary loss technique reduces the imbalance
of architecture parameter updates and improves separation accuracy.

목차 (Table of Contents)

제1장 서 론 1
제1절 연구의 필요성 1
제2절 연구 내용 2
제2장 관련 연구 3
제1절 음성 분리와 Conv-TasNet 3

제1장 서 론 1
제1절 연구의 필요성 1
제2절 연구 내용 2
제2장 관련 연구 3
제1절 음성 분리와 Conv-TasNet 3
제2절 신경망 탐색 8
제3장 설계 및 구현 10
제1절 음성 분리 모델 탐색공간 10
제2절 탐색 전략 13
3.2.1 경사하강법 기반 신경망 탐색 전략 (DARTS) 13
3.2.2 GPU 메모리 효율적 신경망 탐색 15
3.2.3 모델 크기 최적화 탐색 목적 함수 17
3.2.4 강화학습 기반 신경망 탐색 전략 18
제3절 탐색 전략 20
제4장 설계 및 구현 23
제1절 실험 환경 23
제2절 구조 탐색 결과 26
제3절 보조 손실 효과 29
제5장 결 론 32
참고 문헌 33
ABSTRACT 38

상세검색

RISS 보유자료

상세검색

해외전자자료

신경망 구조 탐색 알고리즘을 이용한 음성 분리 딥 러닝 모델 설계 자동화 = Speech separation deep learning model design automation via neural architecture search

부가정보

분석정보

연관 공개강의(KOCW)

이 자료와 함께 이용한 RISS 자료

나만을 위한 추천자료