http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
HMM을 이용한 연속 음성인식 시스템의 화자적응화에 관한 연구
심장엽,김상범,김주성,김수훈,이영재,이종진,허강인 동아대학교 공과대학 부설 한국자원개발연구소 1995 硏究報告 Vol.19 No.2
It is hard to collect sufficient speech data for training a speaker-dependent (SD) model from the same speaker. In contrast, to trains a speaker-independent (SI) model need not collect a large amount of speech data per speaker but from many speakers. Speaker-adaptation (SA) is an additional training technique from SI model by a small amount of adaptation speech. It has proved to be a powerful tool for achieving good recognition performance without the high cost of SD training. In this study, a speaker adaptation algorithm (MAPE) which trains it by every utterance sequentially without hand-labelling is introduces. The hand-labelling is performed automatically by Concatenation training and Viterbi-segmentation. The secuential-training is performed by MAPE(Maximum A Posteriori probability Estimation). We can train it using any small amount of adaptation speech data. For newspaper editorial continuous speech, the recognition rates of adaptation of HMM was 62.5% respectively which is approximately 32.5% improvement over that of unadapated HMM.
심장엽,김상범,김주성,김수훈,이영재,허강인 東亞大學校 附設 情報通信硏究所 1996 情報通信硏究所論文誌 Vol.4 No.1
This paper is a study on the composition of Real-Time Continuous Speech Recognition System for Man-Machine Interface and it examines the posibility that applies to automatic system. The Continous Speech Recognition System is composed of Contivuous Distribution HMM model and algorithm of One Pass DP method. The System is composed so that it may detect start point and end point of speech data which are converted into samples by 10 KHz, 8 bit A/D whthin real time, then so that it may recognie them by one Pass DP method, display the result of recognition on PC monitor and at same time sent control data to Interface. HMM models are creadted by training for continuous speech samples which are control words, area names and digital sounds. In the result of experiment by Continuous Speech Recognition System, there are some kind of errors which are insertion, replacement and deletion of one syllable, but it examined the posibility that can by applied to Man-Machine Interface on automatic system if post-process is performed for recognition.
심장엽,안종영,김상범,허강인 東亞大學校 附設 情報通信硏究所 1995 情報通信硏究所論文誌 Vol.3 No.1
본 논문에서는 포-만트 vocoder를 구성하여 한국어 음성 합성을 연구하였다. 포-만트 vocoder의 구성에 필요한 합성파라메터로서는 스펙트럼 모멘트법에 의한 포-만트 주파수, 최적 Comb법에 의한 피치 주파수, 단시간 평균에너지와 단시간 평균진폭, 대역폭, 음원파형, 가우시안 백색잡음이다. 합성실험은 남성 화자가 발성한 음절, 단어, 문장 등의 음성데이터를 이용하였다. 청취결과, 유무성음에 대해 각각 Rosenberg의 음원파형과 삼각파를 음원으로 적용한 경우가 우수하였다. This paper studies on a method of Korean speech synthesis by composing formant VOCODER. The parameters of speech synthesis are formant frequencies by spectrum moment method, pitch frequencies by optimum Comb method, short time average energy, short time average amplitude, bandwidth, excitation wave and gaussian white noise. In this paper, synthesis is performed using speech data which are syllable, word and sentence spoken by man. Synthesis results are good in case of using Rosenberg & Triangle wave respectively as voice source.