http://chineseinput.net/에서 pinyin(병음)방식으로 중국어를 변환할 수 있습니다.
변환된 중국어를 복사하여 사용하시면 됩니다.
M 채널 IIR Cosine-Modulated 필터 뱅크의 설계와 음향 반향 제거에서 응용
김상균,유창동,Kim, Sang-Gyun,Yoo, Chang-Dong 대한전자공학회 2002 電子工學會論文誌-SP (Signal processing) Vol.39 No.5
In this paper, a novel method for designing an M-channel, causal, stable IIR cosine-modulated filter bank (CMFB) with near PR property is proposed. The IIR prototype filter is designed with a simple constraint using lattice stucture with 1st order allpass filter components. The IIR prototype filter which is designed by the proposed method has higher stopband attenuation and sharper roll-off characteristic than the one which is designed by the previously proposed method with similar complexity. The proposed M-channel IIR CMFB which is designed from this IIR prototype filter is applied to subband acoustic echo canceller (AEC). We obtained about 15dB higher ERLE using this subband AEC than when M-channel FIR subband AEC with similar complexity. 본 논문에서는 인과적 (causal)이고 안정적 (stable)이며 거의 완전 복원 (nearly perfect reconstruction) 성질을 갖는 M 채널 infinite-impulse response (IIR) cosine-modulated (CM) 필터 뱅크를 설계할 수 있는 새로운 방법을 제안하였다. 1st order allpass 필터 성분을 갖는 lattice 구조를 이용하여 IIR 원형 (prototype) 필터를 간단한 제약 조건 (constraint)을 가지고서 설계하였다. 이 방법으로 설계된 IIR 원형 필터는 기존의 방법으로 설계된 IIR 원형 필터보다 더 높은 stopband attenuation과 더 가파른 (sharp) 필터 특성을 가진다. 이 IIR prototype 필터를 이용하여 설계한 M 채널 IIR CM 필터 뱅크를 subband 음향 반향 제거기에 적용하였다. 설계된 IIR CM 필터 뱅크를 이용할 경우 같은 복잡도를 가지는 M 채널 finite-impulse response (FIR) CM 필터 뱅크를 이용할 때보다 최대 15 dB 높은 echo return loss enhancement (ERLE)를 갖는 결과를 얻었다.
이선일(Sun-Il Lee),유창동(Chang-Dong Yoo) 대한전자공학회 2007 대한전자공학회 학술대회 Vol.2007 No.11
In this paper, a robust video fingerprinting method based on the boosting is proposed. Given a set of training data, the hoosting learns a binarization method which converts a real-valued fingerprint into a robust and pairwisely independent binary fingerprint which is suitable for efficient database search. Experimental results show that the proposed method is robust against various common video processing steps and outperforms conventional video fingerprinting methods.
김상균(Sang-Gyun Kim),유창동(Chang-Dong Yoo) 대한전자공학회 2007 대한전자공학회 학술대회 Vol.2007 No.11
본 논문에서는 언더디터민드 혼합신호로부터 다양한 분포를 갖는 원신호들을 분리해내는 알고리즘을 제안한다. 원신호의 확률 분포를 모사해주는 파라메트릭 일반화된 정규분포에 기반하여 원신호 추정이 최소 평균자승오차를 갖도록 원신호의 nullspace 성분을 추정한다. 일반화된 정규분포의 매개 변수들은 EM 알고리즘을 이용하여 추정한다. 혼합행렬 역시 새롭게 제안한 단일 신호 검출 알고리즘을 이용하여 검출된 영역의 혼합신호로부터 추정한다. 제안한 알고리즘은 기존의 알고리즘들에 비해 혼합행렬을 더 정확하게 추정하고, 음성 및 오디오 신호를 포함하여 다양한 분포를 갖는 원신호들을 더 높은 신호 대 간섭비를 가지며 분리하는 것을 실험을 통하여 확인하였다.
권석봉,윤성락,장규철,김용래,김봉완,김회린,유창동,이용주,권오욱,Kwon, Suk-Bong,Yun, Sung-Rack,Jang, Gyu-Cheol,Kim, Yong-Rae,Kim, Bong-Wan,Kim, Hoi-Rin,Yoo, Chang-Dong,Lee, Yong-Ju,Kwon, Oh-Wook 대한음성학회 2006 말소리 Vol.59 No.-
We report the evaluation results of the Korean speech recognition platform called ECHOS. The platform has an object-oriented and reusable architecture so that researchers can easily evaluate their own algorithms. The platform has all intrinsic modules to build a large vocabulary speech recognizer: Noise reduction, end-point detection, feature extraction, hidden Markov model (HMM)-based acoustic modeling, cross-word modeling, n-gram language modeling, n-best search, word graph generation, and Korean-specific language processing. The platform supports both lexical search trees and finite-state networks. It performs word-dependent n-best search with bigram in the forward search stage, and rescores the lattice with trigram in the backward stage. In an 8000-word continuous speech recognition task, the platform with a lexical tree increases 40% of word errors but decreases 50% of recognition time compared to the HTK platform with flat lexicon. ECHOS reduces 40% of recognition errors through incorporation of cross-word modeling. With the number of Gaussian mixtures increasing to 16, it yields word accuracy comparable to the previous lexical tree-based platform, Julius.
장규철,우수영,유창동 한국음성과학회 2002 음성과학 Vol.9 No.3
A simple algorithm for discriminating voiced sounds in a speech is proposed. In addition to low-frequency energy and zero-crossing rate (ZCR), both of which have been widely used in the past for identifying voiced sounds, the proposed algorithm incorporates pitch variation to improve the discrimination rate. Based on TIMIT corpus, evaluation result shows an improvement of 13% in the discrimination of voiced phonemes over that of the traditional algorithm using only energy and ZCR.