This thesis deals with a new speech analysis/synthesis method in the viewpoint that improves excitation signals based on a filter of speech synthesis using Multiband Excitation Signals.
The traditional speech synthesis system using analysis/synthesis...
This thesis deals with a new speech analysis/synthesis method in the viewpoint that improves excitation signals based on a filter of speech synthesis using Multiband Excitation Signals.
The traditional speech synthesis system using analysis/synthesis method have generally used a simple voiced/unvoiced decision, and excitation signals that are made up of white noise signals in case of unvoiced frame, or periodic impluse signals in case of voiced frame. But these excitation signals become the factor which restricts the quality and intelligibility of synthesized speech because it is difficult to generate signals with the spectrum similar to an original speech composed of harmonics and noise signals only by using impluse signals.
This thesis proposes a scheme that improves the intelligibility by restoring noise information mixed in voiced frame with respect to speech quality. First, the voiced/unvoiced decision method for each frequency band in one frame is suggested. It allows the excitation signal for a particular speech segment to be a mixture of periodic (voiced) energy and noise-like (unvoiced) energy. Secondly, the analysis/synthesis system using Multiband Excitation Signals, generated by the information of voiced/unvoiced bands on the spectrum of one frame, is designed in order to synthesize a speech signal similar to the original speech signal.
The result of evaluation shows that this proposed approach can achieved more natural-sounding synthesized speech with less "Buzziness" than the traditional method.