This paper is about an experiment of speaker-independent automatic Korean spoken words recognition using Multi-Layered Perceptron
The words in a sentence are recognized from the phonemes. The phonemes are converted from frames which are recognized by...
This paper is about an experiment of speaker-independent automatic Korean spoken words recognition using Multi-Layered Perceptron
The words in a sentence are recognized from the phonemes. The phonemes are converted from frames which are recognized by neural network. The neural networks get inputs from wide areas in fronts and back centered current frame to take the changes of characteristics of phonemes by continuous utterance. To take the characteristics of human hearing, PLP cepstrum is also extracted as feature, used to train and recognize the neural networks.
The speeches were produced five times a proverb, five of them were used in the training and others were in the recognition. The alogorithm uses, at the stage of training neurons, fetures such as the rate of zero crossing rate, short-term energy, and either PARCOR coefficient or PLP coefficient. These fetures are extracted from speech samples selected by sliding 25.6msec window with a sliding gap being 3 msec long, then interleaved and summed up to 7 sets of parameters covering 171 msec worth of speech for use of neural network inputs. Performances are compared when either PARCOR or auditory-like PLP is included in the feture set.