سال انتشار: ۱۳۹۱

محل انتشار: بیستمین کنفرانس مهندسی برق ایران

تعداد صفحات: ۴

نویسنده(ها):

Mohammad Mohsen Goodarzi – Research Center for Intelligent Signal Processing (RCISP), Tehran, Iran
Farshad Almasganj – Research Center for Intelligent Signal Processing (RCISP), Tehran, Iran
Jahanshah Kabudian –
Yasser Shekofteh –

چکیده:

Configuring a whole setup with application of continuous conversational telephony speech recognition in Persian is the goal of this paper. For this propose, two commonmethods, Gaussian Mixture Model (GMM) and Neural Network (NN) and a proposed hybrid GMM-NN method have been considered to estimate full-bandwidth features from band-limitedfeatures. Performances of these methods have been evaluated with two different spectral and cepstral based features, LFBEand MFCC. Also, the effect of speaker gender in estimation process has been investigated. Our results showed that bestphoneme recognition accuracy is obtained when MFCC features are reconstructed using two gender dependent neural networks.In this configuration, phoneme accuracy was about 1.6 % more than baseline. The tests were applied on TFarsDat corpus