سال انتشار: ۱۳۹۱

محل انتشار: بیستمین کنفرانس مهندسی برق ایران

تعداد صفحات: ۴

نویسنده(ها):

Roohollah Abdipour – Audio & Speech Processing Lab, School of Computer Engineering, Iran University of Science &Technology, Iran
Ahmad Akbari –
Mohsen Rahmani – Computer Engineering Department, Faculty of Engineering, Arak University
Babak Nasersharif – Electrical and Computer engineering Department, K.N.Toosi University of Technology

چکیده:

Ideal binary mask speech enhancement is shown to increase the speech quality as well as speech intelligibility. But, this property depends highly on the accurate separation ofspeech and masker time-frequency units of the input spectrum, which is a difficult task in real situations. Ordinary binary maskmethods are single-microphone methods and so, can obtain little information from the environment. In this paper, we devise a two-microphone method that uses a classifier to distinguishspeech-dominated and masker-dominated time-frequency units. The classifier uses simply computable two-microphone featureswhich enable it to be used in real-time scenarios. These proposed features empower the classifier to reach toclassification accuracies near 80%. This high accuracy in turn, empowers the Ideal binary mask mthod to obtain higher SNRI and NPLR values in comparison to state-of-the-art noisereduction methods. These results indicate that the proposed two-microphone features have high information content for speech/masker separation.