سال انتشار: ۱۳۹۱
محل انتشار: بیستمین کنفرانس مهندسی برق ایران
تعداد صفحات: ۴
Roohollah Abdipour – Audio & Speech Processing Lab, School of Computer Engineering, Iran University of Science &Technology, Iran
Ahmad Akbari –
Mohsen Rahmani – Computer Engineering Department, Faculty of Engineering, Arak University
Babak Nasersharif – Electrical and Computer engineering Department, K.N.Toosi University of Technology
Ideal binary mask speech enhancement is shown to increase the speech quality as well as speech intelligibility. But, this property depends highly on the accurate separation ofspeech and masker time-frequency units of the input spectrum, which is a difficult task in real situations. Ordinary binary maskmethods are single-microphone methods and so, can obtain little information from the environment. In this paper, we devise a two-microphone method that uses a classifier to distinguishspeech-dominated and masker-dominated time-frequency units. The classifier uses simply computable two-microphone featureswhich enable it to be used in real-time scenarios. These proposed features empower the classifier to reach toclassification accuracies near 80%. This high accuracy in turn, empowers the Ideal binary mask mthod to obtain higher SNRI and NPLR values in comparison to state-of-the-art noisereduction methods. These results indicate that the proposed two-microphone features have high information content for speech/masker separation.