S.A. Selouani, D. O’Shaughnessy, and J. Caelen
Speech recognition, Genetic Algorithms, Hidden Markov Models,eigen-decomposition, distinctive cues, noise removal
The reliability of automatic speech recognition (ASR) systems is closely related to the parameterization process which is expected to accurately characterize the phonetic, dynamic and static components in speech. For this purpose, ASR methods build speech sound models based on large speech corpora that attempt to include common sources of variability that may occur in real-life conditions. Nevertheless, not all variabilities can reasonably be covered. For that reason, the performance of current ASR systems, whose designs are predicated on relatively noise-free conditions, degrades rapidly in the presence of high-level adverse conditions. To cope with mismatched (adverse) conditions and to achieve noise robustness, we present in this paper an original approach that operates in two steps. The first one consists of integrating in the front-end process, besides mean-subtracted mel-frequency cepstral coefficients, acoustic distinctive features that provides a more convenient interface to higher-level components of ASR systems. The second step consists of combining subspace filtering and Genetic Algorithms to get less- variant parameters. The advantages of this approach include that no estimation of noise is required and the recognition system is not modified. The effectiveness of the method is assessed in high interfering car noise by using a noisy subset of the TIMIT database. Obtained results show that the proposed method reduces drastically the word error rate for a wide range of signal-to-noise ratios.
Important Links:
Go Back