Please use this identifier to cite or link to this item:
http://hdl.handle.net/1893/21733
Appears in Collections: | Computing Science and Mathematics eTheses |
Title: | Towards A Robust Arabic Speech Recognition System Based On Reservoir Computing |
Author(s): | Alalshekmubarak, Abdulrahman |
Supervisor(s): | Smith, Leslie Graham, Bruce |
Keywords: | Reservoir computing Speech recognition Speech corpus Arabic language |
Issue Date: | 21-Nov-2014 |
Publisher: | University of Stirling |
Abstract: | In this thesis we investigate the potential of developing a speech recognition system based on a recently introduced artificial neural network (ANN) technique, namely Reservoir Computing (RC). This technique has, in theory, a higher capability for modelling dynamic behaviour compared to feed-forward ANNs due to the recurrent connections between the nodes in the reservoir layer, which serves as a memory. We conduct this study on the Arabic language, (one of the most spoken languages in the world and the official language in 26 countries), because there is a serious gap in the literature on speech recognition systems for Arabic, making the potential impact high. The investigation covers a variety of tasks, including the implementation of the first reservoir-based Arabic speech recognition system. In addition, a thorough evaluation of the developed system is conducted including several comparisons to other state- of-the-art models found in the literature, and baseline models. The impact of feature extraction methods are studied in this work, and a new biologically inspired feature extraction technique, namely the Auditory Nerve feature, is applied to the speech recognition domain. Comparing different feature extraction methods requires access to the original recorded sound, which is not possible in the only publicly accessible Arabic corpus. We have developed the largest public Arabic corpus for isolated words, which contains roughly 10,000 samples. Our investigation has led us to develop two novel approaches based on reservoir computing, ESNSVMs (Echo State Networks with Support Vector Machines) and ESNEKMs (Echo State Networks with Extreme Kernel Machines). These aim to improve the performance of the conventional RC approach by proposing different readout architectures. These two approaches have been compared to the conventional RC approach and other state-of-the- art systems. Finally, these developed approaches have been evaluated on the presence of different types and levels of noise to examine their resilience to noise, which is crucial for real world applications. |
Type: | Thesis or Dissertation |
URI: | http://hdl.handle.net/1893/21733 |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Alalshekmubarak_November_2014.pdf | 3.87 MB | Adobe PDF | View/Open |
This item is protected by original copyright |
Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/
If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.