Novel Computationally Intelligent Machine Learning Algorithms for Data Mining and Knowledge Discovery

Gheyas, Iffat A.

Please use this identifier to cite or link to this item: http://hdl.handle.net/1893/2152

Appears in Collections:	Computing Science and Mathematics eTheses
Title:	Novel Computationally Intelligent Machine Learning Algorithms for Data Mining and Knowledge Discovery
Author(s):	Gheyas, Iffat A.
Supervisor(s):	Smith, Leslie S.
Keywords:	Feature SubsetSselection Missing value impuation Single Imputation Multiple Imputation Dimensionality Reduction Time Series Forecasting Curse of Dimensionality Neural Networks Evolutionary Algorithm
Issue Date:	24-Nov-2009
Publisher:	University of Stirling
Citation:	N/A
Abstract:	This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.
Type:	Thesis or Dissertation
URI:	http://hdl.handle.net/1893/2152
Affiliation:	School of Natural Sciences Computing Science and Mathematics

Files in This Item:

File	Description	Size	Format
Iffat_Thesis.pdf		5.12 MB	Adobe PDF	View/Open

This item is protected by original copyright

View License

Show full item record

Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

The metadata of the records in the Repository are available under the CC0 public domain dedication: No Rights Reserved https://creativecommons.org/publicdomain/zero/1.0/

If you believe that any material held in STORRE infringes copyright, please contact library@stir.ac.uk providing details and we will remove the Work from public display in STORRE and investigate your claim.

STORRE

STORRE: Stirling Online Research Repository