|Appears in Collections:||Computing Science and Mathematics eTheses|
|Title:||Novel Computationally Intelligent Machine Learning Algorithms for Data Mining and Knowledge Discovery|
|Author(s):||Gheyas, Iffat A.|
|Supervisor(s):||Smith, Leslie S.|
Missing value impuation
Time Series Forecasting
Curse of Dimensionality
|Publisher:||University of Stirling|
|Abstract:||This thesis addresses three major issues in data mining regarding feature subset selection in large dimensionality domains, plausible reconstruction of incomplete data in cross-sectional applications, and forecasting univariate time series. For the automated selection of an optimal subset of features in real time, we present an improved hybrid algorithm: SAGA. SAGA combines the ability to avoid being trapped in local minima of Simulated Annealing with the very high convergence rate of the crossover operator of Genetic Algorithms, the strong local search ability of greedy algorithms and the high computational efficiency of generalized regression neural networks (GRNN). For imputing missing values and forecasting univariate time series, we propose a homogeneous neural network ensemble. The proposed ensemble consists of a committee of Generalized Regression Neural Networks (GRNNs) trained on different subsets of features generated by SAGA and the predictions of base classifiers are combined by a fusion rule. This approach makes it possible to discover all important interrelations between the values of the target variable and the input features. The proposed ensemble scheme has two innovative features which make it stand out amongst ensemble learning algorithms: (1) the ensemble makeup is optimized automatically by SAGA; and (2) GRNN is used for both base classifiers and the top level combiner classifier. Because of GRNN, the proposed ensemble is a dynamic weighting scheme. This is in contrast to the existing ensemble approaches which belong to the simple voting and static weighting strategy. The basic idea of the dynamic weighting procedure is to give a higher reliability weight to those scenarios that are similar to the new ones. The simulation results demonstrate the validity of the proposed ensemble model.|
|Type:||Thesis or Dissertation|
|Affiliation:||School of Natural Sciences|
Computing Science and Mathematics
This item is protected by original copyright
Items in the Repository are protected by copyright, with all rights reserved, unless otherwise indicated.
If you believe that any material held in STORRE infringes copyright, please contact firstname.lastname@example.org providing details and we will remove the Work from public display in STORRE and investigate your claim.