Books on Demand GmbH, 2000 — 252 p. — ISBN 9783898118613,3898118614
Today, there is an increased need to extract information for decision making from a large collection of data. This transformation of data into knowledge is an interactive and iterative process of various subtasks and decisions, and is called Knowledge Discovery from Data. The central part of Knowledge Discovery is Data Mining.
Most important for a more sophisticated data mining is to try to limit the user involvement in the entire data mining process to the inclusion of well-known a priori knowledge while making this process more automated and more objective. Soft computing, i.e., Fuzzy Modelling, Neural Networks, Genetic Algorithms and other methods of automatic model generation, is a way to mine data by generating mathematical models from empirical data more or less automatically. In the past years there has been much publicity about the ability of Artificial Neural Networks to learn and to generalize despite important problems with design, development and application of Neural Networks:
Neural Networks have no explanatory power by default to describe why results are as they are. This means that the knowledge (models) extracted by Neural Networks is still hidden and distributed over the network.
There is no systematical approach for designing and developing Neural Networks. It is a trialand-error process.
Training of Neural Networks is a kind of statistical estimation often using algorithms that are slower and less effective than algorithms used in statistical software.
If noise is considerable in a data sample, the generated models systematically tend to being overfitted.
In contrast to Neural Networks that use Genetic Algorithms as an external procedure to optimize the network architecture and several pruning techniques to counteract overtraining, the new approach described in this book introduces principles of evolution - inheritance, mutation and selection - for generating a network structure systematically enabling automatic model structure synthesis and model validation. Models are generated from the data in the form of networks of active neurons in an evolutionary fashion of repetitive generation of populations of competing models of growing complexity and their validation and selection until an optimal complex model - not too simple and not too complex - has been created. That is, growing a treelike network out of seed information (input and output variables data) in an evolutionary fashion of pairwise combination and survival-of-the-fittest selection from a simple single individual (neuron) to a desired final, not overspecialized behavior (model). Neither, the number of neurons and the number of layers in the network, nor the actual behavior of each created neuron is predefined. All this is adjusted during the process of self-organization, and therefore, is called self-organizing data mining.
Knowledge Discovery from DataModels and their application in decision making
Relevance and value of forecasts
Theory driven approach
Data driven approach
Data mining
Self-organizing Data MiningInvolvement of users in the data mining process
Automatic model generation
Regression based models
Rule based modelling
Symbolic modelling
Nonparametric models
Self-organizing data mining
Self-organizing Modelling TechnologiesStatistical Learning Networks
Inductive approach - The GMDH algorithm
Induction
Principles
Model of optimal complexity
Parametric GMDH AlgorithmsElementary models (neurons)
Generation of alternate model variants
Nets of active neurons
Criteria of model selection
Validation
Nonparametric AlgorithmsObjective Cluster Analysis
Analog Complexing
Self-organizing Fuzzy Rule Induction
Logic based rules
Application of Self-organizing Data MiningSpectrum of self-organizing data mining methods
Choice of appropriate modelling methods
Application fields
Synthesis
Software tools
KnowledgeMinerGeneral features
GMDH implementation
Elementary models and active neurons
Generation of alternate model variants
Criteria of model selection
Systems of equations
Analog Complexing implementation
FeaturesExample
Fuzzy Rule Induction implementationFuzzification
Rule induction
Defuzzification
Example
Using modelsThe model base
Finance module
Sample Applications
From EconomicsNational economy
Stock prediction
Balance sheet
Sales prediction
Solvency checking
Energy consumptionFrom Ecology
Water pollution
Water quality
From other FieldsHeart disease
U.S. congressional voting behavior