Bruges, Belgium, April 24-25-26
Content of the proceedings
-
Regression
Exploratory Data Analysis in Medicine and Bioinformatics
Sampling and model selection
Neural Networks and Cognitive Science
ANN models and learning I
Representation of high-dimensional data
ANN models and learning II
Learning
Hardware and Parallel Computer Implementations of Neural Networks
Perspectives on Learning with Recurrent Networks
ANN models and learning II I
Information extraction
Neural Network Techniques in Fault Detection and Isolation
Regression
ES2002-15
Efficient formation of a basis in a kernel induced feature space
G.C. Cawley, N.L.C. Talbot
Efficient formation of a basis in a kernel induced feature space
G.C. Cawley, N.L.C. Talbot
ES2002-4
Theoretical properties of functional Multi Layer Perceptrons
F. Rossi, B. Conan-Guez, F. Fleuret
Theoretical properties of functional Multi Layer Perceptrons
F. Rossi, B. Conan-Guez, F. Fleuret
Abstract:
In this paper, we study a natural extension of Multi Layer Perceptrons (MLP) to functional inputs. We show that fundamental results for numerical MLP can be extended to functional MLP. We obtain universal approximation results that show the expressive power of functional MLP is comparable to the one of numerical MLP. We obtain consistency results which imply that optimal parameters estimation for functional MLP is consistent.
In this paper, we study a natural extension of Multi Layer Perceptrons (MLP) to functional inputs. We show that fundamental results for numerical MLP can be extended to functional MLP. We obtain universal approximation results that show the expressive power of functional MLP is comparable to the one of numerical MLP. We obtain consistency results which imply that optimal parameters estimation for functional MLP is consistent.
ES2002-25
Storing many-to-many mappings on a feed-forward neural network using fuzzy sets
R.K. Brouwer
Storing many-to-many mappings on a feed-forward neural network using fuzzy sets
R.K. Brouwer
Abstract:
Feed-forward networks are generally trained to represent functions or many-to-one (m-o) mappings. In this paper however a feed-forward network with modified training algorithm is considered to represent multi-valued or one-to-many (o-m) mappings. The o-m mapping is viewed as an m-o mapping where the values corresponding to a value of the independent variable are sets. Thus the problem of representing a o-m mapping has been converted into a problem of training a network to return sets rather than vectors. The resulting o-m mapping may have variable multiplicity leading to sets of variable cardinality. The crisp sets of variable cardinality in turn are replaced by fuzzy sets of fixed cardinality by adding elements, called “do not cares” which have membership values of zero. Since the target outputs of the feedforward network are now sets of fixed cardinality and the actual output of a feedforward network is a vector the training algorithm is modified to take into account the fact that order should be removed as a constraint when the error vector is calculated. Results of simulations show that the method proposed is quite effective.
Feed-forward networks are generally trained to represent functions or many-to-one (m-o) mappings. In this paper however a feed-forward network with modified training algorithm is considered to represent multi-valued or one-to-many (o-m) mappings. The o-m mapping is viewed as an m-o mapping where the values corresponding to a value of the independent variable are sets. Thus the problem of representing a o-m mapping has been converted into a problem of training a network to return sets rather than vectors. The resulting o-m mapping may have variable multiplicity leading to sets of variable cardinality. The crisp sets of variable cardinality in turn are replaced by fuzzy sets of fixed cardinality by adding elements, called “do not cares” which have membership values of zero. Since the target outputs of the feedforward network are now sets of fixed cardinality and the actual output of a feedforward network is a vector the training algorithm is modified to take into account the fact that order should be removed as a constraint when the error vector is calculated. Results of simulations show that the method proposed is quite effective.
ES2002-16
Heteroscedastic regularised kernel regression for prediction of episodes of poor air quality
R.J. Foxall, G.C. Cawley, N.L.C. Talbot, S.R. Dorling, D.P. Mandic
Heteroscedastic regularised kernel regression for prediction of episodes of poor air quality
R.J. Foxall, G.C. Cawley, N.L.C. Talbot, S.R. Dorling, D.P. Mandic
Abstract:
\N
\N
Exploratory Data Analysis in Medicine and Bioinformatics
ES2002-400
Exploratory Data Analysis in Medicine and Bioinformatics
A. Wismüller, T. Villmann
Exploratory Data Analysis in Medicine and Bioinformatics
A. Wismüller, T. Villmann
ES2002-401
A data vizualisation method for investigating the reliability of a high-dimensional low-back-pain MLP network
M.L. Vaughn, S.J. Taylor, M.A. Foy, A.J.B. Fogg
A data vizualisation method for investigating the reliability of a high-dimensional low-back-pain MLP network
M.L. Vaughn, S.J. Taylor, M.A. Foy, A.J.B. Fogg
Abstract:
This study uses a new data visualization method, developed by the first author, to investigate the reliability of a real world low back-pain Multi-layer Perceptron (MLP) network from a hidden layer decision region perspective. Using decision region identification information from an explanation facility, the MLP training examples are discovered to occupy decision regions in contiguous class threads across the 48-dimensional input space. MLP testing cases show a similar distribution and consistency within the contiguous threads but with a reduced reliability. Three test regions outside the network’s knowledge bounds are situated between training regions with a consistent classification.
This study uses a new data visualization method, developed by the first author, to investigate the reliability of a real world low back-pain Multi-layer Perceptron (MLP) network from a hidden layer decision region perspective. Using decision region identification information from an explanation facility, the MLP training examples are discovered to occupy decision regions in contiguous class threads across the 48-dimensional input space. MLP testing cases show a similar distribution and consistency within the contiguous threads but with a reduced reliability. Three test regions outside the network’s knowledge bounds are situated between training regions with a consistent classification.
ES2002-403
Double self-organizing maps to cluster gene expression data
D. Wang, H. Ressom, M. Musavi, C. Domnisoru
Double self-organizing maps to cluster gene expression data
D. Wang, H. Ressom, M. Musavi, C. Domnisoru
Abstract:
Clustering is a very useful and important technique for analyzing gene expression data. Self-organizing map (SOM) is one of the most useful clustering algorithms. SOM requires the number of clusters to be one of the initialization parameters prior to clustering. However, this information is unavailable in most cases, particularly in gene expression data. Thus, the validation results from SOM are commonly employed to choose the appropriate number of clusters. This approach is very inconvenient and time-consuming. This paper applies a novel model of SOM, which is known as double self-organizing map (DSOM), to cluster gene expression data. DSOM helps to find the appropriate number of clusters by clearly and visually depicting the appropriate number of clusters. We use DSOM to cluster an artificial data set and two kinds of real gene expression data sets. To validate our results, we employed a novel validation technique, which is known as figure of merit (FOM)
Clustering is a very useful and important technique for analyzing gene expression data. Self-organizing map (SOM) is one of the most useful clustering algorithms. SOM requires the number of clusters to be one of the initialization parameters prior to clustering. However, this information is unavailable in most cases, particularly in gene expression data. Thus, the validation results from SOM are commonly employed to choose the appropriate number of clusters. This approach is very inconvenient and time-consuming. This paper applies a novel model of SOM, which is known as double self-organizing map (DSOM), to cluster gene expression data. DSOM helps to find the appropriate number of clusters by clearly and visually depicting the appropriate number of clusters. We use DSOM to cluster an artificial data set and two kinds of real gene expression data sets. To validate our results, we employed a novel validation technique, which is known as figure of merit (FOM)
ES2002-402
Improving robustness of fuzzy gene modeling
R. Reynolds, H. Ressom, M. Musavi, C. Domnisoru
Improving robustness of fuzzy gene modeling
R. Reynolds, H. Ressom, M. Musavi, C. Domnisoru
Abstract:
This paper proposes modifications to current fuzzy models of gene interaction. Current algorithms apply all combinations of genes to a fuzzy model (i.e. activator/repressor/target), evaluating how well each combination fits the model. The models are susceptible to noisy signals in the gene expression data. Since the margin of error in current microarray technology can be high, the results generated may not properly reflect valid relationships. This paper investigates different methods of creating fuzzy models. We explore methods of conjunction and rule aggregation that produce valid results while being resilient to minor changes to model input.
This paper proposes modifications to current fuzzy models of gene interaction. Current algorithms apply all combinations of genes to a fuzzy model (i.e. activator/repressor/target), evaluating how well each combination fits the model. The models are susceptible to noisy signals in the gene expression data. Since the margin of error in current microarray technology can be high, the results generated may not properly reflect valid relationships. This paper investigates different methods of creating fuzzy models. We explore methods of conjunction and rule aggregation that produce valid results while being resilient to minor changes to model input.
Sampling and model selection
ES2002-60
Parametric bootstrap for test of contrast difference in neural networks
R. Kallel, J. Rynkiewicz
Parametric bootstrap for test of contrast difference in neural networks
R. Kallel, J. Rynkiewicz
Abstract:
This work concernes the contrast difference test and its asymptotic properties for non linear auto-regressive models. Our approach is based on an application of the parametric bootstrap method. It is a re-sampling method based on the estimate parameters of the models. The resulting methodology is illustrated by simulations of multilayer perceptron models, and an asymptotic justification is given at the end.
This work concernes the contrast difference test and its asymptotic properties for non linear auto-regressive models. Our approach is based on an application of the parametric bootstrap method. It is a re-sampling method based on the estimate parameters of the models. The resulting methodology is illustrated by simulations of multilayer perceptron models, and an asymptotic justification is given at the end.
ES2002-64
A resampling and multiple testing-based procedure for determining the size of a neural network
A. Yanes Escolano, E. Guerrero Vazquez, P.L. Galindo Riano, J. Pizarro Junquera
A resampling and multiple testing-based procedure for determining the size of a neural network
A. Yanes Escolano, E. Guerrero Vazquez, P.L. Galindo Riano, J. Pizarro Junquera
Abstract:
One of the most important difficulties in using neural networks for a real-world problem is the issue of model complexity, and how affects the generalization performance. We present a new algorithm based on multiple comparison methods for finding low complexity neural networks with high generalization capability.
One of the most important difficulties in using neural networks for a real-world problem is the issue of model complexity, and how affects the generalization performance. We present a new algorithm based on multiple comparison methods for finding low complexity neural networks with high generalization capability.
Neural Networks and Cognitive Science
ES2002-450
Neural networks for modeling memory : case studies
H. Paugam-Moisy, D. Puzenat, E. Reynaud, J.-P. Magué
Neural networks for modeling memory : case studies
H. Paugam-Moisy, D. Puzenat, E. Reynaud, J.-P. Magué
Abstract:
First, neural networks have been inspired by cognitive processes [MCC43,HEB49,RUM86]. Second, they were proved to be very efficient computing tools for engineering, financial and medical applications [FRE91,BIS95,HER94,BLA96]. In this article we point out that there is still a great interest, for both engineering and cognitive science, to explore more deeply the links between natural and artificial neural systems. On the one hand: how to define more complex learning rules adapted to heterogeneous neural networks and how to build modular multi-network systems for modeling cognitive processes. On the other hand: how to derive new interesting learning paradigms back, for artificial neural networks, and how to design more performant systems than classical basic connectionist models. After a short survey of connectionist models for modeling memory, we develop two case studies. The first is a model for a multimodal associative memory and the second is a model for more deeply understanding the mechanisms of spatial cognition.
First, neural networks have been inspired by cognitive processes [MCC43,HEB49,RUM86]. Second, they were proved to be very efficient computing tools for engineering, financial and medical applications [FRE91,BIS95,HER94,BLA96]. In this article we point out that there is still a great interest, for both engineering and cognitive science, to explore more deeply the links between natural and artificial neural systems. On the one hand: how to define more complex learning rules adapted to heterogeneous neural networks and how to build modular multi-network systems for modeling cognitive processes. On the other hand: how to derive new interesting learning paradigms back, for artificial neural networks, and how to design more performant systems than classical basic connectionist models. After a short survey of connectionist models for modeling memory, we develop two case studies. The first is a model for a multimodal associative memory and the second is a model for more deeply understanding the mechanisms of spatial cognition.
ES2002-452
Connectionist models investigating representations formed in the sequential generation of characters
F.M. Richardson, N. Davey, L. Peters, D.J. Done, S.H. Anthony
Connectionist models investigating representations formed in the sequential generation of characters
F.M. Richardson, N. Davey, L. Peters, D.J. Done, S.H. Anthony
Abstract:
This paper considers the results of three different methods of encoding visual and motor representations of single sequential character production using three different architectures for the simulation of perceptual and motor processes. Examination of such processes through neural net modelling of the generation of handwritten characters promises to be a fruitful avenue of exploration as the induced representations of the models can be examined. The results of this analysis showed that both spatial and temporal similarity were important in these representations. Similar results have been shown to be true for actual representations in the motor cortex.
This paper considers the results of three different methods of encoding visual and motor representations of single sequential character production using three different architectures for the simulation of perceptual and motor processes. Examination of such processes through neural net modelling of the generation of handwritten characters promises to be a fruitful avenue of exploration as the induced representations of the models can be examined. The results of this analysis showed that both spatial and temporal similarity were important in these representations. Similar results have been shown to be true for actual representations in the motor cortex.
ES2002-453
The problem of adaptive control in a living system or how to acquire an inverse model without external help
K. Th. Kalveram, T. Schinauer
The problem of adaptive control in a living system or how to acquire an inverse model without external help
K. Th. Kalveram, T. Schinauer
Abstract:
Recent research uncovers that goal directed sensorimotor behaviour is governed by negative feedback of positional error, and by feedforward through inverse modelling of the limb's dynamics. Thereby, forward models seem to provide the kinematic state of the limb. The question addressed in the paper is, how the neural network representing the inverse model can be trained. Because in this case an error based learning algorithm seems to be unavailable, an alternative non error based method called auto-imitation is proposed. It is demonstrated, that, if combining a special type of neural network (the power net) with a modified type of a Hebbian synapse, the inverse dynamics of an onejointed arm can be precisely identified using auto-imitation. This holds for a simulated arm and a real robot arm as well.
Recent research uncovers that goal directed sensorimotor behaviour is governed by negative feedback of positional error, and by feedforward through inverse modelling of the limb's dynamics. Thereby, forward models seem to provide the kinematic state of the limb. The question addressed in the paper is, how the neural network representing the inverse model can be trained. Because in this case an error based learning algorithm seems to be unavailable, an alternative non error based method called auto-imitation is proposed. It is demonstrated, that, if combining a special type of neural network (the power net) with a modified type of a Hebbian synapse, the inverse dynamics of an onejointed arm can be precisely identified using auto-imitation. This holds for a simulated arm and a real robot arm as well.
ES2002-451
Biologically-inspired human motion detection
V. Laxmi, J.N. Carter, R.I. Damper
Biologically-inspired human motion detection
V. Laxmi, J.N. Carter, R.I. Damper
Abstract:
A model of motion detection is described, inspired by the capability of humans to recognise biological motion even from minimal information systems such as moving light displays. The model, a feed-forward backpropagation neural network, uses labelled joint data, analogous to light points in such displays. In preliminary work, the model achieves 100% person classification on a set of 4 artificial subjects and another of 4 real subjects. Subsequently, 100% motion detection is achieved on a set of 21 subjects. In the latter case, the correspondence problem is also solved by the model, since the network is not `told' which joint is which. Like human beings, the neural networks perform both tasks within a small fraction of the gait cycle.
A model of motion detection is described, inspired by the capability of humans to recognise biological motion even from minimal information systems such as moving light displays. The model, a feed-forward backpropagation neural network, uses labelled joint data, analogous to light points in such displays. In preliminary work, the model achieves 100% person classification on a set of 4 artificial subjects and another of 4 real subjects. Subsequently, 100% motion detection is achieved on a set of 21 subjects. In the latter case, the correspondence problem is also solved by the model, since the network is not `told' which joint is which. Like human beings, the neural networks perform both tasks within a small fraction of the gait cycle.
ES2002-32
Why will rat's go where rats will not?
J. Hayes, V. Murphy, N. Davey, P. Smith, L. Peters
Why will rat's go where rats will not?
J. Hayes, V. Murphy, N. Davey, P. Smith, L. Peters
Abstract:
Experimental evidence indicates that regular plurals are nearly always omitted from English compounds (e.g., rats-eater) while irregular plurals may be included within these structures (e.g., mice-chaser). This phenomenon is considered to be good evidence to support the dual mechanism model of morphological processing (Pinker & Prince, 1992). However, evidence from neural net modelling has shown that a single route associative memory based account might provide an equally, if not more, valid explanation of the compounding phenomenon.
Experimental evidence indicates that regular plurals are nearly always omitted from English compounds (e.g., rats-eater) while irregular plurals may be included within these structures (e.g., mice-chaser). This phenomenon is considered to be good evidence to support the dual mechanism model of morphological processing (Pinker & Prince, 1992). However, evidence from neural net modelling has shown that a single route associative memory based account might provide an equally, if not more, valid explanation of the compounding phenomenon.
ANN models and learning I
ES2002-51
Rule extraction from support vector machines
H. Nunez, C. Angulo, A. Catala
Rule extraction from support vector machines
H. Nunez, C. Angulo, A. Catala
Abstract:
Support vector machines (SVMs) are learning systems based on the statistical learning theory, which are exhibiting good generalization ability on real data sets. Nevertheless, a possible limitation of SVM is that they generate black box models. In this work, a procedure for rule extraction from support vector machines is proposed: the SVM+Prototypes method. This method allows to give explanation ability to SVM. Once determined the decision function by means of a SVM, a clustering algorithm is used to determine prototype vectors for each class. These points are combined with the support vectors using geometric methods to define ellipsoids in the input space, which are later transfers to if-then rules. By using the support vectors we can establish the limits of these regions.
Support vector machines (SVMs) are learning systems based on the statistical learning theory, which are exhibiting good generalization ability on real data sets. Nevertheless, a possible limitation of SVM is that they generate black box models. In this work, a procedure for rule extraction from support vector machines is proposed: the SVM+Prototypes method. This method allows to give explanation ability to SVM. Once determined the decision function by means of a SVM, a clustering algorithm is used to determine prototype vectors for each class. These points are combined with the support vectors using geometric methods to define ellipsoids in the input space, which are later transfers to if-then rules. By using the support vectors we can establish the limits of these regions.
ES2002-6
Fuzzy support vector machines for multiclass problems
S. Abe, T. Inoue
Fuzzy support vector machines for multiclass problems
S. Abe, T. Inoue
Abstract:
Since support vector machines for pattern classification are based on two-class classification problems, unclassifiable regions exist when extended to n ( > 2)-class problems. In our previous work, to solve this problem, we developed fuzzy support vector machines for one-to-(n-1) classification. In this paper, we extend our method to pairwise classification. Namely, using the decision functions obtained by training the support vector machines for classes i and j (j ne i, j =1,..., n), for class i we define a truncated polyhedral pyramidal membership function. The membership functions are defined so that, for the data in the classifiable regions, the classification results are the same for the two methods. Thus, the generalization ability of the fuzzy support vector machine is the same with or better than that of the support vector machine for pairwise classification. We evaluate our method for four benchmark data sets and demonstrate the superiority of our method.
Since support vector machines for pattern classification are based on two-class classification problems, unclassifiable regions exist when extended to n ( > 2)-class problems. In our previous work, to solve this problem, we developed fuzzy support vector machines for one-to-(n-1) classification. In this paper, we extend our method to pairwise classification. Namely, using the decision functions obtained by training the support vector machines for classes i and j (j ne i, j =1,..., n), for class i we define a truncated polyhedral pyramidal membership function. The membership functions are defined so that, for the data in the classifiable regions, the classification results are the same for the two methods. Thus, the generalization ability of the fuzzy support vector machine is the same with or better than that of the support vector machine for pairwise classification. We evaluate our method for four benchmark data sets and demonstrate the superiority of our method.
ES2002-7
Different criteria for active learning in neural networks: a comparative study
J. Poland, A. Zell
Different criteria for active learning in neural networks: a comparative study
J. Poland, A. Zell
Abstract:
The field of active learning and optimal query construction in Neural Network training is tightly connected with the design of experiments and its rich theory. Thus there is a large number of active learning strategies and query criteria which have a sound theoretical foundation. This comparative study considers the regression problem of approximating a nonlinear noisy function with relatively few inputs. We evaluate some query criteria, namely space-filling criteria, variance criteria, markov chain monte carlo methods and query by committee.
The field of active learning and optimal query construction in Neural Network training is tightly connected with the design of experiments and its rich theory. Thus there is a large number of active learning strategies and query criteria which have a sound theoretical foundation. This comparative study considers the regression problem of approximating a nonlinear noisy function with relatively few inputs. We evaluate some query criteria, namely space-filling criteria, variance criteria, markov chain monte carlo methods and query by committee.
ES2002-10
Supervised learning in committee machines by PCA
C. Bunzmann, M. Biehl, R. Urbanczik
Supervised learning in committee machines by PCA
C. Bunzmann, M. Biehl, R. Urbanczik
Abstract:
A learning algorithm for multilayer perceptrons is suggested which relates to the technique of principal component analysis. The latter is performed with respect to a correlation matrix computed from the example inputs and their target outputs. For large networks it is demonstrated that the procedure requires by far fewer examples for good generalization than traditional on--line training prescriptions.
A learning algorithm for multilayer perceptrons is suggested which relates to the technique of principal component analysis. The latter is performed with respect to a correlation matrix computed from the example inputs and their target outputs. For large networks it is demonstrated that the procedure requires by far fewer examples for good generalization than traditional on--line training prescriptions.
ES2002-57
The use of LS-SVM in the classification of brain tumors based on Magnetic Resonance Spectroscopy signals
L. Lukas, A. Devos, J.A.K. Suykens, L. Vanhamme, S. Van Huffel, A.R. Tate, C. Majos, C. Arus
The use of LS-SVM in the classification of brain tumors based on Magnetic Resonance Spectroscopy signals
L. Lukas, A. Devos, J.A.K. Suykens, L. Vanhamme, S. Van Huffel, A.R. Tate, C. Majos, C. Arus
Abstract:
Least Squares Support Vector Machines (LS-SVM) have been developed and successfully applied to classification problems in many areas. In comparison with several other classical methods this technique consistently performs very well on a large variety of problems. Here, results on the application of LS-SVM for classification of brain tumors based on Magnetic Resonance Spectroscopy (MRS) signals are presented. Several kernels are used and compared to find the optimal classifier. Despite the high dimensionality and the scarcity of the input data, and the fact that no additional clinical information is used, a good ROC and classification performance can be achieved after applying leave-one-out cross-validation for hyperparameter selection together with an additional bias term correction. The improvement of this classification based on MRS signals will lead to an advanced tool for the discrimination of brain tumors, which is presently under development for the INTERPRET project.
Least Squares Support Vector Machines (LS-SVM) have been developed and successfully applied to classification problems in many areas. In comparison with several other classical methods this technique consistently performs very well on a large variety of problems. Here, results on the application of LS-SVM for classification of brain tumors based on Magnetic Resonance Spectroscopy (MRS) signals are presented. Several kernels are used and compared to find the optimal classifier. Despite the high dimensionality and the scarcity of the input data, and the fact that no additional clinical information is used, a good ROC and classification performance can be achieved after applying leave-one-out cross-validation for hyperparameter selection together with an additional bias term correction. The improvement of this classification based on MRS signals will lead to an advanced tool for the discrimination of brain tumors, which is presently under development for the INTERPRET project.
ES2002-18
Clustering in data space and feature space
D. MacDonald, C. Fyfe
Clustering in data space and feature space
D. MacDonald, C. Fyfe
ES2002-19
Maximum likelihood Hebbian rules
C. Fyfe, E. Corchado
Maximum likelihood Hebbian rules
C. Fyfe, E. Corchado
Abstract:
In this paper, we review an extension of the learning rules in a Principal Component Analysis network which has been derived to be optimal for a specific probability density function. We note that this probability density function is one of a family of pdfs and investigate the learning rules formed in order to be optimal for several members of this family. We show that, whereas previous authors [5] have viewed the single member of the family as an extension of PCA, it is more appropriate to view the whole family of learning rules as methods of performing Exploratory Projection Pursuit. We illustrate this on artificial data sets.
In this paper, we review an extension of the learning rules in a Principal Component Analysis network which has been derived to be optimal for a specific probability density function. We note that this probability density function is one of a family of pdfs and investigate the learning rules formed in order to be optimal for several members of this family. We show that, whereas previous authors [5] have viewed the single member of the family as an extension of PCA, it is more appropriate to view the whole family of learning rules as methods of performing Exploratory Projection Pursuit. We illustrate this on artificial data sets.
ES2002-23
Fast exact leave-one-out cross-validation of least-squares Support Vector Machines
K. Saadi, G.C. Cawley, N.L.C. Talbot
Fast exact leave-one-out cross-validation of least-squares Support Vector Machines
K. Saadi, G.C. Cawley, N.L.C. Talbot
ES2002-65
Noise derived information criterion for model selection
J. Pizarro Junquera, P. Galindo Riano, E. Guerrero Vazquez, A. Yanez Escolano
Noise derived information criterion for model selection
J. Pizarro Junquera, P. Galindo Riano, E. Guerrero Vazquez, A. Yanez Escolano
Abstract:
This paper proposes a new complexity-penalization model selection strategy derived from the minimum risk principle and the behavior of candidate models under noisy conditions. This strategy seems to be robust in small sample size conditions and tends to AIC criterion as sample size grows up. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to other state-of-the-art criteria.
This paper proposes a new complexity-penalization model selection strategy derived from the minimum risk principle and the behavior of candidate models under noisy conditions. This strategy seems to be robust in small sample size conditions and tends to AIC criterion as sample size grows up. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to other state-of-the-art criteria.
ES2002-27
An unified framework for 'All data at once' multi-class Support Vector Machines
C. Angulo, X. Parra, A. Catala
An unified framework for 'All data at once' multi-class Support Vector Machines
C. Angulo, X. Parra, A. Catala
Abstract:
Support Vectors (SV) are a machine learning procedure based on Vapnik's Statistical Learning Theory, initially defined for bi-classification problems. A lot of work is being made from different research areas to obtain new algorithms for multi-class problems, the more usual task in real-world problems. A promising extension is to treat `all data at once' into one multi-class SVM by modifying the associated quadratic programming (QP) problem. In this work, a unified architecture is developed to compare the associated QP problem for different approaches. With the new framework comparisons between algorithms become easier and it is a powerful tool to analyze the performance and behaviour of these approaches.
Support Vectors (SV) are a machine learning procedure based on Vapnik's Statistical Learning Theory, initially defined for bi-classification problems. A lot of work is being made from different research areas to obtain new algorithms for multi-class problems, the more usual task in real-world problems. A promising extension is to treat `all data at once' into one multi-class SVM by modifying the associated quadratic programming (QP) problem. In this work, a unified architecture is developed to compare the associated QP problem for different approaches. With the new framework comparisons between algorithms become easier and it is a powerful tool to analyze the performance and behaviour of these approaches.
ES2002-70
Prediction of mental development of preterm newborns at birth time using LS-SVM
L. Ameye, C. Lu, L. Lukas, J. De Brabanter, J.A.K. Suykens, S. Van Huffel, H. Daniels, G. Naulaers, H. Devlieger
Prediction of mental development of preterm newborns at birth time using LS-SVM
L. Ameye, C. Lu, L. Lukas, J. De Brabanter, J.A.K. Suykens, S. Van Huffel, H. Daniels, G. Naulaers, H. Devlieger
Abstract:
\N
\N
Representation of high-dimensional data
ES2002-250
Searching for the embedded manifolds in high-dimensional data, problems and unsolved questions
J. Hérault, A. Guérin-Dugué, P. Villemain
Searching for the embedded manifolds in high-dimensional data, problems and unsolved questions
J. Hérault, A. Guérin-Dugué, P. Villemain
Abstract:
Starting from a recall of several classical - and less classical - remarks about high dimensional data spaces, this paper gives a bird's eye view over various techniques of data reduction, from linear multidimensional scaling to non-linear and non-parametric methods. Two kinds of approaches will be presented, the first one operating in the feature space, the second one operating in the dissimilarity space. A special attention will be devoted to the CCA algorithm, in a version which aims at capturing the mean manifold spanned by the data vectors. Some examples from artificial and real data are given.
Starting from a recall of several classical - and less classical - remarks about high dimensional data spaces, this paper gives a bird's eye view over various techniques of data reduction, from linear multidimensional scaling to non-linear and non-parametric methods. Two kinds of approaches will be presented, the first one operating in the feature space, the second one operating in the dissimilarity space. A special attention will be devoted to the CCA algorithm, in a version which aims at capturing the mean manifold spanned by the data vectors. Some examples from artificial and real data are given.
ES2002-254
Curvilinear Distance Analysis versus Isomap
J.A. Lee, A. Lendasse, M. Verleysen
Curvilinear Distance Analysis versus Isomap
J.A. Lee, A. Lendasse, M. Verleysen
Abstract:
Dimension reduction techniques are widely used for the analysis and visualization of complex sets of data. This paper compares two nonlinear projection methods: Isomap and Curvilinear Distance Analysis. Contrarily to the traditional linear PCA, these methods work like multidimensional scaling, by reproducing in the projection space the pairwise distances measured in the data space. They differ from the classical linear MDS by the metrics they use and by the way they build the mapping (algebraic or neural). While Isomap relies directly on the traditional MDS, CDA is based on a nonlinear variant of MDS, called CCA (Curvilinear Component Analysis). Although Isomap and CDA share the same metrics, the comparison highlights their respective strengths and weaknesses.
Dimension reduction techniques are widely used for the analysis and visualization of complex sets of data. This paper compares two nonlinear projection methods: Isomap and Curvilinear Distance Analysis. Contrarily to the traditional linear PCA, these methods work like multidimensional scaling, by reproducing in the projection space the pairwise distances measured in the data space. They differ from the classical linear MDS by the metrics they use and by the way they build the mapping (algebraic or neural). While Isomap relies directly on the traditional MDS, CDA is based on a nonlinear variant of MDS, called CCA (Curvilinear Component Analysis). Although Isomap and CDA share the same metrics, the comparison highlights their respective strengths and weaknesses.
ES2002-251
Fast nonlinear dimensionality reduction with topology preserving networks
J.J. Verbeek, N. Vlassis, B. Krose
Fast nonlinear dimensionality reduction with topology preserving networks
J.J. Verbeek, N. Vlassis, B. Krose
Abstract:
We present a fast alternative for the Isomap algorithm. A set of quantizers is fit to the data and a neighborhood structure based on the competitive Hebbian rule is imposed on it. This structure is used to obtain low-dimensional description of the data by means of computing geodesic distances and multi dimensional scaling. The quantization allows for faster processing of the data. The speed-up as compared to Isomap is roughly quadratic in the ratio between the number of quantizers and the number of data points. The quantizers and neighborhood structure are used to map the data to the low dimensional space.
We present a fast alternative for the Isomap algorithm. A set of quantizers is fit to the data and a neighborhood structure based on the competitive Hebbian rule is imposed on it. This structure is used to obtain low-dimensional description of the data by means of computing geodesic distances and multi dimensional scaling. The quantization allows for faster processing of the data. The speed-up as compared to Isomap is roughly quadratic in the ratio between the number of quantizers and the number of data points. The quantizers and neighborhood structure are used to map the data to the low dimensional space.
ES2002-255
When does geodesic distance recover the true hidden parametrization of families of articulated images?
D. Donoho, C. Grimes
When does geodesic distance recover the true hidden parametrization of families of articulated images?
D. Donoho, C. Grimes
ES2002-253
How to generalize geometric ICA to higher dimensions
F.J. Theis, E.W. Lang
How to generalize geometric ICA to higher dimensions
F.J. Theis, E.W. Lang
Abstract:
Geometric algorithms for linear independent component analysis (ICA) have recently received some attention due to their pictorial description and their relative ease of implementation. The geometric approach to ICA has been proposed first by Puntonet and Prieto in order to separate linear mixtures. One major drawback of geometric algorithms is, however, an exponentially rising number of samples and convergence times with increasing dimensiononality thus basically restricting geometric ICA to low-dimensional cases. We propose to apply overcomplete ICA to geometric ICA to reduce high dimensional problems to lower-dimensional ones, thus generalizing geometric ICA to higher dimensions.
Geometric algorithms for linear independent component analysis (ICA) have recently received some attention due to their pictorial description and their relative ease of implementation. The geometric approach to ICA has been proposed first by Puntonet and Prieto in order to separate linear mixtures. One major drawback of geometric algorithms is, however, an exponentially rising number of samples and convergence times with increasing dimensiononality thus basically restricting geometric ICA to low-dimensional cases. We propose to apply overcomplete ICA to geometric ICA to reduce high dimensional problems to lower-dimensional ones, thus generalizing geometric ICA to higher dimensions.
ES2002-252
Neural dimensionality reduction for document processing
M. Delichère, D. Memmi
Neural dimensionality reduction for document processing
M. Delichère, D. Memmi
Abstract:
Document processing usually gives rise to high-dimension representation vectors which are redundant and costly to process. Reducing dimensionality would be appropriate, but standard factor analysis methods such as PCA cannot deal with vectors of very high dimension. We have used instead an adaptive neural network technique (the Generalized Hebbian Algorithm) to extract the first principal components of a text corpus in order to represent documents economically. The approach is efficient and gives good results in a real Web page clustering application.
Document processing usually gives rise to high-dimension representation vectors which are redundant and costly to process. Reducing dimensionality would be appropriate, but standard factor analysis methods such as PCA cannot deal with vectors of very high dimension. We have used instead an adaptive neural network technique (the Generalized Hebbian Algorithm) to extract the first principal components of a text corpus in order to represent documents economically. The approach is efficient and gives good results in a real Web page clustering application.
ANN models and learning II
ES2002-26
Geometric overcomplete ICA
F.J. Theis, E.W. Lang
Geometric overcomplete ICA
F.J. Theis, E.W. Lang
Abstract:
In independent component analysis (ICA), given some signal input the goal is to find an independent decomposition. We present an algorithm based on geometric considerations to decompose a linear mixture of more sources than sensor signals. We present an efficient method for the matrix-recovery step in the framework of a two-step approach to the source separation problem. The second step - source-recovery - uses the standard maximum-likelihood approach.
In independent component analysis (ICA), given some signal input the goal is to find an independent decomposition. We present an algorithm based on geometric considerations to decompose a linear mixture of more sources than sensor signals. We present an efficient method for the matrix-recovery step in the framework of a two-step approach to the source separation problem. The second step - source-recovery - uses the standard maximum-likelihood approach.
ES2002-71
Advantages and drawbacks of the Batch Kohonen algorithm
J.-C. Fort, P. Letremy, M. Cottrell
Advantages and drawbacks of the Batch Kohonen algorithm
J.-C. Fort, P. Letremy, M. Cottrell
Abstract:
The Kohonen algorithm (SOM) was originally defined as a stochastic algorithm which works in an on-line way and which was designed to model some plastic features of the human brain. In fact it is nowadays extensively used for data mining, data visualization, and exploratory data analysis. Some users are tempted to use the batch version of the Kohonen algorithm (KBATCH) since it is a deterministic algorithm which can go faster in some cases. After [7], which tried to elucidate the mathematical nature of the Batch variant, in this paper, we give some elements of comparison for both algorithms, using theoretical arguments, simulated data and real data.
The Kohonen algorithm (SOM) was originally defined as a stochastic algorithm which works in an on-line way and which was designed to model some plastic features of the human brain. In fact it is nowadays extensively used for data mining, data visualization, and exploratory data analysis. Some users are tempted to use the batch version of the Kohonen algorithm (KBATCH) since it is a deterministic algorithm which can go faster in some cases. After [7], which tried to elucidate the mathematical nature of the Batch variant, in this paper, we give some elements of comparison for both algorithms, using theoretical arguments, simulated data and real data.
ES2002-5
Mobile radio access network monitoring using the self-organizing map
P. Lehtimäki, K. Raivio, O. Simula
Mobile radio access network monitoring using the self-organizing map
P. Lehtimäki, K. Raivio, O. Simula
Abstract:
In this study, a method for process clustering and visualization using the Self-Organizing Map (SOM) is described. The presented method is applied in clustering and monitoring of mobile cells of a Mobile Radio Access Network (RAN).
In this study, a method for process clustering and visualization using the Self-Organizing Map (SOM) is described. The presented method is applied in clustering and monitoring of mobile cells of a Mobile Radio Access Network (RAN).
ES2002-31
Evaluating the impact of multiplicative input perturbations on radial basis function networks
J.L. Bernier, J. Gonzales, A. Canas, A.F. Diaz, F.J. Fernandez, J. Ortega
Evaluating the impact of multiplicative input perturbations on radial basis function networks
J.L. Bernier, J. Gonzales, A. Canas, A.F. Diaz, F.J. Fernandez, J. Ortega
Abstract:
Mean Squared Sensitivity (MSS) has been previously introduced as an approximation of the performance degradation of a MLP affected by perturbations in different parameters. In the present paper, we focuse on RBF networks for studying the implications when these are affected by input noise. We have obtained the corresponding analytical expression for MSS and have validated it experimentally, using two different models for perturbations: an additive and a multiplicative model. Thus, MSS is proposed as a quantitative measurement for evaluating the noise immunity of a RBFN configuration.
Mean Squared Sensitivity (MSS) has been previously introduced as an approximation of the performance degradation of a MLP affected by perturbations in different parameters. In the present paper, we focuse on RBF networks for studying the implications when these are affected by input noise. We have obtained the corresponding analytical expression for MSS and have validated it experimentally, using two different models for perturbations: an additive and a multiplicative model. Thus, MSS is proposed as a quantitative measurement for evaluating the noise immunity of a RBFN configuration.
ES2002-34
Learning sparse representations of three-dimensional objects
G. Peters, C. von der Malsburg
Learning sparse representations of three-dimensional objects
G. Peters, C. von der Malsburg
Abstract:
Each object in our environment can cause considerably different patterns of excitation in our retinae depending on the observed viewpoint of the object. Despite this we are able to perceive that the changing signals are produced by the same object. It is a function of our brain to provide this constant recognition from such inconstant input signals by establishing an internal representation of the object. The nature of such a viewpoint-invariant representation, the way how it can be acquired, and its application in a perception task are the concern of this work. We describe the generation of view-based, sparse representations of real-world objects and apply them in a pose estimation task.
Each object in our environment can cause considerably different patterns of excitation in our retinae depending on the observed viewpoint of the object. Despite this we are able to perceive that the changing signals are produced by the same object. It is a function of our brain to provide this constant recognition from such inconstant input signals by establishing an internal representation of the object. The nature of such a viewpoint-invariant representation, the way how it can be acquired, and its application in a perception task are the concern of this work. We describe the generation of view-based, sparse representations of real-world objects and apply them in a pose estimation task.
ES2002-12
An estimation model of pupil size for 'Blink Artifact' and it's applications
M. Nakayama, Y. Shimizu
An estimation model of pupil size for 'Blink Artifact' and it's applications
M. Nakayama, Y. Shimizu
Abstract:
It is well known that the measuring of pupil size is influenced by noises and blinking. This paper describes the development of an estimation model of pupil size for 'blink artifact' which is based on a 3 layered perceptron with back propagation method. The model was trained by pupil responses with artificial blinks. It was found that pupil size during the blink period could be estimated according to the training period. When this model was applied to pupillary changes for subjects viewing TV programs, inappropriate power of frequencies were removed in the frequency analysis for temporal pupillary change. This result provides evidence that this model can remove faults from pupil response measurements.
It is well known that the measuring of pupil size is influenced by noises and blinking. This paper describes the development of an estimation model of pupil size for 'blink artifact' which is based on a 3 layered perceptron with back propagation method. The model was trained by pupil responses with artificial blinks. It was found that pupil size during the blink period could be estimated according to the training period. When this model was applied to pupillary changes for subjects viewing TV programs, inappropriate power of frequencies were removed in the frequency analysis for temporal pupillary change. This result provides evidence that this model can remove faults from pupil response measurements.
ES2002-38
Novelty detection for strain-gauge degradation using maximally correlated components
G. Hollier, J. Austin
Novelty detection for strain-gauge degradation using maximally correlated components
G. Hollier, J. Austin
Abstract:
A new method for the detection of the degradation of strain-gauges attached to airframes is developed, using novelty-detection techniques and maximally correlated components. This considerably improves upon the previous method for the detection of changes in the response-line gradient.
A new method for the detection of the degradation of strain-gauges attached to airframes is developed, using novelty-detection techniques and maximally correlated components. This considerably improves upon the previous method for the detection of changes in the response-line gradient.
ES2002-40
Modeling efficient conjunction detection with spiking neural networks
S.M. Bohte, J.N. Kok, H. La Poutré
Modeling efficient conjunction detection with spiking neural networks
S.M. Bohte, J.N. Kok, H. La Poutré
Abstract:
The design of neural networks that are able to efficiently encode and detect conjunctions of features is an important open challenge that is also referred to as “the binding-problem”. We define a formal framework for neural nodes that process activity in the form of tuples of spike-trains which can efficiently encode and detect feature-conjunctions on a retinal input field in a position-invariant manner, also in the presence of multiple feature-conjunctions.
The design of neural networks that are able to efficiently encode and detect conjunctions of features is an important open challenge that is also referred to as “the binding-problem”. We define a formal framework for neural nodes that process activity in the form of tuples of spike-trains which can efficiently encode and detect feature-conjunctions on a retinal input field in a position-invariant manner, also in the presence of multiple feature-conjunctions.
ES2002-44
Segmental duration control by time delay neural networks with asymmetric causal and retro-causal information flows
C. Erden, H.G. Zimmermann
Segmental duration control by time delay neural networks with asymmetric causal and retro-causal information flows
C. Erden, H.G. Zimmermann
Abstract:
The generation of pleasant prosody parameters is very important for speech synthesis. A Prosody generation unit can be seen as a dynamical system. In this paper sophisticated time-delay recurrent neural network (NN) topologies are presented which can be used for the modeling of dynamical systems. Within the prosody prediction task left and right context information is known to influence the prediction of prosody control parameters. This can be modeled by causal-retro-causal information flows. Since information being available during training is partially unavailable during application, there is a structural switching from training to application. This structural change of the information flow is handled by two asymmetric architectures. These proposed new architectures allow the integration of further a priori knowledge. By this we are able to improve the performance of our duration control unit within our text-to-speech (TTS) system Papageno.
The generation of pleasant prosody parameters is very important for speech synthesis. A Prosody generation unit can be seen as a dynamical system. In this paper sophisticated time-delay recurrent neural network (NN) topologies are presented which can be used for the modeling of dynamical systems. Within the prosody prediction task left and right context information is known to influence the prediction of prosody control parameters. This can be modeled by causal-retro-causal information flows. Since information being available during training is partially unavailable during application, there is a structural switching from training to application. This structural change of the information flow is handled by two asymmetric architectures. These proposed new architectures allow the integration of further a priori knowledge. By this we are able to improve the performance of our duration control unit within our text-to-speech (TTS) system Papageno.
ES2002-45
Neural predictive coding for speech discriminant feature extraction: The DFE-NPC
M. Chetouani, B. Gas, J.L. Zarader, C. Chavy
Neural predictive coding for speech discriminant feature extraction: The DFE-NPC
M. Chetouani, B. Gas, J.L. Zarader, C. Chavy
Abstract:
In this paper, we present a predictive neural network called Neural Predictive Coding (NPC). This model is used for non linear discriminant features extraction (DFE) applied to phoneme recognition. We also, present a new extension of the NPC model : DFE-NPC. In order to evaluate the performances of the DFE-NPC model, we carried out a study of Darpa-Timit phonemes (in particular /b/, /d/, /g/ and /p/, /t/, /q/ phonemes) recognition. Comparisons with coding methods (LPC, MFCC, PLP, RASTA-PLP) are presented: they put in obsviousness an improvement of the classification.
In this paper, we present a predictive neural network called Neural Predictive Coding (NPC). This model is used for non linear discriminant features extraction (DFE) applied to phoneme recognition. We also, present a new extension of the NPC model : DFE-NPC. In order to evaluate the performances of the DFE-NPC model, we carried out a study of Darpa-Timit phonemes (in particular /b/, /d/, /g/ and /p/, /t/, /q/ phonemes) recognition. Comparisons with coding methods (LPC, MFCC, PLP, RASTA-PLP) are presented: they put in obsviousness an improvement of the classification.
ES2002-47
Multiresolution codes for scene categorization
N. Denquive, P. Tarroux
Multiresolution codes for scene categorization
N. Denquive, P. Tarroux
Abstract:
The development of fast and reliable image classification algorithms is mandatory for modern image applications involving large databases. Biological systems seem to have the ability to categorize complex scenes in an accurate and very fast way. Our aim is to develop an architecture that leads to similar performances in computer vision. In this work, we present a coding method based on some principles inspired from biology that achieves a fast classification of complex visual scenes. A signature vector is extracted from the visual scene by a multi-scale filtering obtained through a bank of Gabor filters. These vectors constitute the inputs of a radial basis function network. The first connection layer implements a recoding of the filter outputs. The second one achieves a linear separation of the classes in the space of coding. We showed that an incremental approach in which each class is learned separately outperforms a more global one in which we tried to learn all classes together. According to the considered image category, the subset of features leading to the best result could be different, suggesting the use of feature vectors adapted to each image category. However, one of the major results of our study is that the signature vector we used, albeit very simple to compute, contains enough information to allow a correct image classification.
The development of fast and reliable image classification algorithms is mandatory for modern image applications involving large databases. Biological systems seem to have the ability to categorize complex scenes in an accurate and very fast way. Our aim is to develop an architecture that leads to similar performances in computer vision. In this work, we present a coding method based on some principles inspired from biology that achieves a fast classification of complex visual scenes. A signature vector is extracted from the visual scene by a multi-scale filtering obtained through a bank of Gabor filters. These vectors constitute the inputs of a radial basis function network. The first connection layer implements a recoding of the filter outputs. The second one achieves a linear separation of the classes in the space of coding. We showed that an incremental approach in which each class is learned separately outperforms a more global one in which we tried to learn all classes together. According to the considered image category, the subset of features leading to the best result could be different, suggesting the use of feature vectors adapted to each image category. However, one of the major results of our study is that the signature vector we used, albeit very simple to compute, contains enough information to allow a correct image classification.
ES2002-48
Evaluation of gradient descent learning algorithms with adaptive and local learning rate for recognising hand-written numerals
M. Giudici, F. Queirolo, M. Valle
Evaluation of gradient descent learning algorithms with adaptive and local learning rate for recognising hand-written numerals
M. Giudici, F. Queirolo, M. Valle
Abstract:
Gradient descent learning algorithms, namely Back Propagation (BP), can significantly increase the classification performance of Multi Layer Perceptrons adopting a local and adaptive learning rate management approach. In this paper, we present the comparison of the performance on hand-written characters classification of two BP algorithms, implementing fixed and adaptive learning rate. The results show that the validation error and average number of learning iterations are lower for the adaptive learning rate BP algorithm.
Gradient descent learning algorithms, namely Back Propagation (BP), can significantly increase the classification performance of Multi Layer Perceptrons adopting a local and adaptive learning rate management approach. In this paper, we present the comparison of the performance on hand-written characters classification of two BP algorithms, implementing fixed and adaptive learning rate. The results show that the validation error and average number of learning iterations are lower for the adaptive learning rate BP algorithm.
Learning
ES2002-1
Batch-RLVQ
B. Hammer, T. Villmann
Batch-RLVQ
B. Hammer, T. Villmann
Abstract:
Recently a variation of learning vector quantization has been proposed in [Bojer et.al.], which allows an automatic determination of relevance factors for the input dimensions: relevance learning vector quantization (RLVQ). RLVQ is heuristically motivated and may show instabilities for inappropriate data since it does not obey a gradient dynamics. Here we propose an energy function which describes the dynamics of RLVQ in the stable phase. It can be used to substitute the original dynamics for instable situations. Moreover, it yields to a batch version of RLVQ where hard competition can be substituted by soft clustering. Hence annealing schemes can be applied naturally in order to avoid local minima.
Recently a variation of learning vector quantization has been proposed in [Bojer et.al.], which allows an automatic determination of relevance factors for the input dimensions: relevance learning vector quantization (RLVQ). RLVQ is heuristically motivated and may show instabilities for inappropriate data since it does not obey a gradient dynamics. Here we propose an energy function which describes the dynamics of RLVQ in the stable phase. It can be used to substitute the original dynamics for instable situations. Moreover, it yields to a batch version of RLVQ where hard competition can be substituted by soft clustering. Hence annealing schemes can be applied naturally in order to avoid local minima.
ES2002-8
Combining gestural and contact information for visual guidance of multi-finger grasps
G. Heidemann, H. Ritter
Combining gestural and contact information for visual guidance of multi-finger grasps
G. Heidemann, H. Ritter
Abstract:
A computer vision system for a three-fingered robot hand is presented which can solve two entirely different tasks at a time: First, to guide the robot hand, hand gestures of a human instructor are classified using the hand camera. Second, when an object has been grasped the success or failure of the grasping action can be judged qualitatively by the same system. Both tasks are solved using a view based approach which classifies a set of prototypical situations instead of exact geometric reconstruction.
A computer vision system for a three-fingered robot hand is presented which can solve two entirely different tasks at a time: First, to guide the robot hand, hand gestures of a human instructor are classified using the hand camera. Second, when an object has been grasped the success or failure of the grasping action can be judged qualitatively by the same system. Both tasks are solved using a view based approach which classifies a set of prototypical situations instead of exact geometric reconstruction.
ES2002-14
Separation of a mixture of signals using linear filtering and second order statistics
A.M. Tomé
Separation of a mixture of signals using linear filtering and second order statistics
A.M. Tomé
Abstract:
Some recent works address the problem of blind source separation with a matrix pencil. In this paper we show that the covariance matrices of the pencil can be computed at the output of a simple linear filter instead of using time-delayed covariance matrices. It is also shown, using block matrix manipulation, that the method might applied when the number of source signals is not equal to the number of mixed signals. An experimental study, comparing different strategies of computing the matrix pencil, is also presented.
Some recent works address the problem of blind source separation with a matrix pencil. In this paper we show that the covariance matrices of the pencil can be computed at the output of a simple linear filter instead of using time-delayed covariance matrices. It is also shown, using block matrix manipulation, that the method might applied when the number of source signals is not equal to the number of mixed signals. An experimental study, comparing different strategies of computing the matrix pencil, is also presented.
ES2002-41
Sparse image coding using an asynchronous spiking neural network
L. Perrinet, M. Samuelides
Sparse image coding using an asynchronous spiking neural network
L. Perrinet, M. Samuelides
Abstract:
In order to explore coding strategies in the retina, we use a wavelet-like transform which output is sparse, as is observed in biological retinas [Olshausen98]. This transform is defined in the context of a one-pass feed-forward spiking neural network, and the output is the list of its neurons' spikes: it is recursively constructed using a greedy matching pursuit scheme which first selects higher contrast energy values. As in [Vanrullen01], we find invariants in the output for some classes of images, allowing to code the absolute contrast value solely by its rank in the spike list. An application to image compression is shown which is comparable to other techniques such as JPEG at low bit compression.
In order to explore coding strategies in the retina, we use a wavelet-like transform which output is sparse, as is observed in biological retinas [Olshausen98]. This transform is defined in the context of a one-pass feed-forward spiking neural network, and the output is the list of its neurons' spikes: it is recursively constructed using a greedy matching pursuit scheme which first selects higher contrast energy values. As in [Vanrullen01], we find invariants in the output for some classes of images, allowing to code the absolute contrast value solely by its rank in the spike list. An application to image compression is shown which is comparable to other techniques such as JPEG at low bit compression.
Hardware and Parallel Computer Implementations of Neural Networks
ES2002-350
Artificial Neural Networks on Massively Parallel Computer Hardware
U. Seiffert
Artificial Neural Networks on Massively Parallel Computer Hardware
U. Seiffert
Abstract:
It seems to be an everlasting discussion. Spending a lot of additional time and extra money to implement a particular algorithm on parallel hardware is often considered as the ultimate solution to all existing time problems for the ones - and the most silly waste of time for the others. In fact, there are many pros and cons, which should be always individually weighted. Besides many specific constraints, in general artificial neural networks are worth to be taken into consideration. This tutorial paper gives a survey and guides those people who are willing to go the way of a parallel implementation utilizing the most recent and accessible parallel computer hardware and software. The paper is rounded off with an extensive reference section.
It seems to be an everlasting discussion. Spending a lot of additional time and extra money to implement a particular algorithm on parallel hardware is often considered as the ultimate solution to all existing time problems for the ones - and the most silly waste of time for the others. In fact, there are many pros and cons, which should be always individually weighted. Besides many specific constraints, in general artificial neural networks are worth to be taken into consideration. This tutorial paper gives a survey and guides those people who are willing to go the way of a parallel implementation utilizing the most recent and accessible parallel computer hardware and software. The paper is rounded off with an extensive reference section.
ES2002-354
PCNN neurocomputers - Event driven and parallel architectures
C. Grassmann, T. Schoenauer, C. Wolff
PCNN neurocomputers - Event driven and parallel architectures
C. Grassmann, T. Schoenauer, C. Wolff
Abstract:
The simulation of large spiking neural networks (PCNN) espe-cially for vision purposes is limited by the computing power of general pur-pose computer systems [5,9,10]. Therefore, the simulation of real world scenarios requires dedicated simulator systems. This article presents architec-tures of software and hardware implementations for PCNN simulator systems. The implementations are based on a common event driven approach using spike events for communication and processing flow. Furthermore, parallel approaches utilizing the spike event computing are introduced for simulation acceleration. Implementations of software simulators on work-station clusters and parallel computers and hardware accelerators based on FPGAs, ASICs and DSPs are described. The presented results demonstrate the capability to simulate large vision networks close to real world/real time requirements.
The simulation of large spiking neural networks (PCNN) espe-cially for vision purposes is limited by the computing power of general pur-pose computer systems [5,9,10]. Therefore, the simulation of real world scenarios requires dedicated simulator systems. This article presents architec-tures of software and hardware implementations for PCNN simulator systems. The implementations are based on a common event driven approach using spike events for communication and processing flow. Furthermore, parallel approaches utilizing the spike event computing are introduced for simulation acceleration. Implementations of software simulators on work-station clusters and parallel computers and hardware accelerators based on FPGAs, ASICs and DSPs are described. The presented results demonstrate the capability to simulate large vision networks close to real world/real time requirements.
ES2002-359
A reconfigurable SOM hardware accelerator
M. Porrmann, M. Franzmeier, H. Kalte, U. Witkowski, U. Rückert
A reconfigurable SOM hardware accelerator
M. Porrmann, M. Franzmeier, H. Kalte, U. Witkowski, U. Rückert
Abstract:
A dynamically reconfigurable hardware accelerator for self-organizing feature maps is presented. The system is based on the universal rapid prototyping system RAPTOR2000 that has been developed by the authors. The modular prototyping system is based on XILINX FPGAs and is capable of emulating hardware implementations with a complexity of more than 24 million system gates. RAPTOR2000 is linked to its host - a standard personal computer or workstation - via the PCI bus. For the simulation of self organizing maps a module has been designed for the RAPTOR2000 system, that embodies an FPGA of the Xilinx Virtex series and optionally up to 128 MBytes of SDRAM. A speed-up of about 50 is achieved with five FPGA modules on the RAPTOR2000 system compared to a software implementation on a state of the art personal computer for typical applications of self-organizing maps.
A dynamically reconfigurable hardware accelerator for self-organizing feature maps is presented. The system is based on the universal rapid prototyping system RAPTOR2000 that has been developed by the authors. The modular prototyping system is based on XILINX FPGAs and is capable of emulating hardware implementations with a complexity of more than 24 million system gates. RAPTOR2000 is linked to its host - a standard personal computer or workstation - via the PCI bus. For the simulation of self organizing maps a module has been designed for the RAPTOR2000 system, that embodies an FPGA of the Xilinx Virtex series and optionally up to 128 MBytes of SDRAM. A speed-up of about 50 is achieved with five FPGA modules on the RAPTOR2000 system compared to a software implementation on a state of the art personal computer for typical applications of self-organizing maps.
ES2002-352
Stochastic resonance and finite resolution in a leaky integrate-and-fire neuron
N. Mtetwa, L.S. Smith, A. Hussain
Stochastic resonance and finite resolution in a leaky integrate-and-fire neuron
N. Mtetwa, L.S. Smith, A. Hussain
Abstract:
The paper discusses the effect of stochastic resonance (SR) in a leaky integrate-and-fire (LIF) neuron and investigates its realisation on low resolution digitally implemented systems. We report in this new study that stochastic resonance which is mainly associated with floating point implmenetations is possible on lower resolution integer based representations which result in real-time performance on digital hardwre.
The paper discusses the effect of stochastic resonance (SR) in a leaky integrate-and-fire (LIF) neuron and investigates its realisation on low resolution digitally implemented systems. We report in this new study that stochastic resonance which is mainly associated with floating point implmenetations is possible on lower resolution integer based representations which result in real-time performance on digital hardwre.
ES2002-358
Hardware solutions for implementation of neural networks in High Energy Physics triggers
J.-C. Prévotet, B. Denby, P. Garda, B. Granado, C. Kiesling
Hardware solutions for implementation of neural networks in High Energy Physics triggers
J.-C. Prévotet, B. Denby, P. Garda, B. Granado, C. Kiesling
Abstract:
Neural networks have been used as triggers in HEP for more than ten years, and continue to deliver promising results. In this article, we will give an overview of the triggering problem and present general neural online solutions retained by physicists to process data in High Energy Physics triggers. We will finally describe an FPGA implemented architecture dedicated to fast neural computations, taking advantage of massive parallelism in order to meet the tight timing constraints imposed by Level 1 neural triggers.
Neural networks have been used as triggers in HEP for more than ten years, and continue to deliver promising results. In this article, we will give an overview of the triggering problem and present general neural online solutions retained by physicists to process data in High Energy Physics triggers. We will finally describe an FPGA implemented architecture dedicated to fast neural computations, taking advantage of massive parallelism in order to meet the tight timing constraints imposed by Level 1 neural triggers.
Perspectives on Learning with Recurrent Networks
ES2002-200
Perspectives on learning with recurrent neural networks
B. Hammer, J.J. Steil
Perspectives on learning with recurrent neural networks
B. Hammer, J.J. Steil
Abstract:
We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods and dynamical systems theory. The structuring guideline is to understand many new approaches as different efforts to regularize and thereby improve recurrent learning. Often this is done on two levels: by restricting the learning objective by constraints, for instance derived from stability conditions or weight normalization, and by imposing architectural constraints as for instance local recurrence.
We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods and dynamical systems theory. The structuring guideline is to understand many new approaches as different efforts to regularize and thereby improve recurrent learning. Often this is done on two levels: by restricting the learning objective by constraints, for instance derived from stability conditions or weight normalization, and by imposing architectural constraints as for instance local recurrence.
ES2002-211
DEKF-LSTM
F.A. Gers, J.A. Perez-Ortiz, D. Eck, J. Schmidhuber
DEKF-LSTM
F.A. Gers, J.A. Perez-Ortiz, D. Eck, J. Schmidhuber
Abstract:
Unlike traditional recurrent neural networks, the long short-term memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n = 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.
Unlike traditional recurrent neural networks, the long short-term memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n = 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.
ES2002-207
Generalization by structural properties from sparse nested symbolic data
M. Boden
Generalization by structural properties from sparse nested symbolic data
M. Boden
Abstract:
A set of simulations demonstrate that recurrent networks can exhibit generalization by abstraction from extremely sparse but structurally homogenous symbolic data. By cascading two recurrent networks -- feeding the second network with discretized hidden states of the first -- it is also possible to generalize according to complex structure. By automatic discretization the cascaded architecture assists in scaling up sequential learning tasks and offers explanations to the apparent systematicity and generativity of language use.
A set of simulations demonstrate that recurrent networks can exhibit generalization by abstraction from extremely sparse but structurally homogenous symbolic data. By cascading two recurrent networks -- feeding the second network with discretized hidden states of the first -- it is also possible to generalize according to complex structure. By automatic discretization the cascaded architecture assists in scaling up sequential learning tasks and offers explanations to the apparent systematicity and generativity of language use.
ES2002-210
Estimating probabilities for unbounded categorization problems
J. Henderson
Estimating probabilities for unbounded categorization problems
J. Henderson
Abstract:
We propose two output activation functions for estimating probability distributions over an unbounded number of categories with a recurrent neural network, and derive the statistical assumptions which they embody. Both these methods perform better than the standard approach to such problems, when applied to probabilistic parsing of natural language with Simple Synchrony Networks.
We propose two output activation functions for estimating probability distributions over an unbounded number of categories with a recurrent neural network, and derive the statistical assumptions which they embody. Both these methods perform better than the standard approach to such problems, when applied to probabilistic parsing of natural language with Simple Synchrony Networks.
ES2002-206
A general framework for unsupervised processing of structured data
B. Hammer, A. Micheli, A. Sperduti
A general framework for unsupervised processing of structured data
B. Hammer, A. Micheli, A. Sperduti
Abstract:
We propose a general framework for unsupervised recurrent and recursive networks. This proposal covers various popular approaches like standard self organizing maps (SOM), temporal Kohonen maps, recursive SOM, and SOM for structured data. We define Hebbian learning within this general framework. We show how approaches based on an energy function, like neural gas, can be transferred to this abstract framework so that proposals for new learning algorithms emerge.
We propose a general framework for unsupervised recurrent and recursive networks. This proposal covers various popular approaches like standard self organizing maps (SOM), temporal Kohonen maps, recursive SOM, and SOM for structured data. We define Hebbian learning within this general framework. We show how approaches based on an energy function, like neural gas, can be transferred to this abstract framework so that proposals for new learning algorithms emerge.
ES2002-204
Undershooting: modeling dynamical systems by time grid refinements
H.G. Zimmermann, R. Neuneier, R. Grothmann
Undershooting: modeling dynamical systems by time grid refinements
H.G. Zimmermann, R. Neuneier, R. Grothmann
Abstract:
Building models of dynamical systems on the basis of observed data, the time grid of the data is typically the same as the time grid of the model. We show that a refinement of the model time grid relative to a wider-meshed time grid of the data provides deeper insights into the dynamics. This ''undershooting'' can be derived from the principle of uniform causality. Combining undershooting with recurrent error correction neural networks, lead to a novel approach which improves the performance of our models by time grid refinements.
Building models of dynamical systems on the basis of observed data, the time grid of the data is typically the same as the time grid of the model. We show that a refinement of the model time grid relative to a wider-meshed time grid of the data provides deeper insights into the dynamics. This ''undershooting'' can be derived from the principle of uniform causality. Combining undershooting with recurrent error correction neural networks, lead to a novel approach which improves the performance of our models by time grid refinements.
ES2002-203
Learning in a chaotic neural network
N. Crook, T. olde Scheper
Learning in a chaotic neural network
N. Crook, T. olde Scheper
Abstract:
Previous research has shown how the Unstable Periodic Orbits (UPOs) embedded in a chaotic attractor can be made to correspond to self-organised dynamic memory states in a chaotic neural network. This paper demonstrates how this chaotic neural network model can be extended to enable it to adapt to dynamic input patterns using two unsupervised learning rules. The proposed learning rules are designed to modify model parameters in order to support the network's dynamics from which the memories emerge. This means that input weights and feedback delays are adapted so that the network will stabilise an appropriate UPO in response to each input signal.
Previous research has shown how the Unstable Periodic Orbits (UPOs) embedded in a chaotic attractor can be made to correspond to self-organised dynamic memory states in a chaotic neural network. This paper demonstrates how this chaotic neural network model can be extended to enable it to adapt to dynamic input patterns using two unsupervised learning rules. The proposed learning rules are designed to modify model parameters in order to support the network's dynamics from which the memories emerge. This means that input weights and feedback delays are adapted so that the network will stabilise an appropriate UPO in response to each input signal.
ES2002-205
Yield curve forecasting by error correction neural networks and partial learning
H.G. Zimmermann, Ch. Tietz, R. Grothmann
Yield curve forecasting by error correction neural networks and partial learning
H.G. Zimmermann, Ch. Tietz, R. Grothmann
Abstract:
Error correction neural networks (ECNN) are an appropriate framework for the modeling of dynamical systems in the presents of noise or missing external influences. Combining ECNNs with the concept of variants-invariants separation in form of a bottleneck coordinate transformation, we are able to handle high-dimensional problems. Further on, we propose a new learning rule for the training of neural networks, which evaluates only specific gradients for the adaptation of the network weights. By this, we are able to generate time invariant localized structures and thus, support the optimization of the network. Forecasting the German yield curve, an ECNN including the separation of variants-invariants is superior to traditional neural networks.
Error correction neural networks (ECNN) are an appropriate framework for the modeling of dynamical systems in the presents of noise or missing external influences. Combining ECNNs with the concept of variants-invariants separation in form of a bottleneck coordinate transformation, we are able to handle high-dimensional problems. Further on, we propose a new learning rule for the training of neural networks, which evaluates only specific gradients for the adaptation of the network weights. By this, we are able to generate time invariant localized structures and thus, support the optimization of the network. Forecasting the German yield curve, an ECNN including the separation of variants-invariants is superior to traditional neural networks.
ANN models and learning II I
ES2002-24
State reconstruction of piecewise linear maps using a clustering machine
G. Millerioux, G. Bloch
State reconstruction of piecewise linear maps using a clustering machine
G. Millerioux, G. Bloch
Abstract:
State reconstruction of piecewise linear systems is addressed. The description of such a family of systems involves, for each region of the partitioned state space, an affine description and a switching rule which orchestrates the way the dynamics changes from a linear form to another. It results on two distinct states~: the continuous state the discrete state. An observer of piecewise linear systems must recover both of them. It is shown that the discrete state can be recovered by a clustering technique. The continuous state reconstruction is formulated as set of Linear Matrix Inequalities to be solved. They are derived from the notion of poly-quadratic stability and ensure global convergence of the observer.
State reconstruction of piecewise linear systems is addressed. The description of such a family of systems involves, for each region of the partitioned state space, an affine description and a switching rule which orchestrates the way the dynamics changes from a linear form to another. It results on two distinct states~: the continuous state the discrete state. An observer of piecewise linear systems must recover both of them. It is shown that the discrete state can be recovered by a clustering technique. The continuous state reconstruction is formulated as set of Linear Matrix Inequalities to be solved. They are derived from the notion of poly-quadratic stability and ensure global convergence of the observer.
ES2002-52
Unsupervised classifier for monitoring and diagnostic of time series
S. Lecoeuche
Unsupervised classifier for monitoring and diagnostic of time series
S. Lecoeuche
Abstract:
It is assumed that complex systems are represented by parameters that evolve with time. Hence, it is possible to survey systems and to make diagnostics analyzing time series. The paper presents the development of a neural architecture. A membership degree of an input vector to a prototype is introduced along with the membership degree of the input to a class. The proposed unsupervised learning process makes possible the creation of new prototypes and new classes when necessary. The application to standard time series shows good results: 96% of the inputs are well classified using few prototypes.
It is assumed that complex systems are represented by parameters that evolve with time. Hence, it is possible to survey systems and to make diagnostics analyzing time series. The paper presents the development of a neural architecture. A membership degree of an input vector to a prototype is introduced along with the membership degree of the input to a class. The proposed unsupervised learning process makes possible the creation of new prototypes and new classes when necessary. The application to standard time series shows good results: 96% of the inputs are well classified using few prototypes.
ES2002-56
Width optimization of the Gaussian kernels in Radial Basis Function Networks
N. Benoudjit, C. Archambeau, A. Lendasse, J. Lee, M. Verleysen
Width optimization of the Gaussian kernels in Radial Basis Function Networks
N. Benoudjit, C. Archambeau, A. Lendasse, J. Lee, M. Verleysen
Abstract:
Radial basis function networks are usually trained according to a three-stage procedure. In the literature, many papers are devoted to the estimation of the position of Gaussian kernels, as well as the computation of the weights. Meanwhile, very few focus on the estimation of the kernel widths. In this paper, first, we develop a heuristic to optimize the widths in order to improve the generalization process. Subsequently, we validate our approach on several theoretical and real-life approximation problems.
Radial basis function networks are usually trained according to a three-stage procedure. In the literature, many papers are devoted to the estimation of the position of Gaussian kernels, as well as the computation of the weights. Meanwhile, very few focus on the estimation of the kernel widths. In this paper, first, we develop a heuristic to optimize the widths in order to improve the generalization process. Subsequently, we validate our approach on several theoretical and real-life approximation problems.
ES2002-36
High frequency forecasting with associative memories
A. Pasley, J. Austin
High frequency forecasting with associative memories
A. Pasley, J. Austin
ES2002-63
Nonlinear PCA: a new hierarchical approach
M. Scholz, R. Vigario
Nonlinear PCA: a new hierarchical approach
M. Scholz, R. Vigario
Abstract:
Traditionally, nonlinear principal component analysis (NLPCA) is seen as nonlinear generalization of the standard (linear) principal component analysis (PCA). So far, most of these generalizations rely on a symmetric type of learning. Here we propose an algorithm that extends PCA into NLPCA through a hierarchical type of learning. The hierarchical algorithm (h-NLPCA), like many versions of the symmetric one (s-NLPCA), is based on a multi-layer perceptron with an auto-associative topology, the learning rule of which has been upgraded to accommodate the desired discrimination between components. With h-NLPCA we seek not only the nonlinear subspace spanned by the optimal set of components, ideal for data compression, but we give particular interest to the order in which these components appear. Due to its hierarchical nature, our algorithm is shown to be very efficient in detecting meaningful nonlinear features from real world data, as well as in providing a nonlinear whitening. Furthermore, in a quantitative type of analysis, the h-NLPCA achieves better classification accuracies, with a smaller number of components than most traditional approaches.
Traditionally, nonlinear principal component analysis (NLPCA) is seen as nonlinear generalization of the standard (linear) principal component analysis (PCA). So far, most of these generalizations rely on a symmetric type of learning. Here we propose an algorithm that extends PCA into NLPCA through a hierarchical type of learning. The hierarchical algorithm (h-NLPCA), like many versions of the symmetric one (s-NLPCA), is based on a multi-layer perceptron with an auto-associative topology, the learning rule of which has been upgraded to accommodate the desired discrimination between components. With h-NLPCA we seek not only the nonlinear subspace spanned by the optimal set of components, ideal for data compression, but we give particular interest to the order in which these components appear. Due to its hierarchical nature, our algorithm is shown to be very efficient in detecting meaningful nonlinear features from real world data, as well as in providing a nonlinear whitening. Furthermore, in a quantitative type of analysis, the h-NLPCA achieves better classification accuracies, with a smaller number of components than most traditional approaches.
ES2002-28
Probabilistic derivation and Multiple Canonical Correlation Analysis
P.L. Lai
Probabilistic derivation and Multiple Canonical Correlation Analysis
P.L. Lai
Abstract:
We review a new method of performing Canonical Correlation Analysis (CCA) with Artificial Neural Networks. We have previously [4,5] compared its capabilities with standard statistical methods on simple data sets where the maximum correlations are given by linear filters. In this paper, we extend the method by implementing a very precise set of constraints which allow multiple correlations to be found at once. We demonstrate the network's capabilities on the standard Random Dot Stereogram data set. We also re-derive the learning rules from a probabilistic perspective and then by use of a specific prior on the weights, simplify [2] which an abstraction of the Random Do Stereogram matching problem and show how a second layer network using Factor Analysis can be used to combine the results of the CCA network to obtain higher order information.
We review a new method of performing Canonical Correlation Analysis (CCA) with Artificial Neural Networks. We have previously [4,5] compared its capabilities with standard statistical methods on simple data sets where the maximum correlations are given by linear filters. In this paper, we extend the method by implementing a very precise set of constraints which allow multiple correlations to be found at once. We demonstrate the network's capabilities on the standard Random Dot Stereogram data set. We also re-derive the learning rules from a probabilistic perspective and then by use of a specific prior on the weights, simplify [2] which an abstraction of the Random Do Stereogram matching problem and show how a second layer network using Factor Analysis can be used to combine the results of the CCA network to obtain higher order information.
ES2002-66
Orthogonal transformations for optimal time series prediction
M. Salmeron, A. Prieto, J. Ortega, C.G. Puntonet, M. Rodriguez Alvarez
Orthogonal transformations for optimal time series prediction
M. Salmeron, A. Prieto, J. Ortega, C.G. Puntonet, M. Rodriguez Alvarez
ES2002-69
Neuro-fuzzy methodologies for the clustering and the reliability estimation of olive fruit fly infestation
E. Bellei, R. Petacchi, L. Reyneri
Neuro-fuzzy methodologies for the clustering and the reliability estimation of olive fruit fly infestation
E. Bellei, R. Petacchi, L. Reyneri
Abstract:
The present article describes the last results obtained from the application of neuro-fuzzy techniques in the study of Bactrocera Oleae infestation in the Liguria region olive grows. The project “Applications of Neuro-Fuzzy Techniques in Agriculture” started on March 2000 with the monitoring and collection of data from a large number of oil farms. The main aim of the project was realized an area-wide Bactrocera Oleae monitoring network in order to administer IPM and to provide technical assistance in treatments to each farm. The way to reach this aim had been the creation of neuro-fuzzy systems for the extraction of infestation’s features to make an appropriate classification with good labels to suggest treatments for each monitored farm, based also on the estimation of olive fly measure’s reliability. During the project, it has been proved that standard approaches to made forecast analyses on data referred to the growth of olive fly, give results less good and flexible than those obtained with the new analysis techniques like neuro-fuzzy methodologies, which are more adapted for non-linear and complex problems like the agronomic ones.
The present article describes the last results obtained from the application of neuro-fuzzy techniques in the study of Bactrocera Oleae infestation in the Liguria region olive grows. The project “Applications of Neuro-Fuzzy Techniques in Agriculture” started on March 2000 with the monitoring and collection of data from a large number of oil farms. The main aim of the project was realized an area-wide Bactrocera Oleae monitoring network in order to administer IPM and to provide technical assistance in treatments to each farm. The way to reach this aim had been the creation of neuro-fuzzy systems for the extraction of infestation’s features to make an appropriate classification with good labels to suggest treatments for each monitored farm, based also on the estimation of olive fly measure’s reliability. During the project, it has been proved that standard approaches to made forecast analyses on data referred to the growth of olive fly, give results less good and flexible than those obtained with the new analysis techniques like neuro-fuzzy methodologies, which are more adapted for non-linear and complex problems like the agronomic ones.
ES2002-30
Use of artificial neural networks process analyzers: a case study
H. Al-Duwaish, L. Ghouti, T. Halawani, M. Mohandes
Use of artificial neural networks process analyzers: a case study
H. Al-Duwaish, L. Ghouti, T. Halawani, M. Mohandes
Abstract:
In this paper, artificial neural networks (ANN), which are known for their ability to model nonlinear systems and their inherent noise-filtering abilities, are used as O2 analyzer to predict O2 contents in a boiler at SHARQ petrochemical company in Saudi Arabia. The training data has been collected over duration of one month and used to train a neural network to develop neural based oxygen analyzer. The results are very promising.
In this paper, artificial neural networks (ANN), which are known for their ability to model nonlinear systems and their inherent noise-filtering abilities, are used as O2 analyzer to predict O2 contents in a boiler at SHARQ petrochemical company in Saudi Arabia. The training data has been collected over duration of one month and used to train a neural network to develop neural based oxygen analyzer. The results are very promising.
Information extraction
ES2002-17
Forecasting using twinned principal curves
Y. Han, C. Fyfe
Forecasting using twinned principal curves
Y. Han, C. Fyfe
ES2002-50
Kernel Temporal Component Analysis (KTCA)
D. Martinez, A. Bray
Kernel Temporal Component Analysis (KTCA)
D. Martinez, A. Bray
Abstract:
We describe an efficient algorithm for simultaneously extracting multiple smoothly-varying non-linear invariances from time-series data. The method exploits the concept of maximizing temporal predictability introduced by Stone in the linear domain - we term this temporal component analysis (TCA). Our current work extends this linear method into the non linear domain using kernel-based methods; it performs a non-linear projection of the input into an unknown high dimensional feature space, computing a linear solution in this space. In this paper we describe the improved on-line version of this algorithm (KTCA) for working on very large data sets, and demonstrate its applicability for computer vision by extracting non-linear disparity directly from grey-level stereo pairs, without pre-processing.
We describe an efficient algorithm for simultaneously extracting multiple smoothly-varying non-linear invariances from time-series data. The method exploits the concept of maximizing temporal predictability introduced by Stone in the linear domain - we term this temporal component analysis (TCA). Our current work extends this linear method into the non linear domain using kernel-based methods; it performs a non-linear projection of the input into an unknown high dimensional feature space, computing a linear solution in this space. In this paper we describe the improved on-line version of this algorithm (KTCA) for working on very large data sets, and demonstrate its applicability for computer vision by extracting non-linear disparity directly from grey-level stereo pairs, without pre-processing.
ES2002-62
Exploratory Correlation Analysis
J. Koetsier, D. MacDonald, D. Charles, C. Fyfe
Exploratory Correlation Analysis
J. Koetsier, D. MacDonald, D. Charles, C. Fyfe
Abstract:
We present a novel unsupervised artificial neural network for the extraction of common features in multiple data sources. This algorithm, which we name Exploratory Correlation Analysis (ECA), is a multi-stream extension of a neural implementation of Exploratory Projection Pursuit (EPP) and has a close relationship with Canonical Correlation Analysis (CCA). Whereas EPP identifies "interesting" statistical directions in a single stream of data, ECA develops a joint coding of the common underlying statistical features across a number of data streams.
We present a novel unsupervised artificial neural network for the extraction of common features in multiple data sources. This algorithm, which we name Exploratory Correlation Analysis (ECA), is a multi-stream extension of a neural implementation of Exploratory Projection Pursuit (EPP) and has a close relationship with Canonical Correlation Analysis (CCA). Whereas EPP identifies "interesting" statistical directions in a single stream of data, ECA develops a joint coding of the common underlying statistical features across a number of data streams.
Neural Network Techniques in Fault Detection and Isolation
ES2002-302
Neural networks for fault diagnosis and identification of industrial processes
S. Simani, C. Fantuzzi
Neural networks for fault diagnosis and identification of industrial processes
S. Simani, C. Fantuzzi
Abstract:
In this work a model--based procedure exploiting analytical redundancy via state estimation techniques for the diagnosis of faults regarding sensors of a dynamic system is presented. Fault detection is based on Kalman filters designed in stochastic environment. Fault identification is therefore performed by means of different neural network architectures. In particular, neural networks are used as function approximators for estimating sensor fault sizes. The proposed fault diagnosis and identification tool is tested on a industrial gas turbine.
In this work a model--based procedure exploiting analytical redundancy via state estimation techniques for the diagnosis of faults regarding sensors of a dynamic system is presented. Fault detection is based on Kalman filters designed in stochastic environment. Fault identification is therefore performed by means of different neural network architectures. In particular, neural networks are used as function approximators for estimating sensor fault sizes. The proposed fault diagnosis and identification tool is tested on a industrial gas turbine.
ES2002-301
Neural networks for fault diagnosis of industrial plants at different working points
S. Simani, R. J. Patton
Neural networks for fault diagnosis of industrial plants at different working points
S. Simani, R. J. Patton
Abstract:
Industrial plants often work at different operating points. However, in literature applications of neural networks for fault diagnosis usually consider only a single working condition or small changes of operating points. A standard scheme for the design of neural networks for fault diagnosis at all operating points may be impractical due to the unavailability of suitable training data for all working conditions. This paper addresses the design of a single neural network for the diagnosis of faults in the sensors of an industrial gas turbine working at different conditions. The presented results illustrate the performance of the trained neural network for sensor fault diagnosis.
Industrial plants often work at different operating points. However, in literature applications of neural networks for fault diagnosis usually consider only a single working condition or small changes of operating points. A standard scheme for the design of neural networks for fault diagnosis at all operating points may be impractical due to the unavailability of suitable training data for all working conditions. This paper addresses the design of a single neural network for the diagnosis of faults in the sensors of an industrial gas turbine working at different conditions. The presented results illustrate the performance of the trained neural network for sensor fault diagnosis.
ES2002-303
Fault diagnosis of an electro-pneumatic valve actuator using neural networks with fuzzy capabilities
F.J. Uppal, R.J. Patton
Fault diagnosis of an electro-pneumatic valve actuator using neural networks with fuzzy capabilities
F.J. Uppal, R.J. Patton
Abstract:
The early detection of faults (just beginning and still developing) can help avoid system shutdown, breakdown and even catastrophes involving human fatalities and material damage. Computational intelligence techniques are being investigated as an extension to the traditional fault diagnosis methods. This paper discusses the neuro-fuzzy approach to modelling and fault diagnosis, based on the TSK/Mamdani approaches. An application study of an electro-pneumatic valve actuator in a sugar factory is described. The key issues of finding a suitable structure for detecting and isolating ten realistic actuator faults are outlined.
The early detection of faults (just beginning and still developing) can help avoid system shutdown, breakdown and even catastrophes involving human fatalities and material damage. Computational intelligence techniques are being investigated as an extension to the traditional fault diagnosis methods. This paper discusses the neuro-fuzzy approach to modelling and fault diagnosis, based on the TSK/Mamdani approaches. An application study of an electro-pneumatic valve actuator in a sugar factory is described. The key issues of finding a suitable structure for detecting and isolating ten realistic actuator faults are outlined.
ES2002-304
Non-linear Canonical Correlation Analysis using a RBF network
S. Kumar, E.B. Martin, J. Morris
Non-linear Canonical Correlation Analysis using a RBF network
S. Kumar, E.B. Martin, J. Morris
Abstract:
A non-linear version of the multivariate statistical technique of canonical correlation analysis (CCA) is proposed through the integration of a radial basis function (RBF) network. The advantage of the RBF network is that the solution of linear CCA can be used to train the network and hence the training effort is minimal. Also the canonical variables can be extracted simultaneously. It is shown that the proposed technique can be used to extract non-linear structures inherent within a data set.
A non-linear version of the multivariate statistical technique of canonical correlation analysis (CCA) is proposed through the integration of a radial basis function (RBF) network. The advantage of the RBF network is that the solution of linear CCA can be used to train the network and hence the training effort is minimal. Also the canonical variables can be extracted simultaneously. It is shown that the proposed technique can be used to extract non-linear structures inherent within a data set.
ES2002-305
Free-swinging and locked joint fault detection and isolation in cooperative manipulators
R. Tinos, M. H. Terra
Free-swinging and locked joint fault detection and isolation in cooperative manipulators
R. Tinos, M. H. Terra
Abstract:
The problem of fault detection and isolation (FDI) in cooperative manipulators is addressed. Free-swinging and locked joint faults are detected and isolated by an FDI system based on neural networks. For each arm, a Multilayer Perceptron (MLP) is used to reproduce the dynamics of the fault-free robot. The outputs of each MLP are compared to the real joint velocities in order to generate a residual vector that is then classified by an RBF network. Simulations and a real application are presented indicating the effectiveness of the FDI system.
The problem of fault detection and isolation (FDI) in cooperative manipulators is addressed. Free-swinging and locked joint faults are detected and isolated by an FDI system based on neural networks. For each arm, a Multilayer Perceptron (MLP) is used to reproduce the dynamics of the fault-free robot. The outputs of each MLP are compared to the real joint velocities in order to generate a residual vector that is then classified by an RBF network. Simulations and a real application are presented indicating the effectiveness of the FDI system.