ESANN 2002 - proceedings

Abstract:
In this paper, we study a natural extension of Multi Layer Perceptrons (MLP) to functional inputs. We show that fundamental results for numerical MLP can be extended to functional MLP. We obtain universal approximation results that show the expressive power of functional MLP is comparable to the one of numerical MLP. We obtain consistency results which imply that optimal parameters estimation for functional MLP is consistent.

Download manuscript

ES2002-25
Storing many-to-many mappings on a feed-forward neural network using fuzzy sets
R.K. Brouwer

Abstract:
Feed-forward networks are generally trained to represent functions or many-to-one (m-o) mappings. In this paper however a feed-forward network with modified training algorithm is considered to represent multi-valued or one-to-many (o-m) mappings. The o-m mapping is viewed as an m-o mapping where the values corresponding to a value of the independent variable are sets. Thus the problem of representing a o-m mapping has been converted into a problem of training a network to return sets rather than vectors. The resulting o-m mapping may have variable multiplicity leading to sets of variable cardinality. The crisp sets of variable cardinality in turn are replaced by fuzzy sets of fixed cardinality by adding elements, called “do not cares” which have membership values of zero. Since the target outputs of the feedforward network are now sets of fixed cardinality and the actual output of a feedforward network is a vector the training algorithm is modified to take into account the fact that order should be removed as a constraint when the error vector is calculated. Results of simulations show that the method proposed is quite effective.

Download manuscript

ES2002-16
Heteroscedastic regularised kernel regression for prediction of episodes of poor air quality
R.J. Foxall, G.C. Cawley, N.L.C. Talbot, S.R. Dorling, D.P. Mandic

Abstract:
Abstract not available

Download manuscript

Exploratory Data Analysis in Medicine and Bioinformatics

ES2002-400
Exploratory Data Analysis in Medicine and Bioinformatics
A. Wismüller, T. Villmann

Abstract:
Abstract not available

Download manuscript

ES2002-401
A data vizualisation method for investigating the reliability of a high-dimensional low-back-pain MLP network
M.L. Vaughn, S.J. Taylor, M.A. Foy, A.J.B. Fogg

Abstract:
This study uses a new data visualization method, developed by the first author, to investigate the reliability of a real world low back-pain Multi-layer Perceptron (MLP) network from a hidden layer decision region perspective. Using decision region identification information from an explanation facility, the MLP training examples are discovered to occupy decision regions in contiguous class threads across the 48-dimensional input space. MLP testing cases show a similar distribution and consistency within the contiguous threads but with a reduced reliability. Three test regions outside the network’s knowledge bounds are situated between training regions with a consistent classification.

Download manuscript

ES2002-403
Double self-organizing maps to cluster gene expression data
D. Wang, H. Ressom, M. Musavi, C. Domnisoru

Abstract:
Clustering is a very useful and important technique for analyzing gene expression data. Self-organizing map (SOM) is one of the most useful clustering algorithms. SOM requires the number of clusters to be one of the initialization parameters prior to clustering. However, this information is unavailable in most cases, particularly in gene expression data. Thus, the validation results from SOM are commonly employed to choose the appropriate number of clusters. This approach is very inconvenient and time-consuming. This paper applies a novel model of SOM, which is known as double self-organizing map (DSOM), to cluster gene expression data. DSOM helps to find the appropriate number of clusters by clearly and visually depicting the appropriate number of clusters. We use DSOM to cluster an artificial data set and two kinds of real gene expression data sets. To validate our results, we employed a novel validation technique, which is known as figure of merit (FOM)

Download manuscript

ES2002-402
Improving robustness of fuzzy gene modeling
R. Reynolds, H. Ressom, M. Musavi, C. Domnisoru

Abstract:
This paper proposes modifications to current fuzzy models of gene interaction. Current algorithms apply all combinations of genes to a fuzzy model (i.e. activator/repressor/target), evaluating how well each combination fits the model. The models are susceptible to noisy signals in the gene expression data. Since the margin of error in current microarray technology can be high, the results generated may not properly reflect valid relationships. This paper investigates different methods of creating fuzzy models. We explore methods of conjunction and rule aggregation that produce valid results while being resilient to minor changes to model input.

Download manuscript

Sampling and model selection

ES2002-60
Parametric bootstrap for test of contrast difference in neural networks
R. Kallel, J. Rynkiewicz

Abstract:
This work concernes the contrast difference test and its asymptotic properties for non linear auto-regressive models. Our approach is based on an application of the parametric bootstrap method. It is a re-sampling method based on the estimate parameters of the models. The resulting methodology is illustrated by simulations of multilayer perceptron models, and an asymptotic justification is given at the end.

Download manuscript

ES2002-64
A resampling and multiple testing-based procedure for determining the size of a neural network
A. Yanes Escolano, E. Guerrero Vazquez, P.L. Galindo Riano, J. Pizarro Junquera

Abstract:
One of the most important difficulties in using neural networks for a real-world problem is the issue of model complexity, and how affects the generalization performance. We present a new algorithm based on multiple comparison methods for finding low complexity neural networks with high generalization capability.

Download manuscript

Neural Networks and Cognitive Science

ES2002-450
Neural networks for modeling memory : case studies
H. Paugam-Moisy, D. Puzenat, E. Reynaud, J.-P. Magué

Abstract:
First, neural networks have been inspired by cognitive processes [MCC43,HEB49,RUM86]. Second, they were proved to be very efficient computing tools for engineering, financial and medical applications [FRE91,BIS95,HER94,BLA96]. In this article we point out that there is still a great interest, for both engineering and cognitive science, to explore more deeply the links between natural and artificial neural systems. On the one hand: how to define more complex learning rules adapted to heterogeneous neural networks and how to build modular multi-network systems for modeling cognitive processes. On the other hand: how to derive new interesting learning paradigms back, for artificial neural networks, and how to design more performant systems than classical basic connectionist models. After a short survey of connectionist models for modeling memory, we develop two case studies. The first is a model for a multimodal associative memory and the second is a model for more deeply understanding the mechanisms of spatial cognition.

Download manuscript

ES2002-452
Connectionist models investigating representations formed in the sequential generation of characters
F.M. Richardson, N. Davey, L. Peters, D.J. Done, S.H. Anthony

Abstract:
This paper considers the results of three different methods of encoding visual and motor representations of single sequential character production using three different architectures for the simulation of perceptual and motor processes. Examination of such processes through neural net modelling of the generation of handwritten characters promises to be a fruitful avenue of exploration as the induced representations of the models can be examined. The results of this analysis showed that both spatial and temporal similarity were important in these representations. Similar results have been shown to be true for actual representations in the motor cortex.

Download manuscript

ES2002-453
The problem of adaptive control in a living system or how to acquire an inverse model without external help
K. Th. Kalveram, T. Schinauer

Abstract:
Recent research uncovers that goal directed sensorimotor behaviour is governed by negative feedback of positional error, and by feedforward through inverse modelling of the limb's dynamics. Thereby, forward models seem to provide the kinematic state of the limb. The question addressed in the paper is, how the neural network representing the inverse model can be trained. Because in this case an error based learning algorithm seems to be unavailable, an alternative non error based method called auto-imitation is proposed. It is demonstrated, that, if combining a special type of neural network (the power net) with a modified type of a Hebbian synapse, the inverse dynamics of an onejointed arm can be precisely identified using auto-imitation. This holds for a simulated arm and a real robot arm as well.

Download manuscript

ES2002-451
Biologically-inspired human motion detection
V. Laxmi, J.N. Carter, R.I. Damper

Abstract:
A model of motion detection is described, inspired by the capability of humans to recognise biological motion even from minimal information systems such as moving light displays. The model, a feed-forward backpropagation neural network, uses labelled joint data, analogous to light points in such displays. In preliminary work, the model achieves 100% person classification on a set of 4 artificial subjects and another of 4 real subjects. Subsequently, 100% motion detection is achieved on a set of 21 subjects. In the latter case, the correspondence problem is also solved by the model, since the network is not `told' which joint is which. Like human beings, the neural networks perform both tasks within a small fraction of the gait cycle.

Download manuscript

ES2002-32
Why will rat's go where rats will not?
J. Hayes, V. Murphy, N. Davey, P. Smith, L. Peters

Abstract:
Experimental evidence indicates that regular plurals are nearly always omitted from English compounds (e.g., rats-eater) while irregular plurals may be included within these structures (e.g., mice-chaser). This phenomenon is considered to be good evidence to support the dual mechanism model of morphological processing (Pinker & Prince, 1992). However, evidence from neural net modelling has shown that a single route associative memory based account might provide an equally, if not more, valid explanation of the compounding phenomenon.

Download manuscript

ANN models and learning I

ES2002-51
Rule extraction from support vector machines
H. Nunez, C. Angulo, A. Catala

Abstract:
Support vector machines (SVMs) are learning systems based on the statistical learning theory, which are exhibiting good generalization ability on real data sets. Nevertheless, a possible limitation of SVM is that they generate black box models. In this work, a procedure for rule extraction from support vector machines is proposed: the SVM+Prototypes method. This method allows to give explanation ability to SVM. Once determined the decision function by means of a SVM, a clustering algorithm is used to determine prototype vectors for each class. These points are combined with the support vectors using geometric methods to define ellipsoids in the input space, which are later transfers to if-then rules. By using the support vectors we can establish the limits of these regions.

Download manuscript

ES2002-6
Fuzzy support vector machines for multiclass problems
S. Abe, T. Inoue

Abstract:
Since support vector machines for pattern classification are based on two-class classification problems, unclassifiable regions exist when extended to n ( > 2)-class problems. In our previous work, to solve this problem, we developed fuzzy support vector machines for one-to-(n-1) classification. In this paper, we extend our method to pairwise classification. Namely, using the decision functions obtained by training the support vector machines for classes i and j (j ne i, j =1,..., n), for class i we define a truncated polyhedral pyramidal membership function. The membership functions are defined so that, for the data in the classifiable regions, the classification results are the same for the two methods. Thus, the generalization ability of the fuzzy support vector machine is the same with or better than that of the support vector machine for pairwise classification. We evaluate our method for four benchmark data sets and demonstrate the superiority of our method.

Download manuscript

ES2002-7
Different criteria for active learning in neural networks: a comparative study
J. Poland, A. Zell

Abstract:
The field of active learning and optimal query construction in Neural Network training is tightly connected with the design of experiments and its rich theory. Thus there is a large number of active learning strategies and query criteria which have a sound theoretical foundation. This comparative study considers the regression problem of approximating a nonlinear noisy function with relatively few inputs. We evaluate some query criteria, namely space-filling criteria, variance criteria, markov chain monte carlo methods and query by committee.

Download manuscript

ES2002-10
Supervised learning in committee machines by PCA
C. Bunzmann, M. Biehl, R. Urbanczik

Abstract:
A learning algorithm for multilayer perceptrons is suggested which relates to the technique of principal component analysis. The latter is performed with respect to a correlation matrix computed from the example inputs and their target outputs. For large networks it is demonstrated that the procedure requires by far fewer examples for good generalization than traditional on--line training prescriptions.

Download manuscript

ES2002-57
The use of LS-SVM in the classification of brain tumors based on Magnetic Resonance Spectroscopy signals
L. Lukas, A. Devos, J.A.K. Suykens, L. Vanhamme, S. Van Huffel, A.R. Tate, C. Majos, C. Arus

Abstract:
Least Squares Support Vector Machines (LS-SVM) have been developed and successfully applied to classification problems in many areas. In comparison with several other classical methods this technique consistently performs very well on a large variety of problems. Here, results on the application of LS-SVM for classification of brain tumors based on Magnetic Resonance Spectroscopy (MRS) signals are presented. Several kernels are used and compared to find the optimal classifier. Despite the high dimensionality and the scarcity of the input data, and the fact that no additional clinical information is used, a good ROC and classification performance can be achieved after applying leave-one-out cross-validation for hyperparameter selection together with an additional bias term correction. The improvement of this classification based on MRS signals will lead to an advanced tool for the discrimination of brain tumors, which is presently under development for the INTERPRET project.

Download manuscript

ES2002-18
Clustering in data space and feature space
D. MacDonald, C. Fyfe

Abstract:
Abstract not available

Download manuscript

ES2002-19
Maximum likelihood Hebbian rules
C. Fyfe, E. Corchado

Abstract:
In this paper, we review an extension of the learning rules in a Principal Component Analysis network which has been derived to be optimal for a specific probability density function. We note that this probability density function is one of a family of pdfs and investigate the learning rules formed in order to be optimal for several members of this family. We show that, whereas previous authors [5] have viewed the single member of the family as an extension of PCA, it is more appropriate to view the whole family of learning rules as methods of performing Exploratory Projection Pursuit. We illustrate this on artificial data sets.

Download manuscript

ES2002-23
Fast exact leave-one-out cross-validation of least-squares Support Vector Machines
K. Saadi, G.C. Cawley, N.L.C. Talbot

Abstract:
Abstract not available

Download manuscript

ES2002-65
Noise derived information criterion for model selection
J. Pizarro Junquera, P. Galindo Riano, E. Guerrero Vazquez, A. Yanez Escolano

Abstract:
This paper proposes a new complexity-penalization model selection strategy derived from the minimum risk principle and the behavior of candidate models under noisy conditions. This strategy seems to be robust in small sample size conditions and tends to AIC criterion as sample size grows up. The simulation study at the end of the paper will show that the proposed criterion is extremely competitive when compared to other state-of-the-art criteria.

Download manuscript

ES2002-27
An unified framework for 'All data at once' multi-class Support Vector Machines
C. Angulo, X. Parra, A. Catala

Abstract:
Support Vectors (SV) are a machine learning procedure based on Vapnik's Statistical Learning Theory, initially defined for bi-classification problems. A lot of work is being made from different research areas to obtain new algorithms for multi-class problems, the more usual task in real-world problems. A promising extension is to treat `all data at once' into one multi-class SVM by modifying the associated quadratic programming (QP) problem. In this work, a unified architecture is developed to compare the associated QP problem for different approaches. With the new framework comparisons between algorithms become easier and it is a powerful tool to analyze the performance and behaviour of these approaches.

Download manuscript

ES2002-70
Prediction of mental development of preterm newborns at birth time using LS-SVM
L. Ameye, C. Lu, L. Lukas, J. De Brabanter, J.A.K. Suykens, S. Van Huffel, H. Daniels, G. Naulaers, H. Devlieger

Abstract:
Abstract not available

Download manuscript

Representation of high-dimensional data

ES2002-250
Searching for the embedded manifolds in high-dimensional data, problems and unsolved questions
J. Hérault, A. Guérin-Dugué, P. Villemain

Abstract:
Starting from a recall of several classical - and less classical - remarks about high dimensional data spaces, this paper gives a bird's eye view over various techniques of data reduction, from linear multidimensional scaling to non-linear and non-parametric methods. Two kinds of approaches will be presented, the first one operating in the feature space, the second one operating in the dissimilarity space. A special attention will be devoted to the CCA algorithm, in a version which aims at capturing the mean manifold spanned by the data vectors. Some examples from artificial and real data are given.

Download manuscript

ES2002-254
Curvilinear Distance Analysis versus Isomap
J.A. Lee, A. Lendasse, M. Verleysen

Abstract:
Dimension reduction techniques are widely used for the analysis and visualization of complex sets of data. This paper compares two nonlinear projection methods: Isomap and Curvilinear Distance Analysis. Contrarily to the traditional linear PCA, these methods work like multidimensional scaling, by reproducing in the projection space the pairwise distances measured in the data space. They differ from the classical linear MDS by the metrics they use and by the way they build the mapping (algebraic or neural). While Isomap relies directly on the traditional MDS, CDA is based on a nonlinear variant of MDS, called CCA (Curvilinear Component Analysis). Although Isomap and CDA share the same metrics, the comparison highlights their respective strengths and weaknesses.

Download manuscript

ES2002-251
Fast nonlinear dimensionality reduction with topology preserving networks
J.J. Verbeek, N. Vlassis, B. Krose

Abstract:
We present a fast alternative for the Isomap algorithm. A set of quantizers is fit to the data and a neighborhood structure based on the competitive Hebbian rule is imposed on it. This structure is used to obtain low-dimensional description of the data by means of computing geodesic distances and multi dimensional scaling. The quantization allows for faster processing of the data. The speed-up as compared to Isomap is roughly quadratic in the ratio between the number of quantizers and the number of data points. The quantizers and neighborhood structure are used to map the data to the low dimensional space.

Download manuscript

ES2002-255
When does geodesic distance recover the true hidden parametrization of families of articulated images?
D. Donoho, C. Grimes

Abstract:
Abstract not available

Download manuscript

ES2002-253
How to generalize geometric ICA to higher dimensions
F.J. Theis, E.W. Lang

Abstract:
Geometric algorithms for linear independent component analysis (ICA) have recently received some attention due to their pictorial description and their relative ease of implementation. The geometric approach to ICA has been proposed first by Puntonet and Prieto in order to separate linear mixtures. One major drawback of geometric algorithms is, however, an exponentially rising number of samples and convergence times with increasing dimensiononality thus basically restricting geometric ICA to low-dimensional cases. We propose to apply overcomplete ICA to geometric ICA to reduce high dimensional problems to lower-dimensional ones, thus generalizing geometric ICA to higher dimensions.

Download manuscript

ES2002-252
Neural dimensionality reduction for document processing
M. Delichère, D. Memmi

Abstract:
Document processing usually gives rise to high-dimension representation vectors which are redundant and costly to process. Reducing dimensionality would be appropriate, but standard factor analysis methods such as PCA cannot deal with vectors of very high dimension. We have used instead an adaptive neural network technique (the Generalized Hebbian Algorithm) to extract the first principal components of a text corpus in order to represent documents economically. The approach is efficient and gives good results in a real Web page clustering application.

Download manuscript

ANN models and learning II

ES2002-26
Geometric overcomplete ICA
F.J. Theis, E.W. Lang

Abstract:
In independent component analysis (ICA), given some signal input the goal is to find an independent decomposition. We present an algorithm based on geometric considerations to decompose a linear mixture of more sources than sensor signals. We present an efficient method for the matrix-recovery step in the framework of a two-step approach to the source separation problem. The second step - source-recovery - uses the standard maximum-likelihood approach.

Download manuscript

ES2002-71
Advantages and drawbacks of the Batch Kohonen algorithm
J.-C. Fort, P. Letremy, M. Cottrell

Abstract:
The Kohonen algorithm (SOM) was originally defined as a stochastic algorithm which works in an on-line way and which was designed to model some plastic features of the human brain. In fact it is nowadays extensively used for data mining, data visualization, and exploratory data analysis. Some users are tempted to use the batch version of the Kohonen algorithm (KBATCH) since it is a deterministic algorithm which can go faster in some cases. After [7], which tried to elucidate the mathematical nature of the Batch variant, in this paper, we give some elements of comparison for both algorithms, using theoretical arguments, simulated data and real data.

Download manuscript

ES2002-5
Mobile radio access network monitoring using the self-organizing map
P. Lehtimäki, K. Raivio, O. Simula

Abstract:
In this study, a method for process clustering and visualization using the Self-Organizing Map (SOM) is described. The presented method is applied in clustering and monitoring of mobile cells of a Mobile Radio Access Network (RAN).

Download manuscript

ES2002-31
Evaluating the impact of multiplicative input perturbations on radial basis function networks
J.L. Bernier, J. Gonzales, A. Canas, A.F. Diaz, F.J. Fernandez, J. Ortega

Abstract:
Mean Squared Sensitivity (MSS) has been previously introduced as an approximation of the performance degradation of a MLP affected by perturbations in different parameters. In the present paper, we focuse on RBF networks for studying the implications when these are affected by input noise. We have obtained the corresponding analytical expression for MSS and have validated it experimentally, using two different models for perturbations: an additive and a multiplicative model. Thus, MSS is proposed as a quantitative measurement for evaluating the noise immunity of a RBFN configuration.

Download manuscript

ES2002-34
Learning sparse representations of three-dimensional objects
G. Peters, C. von der Malsburg

Abstract:
Each object in our environment can cause considerably different patterns of excitation in our retinae depending on the observed viewpoint of the object. Despite this we are able to perceive that the changing signals are produced by the same object. It is a function of our brain to provide this constant recognition from such inconstant input signals by establishing an internal representation of the object. The nature of such a viewpoint-invariant representation, the way how it can be acquired, and its application in a perception task are the concern of this work. We describe the generation of view-based, sparse representations of real-world objects and apply them in a pose estimation task.

Download manuscript

ES2002-12
An estimation model of pupil size for 'Blink Artifact' and it's applications
M. Nakayama, Y. Shimizu

Abstract:
It is well known that the measuring of pupil size is influenced by noises and blinking. This paper describes the development of an estimation model of pupil size for 'blink artifact' which is based on a 3 layered perceptron with back propagation method. The model was trained by pupil responses with artificial blinks. It was found that pupil size during the blink period could be estimated according to the training period. When this model was applied to pupillary changes for subjects viewing TV programs, inappropriate power of frequencies were removed in the frequency analysis for temporal pupillary change. This result provides evidence that this model can remove faults from pupil response measurements.

Download manuscript

ES2002-38
Novelty detection for strain-gauge degradation using maximally correlated components
G. Hollier, J. Austin

Abstract:
A new method for the detection of the degradation of strain-gauges attached to airframes is developed, using novelty-detection techniques and maximally correlated components. This considerably improves upon the previous method for the detection of changes in the response-line gradient.

Download manuscript

ES2002-40
Modeling efficient conjunction detection with spiking neural networks
S.M. Bohte, J.N. Kok, H. La Poutré

Abstract:
The design of neural networks that are able to efficiently encode and detect conjunctions of features is an important open challenge that is also referred to as “the binding-problem”. We define a formal framework for neural nodes that process activity in the form of tuples of spike-trains which can efficiently encode and detect feature-conjunctions on a retinal input field in a position-invariant manner, also in the presence of multiple feature-conjunctions.

Download manuscript

ES2002-44
Segmental duration control by time delay neural networks with asymmetric causal and retro-causal information flows
C. Erden, H.G. Zimmermann

Abstract:
The generation of pleasant prosody parameters is very important for speech synthesis. A Prosody generation unit can be seen as a dynamical system. In this paper sophisticated time-delay recurrent neural network (NN) topologies are presented which can be used for the modeling of dynamical systems. Within the prosody prediction task left and right context information is known to influence the prediction of prosody control parameters. This can be modeled by causal-retro-causal information flows. Since information being available during training is partially unavailable during application, there is a structural switching from training to application. This structural change of the information flow is handled by two asymmetric architectures. These proposed new architectures allow the integration of further a priori knowledge. By this we are able to improve the performance of our duration control unit within our text-to-speech (TTS) system Papageno.

Download manuscript

ES2002-45
Neural predictive coding for speech discriminant feature extraction: The DFE-NPC
M. Chetouani, B. Gas, J.L. Zarader, C. Chavy

Abstract:
In this paper, we present a predictive neural network called Neural Predictive Coding (NPC). This model is used for non linear discriminant features extraction (DFE) applied to phoneme recognition. We also, present a new extension of the NPC model : DFE-NPC. In order to evaluate the performances of the DFE-NPC model, we carried out a study of Darpa-Timit phonemes (in particular /b/, /d/, /g/ and /p/, /t/, /q/ phonemes) recognition. Comparisons with coding methods (LPC, MFCC, PLP, RASTA-PLP) are presented: they put in obsviousness an improvement of the classification.

Download manuscript

ES2002-47
Multiresolution codes for scene categorization
N. Denquive, P. Tarroux

Abstract:
The development of fast and reliable image classification algorithms is mandatory for modern image applications involving large databases. Biological systems seem to have the ability to categorize complex scenes in an accurate and very fast way. Our aim is to develop an architecture that leads to similar performances in computer vision. In this work, we present a coding method based on some principles inspired from biology that achieves a fast classification of complex visual scenes. A signature vector is extracted from the visual scene by a multi-scale filtering obtained through a bank of Gabor filters. These vectors constitute the inputs of a radial basis function network. The first connection layer implements a recoding of the filter outputs. The second one achieves a linear separation of the classes in the space of coding. We showed that an incremental approach in which each class is learned separately outperforms a more global one in which we tried to learn all classes together. According to the considered image category, the subset of features leading to the best result could be different, suggesting the use of feature vectors adapted to each image category. However, one of the major results of our study is that the signature vector we used, albeit very simple to compute, contains enough information to allow a correct image classification.

Download manuscript

ES2002-48
Evaluation of gradient descent learning algorithms with adaptive and local learning rate for recognising hand-written numerals
M. Giudici, F. Queirolo, M. Valle

Abstract:
Gradient descent learning algorithms, namely Back Propagation (BP), can significantly increase the classification performance of Multi Layer Perceptrons adopting a local and adaptive learning rate management approach. In this paper, we present the comparison of the performance on hand-written characters classification of two BP algorithms, implementing fixed and adaptive learning rate. The results show that the validation error and average number of learning iterations are lower for the adaptive learning rate BP algorithm.

Download manuscript

Learning

ES2002-1
Batch-RLVQ
B. Hammer, T. Villmann

Abstract:
Recently a variation of learning vector quantization has been proposed in [Bojer et.al.], which allows an automatic determination of relevance factors for the input dimensions: relevance learning vector quantization (RLVQ). RLVQ is heuristically motivated and may show instabilities for inappropriate data since it does not obey a gradient dynamics. Here we propose an energy function which describes the dynamics of RLVQ in the stable phase. It can be used to substitute the original dynamics for instable situations. Moreover, it yields to a batch version of RLVQ where hard competition can be substituted by soft clustering. Hence annealing schemes can be applied naturally in order to avoid local minima.

Download manuscript

ES2002-8
Combining gestural and contact information for visual guidance of multi-finger grasps
G. Heidemann, H. Ritter

Abstract:
A computer vision system for a three-fingered robot hand is presented which can solve two entirely different tasks at a time: First, to guide the robot hand, hand gestures of a human instructor are classified using the hand camera. Second, when an object has been grasped the success or failure of the grasping action can be judged qualitatively by the same system. Both tasks are solved using a view based approach which classifies a set of prototypical situations instead of exact geometric reconstruction.

Download manuscript

ES2002-14
Separation of a mixture of signals using linear filtering and second order statistics
A.M. Tomé

Abstract:
Some recent works address the problem of blind source separation with a matrix pencil. In this paper we show that the covariance matrices of the pencil can be computed at the output of a simple linear filter instead of using time-delayed covariance matrices. It is also shown, using block matrix manipulation, that the method might applied when the number of source signals is not equal to the number of mixed signals. An experimental study, comparing different strategies of computing the matrix pencil, is also presented.

Download manuscript

ES2002-41
Sparse image coding using an asynchronous spiking neural network
L. Perrinet, M. Samuelides

Abstract:
In order to explore coding strategies in the retina, we use a wavelet-like transform which output is sparse, as is observed in biological retinas [Olshausen98]. This transform is defined in the context of a one-pass feed-forward spiking neural network, and the output is the list of its neurons' spikes: it is recursively constructed using a greedy matching pursuit scheme which first selects higher contrast energy values. As in [Vanrullen01], we find invariants in the output for some classes of images, allowing to code the absolute contrast value solely by its rank in the spike list. An application to image compression is shown which is comparable to other techniques such as JPEG at low bit compression.

Download manuscript

Hardware and Parallel Computer Implementations of Neural Networks

ES2002-350
Artificial Neural Networks on Massively Parallel Computer Hardware
U. Seiffert

Abstract:
It seems to be an everlasting discussion. Spending a lot of additional time and extra money to implement a particular algorithm on parallel hardware is often considered as the ultimate solution to all existing time problems for the ones - and the most silly waste of time for the others. In fact, there are many pros and cons, which should be always individually weighted. Besides many specific constraints, in general artificial neural networks are worth to be taken into consideration. This tutorial paper gives a survey and guides those people who are willing to go the way of a parallel implementation utilizing the most recent and accessible parallel computer hardware and software. The paper is rounded off with an extensive reference section.

Download manuscript

ES2002-354
PCNN neurocomputers - Event driven and parallel architectures
C. Grassmann, T. Schoenauer, C. Wolff

Abstract:
The simulation of large spiking neural networks (PCNN) espe-cially for vision purposes is limited by the computing power of general pur-pose computer systems [5,9,10]. Therefore, the simulation of real world scenarios requires dedicated simulator systems. This article presents architec-tures of software and hardware implementations for PCNN simulator systems. The implementations are based on a common event driven approach using spike events for communication and processing flow. Furthermore, parallel approaches utilizing the spike event computing are introduced for simulation acceleration. Implementations of software simulators on work-station clusters and parallel computers and hardware accelerators based on FPGAs, ASICs and DSPs are described. The presented results demonstrate the capability to simulate large vision networks close to real world/real time requirements.

Download manuscript

ES2002-359
A reconfigurable SOM hardware accelerator
M. Porrmann, M. Franzmeier, H. Kalte, U. Witkowski, U. Rückert

Abstract:
A dynamically reconfigurable hardware accelerator for self-organizing feature maps is presented. The system is based on the universal rapid prototyping system RAPTOR2000 that has been developed by the authors. The modular prototyping system is based on XILINX FPGAs and is capable of emulating hardware implementations with a complexity of more than 24 million system gates. RAPTOR2000 is linked to its host - a standard personal computer or workstation - via the PCI bus. For the simulation of self organizing maps a module has been designed for the RAPTOR2000 system, that embodies an FPGA of the Xilinx Virtex series and optionally up to 128 MBytes of SDRAM. A speed-up of about 50 is achieved with five FPGA modules on the RAPTOR2000 system compared to a software implementation on a state of the art personal computer for typical applications of self-organizing maps.

Download manuscript

ES2002-352
Stochastic resonance and finite resolution in a leaky integrate-and-fire neuron
N. Mtetwa, L.S. Smith, A. Hussain

Abstract:
The paper discusses the effect of stochastic resonance (SR) in a leaky integrate-and-fire (LIF) neuron and investigates its realisation on low resolution digitally implemented systems. We report in this new study that stochastic resonance which is mainly associated with floating point implmenetations is possible on lower resolution integer based representations which result in real-time performance on digital hardwre.

Download manuscript

ES2002-358
Hardware solutions for implementation of neural networks in High Energy Physics triggers
J.-C. Prévotet, B. Denby, P. Garda, B. Granado, C. Kiesling

Abstract:
Neural networks have been used as triggers in HEP for more than ten years, and continue to deliver promising results. In this article, we will give an overview of the triggering problem and present general neural online solutions retained by physicists to process data in High Energy Physics triggers. We will finally describe an FPGA implemented architecture dedicated to fast neural computations, taking advantage of massive parallelism in order to meet the tight timing constraints imposed by Level 1 neural triggers.

Download manuscript

Perspectives on Learning with Recurrent Networks

ES2002-200
Perspectives on learning with recurrent neural networks
B. Hammer, J.J. Steil

Abstract:
We present an overview of current lines of research on learning with recurrent neural networks (RNNs). Topics covered are: understanding and unification of algorithms, theoretical foundations, new efforts to circumvent gradient vanishing, new architectures, and fusion with other learning methods and dynamical systems theory. The structuring guideline is to understand many new approaches as different efforts to regularize and thereby improve recurrent learning. Often this is done on two levels: by restricting the learning objective by constraints, for instance derived from stability conditions or weight normalization, and by imposing architectural constraints as for instance local recurrence.

Download manuscript

ES2002-211
DEKF-LSTM
F.A. Gers, J.A. Perez-Ortiz, D. Eck, J. Schmidhuber

Abstract:
Unlike traditional recurrent neural networks, the long short-term memory (LSTM) model generalizes well when presented with training sequences derived from regular and also simple nonregular languages. Our novel combination of LSTM and the decoupled extended Kalman filter, however, learns even faster and generalizes even better, requiring only the 10 shortest exemplars (n = 10) of the context sensitive language anbncn to deal correctly with values of n up to 1000 and more. Even when we consider the relatively high update complexity per timestep, in many cases the hybrid offers faster learning than LSTM by itself.

Download manuscript

ES2002-207
Generalization by structural properties from sparse nested symbolic data
M. Boden

Abstract:
A set of simulations demonstrate that recurrent networks can exhibit generalization by abstraction from extremely sparse but structurally homogenous symbolic data. By cascading two recurrent networks -- feeding the second network with discretized hidden states of the first -- it is also possible to generalize according to complex structure. By automatic discretization the cascaded architecture assists in scaling up sequential learning tasks and offers explanations to the apparent systematicity and generativity of language use.

Download manuscript

ES2002-210
Estimating probabilities for unbounded categorization problems
J. Henderson

Abstract:
We propose two output activation functions for estimating probability distributions over an unbounded number of categories with a recurrent neural network, and derive the statistical assumptions which they embody. Both these methods perform better than the standard approach to such problems, when applied to probabilistic parsing of natural language with Simple Synchrony Networks.

Download manuscript

ES2002-206
A general framework for unsupervised processing of structured data
B. Hammer, A. Micheli, A. Sperduti

Abstract:
We propose a general framework for unsupervised recurrent and recursive networks. This proposal covers various popular approaches like standard self organizing maps (SOM), temporal Kohonen maps, recursive SOM, and SOM for structured data. We define Hebbian learning within this general framework. We show how approaches based on an energy function, like neural gas, can be transferred to this abstract framework so that proposals for new learning algorithms emerge.

Download manuscript

ES2002-204
Undershooting: modeling dynamical systems by time grid refinements
H.G. Zimmermann, R. Neuneier, R. Grothmann

Abstract:
Building models of dynamical systems on the basis of observed data, the time grid of the data is typically the same as the time grid of the model. We show that a refinement of the model time grid relative to a wider-meshed time grid of the data provides deeper insights into the dynamics. This ''undershooting'' can be derived from the principle of uniform causality. Combining undershooting with recurrent error correction neural networks, lead to a novel approach which improves the performance of our models by time grid refinements.

Download manuscript

ES2002-203
Learning in a chaotic neural network
N. Crook, T. olde Scheper

Abstract:
Previous research has shown how the Unstable Periodic Orbits (UPOs) embedded in a chaotic attractor can be made to correspond to self-organised dynamic memory states in a chaotic neural network. This paper demonstrates how this chaotic neural network model can be extended to enable it to adapt to dynamic input patterns using two unsupervised learning rules. The proposed learning rules are designed to modify model parameters in order to support the network's dynamics from which the memories emerge. This means that input weights and feedback delays are adapted so that the network will stabilise an appropriate UPO in response to each input signal.

Download manuscript

ES2002-205
Yield curve forecasting by error correction neural networks and partial learning
H.G. Zimmermann, Ch. Tietz, R. Grothmann

Abstract:
Error correction neural networks (ECNN) are an appropriate framework for the modeling of dynamical systems in the presents of noise or missing external influences. Combining ECNNs with the concept of variants-invariants separation in form of a bottleneck coordinate transformation, we are able to handle high-dimensional problems. Further on, we propose a new learning rule for the training of neural networks, which evaluates only specific gradients for the adaptation of the network weights. By this, we are able to generate time invariant localized structures and thus, support the optimization of the network. Forecasting the German yield curve, an ECNN including the separation of variants-invariants is superior to traditional neural networks.

Download manuscript

ANN models and learning II I

ES2002-24
State reconstruction of piecewise linear maps using a clustering machine
G. Millerioux, G. Bloch

Abstract:
State reconstruction of piecewise linear systems is addressed. The description of such a family of systems involves, for each region of the partitioned state space, an affine description and a switching rule which orchestrates the way the dynamics changes from a linear form to another. It results on two distinct states~: the continuous state the discrete state. An observer of piecewise linear systems must recover both of them. It is shown that the discrete state can be recovered by a clustering technique. The continuous state reconstruction is formulated as set of Linear Matrix Inequalities to be solved. They are derived from the notion of poly-quadratic stability and ensure global convergence of the observer.

Download manuscript

ES2002-52
Unsupervised classifier for monitoring and diagnostic of time series
S. Lecoeuche

Abstract:
It is assumed that complex systems are represented by parameters that evolve with time. Hence, it is possible to survey systems and to make diagnostics analyzing time series. The paper presents the development of a neural architecture. A membership degree of an input vector to a prototype is introduced along with the membership degree of the input to a class. The proposed unsupervised learning process makes possible the creation of new prototypes and new classes when necessary. The application to standard time series shows good results: 96% of the inputs are well classified using few prototypes.

Download manuscript

ES2002-56
Width optimization of the Gaussian kernels in Radial Basis Function Networks
N. Benoudjit, C. Archambeau, A. Lendasse, J. Lee, M. Verleysen

Abstract:
Radial basis function networks are usually trained according to a three-stage procedure. In the literature, many papers are devoted to the estimation of the position of Gaussian kernels, as well as the computation of the weights. Meanwhile, very few focus on the estimation of the kernel widths. In this paper, first, we develop a heuristic to optimize the widths in order to improve the generalization process. Subsequently, we validate our approach on several theoretical and real-life approximation problems.

Download manuscript

ES2002-36
High frequency forecasting with associative memories
A. Pasley, J. Austin

Abstract:
Abstract not available

Download manuscript

ES2002-63
Nonlinear PCA: a new hierarchical approach
M. Scholz, R. Vigario

Abstract:
Traditionally, nonlinear principal component analysis (NLPCA) is seen as nonlinear generalization of the standard (linear) principal component analysis (PCA). So far, most of these generalizations rely on a symmetric type of learning. Here we propose an algorithm that extends PCA into NLPCA through a hierarchical type of learning. The hierarchical algorithm (h-NLPCA), like many versions of the symmetric one (s-NLPCA), is based on a multi-layer perceptron with an auto-associative topology, the learning rule of which has been upgraded to accommodate the desired discrimination between components. With h-NLPCA we seek not only the nonlinear subspace spanned by the optimal set of components, ideal for data compression, but we give particular interest to the order in which these components appear. Due to its hierarchical nature, our algorithm is shown to be very efficient in detecting meaningful nonlinear features from real world data, as well as in providing a nonlinear whitening. Furthermore, in a quantitative type of analysis, the h-NLPCA achieves better classification accuracies, with a smaller number of components than most traditional approaches.

Download manuscript

ES2002-28
Probabilistic derivation and Multiple Canonical Correlation Analysis
P.L. Lai

Abstract:
We review a new method of performing Canonical Correlation Analysis (CCA) with Artificial Neural Networks. We have previously [4,5] compared its capabilities with standard statistical methods on simple data sets where the maximum correlations are given by linear filters. In this paper, we extend the method by implementing a very precise set of constraints which allow multiple correlations to be found at once. We demonstrate the network's capabilities on the standard Random Dot Stereogram data set. We also re-derive the learning rules from a probabilistic perspective and then by use of a specific prior on the weights, simplify [2] which an abstraction of the Random Do Stereogram matching problem and show how a second layer network using Factor Analysis can be used to combine the results of the CCA network to obtain higher order information.

Download manuscript

ES2002-66
Orthogonal transformations for optimal time series prediction
M. Salmeron, A. Prieto, J. Ortega, C.G. Puntonet, M. Rodriguez Alvarez

Abstract:
Abstract not available

Download manuscript

ES2002-69
Neuro-fuzzy methodologies for the clustering and the reliability estimation of olive fruit fly infestation
E. Bellei, R. Petacchi, L. Reyneri

Abstract:
The present article describes the last results obtained from the application of neuro-fuzzy techniques in the study of Bactrocera Oleae infestation in the Liguria region olive grows. The project “Applications of Neuro-Fuzzy Techniques in Agriculture” started on March 2000 with the monitoring and collection of data from a large number of oil farms. The main aim of the project was realized an area-wide Bactrocera Oleae monitoring network in order to administer IPM and to provide technical assistance in treatments to each farm. The way to reach this aim had been the creation of neuro-fuzzy systems for the extraction of infestation’s features to make an appropriate classification with good labels to suggest treatments for each monitored farm, based also on the estimation of olive fly measure’s reliability. During the project, it has been proved that standard approaches to made forecast analyses on data referred to the growth of olive fly, give results less good and flexible than those obtained with the new analysis techniques like neuro-fuzzy methodologies, which are more adapted for non-linear and complex problems like the agronomic ones.

Download manuscript

ES2002-30
Use of artificial neural networks process analyzers: a case study
H. Al-Duwaish, L. Ghouti, T. Halawani, M. Mohandes

Abstract:
In this paper, artificial neural networks (ANN), which are known for their ability to model nonlinear systems and their inherent noise-filtering abilities, are used as O2 analyzer to predict O2 contents in a boiler at SHARQ petrochemical company in Saudi Arabia. The training data has been collected over duration of one month and used to train a neural network to develop neural based oxygen analyzer. The results are very promising.

Download manuscript

Information extraction

ES2002-17
Forecasting using twinned principal curves
Y. Han, C. Fyfe

Abstract:
Abstract not available

Download manuscript

ES2002-50
Kernel Temporal Component Analysis (KTCA)
D. Martinez, A. Bray

Abstract:
We describe an efficient algorithm for simultaneously extracting multiple smoothly-varying non-linear invariances from time-series data. The method exploits the concept of maximizing temporal predictability introduced by Stone in the linear domain - we term this temporal component analysis (TCA). Our current work extends this linear method into the non linear domain using kernel-based methods; it performs a non-linear projection of the input into an unknown high dimensional feature space, computing a linear solution in this space. In this paper we describe the improved on-line version of this algorithm (KTCA) for working on very large data sets, and demonstrate its applicability for computer vision by extracting non-linear disparity directly from grey-level stereo pairs, without pre-processing.

Download manuscript

ES2002-62
Exploratory Correlation Analysis
J. Koetsier, D. MacDonald, D. Charles, C. Fyfe

Abstract:
We present a novel unsupervised artificial neural network for the extraction of common features in multiple data sources. This algorithm, which we name Exploratory Correlation Analysis (ECA), is a multi-stream extension of a neural implementation of Exploratory Projection Pursuit (EPP) and has a close relationship with Canonical Correlation Analysis (CCA). Whereas EPP identifies "interesting" statistical directions in a single stream of data, ECA develops a joint coding of the common underlying statistical features across a number of data streams.

Download manuscript

Neural Network Techniques in Fault Detection and Isolation

ES2002-302
Neural networks for fault diagnosis and identification of industrial processes
S. Simani, C. Fantuzzi

Abstract:
In this work a model--based procedure exploiting analytical redundancy via state estimation techniques for the diagnosis of faults regarding sensors of a dynamic system is presented. Fault detection is based on Kalman filters designed in stochastic environment. Fault identification is therefore performed by means of different neural network architectures. In particular, neural networks are used as function approximators for estimating sensor fault sizes. The proposed fault diagnosis and identification tool is tested on a industrial gas turbine.

Download manuscript

ES2002-301
Neural networks for fault diagnosis of industrial plants at different working points
S. Simani, R. J. Patton

Abstract:
Industrial plants often work at different operating points. However, in literature applications of neural networks for fault diagnosis usually consider only a single working condition or small changes of operating points. A standard scheme for the design of neural networks for fault diagnosis at all operating points may be impractical due to the unavailability of suitable training data for all working conditions. This paper addresses the design of a single neural network for the diagnosis of faults in the sensors of an industrial gas turbine working at different conditions. The presented results illustrate the performance of the trained neural network for sensor fault diagnosis.

Download manuscript

ES2002-303
Fault diagnosis of an electro-pneumatic valve actuator using neural networks with fuzzy capabilities
F.J. Uppal, R.J. Patton

Abstract:
The early detection of faults (just beginning and still developing) can help avoid system shutdown, breakdown and even catastrophes involving human fatalities and material damage. Computational intelligence techniques are being investigated as an extension to the traditional fault diagnosis methods. This paper discusses the neuro-fuzzy approach to modelling and fault diagnosis, based on the TSK/Mamdani approaches. An application study of an electro-pneumatic valve actuator in a sugar factory is described. The key issues of finding a suitable structure for detecting and isolating ten realistic actuator faults are outlined.

Download manuscript

ES2002-304
Non-linear Canonical Correlation Analysis using a RBF network
S. Kumar, E.B. Martin, J. Morris

Abstract:
A non-linear version of the multivariate statistical technique of canonical correlation analysis (CCA) is proposed through the integration of a radial basis function (RBF) network. The advantage of the RBF network is that the solution of linear CCA can be used to train the network and hence the training effort is minimal. Also the canonical variables can be extracted simultaneously. It is shown that the proposed technique can be used to extract non-linear structures inherent within a data set.

Download manuscript

ES2002-305
Free-swinging and locked joint fault detection and isolation in cooperative manipulators
R. Tinos, M. H. Terra

Abstract:
The problem of fault detection and isolation (FDI) in cooperative manipulators is addressed. Free-swinging and locked joint faults are detected and isolated by an FDI system based on neural networks. For each arm, a Multilayer Perceptron (MLP) is used to reproduce the dynamics of the fault-free robot. The outputs of each MLP are compared to the real joint velocities in order to generate a residual vector that is then classified by an RBF network. Simulations and a real application are presented indicating the effectiveness of the FDI system.

Download manuscript