Bruges, Belgium, April 21-22-23
Content of the proceedings
-
Dynamical systems
Self-organization
Special session: Adaptive computation of data structures
Methodology
Special session: Remote sensing spectral image analysis
ANN models and learning I
Biological models and inspiration
Special session: Support Vector Machines
ANN models and learning II
Classification
Special session: Information extraction using unsupervised neural networks
ANN models and learning III
Special session: Spiking neurons
Temporal series
Dynamical systems
ES1999-1
Synchronizing chaotic neuromodules
F. Pasemann
Synchronizing chaotic neuromodules
F. Pasemann
Abstract:
We discuss the time-discrete parametrized synchronous dynamics of two coupled chaotic neuromodules. The symmetrical coupling of identical 2-neuron modules results in periodic, quasiperiodic as well as chaotic dynamics constrained to a synchronization manifold M. Stability of the synchonized dynamics is calculated by transversal Lyapunov exponents. In addition to synchronized attractors there often co-exist asynchronous periodic, quasiperiodic or even chaotic atractors. Simulation results for selected sets of parameters are presented.
We discuss the time-discrete parametrized synchronous dynamics of two coupled chaotic neuromodules. The symmetrical coupling of identical 2-neuron modules results in periodic, quasiperiodic as well as chaotic dynamics constrained to a synchronization manifold M. Stability of the synchonized dynamics is calculated by transversal Lyapunov exponents. In addition to synchronized attractors there often co-exist asynchronous periodic, quasiperiodic or even chaotic atractors. Simulation results for selected sets of parameters are presented.
ES1999-14
Mean-field equations reveal synchronization in a 2-populations neural network model
E. Daucé, O. Moynot, O. Pinaud, M. Samuelides, B. Doyon
Mean-field equations reveal synchronization in a 2-populations neural network model
E. Daucé, O. Moynot, O. Pinaud, M. Samuelides, B. Doyon
Abstract:
We study a 2-populations model of analogic recurrent neural network. This model takes into account the influence of inhibitory and excitatory neurons. It is dedicated to study collective dynamical properties of large size fully connected recurrent networks. The evolution of neuron activation states is given in the thermodynamic limit by a set of mean-field equations and the network satisfies a "propagation of chaos" property. All these results are supported by rigorous proofs using large deviations techniques. Moreover, we observe that the bifurcation diagram of these mean-field equations, as well as nite size simulations, reveal a parametric domain where the expectation and variance of the limit law of the activation potentials describe periodic oscillations. Fluctuations of individual neurons around this average may occur, showing the existence of a stochastic non stationary regime for long time. This can be directly related to recent biological discoveries about the role of inhibition in the synchronization of excitatory neurons.
We study a 2-populations model of analogic recurrent neural network. This model takes into account the influence of inhibitory and excitatory neurons. It is dedicated to study collective dynamical properties of large size fully connected recurrent networks. The evolution of neuron activation states is given in the thermodynamic limit by a set of mean-field equations and the network satisfies a "propagation of chaos" property. All these results are supported by rigorous proofs using large deviations techniques. Moreover, we observe that the bifurcation diagram of these mean-field equations, as well as nite size simulations, reveal a parametric domain where the expectation and variance of the limit law of the activation potentials describe periodic oscillations. Fluctuations of individual neurons around this average may occur, showing the existence of a stochastic non stationary regime for long time. This can be directly related to recent biological discoveries about the role of inhibition in the synchronization of excitatory neurons.
Self-organization
ES1999-13
A hierarchical self-organizing feature map for analysis of not well separable clusters of different feature density
S. Schünemann, B. Michaelis
A hierarchical self-organizing feature map for analysis of not well separable clusters of different feature density
S. Schünemann, B. Michaelis
Abstract:
This paper introduces a hierarchical Self-Organizing Feature Map (SOFM). The partial maps consist of individual numbers of neurons, which makes a cluster analysis with different degrees of resolution possible. A definition of a special Mahalanobis space of the data set during the learning improves the properties concerning clusters with low density.
This paper introduces a hierarchical Self-Organizing Feature Map (SOFM). The partial maps consist of individual numbers of neurons, which makes a cluster analysis with different degrees of resolution possible. A definition of a special Mahalanobis space of the data set during the learning improves the properties concerning clusters with low density.
ES1999-47
Using the Kohonen algorithm for quick initialization of Simple Competitive Learning algorithm
E. de Bodt, M. Cottrell, M. Verleysen
Using the Kohonen algorithm for quick initialization of Simple Competitive Learning algorithm
E. de Bodt, M. Cottrell, M. Verleysen
Abstract:
In a previous paper ([1], ESANN’97), we compared the Kohonen algorithm (SOM) to Simple Competitive Learning Algorithm (SCL) when the goal is to reconstruct an unknown density. We showed that for that purpose, the SOM algorithm quickly provides an excellent approximation of the initial density, when the frequencies of each class are taken into account to weight the quantifiers of the classes. Another important property of the SOM is the well known topology conservation, which implies that neighbor data are classified into the same class (as usual) or into neighbor classes. In this paper, we study another interesting property of the SOM algorithm, that holds for any fixed number of quantifiers. We show that even we use those approaches only for quantization, the SOM algorithm can be successfully used to accelerate in a very large proportion the speed of convergence of the classical Simple Competitive Learning Algorithm (SCL).
In a previous paper ([1], ESANN’97), we compared the Kohonen algorithm (SOM) to Simple Competitive Learning Algorithm (SCL) when the goal is to reconstruct an unknown density. We showed that for that purpose, the SOM algorithm quickly provides an excellent approximation of the initial density, when the frequencies of each class are taken into account to weight the quantifiers of the classes. Another important property of the SOM is the well known topology conservation, which implies that neighbor data are classified into the same class (as usual) or into neighbor classes. In this paper, we study another interesting property of the SOM algorithm, that holds for any fixed number of quantifiers. We show that even we use those approaches only for quantization, the SOM algorithm can be successfully used to accelerate in a very large proportion the speed of convergence of the classical Simple Competitive Learning Algorithm (SCL).
Special session: Adaptive computation of data structures
ES1999-307
Learning in structured domains
M. Gori
Learning in structured domains
M. Gori
Abstract:
By and large, learning from examples in the machine learning litera- ture refers to static data types. That main stream of interest, however, has had significant bifurcations (see e.g. the learning issues connected with syntactic and structured pattern recognition) arisen from the need to exploit the structure inherently attached to the data of some learning tasks. In this paper, I review brie y the research carried out in the last few years in the area of connectionist models in the attempt to extend the corresponding learning approaches to the case of structured domain. I give a unified picture of the adaptive computation which can be carried out on graphical objects and show that, under certain restrictions on the kind of graph to be processed, the classic learning algorithm for feedforward networks can be straightforwardly extended.
By and large, learning from examples in the machine learning litera- ture refers to static data types. That main stream of interest, however, has had significant bifurcations (see e.g. the learning issues connected with syntactic and structured pattern recognition) arisen from the need to exploit the structure inherently attached to the data of some learning tasks. In this paper, I review brie y the research carried out in the last few years in the area of connectionist models in the attempt to extend the corresponding learning approaches to the case of structured domain. I give a unified picture of the adaptive computation which can be carried out on graphical objects and show that, under certain restrictions on the kind of graph to be processed, the classic learning algorithm for feedforward networks can be straightforwardly extended.
ES1999-301
Approximation capabilities of folding networks
B. Hammer
Approximation capabilities of folding networks
B. Hammer
Abstract:
In this paper we show several approximation results for folding networks - a generalization of partial recurrent neural networks such that not only time sequences but arbitrary trees can serve as input: Any measurable function can be approximated in probability. Any continuous function can be approximated in the maximum norm on inputs with restricted height, but the resources necessarily increase at least exponentially in the input height. In general, approximation on arbitrary inputs is not possible in the maximum norm.
In this paper we show several approximation results for folding networks - a generalization of partial recurrent neural networks such that not only time sequences but arbitrary trees can serve as input: Any measurable function can be approximated in probability. Any continuous function can be approximated in the maximum norm on inputs with restricted height, but the resources necessarily increase at least exponentially in the input height. In general, approximation on arbitrary inputs is not possible in the maximum norm.
ES1999-304
Tree-recursive computation of gradient information for structures
A. Kuechler
Tree-recursive computation of gradient information for structures
A. Kuechler
Abstract:
Recently, the so-called Backpropagation Through Structure (BPTS) gradient calculation algorithm has been developed to capture learning scenarios where data is adequately represented by hybrid continuous-discrete structures (e.g. labeled ordered trees, nodes aug mented by continuous information). BPTS can be viewed as an extension of the well-known Backpropagation Through Time (BPTT) algorithm for discrete-time dynamical systems and sequence processing. The well-known (functionally equivalent) Real-time Recurrent Learning (RTRL) algorithm has to be favored to BPTT if long sequences are processed. This paper investigates whether and how RTRL can be generalized - while conserving its appealing algorithmic properties -to calculate the gradient information for models operating on the domain of rooted labeled ordered trees. The answer is partly negative. It turns out that a postorder traversal of the tree has to be obeyed in order to keep the space consumption independent from the size of the input structures. By processing vertices in an inverse topological ordering the algorithm can also be applied on labeled directed ordered acyclic graphs. However, we show that on this graph domain the memory consumption grows (in the worst case) linearly with the size of the input structure.
Recently, the so-called Backpropagation Through Structure (BPTS) gradient calculation algorithm has been developed to capture learning scenarios where data is adequately represented by hybrid continuous-discrete structures (e.g. labeled ordered trees, nodes aug mented by continuous information). BPTS can be viewed as an extension of the well-known Backpropagation Through Time (BPTT) algorithm for discrete-time dynamical systems and sequence processing. The well-known (functionally equivalent) Real-time Recurrent Learning (RTRL) algorithm has to be favored to BPTT if long sequences are processed. This paper investigates whether and how RTRL can be generalized - while conserving its appealing algorithmic properties -to calculate the gradient information for models operating on the domain of rooted labeled ordered trees. The answer is partly negative. It turns out that a postorder traversal of the tree has to be obeyed in order to keep the space consumption independent from the size of the input structures. By processing vertices in an inverse topological ordering the algorithm can also be applied on labeled directed ordered acyclic graphs. However, we show that on this graph domain the memory consumption grows (in the worst case) linearly with the size of the input structure.
ES1999-305
Learning search-control heuristics for automated deduction systems with folding architecture networks
C. Goller
Learning search-control heuristics for automated deduction systems with folding architecture networks
C. Goller
Abstract:
During the last years, folding architecture networks and the closely related concept of recursive neural networks have been developed for solving supervised learning tasks on data structures. In this paper, these networks are applied to the problem of learning search-control heuristics for automated deduction systems. Experimental results with the automated deduction system Setheo in an algebraic domain show a considerable performance improvement. Controlled by heuristics which had been learned from simple problems in this domain the system is able to solve several problems from the same domain which had been out of reach for the original system.
During the last years, folding architecture networks and the closely related concept of recursive neural networks have been developed for solving supervised learning tasks on data structures. In this paper, these networks are applied to the problem of learning search-control heuristics for automated deduction systems. Experimental results with the automated deduction system Setheo in an algebraic domain show a considerable performance improvement. Controlled by heuristics which had been learned from simple problems in this domain the system is able to solve several problems from the same domain which had been out of reach for the original system.
ES1999-306
A topological transformation for hidden recursive modelsarchitecture networks
F. Costa, P. Frasconi, G. Soda
A topological transformation for hidden recursive modelsarchitecture networks
F. Costa, P. Frasconi, G. Soda
Abstract:
Discriminant hidden Markov models can be generalized from strings to labeled acyclic structures and, in particular, ordered trees [6, 7]. Inference and parameter estimation algorithms for this class of models can be derived in a straightforward way as special instances of inference and learning algorithms for Bayesian networks. However, if we are interested in building a discriminant model, in which arrows are directed towards the root of the tree, the model turns out to be intractable since the number of parameters grows exponentially with the number of neighbors of each node. In this paper we describe a topological transformation that maps ordered trees into binary trees, thus making the total number of parameters independent of the number of neighbors, as for the case of generative models. Besides reducing complexity, it also permits to deal with general ordered trees without imposing a priori a limit on the maximum outdegree. We show that the topological transformation maps regular sets of trees into regular sets of binary trees and, as a result, it does not affect the possibility of classifying trees with a nite state device. Finally, experimental results from a logo classification task are shown.
Discriminant hidden Markov models can be generalized from strings to labeled acyclic structures and, in particular, ordered trees [6, 7]. Inference and parameter estimation algorithms for this class of models can be derived in a straightforward way as special instances of inference and learning algorithms for Bayesian networks. However, if we are interested in building a discriminant model, in which arrows are directed towards the root of the tree, the model turns out to be intractable since the number of parameters grows exponentially with the number of neighbors of each node. In this paper we describe a topological transformation that maps ordered trees into binary trees, thus making the total number of parameters independent of the number of neighbors, as for the case of generative models. Besides reducing complexity, it also permits to deal with general ordered trees without imposing a priori a limit on the maximum outdegree. We show that the topological transformation maps regular sets of trees into regular sets of binary trees and, as a result, it does not affect the possibility of classifying trees with a nite state device. Finally, experimental results from a logo classification task are shown.
ES1999-302
Neural learning of approximate simple regular languages
M. Forcada, A. Corbi, M. Gori, M. Maggini
Neural learning of approximate simple regular languages
M. Forcada, A. Corbi, M. Gori, M. Maggini
Abstract:
Discrete-time recurrent neural networks (DTRNN) have been used to infer DFA from sets of examples and counterexamples; however, discrete algorithmic methods are much better at this task and clearly outperform DTRNN in space and time complexity. We show, however, how DTRNN may be used to learn not the exact language that explains the whole learning set but an approximate and much simpler language that explains a great majority of the examples by using sim- ple rules. This is accomplished by gradually varying the error function in such a way that the net is eventually allowed to classify clearly but incorrectly those strings that are diÆcult to learn, which are treated as exceptions. The results show that in this way, the DTRNN usually learns a simplified approximate language.
Discrete-time recurrent neural networks (DTRNN) have been used to infer DFA from sets of examples and counterexamples; however, discrete algorithmic methods are much better at this task and clearly outperform DTRNN in space and time complexity. We show, however, how DTRNN may be used to learn not the exact language that explains the whole learning set but an approximate and much simpler language that explains a great majority of the examples by using sim- ple rules. This is accomplished by gradually varying the error function in such a way that the net is eventually allowed to classify clearly but incorrectly those strings that are diÆcult to learn, which are treated as exceptions. The results show that in this way, the DTRNN usually learns a simplified approximate language.
ES1999-355
A benchmark for testing adaptive systems on structured data
M. Hagenbuchner, A.C. Tsoi
A benchmark for testing adaptive systems on structured data
M. Hagenbuchner, A.C. Tsoi
Abstract:
A number of adaptive methods capable of coping with structured data have emerged recently. Until recently, it was difficult to compare the performance of these methods as there are no universally accepted benchmark problems. As a result, we have developed a methodology to generate a benchmark problem sufficiently exible to permit the simulation of a wide range of structured data learning problems, sufficiently fast to generate a set of patterns in a reasonable time, and sfficiently small to allow easy access to needed data. The benchmark described in this paper is an artificial learning task consisting of images that feature objects built through rules expressed by an attributed plex grammar. There are a number of advantages in utilizing this methodology. First, it can be well defined by using an attributed plex grammar. There is no need for the provision of a huge dataset of images as sets for training and testing can quickly be produced through a given grammar. But most importantly, this benchmark encapsulates some of the typical problems encountered in data processing of structured information. This paper illustrates this methodology by means of a traffic policeman problem. The patterns are used to generate data-trees as inputs for a typical adaptive learning algorithm. Preliminary tests show that some of these newly emerged adaptive learning algorithms perform very well compared to conventional methods.
A number of adaptive methods capable of coping with structured data have emerged recently. Until recently, it was difficult to compare the performance of these methods as there are no universally accepted benchmark problems. As a result, we have developed a methodology to generate a benchmark problem sufficiently exible to permit the simulation of a wide range of structured data learning problems, sufficiently fast to generate a set of patterns in a reasonable time, and sfficiently small to allow easy access to needed data. The benchmark described in this paper is an artificial learning task consisting of images that feature objects built through rules expressed by an attributed plex grammar. There are a number of advantages in utilizing this methodology. First, it can be well defined by using an attributed plex grammar. There is no need for the provision of a huge dataset of images as sets for training and testing can quickly be produced through a given grammar. But most importantly, this benchmark encapsulates some of the typical problems encountered in data processing of structured information. This paper illustrates this methodology by means of a traffic policeman problem. The patterns are used to generate data-trees as inputs for a typical adaptive learning algorithm. Preliminary tests show that some of these newly emerged adaptive learning algorithms perform very well compared to conventional methods.
Methodology
ES1999-5
The application of neural networks to the paper-making industry
P.J. Edwards, A. F. Murray, G. Papadopoulos, A.R. Wallace, J. Barnard
The application of neural networks to the paper-making industry
P.J. Edwards, A. F. Murray, G. Papadopoulos, A.R. Wallace, J. Barnard
Abstract:
This paper describes the application of neural network techniques to the paper-making industry, particularly for the prediction of paper “curl”. Paper curl is a common problem and can only be measured reliably off-line, after manufacture. Model development is carried out using imperfect data, typical of that collected in many manufacturing environments, and addresses issues pertinent to real-world use. Predictions then are presented in terms that are relevant to the machine operator, as a measure of paper acceptability, a direct prediction of the quality measure, and always with a measure of prediction confidence. Therefore, the techniques described in this paper are widely applicable to industry.
This paper describes the application of neural network techniques to the paper-making industry, particularly for the prediction of paper “curl”. Paper curl is a common problem and can only be measured reliably off-line, after manufacture. Model development is carried out using imperfect data, typical of that collected in many manufacturing environments, and addresses issues pertinent to real-world use. Predictions then are presented in terms that are relevant to the machine operator, as a measure of paper acceptability, a direct prediction of the quality measure, and always with a measure of prediction confidence. Therefore, the techniques described in this paper are widely applicable to industry.
ES1999-7
Marble slabs quality classification system using texture recognition and neural networks methodology
J. Martinez-Cabeza de Vaca Alajarin, L.-M. Tomas Balibrea
Marble slabs quality classification system using texture recognition and neural networks methodology
J. Martinez-Cabeza de Vaca Alajarin, L.-M. Tomas Balibrea
Abstract:
This article describes the use of an LVQ neural network for the clustering and classification of marble slabs according to their texture. The method used for the recognition of textures is based on the Sum and Diff erence Histograms, a faster version of the Co- occurrence Matrices. The input of the network is a vector of statistical parameters which characterize the pattern shown to the net, and the desired output is the class to which the pattern belongs (supervised learning). The samples chosen for testing the algorithms have been marble slabs of type "Crema Marfil Sierra de la Puerta". The neural network has been implemented using MATLAB.
This article describes the use of an LVQ neural network for the clustering and classification of marble slabs according to their texture. The method used for the recognition of textures is based on the Sum and Diff erence Histograms, a faster version of the Co- occurrence Matrices. The input of the network is a vector of statistical parameters which characterize the pattern shown to the net, and the desired output is the class to which the pattern belongs (supervised learning). The samples chosen for testing the algorithms have been marble slabs of type "Crema Marfil Sierra de la Puerta". The neural network has been implemented using MATLAB.
ES1999-8
Visual-based posture recognition using hybrid neural networks
A. Corradini, H.-J. Boehme, H.-M. Gross
Visual-based posture recognition using hybrid neural networks
A. Corradini, H.-J. Boehme, H.-M. Gross
Abstract:
This paper describes the preliminary results of the research work currently ongoing at our department and carried out as part of a project founded by the Commission of the European Union. In this paper a novel approach to human posture analysis and recognition using standard image processing techniques as well as hybrid neural information processing is presented. We rst develop a reliable and robust person localization module via a combination of oriented lters and three-dimensional dynamic neural elds. Then we focus on the view-based recognition of the user's static gestural instructions from a predefined vocabulary based on both a skin color model and statistical normalized moment invariants. The segmentation of the postures occurs by means of the skin color model based on the Mahalanobis metric. From the resulting binary image containing only regions which have been classififi ed as skin candidates we extract translation and scale invariant moments. They are used as input for two different neural classifiers whose results are then compared. To train and test the neural classifiers we gathered the data from ve people performing 18 repetitions of each of ve postures (our vocabulary): stop, go left, go right, hello left and hello right. The system is currently under development with constant updates and new developments. It uses input from a color video camera and is user-independent. The aim is to build a real-time system able to deal with dynamic gestures.
This paper describes the preliminary results of the research work currently ongoing at our department and carried out as part of a project founded by the Commission of the European Union. In this paper a novel approach to human posture analysis and recognition using standard image processing techniques as well as hybrid neural information processing is presented. We rst develop a reliable and robust person localization module via a combination of oriented lters and three-dimensional dynamic neural elds. Then we focus on the view-based recognition of the user's static gestural instructions from a predefined vocabulary based on both a skin color model and statistical normalized moment invariants. The segmentation of the postures occurs by means of the skin color model based on the Mahalanobis metric. From the resulting binary image containing only regions which have been classififi ed as skin candidates we extract translation and scale invariant moments. They are used as input for two different neural classifiers whose results are then compared. To train and test the neural classifiers we gathered the data from ve people performing 18 repetitions of each of ve postures (our vocabulary): stop, go left, go right, hello left and hello right. The system is currently under development with constant updates and new developments. It uses input from a color video camera and is user-independent. The aim is to build a real-time system able to deal with dynamic gestures.
ES1999-36
Model clustering by deterministic annealing
B. Bakker, T. Heskes
Model clustering by deterministic annealing
B. Bakker, T. Heskes
Abstract:
Although training an ensemble of neural network solutions increases the amount of information obtained from a system, large ensembles may be hard to analyze. Since data clustering is a good method to summarize large bodies of data, we will show in this paper how to use clustering on instances of neural networks. We will describe an algorithm based on deterministic annealing, which is able to cluster various types of data. As an example, we will apply the algorithm to instances of three different types of MLP's, trained to predict the time of death of ovarian cancer patients.
Although training an ensemble of neural network solutions increases the amount of information obtained from a system, large ensembles may be hard to analyze. Since data clustering is a good method to summarize large bodies of data, we will show in this paper how to use clustering on instances of neural networks. We will describe an algorithm based on deterministic annealing, which is able to cluster various types of data. As an example, we will apply the algorithm to instances of three different types of MLP's, trained to predict the time of death of ovarian cancer patients.
Special session: Remote sensing spectral image analysis
ES1999-356
The challenges in spectral image analysis: an introduction, and review of ANN approaches
E. Merényi
The challenges in spectral image analysis: an introduction, and review of ANN approaches
E. Merényi
Abstract:
Utilization of remote sensing multi- and hyperspectral imagery has been rapidly increasing in numerous areas of economic and scientific significance. Hyperspectral sensors, in particular, provide the detailed information that is known from laboratory measurements to characterize and identify minerals, soils, rocks, plants, water bodies, and other surface materials. This opens up tremendous possibilities for resource exploration and management, environmental monitoring, natural hazard prediction, and more. However, exploitation of the wealth of information in spectral images has yet to match up to the sensors' capabilities, as conventional methods often prove inadequate. ANNs hold the promise to revolutionize this area by overcoming many of the mathematical obstacles that traditional techniques fail at. By providing high speed when implemented in parallel hardware, (near-)real time processing of extremely high data volumes, typical in remote sensing spectral imaging, will also be possible.
Utilization of remote sensing multi- and hyperspectral imagery has been rapidly increasing in numerous areas of economic and scientific significance. Hyperspectral sensors, in particular, provide the detailed information that is known from laboratory measurements to characterize and identify minerals, soils, rocks, plants, water bodies, and other surface materials. This opens up tremendous possibilities for resource exploration and management, environmental monitoring, natural hazard prediction, and more. However, exploitation of the wealth of information in spectral images has yet to match up to the sensors' capabilities, as conventional methods often prove inadequate. ANNs hold the promise to revolutionize this area by overcoming many of the mathematical obstacles that traditional techniques fail at. By providing high speed when implemented in parallel hardware, (near-)real time processing of extremely high data volumes, typical in remote sensing spectral imaging, will also be possible.
ES1999-354
A simple associative neural network for producing spatially homogenous spectral abundance interpretations of hyperspectral imagery
N. Pendock
A simple associative neural network for producing spatially homogenous spectral abundance interpretations of hyperspectral imagery
N. Pendock
Abstract:
A hyperspectral remotely sensed image may be modeled as a linear mixture of the spectral responses of unknown spectral endmembers. Using the a-priori information that the unknown spectral abundance images should be spatially homogenous, a simple associative neural network may be trained using Hebbian learning to extract spectral endmembers and corresponding abundance images from a hyperspectral image. The technique is applied to an AVIRIS image of Cuprite, Nevada and is compared to an interactive technique for approximating the spectral convex hull of a hyperspectral image that requires a-priori geological knowledge to identify spectral endmembers.
A hyperspectral remotely sensed image may be modeled as a linear mixture of the spectral responses of unknown spectral endmembers. Using the a-priori information that the unknown spectral abundance images should be spatially homogenous, a simple associative neural network may be trained using Hebbian learning to extract spectral endmembers and corresponding abundance images from a hyperspectral image. The technique is applied to an AVIRIS image of Cuprite, Nevada and is compared to an interactive technique for approximating the spectral convex hull of a hyperspectral image that requires a-priori geological knowledge to identify spectral endmembers.
ES1999-352
Estimating the intrinsic dimensionality of hyperspectral images
J. Bruske, E. Merényi
Estimating the intrinsic dimensionality of hyperspectral images
J. Bruske, E. Merényi
Abstract:
Estimating the intrinsic dimensionality (ID) of an intrinsically low (d-) dimensional data set embedded in a high (n-) dimensional input space by conventional Principal Component Analysis (PCA) is computationally hard because PCA scales cubic (O(n3)) with the input dimension [11]. Besides this computational drawback, global PCA will overestimate the ID if the data manifold is curved. In this paper we apply ID_OTPM [1], a new algorithm for ID estimation based on Optimally Topology Preserving Maps [7] to image sequences. In particular, we utilize ID_OTPM for ID estimation of an AVIRIS data set, a hyperspectral satellite image sequence with input dimension n =257880. Most interestingly, our experiments suggest that the inter-band dimension, db, of the AVIRIS data set is between one and two, whereas the spectral dimension, ds, is about four. These results provide important clues for compression, visualization and classification of the AVIRIS data set.
Estimating the intrinsic dimensionality (ID) of an intrinsically low (d-) dimensional data set embedded in a high (n-) dimensional input space by conventional Principal Component Analysis (PCA) is computationally hard because PCA scales cubic (O(n3)) with the input dimension [11]. Besides this computational drawback, global PCA will overestimate the ID if the data manifold is curved. In this paper we apply ID_OTPM [1], a new algorithm for ID estimation based on Optimally Topology Preserving Maps [7] to image sequences. In particular, we utilize ID_OTPM for ID estimation of an AVIRIS data set, a hyperspectral satellite image sequence with input dimension n =257880. Most interestingly, our experiments suggest that the inter-band dimension, db, of the AVIRIS data set is between one and two, whereas the spectral dimension, ds, is about four. These results provide important clues for compression, visualization and classification of the AVIRIS data set.
ES1999-351
Benefits and limits of the self-organizing map and its variants in the area of satellite remote sensoring processing
T. Villmann
Benefits and limits of the self-organizing map and its variants in the area of satellite remote sensoring processing
T. Villmann
ES1999-353
Comparison of Kohonen, scale-invariant and GTM self-organising maps for interpretation of spectral data
D. MacDonald, S. McGlinchey, J. Kawala, C. Fyfe
Comparison of Kohonen, scale-invariant and GTM self-organising maps for interpretation of spectral data
D. MacDonald, S. McGlinchey, J. Kawala, C. Fyfe
Abstract:
We investigate the use of artificial neural networks in classifying hyperspectral data. Such data when collected from remote sensors provides extremely detailed coverage of e.g. the mineralogical composition of planetary surfaces, however the volume of data supplied often overwhelms traditional classifiers. When we wish to investigate such data sets in an open-ended manner, the use of unsupervised learning is a pre-requisite. A set of remotely sensed spectral images are use to train several different topology preserving neural networks. In each method, the data is projected onto a two dimensional grid designed to visualise the data set in a low dimensional space. Such mappings allow graceful degradation of the classifications given by the mappings since nearby data points are mapped to the same or similar classifications.
We investigate the use of artificial neural networks in classifying hyperspectral data. Such data when collected from remote sensors provides extremely detailed coverage of e.g. the mineralogical composition of planetary surfaces, however the volume of data supplied often overwhelms traditional classifiers. When we wish to investigate such data sets in an open-ended manner, the use of unsupervised learning is a pre-requisite. A set of remotely sensed spectral images are use to train several different topology preserving neural networks. In each method, the data is projected onto a two dimensional grid designed to visualise the data set in a low dimensional space. Such mappings allow graceful degradation of the classifications given by the mappings since nearby data points are mapped to the same or similar classifications.
ANN models and learning I
ES1999-12
AdaBoost and neural networks
T. Windeatt, R. Ghaderi
AdaBoost and neural networks
T. Windeatt, R. Ghaderi
Abstract:
AdaBoost, a recent version of Boosting is known to improve the performance of decision trees in many classification problems, but in some cases it does not do as well as expected. There are also a few reports of its application to more complex classifiers such as neural networks. In this paper we decompose and modify this algorithm for use with RBF NNs, our methodology being based on the technique of combining multiple classifiers.
AdaBoost, a recent version of Boosting is known to improve the performance of decision trees in many classification problems, but in some cases it does not do as well as expected. There are also a few reports of its application to more complex classifiers such as neural networks. In this paper we decompose and modify this algorithm for use with RBF NNs, our methodology being based on the technique of combining multiple classifiers.
ES1999-22
Modeling face recognition learning in early infant development
F. Acerra, Y. Burnod, S. de Schonen
Modeling face recognition learning in early infant development
F. Acerra, Y. Burnod, S. de Schonen
Abstract:
Face recognition development has been studied in experimental psychology, in the first month of life. These studies show that already at the age of 4 months the right hemisphere processes configural information, while the left hemisphere processes what is classically called local information. We have developped a neural model to understand how face recognition learning develops in early infancy. We propose a bayesian network based on local cellular properties of visual areas and on lateral-feedforward interactions in the cortex. The model reproduces the experimental data of the right hemisphere infant behavior, when tested with faces. We suggest that the bayesian neural networks and the biological properties of cortical areas may be a more general and useful instrument to understand human development.
Face recognition development has been studied in experimental psychology, in the first month of life. These studies show that already at the age of 4 months the right hemisphere processes configural information, while the left hemisphere processes what is classically called local information. We have developped a neural model to understand how face recognition learning develops in early infancy. We propose a bayesian network based on local cellular properties of visual areas and on lateral-feedforward interactions in the cortex. The model reproduces the experimental data of the right hemisphere infant behavior, when tested with faces. We suggest that the bayesian neural networks and the biological properties of cortical areas may be a more general and useful instrument to understand human development.
ES1999-3
The NeuralBAG algorithm: optimizing generalization performance in bagged neural networks
J. Carney, P. Cunningham
The NeuralBAG algorithm: optimizing generalization performance in bagged neural networks
J. Carney, P. Cunningham
Abstract:
In this paper we propose an algorithm we call "NeuralBAG" that estimates the set of weights and number of hidden units each network in a bagged ensemble should have so that the generalization performance of the ensemble is optimized. Experiments performed on noisy synthetic data demonstrate the potential of the algorithm. On average, ensembles trained using NeuralBAG out-perform bagged networks trained using cross-validation by 53% and individual networks trained using "cheating" by 32%.
In this paper we propose an algorithm we call "NeuralBAG" that estimates the set of weights and number of hidden units each network in a bagged ensemble should have so that the generalization performance of the ensemble is optimized. Experiments performed on noisy synthetic data demonstrate the potential of the algorithm. On average, ensembles trained using NeuralBAG out-perform bagged networks trained using cross-validation by 53% and individual networks trained using "cheating" by 32%.
ES1999-38
Neuro-wavelet parametric characterization of hardness profiles
V. Colla, L. Reyneri, M. Sgarbi
Neuro-wavelet parametric characterization of hardness profiles
V. Colla, L. Reyneri, M. Sgarbi
Abstract:
This work compares a few attempts based on Neural and Wavelet networks, for extracting the Jominy hardness profile of steels directly from the chemical composition. In particular, the paper proposes a multi-networks architecture, where a rst network is used as a parametric modeler of the Jominy profile itself, while a second one is used as a parameter estimator from the steel chemical composition.
This work compares a few attempts based on Neural and Wavelet networks, for extracting the Jominy hardness profile of steels directly from the chemical composition. In particular, the paper proposes a multi-networks architecture, where a rst network is used as a parametric modeler of the Jominy profile itself, while a second one is used as a parameter estimator from the steel chemical composition.
ES1999-11
Heterogeneity enhanced order in a chaotic neural network
S. Mizutani, K. Shimohara
Heterogeneity enhanced order in a chaotic neural network
S. Mizutani, K. Shimohara
Abstract:
Order of the mean eld by heterogeneity is studied in the turbulent phase of a chaotic neural network. Heterogeneity means the distributed randomness of the input in each neuron or the weight in the network. The average power spectrum of the mean eld is used to observe the order and to focus on its peak sharpness. The sharpness of the power peak grows remarkably in the turbulent phase, except around the phase, due to the input disorder. One can nd the maximum of the power sharpness as the weight disorder increases in the turbulent phase. We suppose that this ordering effect is important for processing information for actual neural networks because of the general existence of such heterogeneity.
Order of the mean eld by heterogeneity is studied in the turbulent phase of a chaotic neural network. Heterogeneity means the distributed randomness of the input in each neuron or the weight in the network. The average power spectrum of the mean eld is used to observe the order and to focus on its peak sharpness. The sharpness of the power peak grows remarkably in the turbulent phase, except around the phase, due to the input disorder. One can nd the maximum of the power sharpness as the weight disorder increases in the turbulent phase. We suppose that this ordering effect is important for processing information for actual neural networks because of the general existence of such heterogeneity.
ES1999-20
Tackling the stability/plasticity dilemma with double loop dynamic systems
C. Lecerf
Tackling the stability/plasticity dilemma with double loop dynamic systems
C. Lecerf
Abstract:
Open and organized systems such as living organisms regulate their exchanges in order to maintain adaptation to their environment. When one reduces a biological organism to its central nervous system (CNS), adaptation comes up as an information flow exchange between the CNS and its environment. Though, the main mechanism used so far to explain learning is derived from the Hebb's hypothesis and it relies on structural modifications of the network through changing weights on connections. The double loop concept proposed here is the core of a structural and dynamic model tackling with incremental learning in large neural networks. A computer simulation of this concept is briefly described, then is given an equivalent mathematical dynamic system that is related to Thomas' biological feedback theory. Due to the double loop architecture, the observed dynamics shows that the model gives a built-in functional answer to the stability/plasticity dilemma.
Open and organized systems such as living organisms regulate their exchanges in order to maintain adaptation to their environment. When one reduces a biological organism to its central nervous system (CNS), adaptation comes up as an information flow exchange between the CNS and its environment. Though, the main mechanism used so far to explain learning is derived from the Hebb's hypothesis and it relies on structural modifications of the network through changing weights on connections. The double loop concept proposed here is the core of a structural and dynamic model tackling with incremental learning in large neural networks. A computer simulation of this concept is briefly described, then is given an equivalent mathematical dynamic system that is related to Thomas' biological feedback theory. Due to the double loop architecture, the observed dynamics shows that the model gives a built-in functional answer to the stability/plasticity dilemma.
Biological models and inspiration
ES1999-6
Regularization in oculomotor adaptation
J. Bullinaria, P. Riddell, S. Rushton
Regularization in oculomotor adaptation
J. Bullinaria, P. Riddell, S. Rushton
Abstract:
The oculomotor system remains plastic so that it can maintain clear single binocular vision during development and also in novel visual conditions (such as wearing new spectacles). It is important to understand this adaptation process so that we can predict in advance potential problems that might arise with new optical devices such as virtual reality head mounted displays. In this paper we present neural network models of adaptation to vertical disparities at different points in the visual field and argue that regularization (weight decay) provides a more realistic account of the empirical data than other approaches.
The oculomotor system remains plastic so that it can maintain clear single binocular vision during development and also in novel visual conditions (such as wearing new spectacles). It is important to understand this adaptation process so that we can predict in advance potential problems that might arise with new optical devices such as virtual reality head mounted displays. In this paper we present neural network models of adaptation to vertical disparities at different points in the visual field and argue that regularization (weight decay) provides a more realistic account of the empirical data than other approaches.
ES1999-26
Recurrent V1-V2 interaction for early visual information processing
H. Neumann, W. Sepp
Recurrent V1-V2 interaction for early visual information processing
H. Neumann, W. Sepp
Abstract:
A majority of cortical areas are connected via feedforward and feedback fiber projections. The computational role of the descending feedback pathways at different processing stages remains largely unknown. We suggest a new computational model in which normalized activities of orientation selective contrast cells are fed forward to the next higher processing stage. The arrangement of input activation is matched against local patterns of curvature shape to generate activities which are subsequently fed back to the previous stage. Initial measurements that are consistent with the top-down generated context-dependent responses are locally enhanced. In all, we present a computational theory for recurrent processing in visual cortex in which the significance of measurements is evaluated on the basis of priors that are represented as contour code patterns. The model handles a variety of perceptual phenomena, such as e.g. bar texture stimuli, illusory contours, and grouping of fragmented shape outline.
A majority of cortical areas are connected via feedforward and feedback fiber projections. The computational role of the descending feedback pathways at different processing stages remains largely unknown. We suggest a new computational model in which normalized activities of orientation selective contrast cells are fed forward to the next higher processing stage. The arrangement of input activation is matched against local patterns of curvature shape to generate activities which are subsequently fed back to the previous stage. Initial measurements that are consistent with the top-down generated context-dependent responses are locally enhanced. In all, we present a computational theory for recurrent processing in visual cortex in which the significance of measurements is evaluated on the basis of priors that are represented as contour code patterns. The model handles a variety of perceptual phenomena, such as e.g. bar texture stimuli, illusory contours, and grouping of fragmented shape outline.
ES1999-33
Neural field description of state-dependent receptive field changes in the visual cortex
K. Suder, F. Wörgötter, T. Wennekers
Neural field description of state-dependent receptive field changes in the visual cortex
K. Suder, F. Wörgötter, T. Wennekers
Abstract:
Receptive elds in V1 have been shown to be wider during synchronized than during non-synchronized EEG states, where, in ad- dition, they can shrink over time in response to ashed stimuli. In the present paper we employ a neural eld approach to describe the activity patterns in V1 analytically. Expressions for spatio-temporal receptive elds are derived and tted to experimental data. The model supports the idea that the observed RF-restructuring is mainly driven by state-dependent LGN ring patterns (burst vs. tonic mode).
Receptive elds in V1 have been shown to be wider during synchronized than during non-synchronized EEG states, where, in ad- dition, they can shrink over time in response to ashed stimuli. In the present paper we employ a neural eld approach to describe the activity patterns in V1 analytically. Expressions for spatio-temporal receptive elds are derived and tted to experimental data. The model supports the idea that the observed RF-restructuring is mainly driven by state-dependent LGN ring patterns (burst vs. tonic mode).
Special session: Support Vector Machines
ES1999-451
Integrating the evidence framework and the support vector machine
J. Kwok
Integrating the evidence framework and the support vector machine
J. Kwok
Abstract:
In this paper, we show that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework. We further on show that levels 2 and 3 can also be applied to SVM. This allows automatic adjustment of the regularization parameter and the kernel parameter. More importantly, it opens up a wealth of Bayesian tools for use with SVM. Performance is evaluated on both synthetic and real-world data sets.
In this paper, we show that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework. We further on show that levels 2 and 3 can also be applied to SVM. This allows automatic adjustment of the regularization parameter and the kernel parameter. More importantly, it opens up a wealth of Bayesian tools for use with SVM. Performance is evaluated on both synthetic and real-world data sets.
ES1999-452
Support vector classifier with asymetric kernel function
K. Tsuda
Support vector classifier with asymetric kernel function
K. Tsuda
ES1999-453
A multiplicative updating algorithm for training support vector machine
N. Cristianini, C. Campbell, J. Shawe-Taylor
A multiplicative updating algorithm for training support vector machine
N. Cristianini, C. Campbell, J. Shawe-Taylor
Abstract:
Support Vector Machines nd maximal margin hyperplanes in a high dimensional feature space, represented as a sparse linear combination of training points. Theoretical results exist which guarantee a high generalization performance when the margin is large or when the representation is very sparse. Multiplicative-Updating algorithms are a new tool for perceptron learning which are guaranteed to converge rapidly when the target concept is sparse. In this paper we present a Multiplicative-Updating algorithm for training Support Vector Machines which combines the generalization power provided by VC theory with the convergence properties of multiplicative algorithms.
Support Vector Machines nd maximal margin hyperplanes in a high dimensional feature space, represented as a sparse linear combination of training points. Theoretical results exist which guarantee a high generalization performance when the margin is large or when the representation is very sparse. Multiplicative-Updating algorithms are a new tool for perceptron learning which are guaranteed to converge rapidly when the target concept is sparse. In this paper we present a Multiplicative-Updating algorithm for training Support Vector Machines which combines the generalization power provided by VC theory with the convergence properties of multiplicative algorithms.
ES1999-455
Face identification using support vector machines
R. Fernandez, E. Viennet
Face identification using support vector machines
R. Fernandez, E. Viennet
Abstract:
The Support Vector Machine (SVM) is a statistic learning technique proposed by Vapnik and his research group [8]. In this paper, we benchmark SVMs on a face identification problem and propose two approaches incorporating SV classifiers. The rst approach maps the images in to a low dimensional features vector via a local Principal Component Analysis (PCA), features vectors are then used as the inputs of a SVM. The second algorithm is a direct SV classifier with invariances. Both approaches are tested on the freely available ORL database. The SV classifier with invariances achieves an error of 1.5%, which is the best result known on ORL database.
The Support Vector Machine (SVM) is a statistic learning technique proposed by Vapnik and his research group [8]. In this paper, we benchmark SVMs on a face identification problem and propose two approaches incorporating SV classifiers. The rst approach maps the images in to a low dimensional features vector via a local Principal Component Analysis (PCA), features vectors are then used as the inputs of a SVM. The second algorithm is a direct SV classifier with invariances. Both approaches are tested on the freely available ORL database. The SV classifier with invariances achieves an error of 1.5%, which is the best result known on ORL database.
ES1999-456
Statistical mechanics of support vector machine
A. Buhot, M. Gordon
Statistical mechanics of support vector machine
A. Buhot, M. Gordon
Abstract:
We present a theoretical study of the properties of a class of Support Vector Machines within the framework of Statistical Mechanics. We determine their capacity, the margin, the number of support vectors and the distribution of distances of the patterns to the separating hyperplane in feature-space.
We present a theoretical study of the properties of a class of Support Vector Machines within the framework of Statistical Mechanics. We determine their capacity, the margin, the number of support vectors and the distribution of distances of the patterns to the separating hyperplane in feature-space.
ES1999-459
An efficient formulation of sparsity controlled support vector regression
P. Drezet, R. Harrison
An efficient formulation of sparsity controlled support vector regression
P. Drezet, R. Harrison
Abstract:
Support Vector Regression (SVR) is a kernel based regression method capable of implementing a variety of regularization techniques. Implementation of SVR usually follows a dual optimization technique which includes Vapnik's e-insensitive zone. The number of terms in the resulting SVR approximation function is dependent on the size of this zone, but improving sparsity by increasing the size of this zone adversely effects precision. We describe an efficient method of formulating SVR without an e-insensitive zone, that selects a minimum support set for the terms of the approximator. Sparsity can then be traded for increased training error and/or decreased SV regularisation.
Support Vector Regression (SVR) is a kernel based regression method capable of implementing a variety of regularization techniques. Implementation of SVR usually follows a dual optimization technique which includes Vapnik's e-insensitive zone. The number of terms in the resulting SVR approximation function is dependent on the size of this zone, but improving sparsity by increasing the size of this zone adversely effects precision. We describe an efficient method of formulating SVR without an e-insensitive zone, that selects a minimum support set for the terms of the approximator. Sparsity can then be traded for increased training error and/or decreased SV regularisation.
ES1999-460
Generalized support vector machines
D. Mattera, F. Palmieri, S. Haykin
Generalized support vector machines
D. Mattera, F. Palmieri, S. Haykin
Abstract:
Most Support Vector (SV) methods proposed in the recent literature can be viewed in a unified framework with great exibility in terms of the choice of the basis functions. We show that all these problems can be solved within a unique approach if we are equipped with a robust method for nding a sparse solution of a linear system. Moreover, for such a purpose, we propose an iterative algorithm that can be simply implemented. This allows us to generalize the classical SV method to a generic choice of the basis functions.
Most Support Vector (SV) methods proposed in the recent literature can be viewed in a unified framework with great exibility in terms of the choice of the basis functions. We show that all these problems can be solved within a unique approach if we are equipped with a robust method for nding a sparse solution of a linear system. Moreover, for such a purpose, we propose an iterative algorithm that can be simply implemented. This allows us to generalize the classical SV method to a generic choice of the basis functions.
ES1999-461
Support vector machines for multi-class pattern recognition
J. Weston, C. Watkins
Support vector machines for multi-class pattern recognition
J. Weston, C. Watkins
Abstract:
The solution of binary classification problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classifiers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-markdatasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed.
The solution of binary classification problems using support vector machines (SVMs) is well developed, but multi-class problems with more than two classes have typically been solved by combining independently produced binary classifiers. We propose a formulation of the SVM that enables a multi-class pattern recognition problem to be solved in a single optimisation. We also propose a similar generalization of linear programming machines. We report experiments using bench-markdatasets in which these two methods achieve a reduction in the number of support vectors and kernel calculations needed.
ES1999-462
From regression to classification in support vector machines
M. Pontil, R. Rifkin, T. Evgeniou
From regression to classification in support vector machines
M. Pontil, R. Rifkin, T. Evgeniou
Abstract:
We study the relation between support vector machines (SVMs) for regression (SVMR) and SVM for classiffication (SVMC). We show that for a given SVMC solution there exists a SVMR solution which is equivalent for a certain choice of the parameters. In particular our result is that for epsilon sufficiently close to one, the optimal hyperplane and threshold for the SVMC problem with regularization parameter Cc are equal to 1/(1-epsilon) times the optimal hyperplane and threshold for SVMR with regularization parameter Cr = (1-epsilon) Cc. A direct consequence of this result is that SVMC can be seen as a special case of SVMR.
We study the relation between support vector machines (SVMs) for regression (SVMR) and SVM for classiffication (SVMC). We show that for a given SVMC solution there exists a SVMR solution which is equivalent for a certain choice of the parameters. In particular our result is that for epsilon sufficiently close to one, the optimal hyperplane and threshold for the SVMC problem with regularization parameter Cc are equal to 1/(1-epsilon) times the optimal hyperplane and threshold for SVMR with regularization parameter Cr = (1-epsilon) Cc. A direct consequence of this result is that SVMC can be seen as a special case of SVMR.
ES1999-464
From first order logic to Nd: a data driven reformulation
M. Sebag
From first order logic to Nd: a data driven reformulation
M. Sebag
Abstract:
First order logic (FOL) offers a natural way of modeling domains such as chemistry: a molecule is most adequately described as a graph of atoms linked by simple or double bonds. To overcome the specific difficulties of dealing with FOL, this paper presents an automatic mapping from the initial problem domain onto the set of integer vectors Nd, where d is a user-supplied integer. This mapping onto a metric space induces a (semi)-distance on the problem domain. Within supervised learning, the quality of the reformulation can thus be estimated from the predictive accuracy of a k-nearest neighbor classifier based on this distance. The approach is validated on a real-world problem pertaining to organic chemistry: toxicology prediction.
First order logic (FOL) offers a natural way of modeling domains such as chemistry: a molecule is most adequately described as a graph of atoms linked by simple or double bonds. To overcome the specific difficulties of dealing with FOL, this paper presents an automatic mapping from the initial problem domain onto the set of integer vectors Nd, where d is a user-supplied integer. This mapping onto a metric space induces a (semi)-distance on the problem domain. Within supervised learning, the quality of the reformulation can thus be estimated from the predictive accuracy of a k-nearest neighbor classifier based on this distance. The approach is validated on a real-world problem pertaining to organic chemistry: toxicology prediction.
ES1999-454
Dimensionality reduction by local processing
C. Wöhler, U. Kressel, J. Schürmann, J. Anlauf
Dimensionality reduction by local processing
C. Wöhler, U. Kressel, J. Schürmann, J. Anlauf
Abstract:
In this paper we describe a novel approach towards dimensionality reduction of patterns to be classified. It consists of local processing of the patterns as an alternative to the well-known global principal component analysis (PCA) algorithm. We use a feed-forward neural network architecture with spatial or spatio-temporal receptive eld connections between the rst two layers that yields a transformed feature vector of significantly reduced dimension. We suggest two techniques to adapt the weights of the receptive elds: a local PCA algorithm and training by online gradient descent. Our dimensionality reduction algorithm requires computational costs that are several times smaller compared to the classical PCA approach without loosing performance in the subsequent classification process. We apply the algorithm to the problem of handwritten digit recognition as well as to the recognition of pedestrians in image sequences.
In this paper we describe a novel approach towards dimensionality reduction of patterns to be classified. It consists of local processing of the patterns as an alternative to the well-known global principal component analysis (PCA) algorithm. We use a feed-forward neural network architecture with spatial or spatio-temporal receptive eld connections between the rst two layers that yields a transformed feature vector of significantly reduced dimension. We suggest two techniques to adapt the weights of the receptive elds: a local PCA algorithm and training by online gradient descent. Our dimensionality reduction algorithm requires computational costs that are several times smaller compared to the classical PCA approach without loosing performance in the subsequent classification process. We apply the algorithm to the problem of handwritten digit recognition as well as to the recognition of pedestrians in image sequences.
ES1999-457
A kernel based adaline
T. Friess, R. Harrison
A kernel based adaline
T. Friess, R. Harrison
Abstract:
This new algorithm combines the conceptual simplicity of a least-mean-square algorithm for linear regression, but exhibits the power of a universal non-linear function approximator. The method is based on a generalisation of the Widrow-Hoff LMS rule using Mercer kernels. Simple examples in curve fitting and non-linear systems identification are solved by the method.
This new algorithm combines the conceptual simplicity of a least-mean-square algorithm for linear regression, but exhibits the power of a universal non-linear function approximator. The method is based on a generalisation of the Widrow-Hoff LMS rule using Mercer kernels. Simple examples in curve fitting and non-linear systems identification are solved by the method.
ES1999-458
Data domain description using support vectors
D. Tax, R. Duin
Data domain description using support vectors
D. Tax, R. Duin
Abstract:
This paper introduces a new method for data domain description, inspired by the Support Vector Machine by V.Vapnik, called the Support Vector Domain Description (SVDD). This method computes a sphere shaped decision boundary with minimal volume around a set of objects. This data description can be used for novelty or outlier detection. It contains support vectors describing the sphere boundary and it has the possibility of obtaining higher order boundary descriptions without much extra computational cost. By using the different kernels this SVDD can obtain more flexible and more accurate data descriptions. The error of the first kind, the fraction of the training objects which will be rejected, can be estimated immediately from the description.
This paper introduces a new method for data domain description, inspired by the Support Vector Machine by V.Vapnik, called the Support Vector Domain Description (SVDD). This method computes a sphere shaped decision boundary with minimal volume around a set of objects. This data description can be used for novelty or outlier detection. It contains support vectors describing the sphere boundary and it has the possibility of obtaining higher order boundary descriptions without much extra computational cost. By using the different kernels this SVDD can obtain more flexible and more accurate data descriptions. The error of the first kind, the fraction of the training objects which will be rejected, can be estimated immediately from the description.
ES1999-463
Support vector machines vs multi-layer perceptrons in particle identification
N. Barabino, M. Pallavicini, A. Petrolini, M. Pontil, A. Verri
Support vector machines vs multi-layer perceptrons in particle identification
N. Barabino, M. Pallavicini, A. Petrolini, M. Pontil, A. Verri
Abstract:
In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two different problems of Particle Identification in High Energy Physics experiments. The obtained results indicate that SVMs and MLPs tend to perform very similarly.
In this paper we evaluate the performance of Support Vector Machines (SVMs) and Multi-Layer Perceptrons (MLPs) on two different problems of Particle Identification in High Energy Physics experiments. The obtained results indicate that SVMs and MLPs tend to perform very similarly.
ANN models and learning II
ES1999-34
Specialization with cortical models: An application to causality learning
H. Frezza-Buet, F. Alexandre
Specialization with cortical models: An application to causality learning
H. Frezza-Buet, F. Alexandre
Abstract:
In this paper we present the principle of learning by specialization within a cortically-inspired framework. Specialization of neurons in the cortex has been observed, and many models are using such "cortical-like" learning mechanisms, adapted for computational efficiency. Adaptations will be discussed, in light of experiments with our cortical model addressing causality learning from perceptive sequences.
In this paper we present the principle of learning by specialization within a cortically-inspired framework. Specialization of neurons in the cortex has been observed, and many models are using such "cortical-like" learning mechanisms, adapted for computational efficiency. Adaptations will be discussed, in light of experiments with our cortical model addressing causality learning from perceptive sequences.
ES1999-23
Generalisation capabilities of a distributed neural classifier
A. Ribert, A. Ennaji, Y. Lecourtier
Generalisation capabilities of a distributed neural classifier
A. Ribert, A. Ennaji, Y. Lecourtier
Abstract:
This article describes a new approach to the automated construction of a distributed neural classifier. The methodology is based upon supervised hierarchical clustering which enables one to determine reliable regions in the representation space. The proposed methodology proceeds by associating each of these regions with a Multi-Layer Perceptron (MLP). Each MLP has to recognise elements inside its region, while rejecting all others. Experimental results for a real problem (handwritten digit recognition) reveal an interesting generalisation behaviour of the distributed classifier in comparison to the knearest neighbour algorithm as well as a single MLP.
This article describes a new approach to the automated construction of a distributed neural classifier. The methodology is based upon supervised hierarchical clustering which enables one to determine reliable regions in the representation space. The proposed methodology proceeds by associating each of these regions with a Multi-Layer Perceptron (MLP). Each MLP has to recognise elements inside its region, while rejecting all others. Experimental results for a real problem (handwritten digit recognition) reveal an interesting generalisation behaviour of the distributed classifier in comparison to the knearest neighbour algorithm as well as a single MLP.
ES1999-42
A comparison of three PCA neural techniques
S. Fiori, F. Piazza
A comparison of three PCA neural techniques
S. Fiori, F. Piazza
Abstract:
We present a comparison of three neural PCA techniques: the GHA by Sanger, the APEX by Kung and Diamataras, and the psi-APEX first proposed by the present authors. Through numerical simulations and computational complexity evaluations we show the psi-APEX algorithms exhibit superior capability and interesting features.
We present a comparison of three neural PCA techniques: the GHA by Sanger, the APEX by Kung and Diamataras, and the psi-APEX first proposed by the present authors. Through numerical simulations and computational complexity evaluations we show the psi-APEX algorithms exhibit superior capability and interesting features.
ES1999-40
Neural networks which identify composite factors
D. MacDonald, D. Charles, C. Fyfe
Neural networks which identify composite factors
D. MacDonald, D. Charles, C. Fyfe
Abstract:
We investigate the use of an artificial neural network to form a sparse distributed representation of the underlying factors in data sets. We extend the previously proposed [1] network so that it may identify composite causes in data sets by creating a hierarchical network. We use the network as a means of identifying individual faces when the network is trained on a mixture of faces and show both analytically and through experiments how noise allows us to find precisely the factors without prior assumptions of the number of factors.
We investigate the use of an artificial neural network to form a sparse distributed representation of the underlying factors in data sets. We extend the previously proposed [1] network so that it may identify composite causes in data sets by creating a hierarchical network. We use the network as a means of identifying individual faces when the network is trained on a mixture of faces and show both analytically and through experiments how noise allows us to find precisely the factors without prior assumptions of the number of factors.
ES1999-2
Supervised Art-II: a new neural network architecture, with quicker learning algorithm, for learning and classifying multivaled input patterns
K. R. Al-Rawi, C. Gonzalo, A. Arquero
Supervised Art-II: a new neural network architecture, with quicker learning algorithm, for learning and classifying multivaled input patterns
K. R. Al-Rawi, C. Gonzalo, A. Arquero
Abstract:
A new artificial neural network (ANN) architecture for learning and classifying multivalued input patterns has been introduced, called Supervised ART-II. It represents a new supervision approach for ART modules. It is quicker in learning than Supervised ART-I when the number of category nodes is large, and it requires less memory. The architecture, learning, and testing of the newly developed ANN have been discussed.
A new artificial neural network (ANN) architecture for learning and classifying multivalued input patterns has been introduced, called Supervised ART-II. It represents a new supervision approach for ART modules. It is quicker in learning than Supervised ART-I when the number of category nodes is large, and it requires less memory. The architecture, learning, and testing of the newly developed ANN have been discussed.
Classification
ES1999-4
Feature binding and relaxation labeling with the competitive layer model
H. Wersing, H. Ritter
Feature binding and relaxation labeling with the competitive layer model
H. Wersing, H. Ritter
Abstract:
We discuss the relation of the Competitive Layer Model (CLM) to Relaxation Labeling (RL) with regard to feature binding and labeling problems. The CLM uses cooperative and competitive interactions to partition a set of input features into groups by energy minimization. As we show, the stable attractors of the CLM provide consistent and unambiguous labelings in the sense of RL and we give an efficient stochastic simulation procedure for their identification. In addition to binding the CLM exhibits contextual activity modulation to represent stimulus salience. We incorporate deterministic annealing for avoidance of local minima and show how figure-ground segmentation and grouping can be combined for the CLM application of contour grouping on a real image.
We discuss the relation of the Competitive Layer Model (CLM) to Relaxation Labeling (RL) with regard to feature binding and labeling problems. The CLM uses cooperative and competitive interactions to partition a set of input features into groups by energy minimization. As we show, the stable attractors of the CLM provide consistent and unambiguous labelings in the sense of RL and we give an efficient stochastic simulation procedure for their identification. In addition to binding the CLM exhibits contextual activity modulation to represent stimulus salience. We incorporate deterministic annealing for avoidance of local minima and show how figure-ground segmentation and grouping can be combined for the CLM application of contour grouping on a real image.
ES1999-16
Segmentation-free detection of overtaking vehicles with a two-stage time-delay neural network classifier
C. Wöhler, J. Schürmann, J. Anlauf
Segmentation-free detection of overtaking vehicles with a two-stage time-delay neural network classifier
C. Wöhler, J. Schürmann, J. Anlauf
Abstract:
We propose an algorithm based on a time delay neural network (TDNN) with spatio-temporal receptive elds for segementation- free detection of overtaking vehicles on motorways. Our algorithm transforms the detection problem into a classification problem of strongly downscaled image sequences which serve as an input to the TDNN without a preliminary segmentation step. The TDNN classifier is followed by an additional classifi cation stage to evaluate the TDNN output over time, which achieves a signifi cant enhancement of the detection performance especially under difficult visibility conditions.
We propose an algorithm based on a time delay neural network (TDNN) with spatio-temporal receptive elds for segementation- free detection of overtaking vehicles on motorways. Our algorithm transforms the detection problem into a classification problem of strongly downscaled image sequences which serve as an input to the TDNN without a preliminary segmentation step. The TDNN classifier is followed by an additional classifi cation stage to evaluate the TDNN output over time, which achieves a signifi cant enhancement of the detection performance especially under difficult visibility conditions.
ES1999-18
An integer recurrent artificial neural network for classifying feature vectors
R. K. Brouwer
An integer recurrent artificial neural network for classifying feature vectors
R. K. Brouwer
Abstract:
The main contribution of this report is the development of an integer recurrent artificial neural network (IRANN) for classification of feature vectors. The network consists both of threshold units or perceptrons and of counters, which are non-threshold units with bi-nary input and integer output. Input and output of the network consists of vectors of natural numbers. For classification representatives of sets are stored by calculating a connection matrix such that all the elements in a training set are attracted to members of the same training set. The class of its attractor then classifies an arbitrary element if the attractor is a member of one of the original training sets. The network is successfully applied to the classification of sugar diabetes data and credit application data.
The main contribution of this report is the development of an integer recurrent artificial neural network (IRANN) for classification of feature vectors. The network consists both of threshold units or perceptrons and of counters, which are non-threshold units with bi-nary input and integer output. Input and output of the network consists of vectors of natural numbers. For classification representatives of sets are stored by calculating a connection matrix such that all the elements in a training set are attracted to members of the same training set. The class of its attractor then classifies an arbitrary element if the attractor is a member of one of the original training sets. The network is successfully applied to the classification of sugar diabetes data and credit application data.
ES1999-24
Feature selection for ANNs using genetic algorithms in condition monitoring
L. Jack, A. Nandi
Feature selection for ANNs using genetic algorithms in condition monitoring
L. Jack, A. Nandi
Abstract:
Artificial Neural Networks (ANNs) can be used successfully to detect faults in rotating machinery, using statistical estimates of the vibration signal as input features. One of the main problems facing the use of ANNs is the selection of the best inputs to the ANN, allowing the creation of compact, highly accurate networks that require comparatively little preprocessing. This paper examines the use of a Genetic Algorithm (GA) to select the most significant input features from a large set of possible features in machine condition monitoring contexts. Using a large set of 156 different features, the GA is able to select a set of 6 features that give 100% recognition accuracy.
Artificial Neural Networks (ANNs) can be used successfully to detect faults in rotating machinery, using statistical estimates of the vibration signal as input features. One of the main problems facing the use of ANNs is the selection of the best inputs to the ANN, allowing the creation of compact, highly accurate networks that require comparatively little preprocessing. This paper examines the use of a Genetic Algorithm (GA) to select the most significant input features from a large set of possible features in machine condition monitoring contexts. Using a large set of 156 different features, the GA is able to select a set of 6 features that give 100% recognition accuracy.
Special session: Information extraction using unsupervised neural networks
ES1999-208
Trends in Unsupervised Learning
C. Fyfe
Trends in Unsupervised Learning
C. Fyfe
Abstract:
We review the trends in unsupervised learning towards the search for (in)dependence rather than (de)correlation, towards the use of global objective functions, towards a balancing of cooperation and competition and towards probabilistic, particularly Bayesian methods.
We review the trends in unsupervised learning towards the search for (in)dependence rather than (de)correlation, towards the use of global objective functions, towards a balancing of cooperation and competition and towards probabilistic, particularly Bayesian methods.
ES1999-202
Detection of two Gaussian clusters
A. Buhot, M. Gordon
Detection of two Gaussian clusters
A. Buhot, M. Gordon
Abstract:
We discuss the detection of two Gaussian clusters given a cloud of points. The optimal learning curve for this unsupervised learning scenario is determined with a replica calculation. A comparison with principal component analysis and supervised learning allows to understand the three different learning phases observed.
We discuss the detection of two Gaussian clusters given a cloud of points. The optimal learning curve for this unsupervised learning scenario is determined with a replica calculation. A comparison with principal component analysis and supervised learning allows to understand the three different learning phases observed.
ES1999-205
Independent component analysis for mixture densities
F. Palmieri, A. Budillon, D. Mattera
Independent component analysis for mixture densities
F. Palmieri, A. Budillon, D. Mattera
Abstract:
Independent component analysis (ICA), formulated as a density estimation problem, is extended to a mixture density model. A number of ICA blocks, associated to implicit equivalent classes, are updated in turn on the basis of the estimated density they represent. The approach is equivalent to the EM algorithm and allows an easy non linear extension of all the current ICA algorithms. We also show a preliminary test on bi-dimensional synthetic data drawn from a mixture model.
Independent component analysis (ICA), formulated as a density estimation problem, is extended to a mixture density model. A number of ICA blocks, associated to implicit equivalent classes, are updated in turn on the basis of the estimated density they represent. The approach is equivalent to the EM algorithm and allows an easy non linear extension of all the current ICA algorithms. We also show a preliminary test on bi-dimensional synthetic data drawn from a mixture model.
ES1999-206
Extraction of intrinsic dimension using CCA - Application to blind sources separation
N. Donckers, A. Lendasse, V. Wertz, M. Verleysen
Extraction of intrinsic dimension using CCA - Application to blind sources separation
N. Donckers, A. Lendasse, V. Wertz, M. Verleysen
Abstract:
A general-purpose useful parameter in data analysis is the intrinsic dimension of a data set, corresponding to the minimum number of variables necessary to describe the data without significant loss of information. The knowledge of this dimension also facilitates most non-linear projection methods. We will show that the intrinsic dimension of a data set can be efficiently estimated using Curvilinear Component Analysis; we will also show that the method can be applied to the Blind Source Separation problem to estimate the number of sources in a mixing.
A general-purpose useful parameter in data analysis is the intrinsic dimension of a data set, corresponding to the minimum number of variables necessary to describe the data without significant loss of information. The knowledge of this dimension also facilitates most non-linear projection methods. We will show that the intrinsic dimension of a data set can be efficiently estimated using Curvilinear Component Analysis; we will also show that the method can be applied to the Blind Source Separation problem to estimate the number of sources in a mixing.
ES1999-207
Noise to extract independent causes
D. Charles, C. Fyfe
Noise to extract independent causes
D. Charles, C. Fyfe
Abstract:
Noisy threshold activation functions are used to force sparse responses on the output neurons of an unsupervised neural network enabling the network to identify the underlying independent factors of visual data. The addition of noise into the network enables us to control the response of the network to the data so we can force only as many outputs to respond to the data as there are significant factors in the data. Noise is also used to modularise the response of the network so that factors with temporal correlation may be coded in the same module of the output space.
Noisy threshold activation functions are used to force sparse responses on the output neurons of an unsupervised neural network enabling the network to identify the underlying independent factors of visual data. The addition of noise into the network enables us to control the response of the network to the data so we can force only as many outputs to respond to the data as there are significant factors in the data. Noise is also used to modularise the response of the network so that factors with temporal correlation may be coded in the same module of the output space.
ES1999-201
Information retrieval systems using an associative conceptual space
J. van den Berg, M. Schuemie
Information retrieval systems using an associative conceptual space
J. van den Berg, M. Schuemie
Abstract:
After a review of 'intelligent' information retrieval systems, we propose an AI-based retrieval system inspired by the WEBSOM-algorithm. Contrary to that approach, however, we introduce a system using only the index of every document. The knowledge extraction process results into a so-called Associative Conceptual Space, where the 'concepts', as found in the documents, are clustered using a Hebbian-type of (un)learning. Then, each document is characterised by comparing the concepts found in it, to those present in the concept space. Applying the characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map.
After a review of 'intelligent' information retrieval systems, we propose an AI-based retrieval system inspired by the WEBSOM-algorithm. Contrary to that approach, however, we introduce a system using only the index of every document. The knowledge extraction process results into a so-called Associative Conceptual Space, where the 'concepts', as found in the documents, are clustered using a Hebbian-type of (un)learning. Then, each document is characterised by comparing the concepts found in it, to those present in the concept space. Applying the characterisations, all documents can be clustered such that semantically similar documents lie close together on a Self-Organising Map.
ES1999-203
Taking inspiration from the Hippocampus can help solving robotics problems
A. Revel, P. Gaussier, J.P. Banquet
Taking inspiration from the Hippocampus can help solving robotics problems
A. Revel, P. Gaussier, J.P. Banquet
Abstract:
\N
\N
ANN models and learning III
ES1999-37
Orthogonal least square algorithm applied to the initialization of multi-layer perceptrons
V. Colla, L. Reyneri, M. Sgarbi
Orthogonal least square algorithm applied to the initialization of multi-layer perceptrons
V. Colla, L. Reyneri, M. Sgarbi
Abstract:
An efficient procedure is proposed for initializing two-layer perceptrons and for determining the optimal number of hidden neurons. This is based on the Orthogonal Least Squares method, which is typical of RBF as well as Wavelet networks. Some experiments are discussed, in which the proposed method is coupled with standard backpropagation training and compared with random initialization.
An efficient procedure is proposed for initializing two-layer perceptrons and for determining the optimal number of hidden neurons. This is based on the Orthogonal Least Squares method, which is typical of RBF as well as Wavelet networks. Some experiments are discussed, in which the proposed method is coupled with standard backpropagation training and compared with random initialization.
ES1999-17
Maximisation of stability ranges for recurrent neural networks subject to on-line adaptation
J. Steil, H. Ritter
Maximisation of stability ranges for recurrent neural networks subject to on-line adaptation
J. Steil, H. Ritter
Abstract:
We present conditions for absolute stability of recurrent neural networks with time-varying weights based on the Popov theorem from non-linear feedback system theory. We show how to maximise the stability bounds by deriving a convex optimisation problem subject to linear matrix inequality constraints, which can efficiently be solved by interior point methods with standard software.
We present conditions for absolute stability of recurrent neural networks with time-varying weights based on the Popov theorem from non-linear feedback system theory. We show how to maximise the stability bounds by deriving a convex optimisation problem subject to linear matrix inequality constraints, which can efficiently be solved by interior point methods with standard software.
ES1999-303
Encoding of sequential translators in discrete-time recurrent neural nets
R.P. Neco, M.L. Forcada, R.C. Carrasco, M.A. Valdez-Munoz
Encoding of sequential translators in discrete-time recurrent neural nets
R.P. Neco, M.L. Forcada, R.C. Carrasco, M.A. Valdez-Munoz
Abstract:
In recent years, there has been a lot of interest in the use of discrete-time recurrent neural nets (DTRNN) to learn nite-state tasks, and in the computational power of DTRNN, particularly in connection with nite-state computation. This paper describes a simple strategy to devise stable encodings of sequential nite-state translators (SFST) in a second-order DTRNN with units having bounded, strictly growing, continuous sigmoid activation functions. The strategy relies on bounding criteria based on a study of the conditions under which the DTRNN is actually behaving as a SFST.
In recent years, there has been a lot of interest in the use of discrete-time recurrent neural nets (DTRNN) to learn nite-state tasks, and in the computational power of DTRNN, particularly in connection with nite-state computation. This paper describes a simple strategy to devise stable encodings of sequential nite-state translators (SFST) in a second-order DTRNN with units having bounded, strictly growing, continuous sigmoid activation functions. The strategy relies on bounding criteria based on a study of the conditions under which the DTRNN is actually behaving as a SFST.
ES1999-29
On the invertibility of the RBF model in a predictive control strategy
A. Fache, O. Dubois, A. Billat
On the invertibility of the RBF model in a predictive control strategy
A. Fache, O. Dubois, A. Billat
Abstract:
This paper describes the importance of the RBF model quality in a model-based predictive control scheme. We show that a good neuronal approximator does not necessarily correctly model the intrinsic behaviour of the identified system. We have used a simulated example to show the harmful effects of a particular type of incorrect behaviour, the non-invertibility of the model relative to the control input. Lastly, we propose a derived RBF model that is slightly more complex, but which is systematically invertible.
This paper describes the importance of the RBF model quality in a model-based predictive control scheme. We show that a good neuronal approximator does not necessarily correctly model the intrinsic behaviour of the identified system. We have used a simulated example to show the harmful effects of a particular type of incorrect behaviour, the non-invertibility of the model relative to the control input. Lastly, we propose a derived RBF model that is slightly more complex, but which is systematically invertible.
ES1999-46
Nonlinear factorization in sparsely encoded Hopfield-like neural networks
A.M. Sirota, A.A. Frolov, D. Husek
Nonlinear factorization in sparsely encoded Hopfield-like neural networks
A.M. Sirota, A.A. Frolov, D. Husek
Abstract:
The problem of binary factorization of complex patterns in recurrent Hopfieldlike neural network was studied both theoretically and by means of computer simulation. The number and sparseness of factors mixed in patterns crucially determines the ability of an autoassociator to perform a factorization. Basing on experimental data on memory and learning one may suggest, that there exists a neural system of intermediate storage of information, which fulfills the function of binary factorization of the incoming polysensory information for its further effective storage in the form of elementary associatively bound factors. We suppose that field CA3 of the hippocampus possessing all properties of the autoassociative memory performs such function. This functional idea could be fruitfully applied to various memory related tasks (e.g. spatial navigation) and lead to some critical experiments.
The problem of binary factorization of complex patterns in recurrent Hopfieldlike neural network was studied both theoretically and by means of computer simulation. The number and sparseness of factors mixed in patterns crucially determines the ability of an autoassociator to perform a factorization. Basing on experimental data on memory and learning one may suggest, that there exists a neural system of intermediate storage of information, which fulfills the function of binary factorization of the incoming polysensory information for its further effective storage in the form of elementary associatively bound factors. We suppose that field CA3 of the hippocampus possessing all properties of the autoassociative memory performs such function. This functional idea could be fruitfully applied to various memory related tasks (e.g. spatial navigation) and lead to some critical experiments.
ES1999-21
Storage capacity and dynamics of nonmonotonic networks
B. Crespi, I. Lazzizzera
Storage capacity and dynamics of nonmonotonic networks
B. Crespi, I. Lazzizzera
Abstract:
This work investigates the retrieval capacities of different types of nonmonotonic neurons. Storage capacity is maximized when the neuron response is a function with well defined geometrical characteristics. Numerical experiments demonstrate that storage capacity is directly related to the dynamical property of the iterative map that describes the network evolution. Maximum capacity is reached when the neuron dynamics are subdivided into two non-overlapping "erratic bands" around points xi = +/- 1.
This work investigates the retrieval capacities of different types of nonmonotonic neurons. Storage capacity is maximized when the neuron response is a function with well defined geometrical characteristics. Numerical experiments demonstrate that storage capacity is directly related to the dynamical property of the iterative map that describes the network evolution. Maximum capacity is reached when the neuron dynamics are subdivided into two non-overlapping "erratic bands" around points xi = +/- 1.
ES1999-30
A general approach to construct RBF net-based classifier
F. Belloir, A. Fache, A. Billat
A general approach to construct RBF net-based classifier
F. Belloir, A. Fache, A. Billat
Abstract:
This paper describes a global approach to the construction of Radial Basis Function (RBF) neural net classifier. We used a new simple algorithm to completely define the structure of the RBF classifier. This algorithm has the major advantage to require only the training set (no step learning, threshold or other parameters as in other methods). Tests on several benchmark datasets showed, despite its simplicity, that this algorithm provides a robust and efficient classifier. The results of this built RBF classifier are compared to those obtained with three other classifiers : a classic one and two neural ones. The robustness and efficiency of this kind of RBF classifier make the proposed algorithm very attractive.
This paper describes a global approach to the construction of Radial Basis Function (RBF) neural net classifier. We used a new simple algorithm to completely define the structure of the RBF classifier. This algorithm has the major advantage to require only the training set (no step learning, threshold or other parameters as in other methods). Tests on several benchmark datasets showed, despite its simplicity, that this algorithm provides a robust and efficient classifier. The results of this built RBF classifier are compared to those obtained with three other classifiers : a classic one and two neural ones. The robustness and efficiency of this kind of RBF classifier make the proposed algorithm very attractive.
ES1999-32
Hidden Markov gating for prediction of change points in switching dynamical systems
S. Liehr, K. Pawelzik, J. Kohlmorgen, S. Lemm, K.-R. Müller
Hidden Markov gating for prediction of change points in switching dynamical systems
S. Liehr, K. Pawelzik, J. Kohlmorgen, S. Lemm, K.-R. Müller
Abstract:
The prediction of switching dynamical systems requires an identification of each individual dynamics and an early detection of mode changes. Here we present a unified framework of a mixtures of experts architecture and a generalized hidden Markov model (HMM) with a state space dependent transition matrix. The specialization of the experts in the dynamical regimes and the adaptation of the switching probabilities is performed simultaneously during the training procedure. We show that our method allows for a fast online detection of mode changes in cases where the most recent input data together with the last dynamical mode contain sufficient information to indicate a dynamical change.
The prediction of switching dynamical systems requires an identification of each individual dynamics and an early detection of mode changes. Here we present a unified framework of a mixtures of experts architecture and a generalized hidden Markov model (HMM) with a state space dependent transition matrix. The specialization of the experts in the dynamical regimes and the adaptation of the switching probabilities is performed simultaneously during the training procedure. We show that our method allows for a fast online detection of mode changes in cases where the most recent input data together with the last dynamical mode contain sufficient information to indicate a dynamical change.
ES1999-31
Critical and non-critical avalanche behavior in networks of integrate-and-fire neurons
C. Eurich, T. Conradi, H. Schwegler
Critical and non-critical avalanche behavior in networks of integrate-and-fire neurons
C. Eurich, T. Conradi, H. Schwegler
Abstract:
We study avalanches of spike activity in fully connected networks of integrate-and-fi re neurons which receive purely random input. In contrast to the self-organized critical avalanche behavior in sandpile models, critical and non-critical behavior is found depending on the interaction strength. Avalanche behavior can be readily understood by using combinatorial arguments in phase space.
We study avalanches of spike activity in fully connected networks of integrate-and-fi re neurons which receive purely random input. In contrast to the self-organized critical avalanche behavior in sandpile models, critical and non-critical behavior is found depending on the interaction strength. Avalanche behavior can be readily understood by using combinatorial arguments in phase space.
Special session: Spiking neurons
ES1999-252
Fast analog computation in networks of spiking neurons using unreliable synapses
T. Natschläger, W. Maass
Fast analog computation in networks of spiking neurons using unreliable synapses
T. Natschläger, W. Maass
Abstract:
We investigate through theoretical analysis and computer simulations the consequences of unreliable synapses for fast analog computations in networks of spiking neurons, with analog variables encoded by the ring activities of pools of spiking neurons. Our results suggest that the known unreliability of synaptic transmission may be viewed as a useful tool for analog computing, rather than as a "bug" in neuronal hardware. We also investigate computations on analog time series encoded by the ring activities of pools of spiking neurons.
We investigate through theoretical analysis and computer simulations the consequences of unreliable synapses for fast analog computations in networks of spiking neurons, with analog variables encoded by the ring activities of pools of spiking neurons. Our results suggest that the known unreliability of synaptic transmission may be viewed as a useful tool for analog computing, rather than as a "bug" in neuronal hardware. We also investigate computations on analog time series encoded by the ring activities of pools of spiking neurons.
ES1999-253
Learning a temporal code
P. Häfliger
Learning a temporal code
P. Häfliger
Abstract:
The paper proposes a concrete information encoding for networks of spiking neurons. A temporal code is presented in which neurons respond to simultaneous spike releases of a particular group of neurons. The paper puts a spike-based learning rule in the context of that coding and shows how a network adapts to events experienced while observing an environment. Furthermore, correlations between events distant in time can be learnt. To demonstrate this, a net is simulated, the neurons of which become selective to moving bar stimuli after repeated presentations of samples.
The paper proposes a concrete information encoding for networks of spiking neurons. A temporal code is presented in which neurons respond to simultaneous spike releases of a particular group of neurons. The paper puts a spike-based learning rule in the context of that coding and shows how a network adapts to events experienced while observing an environment. Furthermore, correlations between events distant in time can be learnt. To demonstrate this, a net is simulated, the neurons of which become selective to moving bar stimuli after repeated presentations of samples.
ES1999-251
VC dimension bounds for networks of spiking neurons
M. Schmitt
VC dimension bounds for networks of spiking neurons
M. Schmitt
Abstract:
We calculate bounds on the VC dimension and pseudo dimension for networks of spiking neurons. The connections between network nodes are parameterized by transmission delays and synaptic weights. We provide bounds in terms of network depth and number of connections that are almost linear. For networks with few layers this yields better bounds than previously established results for networks of unrestricted depth.
We calculate bounds on the VC dimension and pseudo dimension for networks of spiking neurons. The connections between network nodes are parameterized by transmission delays and synaptic weights. We provide bounds in terms of network depth and number of connections that are almost linear. For networks with few layers this yields better bounds than previously established results for networks of unrestricted depth.
ES1999-254
What does a neuron talk about ?
S. Wilke, C.Eurich
What does a neuron talk about ?
S. Wilke, C.Eurich
Abstract:
We study the coding accuracy of a population of stochastically spiking neurons that respond to different features of a stimulus. By using Fisher information as a measure of the encoding error, it can be shown that narrow tuning functions in one of the encoded dimensions increase the coding accuracy for this dimension as long as the active sub-population is large enough. This can be achieved by neurons that are broadly tuned in the other dimensions. If one or more stimulus features encoded by the neural population are unknown, the relative widths of the tuning curves in the remaining dimensions are a measure of the corresponding relative accuracies. This feature allows a quantitative description of the kind of information conveyed by the neural population.
We study the coding accuracy of a population of stochastically spiking neurons that respond to different features of a stimulus. By using Fisher information as a measure of the encoding error, it can be shown that narrow tuning functions in one of the encoded dimensions increase the coding accuracy for this dimension as long as the active sub-population is large enough. This can be achieved by neurons that are broadly tuned in the other dimensions. If one or more stimulus features encoded by the neural population are unknown, the relative widths of the tuning curves in the remaining dimensions are a measure of the corresponding relative accuracies. This feature allows a quantitative description of the kind of information conveyed by the neural population.
Temporal series
ES1999-401
Development of a French speech recognizer using a hybrid HMM/MLP system
J.-M. Boite, C. Ris
Development of a French speech recognizer using a hybrid HMM/MLP system
J.-M. Boite, C. Ris
Abstract:
In this paper we describe the development of a French speech recognizer, and the experiments we carried out on our hybrid HMM/ANN system which combines Artificial Neural Networks (ANN) and Hidden Markov Models (HMMs). A phone recognition experiment with our baseline system achieved a phone accuracy of about 75% which is very similar to the best results reported in the literature [1]. Preliminary experiments on continuous speech recognition have set a baseline performance for our hybrid HMM/ANN system on BREF using lexicons of different sizes. All the experiments were carried out with the STRUT (Speech Training and Recognition Unified Toolkit) software [2] and the NOWAY large vocabulary decoder [3]
In this paper we describe the development of a French speech recognizer, and the experiments we carried out on our hybrid HMM/ANN system which combines Artificial Neural Networks (ANN) and Hidden Markov Models (HMMs). A phone recognition experiment with our baseline system achieved a phone accuracy of about 75% which is very similar to the best results reported in the literature [1]. Preliminary experiments on continuous speech recognition have set a baseline performance for our hybrid HMM/ANN system on BREF using lexicons of different sizes. All the experiments were carried out with the STRUT (Speech Training and Recognition Unified Toolkit) software [2] and the NOWAY large vocabulary decoder [3]
ES1999-402
A hybrid system for fraud detection in mobile communications
Y. Moreau, E. Lerouge, H. Verrelst, J. Vandewalle, C. Störmann, P. Burge
A hybrid system for fraud detection in mobile communications
Y. Moreau, E. Lerouge, H. Verrelst, J. Vandewalle, C. Störmann, P. Burge
Abstract:
During the course of the European project \Advanced Security for Personal Communication Technologies" (ASPeCT), we have developed some rule-based and neural network architectures as a number of different fraud detection tools for GSM networks. We have now integrated these different techniques into a hybrid detection tool. We optimized the performance of the hybrid system in terms of the number of subscribers raising alarms. More precisely, we optimized performance curves showing the trade-offbetween the percentage of correctly identified fraudsters versus the percentage of new subscribers raising alarms. We report here on a common suite of experiments we performed on these different systems.
During the course of the European project \Advanced Security for Personal Communication Technologies" (ASPeCT), we have developed some rule-based and neural network architectures as a number of different fraud detection tools for GSM networks. We have now integrated these different techniques into a hybrid detection tool. We optimized the performance of the hybrid system in terms of the number of subscribers raising alarms. More precisely, we optimized performance curves showing the trade-offbetween the percentage of correctly identified fraudsters versus the percentage of new subscribers raising alarms. We report here on a common suite of experiments we performed on these different systems.
ES1999-35
Hybrid HMM/MLP models for times series prediction
J. Rynkiewicz
Hybrid HMM/MLP models for times series prediction
J. Rynkiewicz
Abstract:
We present a hybrid model consisting of an hidden Markov chain and MLPs to model piecewise stationary series. We compare our results with the model of gating networks (A.S. Weigend et al. [6]) and we show than, at least on the classical laser time series, our model is more parcimonious and give better segmentation of the series.
We present a hybrid model consisting of an hidden Markov chain and MLPs to model piecewise stationary series. We compare our results with the model of gating networks (A.S. Weigend et al. [6]) and we show than, at least on the classical laser time series, our model is more parcimonious and give better segmentation of the series.