Bruges, Belgium, April 27-28-29
Content of the proceedings
-
Clustering, quantization, and self-organization
Dynamical and Numerical Aspects of Neural Computing
Learning I
Artificial Neural Networks and Prognosis in Medicine
Learning II
Perceptrons and Multi-Layer Perceptrons
Evolutionary and neural computation
Independent Component Analysis
Classification using non-standard metrics
Learning III
Biologically inspired models
Kernel methods and the exponential family
Applications
Learning IV
Clustering, quantization, and self-organization
ES2005-78
The architecture of emergent self-organizing maps to reduce projection errors
Alfred Ultsch, Lutz Herrmann
The architecture of emergent self-organizing maps to reduce projection errors
Alfred Ultsch, Lutz Herrmann
Abstract:
There are mainly two types of Emergent Self-Organizing Maps (ESOM) grid structures in use: hexgrid (honeycomb like) and quadgrid (trellis like) maps. In addition to that, the shape of the maps may be square or rectangular. This work investigates the effects of these different map layouts. Hexgrids were found to have no convincing advantage over quadgrids.Rectangular maps, however, are distinctively superior to square maps. Most surprisingly, rectangular maps outperform square maps for isotropic data, i.e. data sets with no particular primary direction.
There are mainly two types of Emergent Self-Organizing Maps (ESOM) grid structures in use: hexgrid (honeycomb like) and quadgrid (trellis like) maps. In addition to that, the shape of the maps may be square or rectangular. This work investigates the effects of these different map layouts. Hexgrids were found to have no convincing advantage over quadgrids.Rectangular maps, however, are distinctively superior to square maps. Most surprisingly, rectangular maps outperform square maps for isotropic data, i.e. data sets with no particular primary direction.
ES2005-115
A new learning algorithm for incremental self-organizing maps
Yann Prudent, Abdel Ennaji
A new learning algorithm for incremental self-organizing maps
Yann Prudent, Abdel Ennaji
Abstract:
An incremental and Growing network model is introduced which is able to learn the topological relations in a given set of input vectors by means of a simple Hebb-like learning rule. First an overview of the most known models of Self-Organizing Maps (SOM) is given. Then we propose a new algorithm for a SOM which can learn new input data (plasticity) without degrading the previously trained network and forgetting the old input data (stability). We report the validation of this model on extensive experiments using a synthetic problem and the handwriting digit recognition problem over a portion of the NIST database.
An incremental and Growing network model is introduced which is able to learn the topological relations in a given set of input vectors by means of a simple Hebb-like learning rule. First an overview of the most known models of Self-Organizing Maps (SOM) is given. Then we propose a new algorithm for a SOM which can learn new input data (plasticity) without degrading the previously trained network and forgetting the old input data (stability). We report the validation of this model on extensive experiments using a synthetic problem and the handwriting digit recognition problem over a portion of the NIST database.
ES2005-82
The dynamics of Learning Vector Quantization
Michael Biehl, Anarta Ghosh, Barbara Hammer
The dynamics of Learning Vector Quantization
Michael Biehl, Anarta Ghosh, Barbara Hammer
Abstract:
Winner-Takes-All (WTA) algorithms offer intuitive and powerful learning schemes such as Learning Vector Quantization (LVQ) and variations thereof, most of which are heuristically motivated. In this article we investigate in an exact mathematical way the dynamics of different vector quantization (VQ) schemes including standard LVQ. We consider the training from high-dimensional data generated according to a mixture of overlapping Gaussians and the case of two prototypes. Simplifying assumptions allow for an exact description of the on-line learning dynamics in terms of coupled differential equations. We compare the typical dynamics of the learning processes and the achievable generalization error.
Winner-Takes-All (WTA) algorithms offer intuitive and powerful learning schemes such as Learning Vector Quantization (LVQ) and variations thereof, most of which are heuristically motivated. In this article we investigate in an exact mathematical way the dynamics of different vector quantization (VQ) schemes including standard LVQ. We consider the training from high-dimensional data generated according to a mixture of overlapping Gaussians and the case of two prototypes. Simplifying assumptions allow for an exact description of the on-line learning dynamics in terms of coupled differential equations. We compare the typical dynamics of the learning processes and the achievable generalization error.
ES2005-23
TreeGNG - hierarchical topological clustering
Kevin Doherty, Rod Adams, Neil Davey
TreeGNG - hierarchical topological clustering
Kevin Doherty, Rod Adams, Neil Davey
Abstract:
This paper presents TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. Tree-GNG extends the Growing Neural Gas algorithm by maintaining a time history of the learned topological mapping. TreeGNG is able to recover from poor decisions made during the construction of the tree, and provides the novel ability to influence the general shape of the hierarchy.
This paper presents TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. Tree-GNG extends the Growing Neural Gas algorithm by maintaining a time history of the learned topological mapping. TreeGNG is able to recover from poor decisions made during the construction of the tree, and provides the novel ability to influence the general shape of the hierarchy.
Dynamical and Numerical Aspects of Neural Computing
ES2005-152
Two or three things that we (intend to) know about Hopfield and Tank networks
Miguel Atencia, Gonzalo Joya, Francisco Sandoval
Two or three things that we (intend to) know about Hopfield and Tank networks
Miguel Atencia, Gonzalo Joya, Francisco Sandoval
Abstract:
This work aims at reviewing some of the main issues that are under research in the field of Hopfield networks. In particular, the feasibility of the Hopfield network as a practical optimization method is addressed. Together with the current results, the main directions that deserve ongoing analysis are shown. Besides, some suggestions are provided in order to identify lines that are at an impasse point, where there is no evidence that further research will be fruitful, or topics that nowadays can just be considered as historically interesting.
This work aims at reviewing some of the main issues that are under research in the field of Hopfield networks. In particular, the feasibility of the Hopfield network as a practical optimization method is addressed. Together with the current results, the main directions that deserve ongoing analysis are shown. Besides, some suggestions are provided in order to identify lines that are at an impasse point, where there is no evidence that further research will be fruitful, or topics that nowadays can just be considered as historically interesting.
ES2005-142
The Nonlinear Dynamic State neuron
Nigel Crook, Wee Jin Goh, Mohammad Hawarat
The Nonlinear Dynamic State neuron
Nigel Crook, Wee Jin Goh, Mohammad Hawarat
Abstract:
This research is concerned with using nonlinear dynamics to greatly enhance the range of possible behaviours of artificial neurons. A novel neuron model is presented which has a dynamic internal state defined by a set of nonlinear equations, together with a threshold driven spike output mechanism. With the aid of spike feedback control the model is able to stabilise one of a large number of Unstable Periodic Orbits in its internal dynamics. These orbits correspond to dynamic states of the neuron each of which generates a unique periodic spike train as output. The properties of this model are explored through experiments with single neurons and networks of neurons.
This research is concerned with using nonlinear dynamics to greatly enhance the range of possible behaviours of artificial neurons. A novel neuron model is presented which has a dynamic internal state defined by a set of nonlinear equations, together with a threshold driven spike output mechanism. With the aid of spike feedback control the model is able to stabilise one of a large number of Unstable Periodic Orbits in its internal dynamics. These orbits correspond to dynamic states of the neuron each of which generates a unique periodic spike train as output. The properties of this model are explored through experiments with single neurons and networks of neurons.
ES2005-114
Stability of backpropagation-decorrelation efficient O(N) recurrent learning
Jochen J. Steil
Stability of backpropagation-decorrelation efficient O(N) recurrent learning
Jochen J. Steil
Abstract:
We provide a stability analysis based on nonlinear feedback theory for the recently introduced backpropagation-decorrelation (BPDC) recurrent learning algorithm. For one output neuron BPDC adapts only the output weights of a possibly large network and therefore can learn in O(N). We derive a simple sufficient stability inequality which can easily be evaluated and monitored online to assure that the recurrent network remains stable while adapting. As byproduct we show that BPDC is highly competitive on the recently introduced CATS benchmark data cite{CATS}.
We provide a stability analysis based on nonlinear feedback theory for the recently introduced backpropagation-decorrelation (BPDC) recurrent learning algorithm. For one output neuron BPDC adapts only the output weights of a possibly large network and therefore can learn in O(N). We derive a simple sufficient stability inequality which can easily be evaluated and monitored online to assure that the recurrent network remains stable while adapting. As byproduct we show that BPDC is highly competitive on the recently introduced CATS benchmark data cite{CATS}.
ES2005-122
Exponential stability of implicit Euler, discrete-time Hopfield neural networks
Francisco R. Villatoro
Exponential stability of implicit Euler, discrete-time Hopfield neural networks
Francisco R. Villatoro
Abstract:
The exponential stability of continuous-time Hopfield neural networks is not preserved when implemented on digital computers by means of explicit numerical methods, whereas the implicit (or backward) Euler method preserves this exponential stability under exactly the same sufficient conditions as those previously obtained for the continuous model. The proof is based on the nonlinear measure approach, here extended to discrete-time systems. This approach also allows the estimation of the exponential convergence rate of the discrete solutions.
The exponential stability of continuous-time Hopfield neural networks is not preserved when implemented on digital computers by means of explicit numerical methods, whereas the implicit (or backward) Euler method preserves this exponential stability under exactly the same sufficient conditions as those previously obtained for the continuous model. The proof is based on the nonlinear measure approach, here extended to discrete-time systems. This approach also allows the estimation of the exponential convergence rate of the discrete solutions.
ES2005-165
Stochastic analysis of the Abe formulation of Hopfield networks
Marie Kratz, Miguel Atencia, Gonzalo Joya
Stochastic analysis of the Abe formulation of Hopfield networks
Marie Kratz, Miguel Atencia, Gonzalo Joya
Abstract:
This work studies the influence of random noise in the application of Hopfield networks to combinatorial optimization. It has been suggested that the Abe formulation, rather than the original Hopfield formulation, is better suited to optimization, but the eventual presence of noise in the connection weights of this model has not been considered up to now. This consideration leads to a model that is formulated as a stochastic differential equation. In the stochastic setting, the analysis reveals that the model is stable, and the states converge towards an attractive set, assuming the noise intensity is bounded. The relation of the attractor with that of the deterministic model requires further study.
This work studies the influence of random noise in the application of Hopfield networks to combinatorial optimization. It has been suggested that the Abe formulation, rather than the original Hopfield formulation, is better suited to optimization, but the eventual presence of noise in the connection weights of this model has not been considered up to now. This consideration leads to a model that is formulated as a stochastic differential equation. In the stochastic setting, the analysis reveals that the model is stable, and the states converge towards an attractive set, assuming the noise intensity is bounded. The relation of the attractor with that of the deterministic model requires further study.
ES2005-33
Using generic neural networks in the control and prediction of grasp postures
Franck Carenzi, Philippe Gorce, Yves Burnod, Marc Maier
Using generic neural networks in the control and prediction of grasp postures
Franck Carenzi, Philippe Gorce, Yves Burnod, Marc Maier
Abstract:
We have developed a neural network model that learns the kinematics of object-dependent reach and grasp tasks with a simulated anthropomorphic arm/hand system. The network learns to combine multi-modal arm-related information as well as object-related information such as object size, location and orientation. We will first describe the learning of the finger inverse kinematics and then the learning of the grasp configuration. Finally, to illustrate the performance of the network, we will simulate object-dependent grasp configurations by the use of a 5-digit hand (4 DoF per digit) linked to a 7 DoF arm.
We have developed a neural network model that learns the kinematics of object-dependent reach and grasp tasks with a simulated anthropomorphic arm/hand system. The network learns to combine multi-modal arm-related information as well as object-related information such as object size, location and orientation. We will first describe the learning of the finger inverse kinematics and then the learning of the grasp configuration. Finally, to illustrate the performance of the network, we will simulate object-dependent grasp configurations by the use of a 5-digit hand (4 DoF per digit) linked to a 7 DoF arm.
ES2005-40
exponential stability of stochastic, retarded neural netorks
Mark Joy
exponential stability of stochastic, retarded neural netorks
Mark Joy
Abstract:
The stability analysis of neural networks is important in the applications and has been studied by many authors. However, only recently has the stability of stochastic models of neural networks been investigated. In this paper we analyse the global asymptotic stability of Cellular Neural Networks (CNNs) described by a stochastic delay differential equation. It can be argued that such a model is as comprehensive as one would like to be when studying perturbations of CNNs since delay siganalling and noise are accounted for. We present a convergence theorem and discuss some examples of its use.
The stability analysis of neural networks is important in the applications and has been studied by many authors. However, only recently has the stability of stochastic models of neural networks been investigated. In this paper we analyse the global asymptotic stability of Cellular Neural Networks (CNNs) described by a stochastic delay differential equation. It can be argued that such a model is as comprehensive as one would like to be when studying perturbations of CNNs since delay siganalling and noise are accounted for. We present a convergence theorem and discuss some examples of its use.
Learning I
ES2005-117
Organization properties of open networks of cooperative neuro-agents
Jean-Pierre Mano, Pierre Glize
Organization properties of open networks of cooperative neuro-agents
Jean-Pierre Mano, Pierre Glize
Abstract:
Our researches on adaptive systems are inspired by their ability of building by themselves a representation of their surrounding world. Using cooperation as a local criterion of self-organization, we study in a network of neuro-agents the evolution of the system and in the same way, the emergence of a functioning coherent with the environmental feed-back. In this paper we expose the abilities of the neuro-agents that give them the autonomy required to build an ab nihilo topology. And finally we want to emphasize the dynamic of the network organization that is basically its best property of adaptation to a changing environment.
Our researches on adaptive systems are inspired by their ability of building by themselves a representation of their surrounding world. Using cooperation as a local criterion of self-organization, we study in a network of neuro-agents the evolution of the system and in the same way, the emergence of a functioning coherent with the environmental feed-back. In this paper we expose the abilities of the neuro-agents that give them the autonomy required to build an ab nihilo topology. And finally we want to emphasize the dynamic of the network organization that is basically its best property of adaptation to a changing environment.
ES2005-86
A ridgelet kernel regression model using genetic algorithm
Shuyuan Yang, Min Wang, Jiao Licheng
A ridgelet kernel regression model using genetic algorithm
Shuyuan Yang, Min Wang, Jiao Licheng
Abstract:
In this paper, a ridgelet kernel regression model is proposed for approximation of high dimensional functions. It is based on ridgelet theory, kernel and regularization technology from which we can deduce a regularized kernel regression form. Taking the objective function solved by quadratic programming to define the fitness function, we use genetic algorithm to search for the optimal directional vector of ridgelet. The results indicate that this method can effectively deal with high dimensional data, especially those with certain kinds of spatial inhomogeneities. Some illustrative examples are included to demonstrate its superiority.
In this paper, a ridgelet kernel regression model is proposed for approximation of high dimensional functions. It is based on ridgelet theory, kernel and regularization technology from which we can deduce a regularized kernel regression form. Taking the objective function solved by quadratic programming to define the fitness function, we use genetic algorithm to search for the optimal directional vector of ridgelet. The results indicate that this method can effectively deal with high dimensional data, especially those with certain kinds of spatial inhomogeneities. Some illustrative examples are included to demonstrate its superiority.
ES2005-22
Boosting by weighting boundary and erroneous samples
Vanessa Gómez-Verdejo, Manuel Ortega-Moral, Jerónimo Arenas-García, Anibal R. Figueiras-Vidal
Boosting by weighting boundary and erroneous samples
Vanessa Gómez-Verdejo, Manuel Ortega-Moral, Jerónimo Arenas-García, Anibal R. Figueiras-Vidal
Abstract:
This paper shows that new and flexible criteria to resample populations in boosting algorithms can lead to performance improvements. Real Adaboost emphasis function can be divided into two different terms, the first only pays attention to the quadratic error of each pattern and the second takes only into account the "proximity" of each pattern to the boundary. Here, we incorporate an additional degree of freedom to this fixed emphasis function showing that a good tradeoff between these two components improves the performance of Real Adaboost algorithm. Results over several benchmark problems show that an error rate reduction, a faster convergence and overfitting robustness can be achieved.
This paper shows that new and flexible criteria to resample populations in boosting algorithms can lead to performance improvements. Real Adaboost emphasis function can be divided into two different terms, the first only pays attention to the quadratic error of each pattern and the second takes only into account the "proximity" of each pattern to the boundary. Here, we incorporate an additional degree of freedom to this fixed emphasis function showing that a good tradeoff between these two components improves the performance of Real Adaboost algorithm. Results over several benchmark problems show that an error rate reduction, a faster convergence and overfitting robustness can be achieved.
Artificial Neural Networks and Prognosis in Medicine
ES2005-153
Artificial neural networks and prognosis in medicine. Survival analysis in breast cancer patients
Franco Leonardo, Jose Jerez, Emilio Alba
Artificial neural networks and prognosis in medicine. Survival analysis in breast cancer patients
Franco Leonardo, Jose Jerez, Emilio Alba
Abstract:
In this paper we first give an introduction to the problem of prognosis in medicine. The importance of prognosis is highlighted and a brief summary of some successful applications of neural networks is included, together with an analysis of their advantages over the standard statistical tools. In the second part, we compare the performances of Cox proportional hazard model and an approach based on artificial neural networks constructed for the prognosis of outcome in patients with primary breast cancer. The data was collected from 32 hospitals in Spain, via the Spanish group of research in breast cancer within the framework of the ``El Alamo" project. The population was divided into training and test sets, and the predictive accuracy of the prognosis models (Cox and neural networks) was compared by determining sensitivities, specificities and the area under receiver operating characteristic curves (area ROC). The results show that neural network predictions are much more accurate, in particular in the early months after surgical intervention.
In this paper we first give an introduction to the problem of prognosis in medicine. The importance of prognosis is highlighted and a brief summary of some successful applications of neural networks is included, together with an analysis of their advantages over the standard statistical tools. In the second part, we compare the performances of Cox proportional hazard model and an approach based on artificial neural networks constructed for the prognosis of outcome in patients with primary breast cancer. The data was collected from 32 hospitals in Spain, via the Spanish group of research in breast cancer within the framework of the ``El Alamo" project. The population was divided into training and test sets, and the predictive accuracy of the prognosis models (Cox and neural networks) was compared by determining sensitivities, specificities and the area under receiver operating characteristic curves (area ROC). The results show that neural network predictions are much more accurate, in particular in the early months after surgical intervention.
ES2005-162
An artificial neural network for analysing the survival of patients with colorectal cancer
Rachel Bittern, Alfred Cuschieri, Sergey Dolgobrodov, Robin Marshall, Peter Moore, Robert Steele
An artificial neural network for analysing the survival of patients with colorectal cancer
Rachel Bittern, Alfred Cuschieri, Sergey Dolgobrodov, Robin Marshall, Peter Moore, Robert Steele
Abstract:
An internet/web based artificial neural network has been developed for use by practicing clinical oncologists and medical researchers as part of a programme to aid decision making and eventually, the management and treatment of individual patients with colorectal cancer. We have configured and implemented a Partial Likelihood Artificial Neural Network (PLANN) and trained it to predict cancer related survival in patients with confirmed colorectal cancer using a database provided by the Clinical Resource and Audit Group (CRAG) in Scotland. The reliability of the trained PLANN was evaluated against Kaplan-Meier (KM) actual survival plots and shows close agreement with them.
An internet/web based artificial neural network has been developed for use by practicing clinical oncologists and medical researchers as part of a programme to aid decision making and eventually, the management and treatment of individual patients with colorectal cancer. We have configured and implemented a Partial Likelihood Artificial Neural Network (PLANN) and trained it to predict cancer related survival in patients with confirmed colorectal cancer using a database provided by the Clinical Resource and Audit Group (CRAG) in Scotland. The reliability of the trained PLANN was evaluated against Kaplan-Meier (KM) actual survival plots and shows close agreement with them.
ES2005-42
Artificial intelligence techniques for the prediction of bladder cancer progression
Maysam Abbod, Jim W.F. Catto, Minyou Chen, Derek A. Linkens, Freddie C. Hamdy
Artificial intelligence techniques for the prediction of bladder cancer progression
Maysam Abbod, Jim W.F. Catto, Minyou Chen, Derek A. Linkens, Freddie C. Hamdy
Abstract:
New techniques for the prediction of tumour behaviour are needed since statistical analysis has a poor accuracy and is not applicable to the individual. Artificial Intelligence (AI) may provide these suitable methods. We have compared the predictive accuracies of neuro-fuzzy modelling (NFM), artificial neural networks (ANN) and traditional statistical methods, for the behaviour of bladder cancer. Experimental molecular biomarkers, including p53 expression and gene methylation, and conventional clinicopathological data were studied in a cohort of 122 patients with bladder cancer. For all 3 methods, models were produced to predict the presence and timing of a tumour progression. Both methods of AI predicted progression with an accuracy ranging from 88-100%. This was superior to logistic regression. NFM appeared better than ANN at predicting the timing of progression.
New techniques for the prediction of tumour behaviour are needed since statistical analysis has a poor accuracy and is not applicable to the individual. Artificial Intelligence (AI) may provide these suitable methods. We have compared the predictive accuracies of neuro-fuzzy modelling (NFM), artificial neural networks (ANN) and traditional statistical methods, for the behaviour of bladder cancer. Experimental molecular biomarkers, including p53 expression and gene methylation, and conventional clinicopathological data were studied in a cohort of 122 patients with bladder cancer. For all 3 methods, models were produced to predict the presence and timing of a tumour progression. Both methods of AI predicted progression with an accuracy ranging from 88-100%. This was superior to logistic regression. NFM appeared better than ANN at predicting the timing of progression.
ES2005-18
Computational models of intracytoplasmic sperm injection prognosis
Hui Liu, Ash Kshirsagar, Jessica Ku, Dolores Lamb, Craig Niederberger
Computational models of intracytoplasmic sperm injection prognosis
Hui Liu, Ash Kshirsagar, Jessica Ku, Dolores Lamb, Craig Niederberger
Abstract:
Intracytoplasmic sperm injection (ICSI), a procedure in which a single sperm is microinjected into an ovum, presents a difficult modeling problem highly sensitive to both under- and overfitting. We modeled an ICSI data set of 528 outcomes with a variety of clinical and laboratory variates using linear and non-linear (neural computational) logistic regression methods and with linear and radial basis function support vector machines. Interestingly, depending on the threshold chosen to determine a positive outcome, investigated neural computational methods yielded similar or lower ROC AUCs than logistic regression in the test set, whereas support vector machines yielded similar or higher ROC AUCs than those of logistic regression and the investigated neural computational models.
Intracytoplasmic sperm injection (ICSI), a procedure in which a single sperm is microinjected into an ovum, presents a difficult modeling problem highly sensitive to both under- and overfitting. We modeled an ICSI data set of 528 outcomes with a variety of clinical and laboratory variates using linear and non-linear (neural computational) logistic regression methods and with linear and radial basis function support vector machines. Interestingly, depending on the threshold chosen to determine a positive outcome, investigated neural computational methods yielded similar or lower ROC AUCs than logistic regression in the test set, whereas support vector machines yielded similar or higher ROC AUCs than those of logistic regression and the investigated neural computational models.
ES2005-31
Handling outliers and missing data in brain tumour clinical assessment using t-GTM
Alfredo Vellido, Paulo J.G. Lisboa, Dolores Vicente
Handling outliers and missing data in brain tumour clinical assessment using t-GTM
Alfredo Vellido, Paulo J.G. Lisboa, Dolores Vicente
Abstract:
Uncertainty is inherent to medical decision making, and automated decision support systems should aim to reduce it. In this paper, MR spectral data are considered in a problem of discrimination of brain tumour types and grades. Models to fit these data can be affected by two sources of uncertainty that might occur in the data: the presence of outliers and data incompleteness. A model for multivariate data clustering and visualization, the GTM, is here redefined as a mixture of Student t-distributions that is robust towards outliers while providing missing values imputation. The effectiveness of this model on the MRS data is demonstrated empirically.
Uncertainty is inherent to medical decision making, and automated decision support systems should aim to reduce it. In this paper, MR spectral data are considered in a problem of discrimination of brain tumour types and grades. Models to fit these data can be affected by two sources of uncertainty that might occur in the data: the presence of outliers and data incompleteness. A model for multivariate data clustering and visualization, the GTM, is here redefined as a mixture of Student t-distributions that is robust towards outliers while providing missing values imputation. The effectiveness of this model on the MRS data is demonstrated empirically.
ES2005-39
predicting bed demand in a hospital using neural networks and ARIMA models: a hybrid approach
Mark Joy, Simon Jones
predicting bed demand in a hospital using neural networks and ARIMA models: a hybrid approach
Mark Joy, Simon Jones
Abstract:
In this paper we describe an investigation into the prediction of emergency bed demand - bed demand due to non-scheduled admissions - within a NHSfootnote{In the UK the state provider of health is the National Health Service -- abbreviated to NHS throughout this paper.} hospital in South London, U.K. A hybrid methodology, incorporating a neural network and an ARIMA model was used to predict a time series of bed demand. A thorough statistical analysis of the data set was performed as a preliminary phase of the research from which a classical linear predicting model was developed. The prediction errors or residuals from this model were then used as input to a neural network. These methods represent a novel approach to the problem of efficient bed resource management for hospitals.
In this paper we describe an investigation into the prediction of emergency bed demand - bed demand due to non-scheduled admissions - within a NHSfootnote{In the UK the state provider of health is the National Health Service -- abbreviated to NHS throughout this paper.} hospital in South London, U.K. A hybrid methodology, incorporating a neural network and an ARIMA model was used to predict a time series of bed demand. A thorough statistical analysis of the data set was performed as a preliminary phase of the research from which a classical linear predicting model was developed. The prediction errors or residuals from this model were then used as input to a neural network. These methods represent a novel approach to the problem of efficient bed resource management for hospitals.
ES2005-46
Functional topographic mapping for robust handling of outliers in brain tumour data
Alfredo Vellido, Paulo J.G. Lisboa
Functional topographic mapping for robust handling of outliers in brain tumour data
Alfredo Vellido, Paulo J.G. Lisboa
Abstract:
Magnetic Resonance spectra comprise finite frequency measurements sampled from a continuous frequency distribution and, therefore, are amenable to Functional Data Analysis (FDA) techniques. In this paper, MR spectral data are considered with the purpose of discriminating between brain tumour types. Models to fit these data can be affected by the uncertainty associated to the presence of outliers. A functional variation on a model for data clustering and visualization, the t-GTM, is introduced. It is defined as a mixture of Student t-distributions that is robust towards outliers. The effectiveness of this model for outlier detection and tumour type visualization is compared for raw and functional data.
Magnetic Resonance spectra comprise finite frequency measurements sampled from a continuous frequency distribution and, therefore, are amenable to Functional Data Analysis (FDA) techniques. In this paper, MR spectral data are considered with the purpose of discriminating between brain tumour types. Models to fit these data can be affected by the uncertainty associated to the presence of outliers. A functional variation on a model for data clustering and visualization, the t-GTM, is introduced. It is defined as a mixture of Student t-distributions that is robust towards outliers. The effectiveness of this model for outlier detection and tumour type visualization is compared for raw and functional data.
ES2005-81
Relevance learning for mental disease classification
Barbara Hammer, Andreas Rechtien, Marc Strickert, Thomas Villmann
Relevance learning for mental disease classification
Barbara Hammer, Andreas Rechtien, Marc Strickert, Thomas Villmann
Abstract:
In medical classification tasks, it is important to gain information about how decisions are made to ground and reflect therapies based on this knowledge. Neural black box mechanisms are not suitable for such tasks, whereas symbolic methods which extract explicit rules are, though their tolerance with respect to noise is often smaller since they do not rely on distributed representations. In this article, we test three hybrid prototype-based neural models which combine neural representations with explicit information representation in comparison to classical decision trees for mental disease classification. Depending on the model, information about relevant input attributes and explicit rules can be derived.
In medical classification tasks, it is important to gain information about how decisions are made to ground and reflect therapies based on this knowledge. Neural black box mechanisms are not suitable for such tasks, whereas symbolic methods which extract explicit rules are, though their tolerance with respect to noise is often smaller since they do not rely on distributed representations. In this article, we test three hybrid prototype-based neural models which combine neural representations with explicit information representation in comparison to classical decision trees for mental disease classification. Depending on the model, information about relevant input attributes and explicit rules can be derived.
ES2005-134
Automatic classification of prostate cancer using pseudo-gaussian radial basis function neural network
Olga Valenzuela, Ignacio Rojas, Fernando Rojas, Luisa Marquez
Automatic classification of prostate cancer using pseudo-gaussian radial basis function neural network
Olga Valenzuela, Ignacio Rojas, Fernando Rojas, Luisa Marquez
Abstract:
Recent advances in multimedia and image processing techniques can be utilized to assist pathologists in this respect. In fact, many investigators believe that automation of prostate cancer analysis increases the rate of early detection. In this paper, we will propose an automatic procedure for prostate cancer light micrograph based on soft-computing technique, for image interpretation, with increased accuracy. We propose a feature subset selection algorithm that selects the most important features, used by a pseudo-gaussian radial basis function neural networks to classify the prostate cancer light micrograph. A high classification rate has been achieved which will reduce the subjective human invention and will increase the diagnostic speed.
Recent advances in multimedia and image processing techniques can be utilized to assist pathologists in this respect. In fact, many investigators believe that automation of prostate cancer analysis increases the rate of early detection. In this paper, we will propose an automatic procedure for prostate cancer light micrograph based on soft-computing technique, for image interpretation, with increased accuracy. We propose a feature subset selection algorithm that selects the most important features, used by a pseudo-gaussian radial basis function neural networks to classify the prostate cancer light micrograph. A high classification rate has been achieved which will reduce the subjective human invention and will increase the diagnostic speed.
Learning II
ES2005-4
Artificial neural network fusion: Application to Arabic words recognition
Nadir Farah, Mohamed tarek Khadir, Mokhtar Sellami
Artificial neural network fusion: Application to Arabic words recognition
Nadir Farah, Mohamed tarek Khadir, Mokhtar Sellami
Abstract:
The study of multiple classifier systems has become recently an area of intensive research in pattern recognition in order to improvethe results of single classifiers. In this work, two types of features combination for handwritten Arabic literal words amount recognition, using neural network classifiers are discussed. Different parallel combination schemes are presented and their results compared with a single classifier benchmark using a complete feature set.
The study of multiple classifier systems has become recently an area of intensive research in pattern recognition in order to improvethe results of single classifiers. In this work, two types of features combination for handwritten Arabic literal words amount recognition, using neural network classifiers are discussed. Different parallel combination schemes are presented and their results compared with a single classifier benchmark using a complete feature set.
ES2005-12
Adaptive Simultaneous Perturbation Based Pruning Algorithm for Neural Control Systems
Jie Ni, Qing Song
Adaptive Simultaneous Perturbation Based Pruning Algorithm for Neural Control Systems
Jie Ni, Qing Song
Abstract:
It is normally difficult to determine the optimal size of neural networks, particularly, in the sequential training applications such as online control. In this paper, a novel training and pruning algorithm, Adaptive Simultaneous Perturbation Based Pruning Algorithm (ASPBP), is proposed for the online tuning and pruning the neural tracking control system. The conic sector theory is introduced in the design of this robust neural control system, which aims at providing guaranteed boundedness for both the input-output signals and the weights of the neural network.
It is normally difficult to determine the optimal size of neural networks, particularly, in the sequential training applications such as online control. In this paper, a novel training and pruning algorithm, Adaptive Simultaneous Perturbation Based Pruning Algorithm (ASPBP), is proposed for the online tuning and pruning the neural tracking control system. The conic sector theory is introduced in the design of this robust neural control system, which aims at providing guaranteed boundedness for both the input-output signals and the weights of the neural network.
ES2005-15
Modified backward feature selection by cross validation
Shigeo Abe
Modified backward feature selection by cross validation
Shigeo Abe
Abstract:
Since training of a classifier takes time, usually some criterion other than the recognition rate is used for feature selection. This may, however, leads to deteriorating the generalization ability by feature selection. To overcome this problem, in this paper, we propose modified backward feature selection by cross validation. Initially, we determine the candidate set which consists of the features that do not deteriorate the generalization ability, if each is deleted from the initial set of features. If the generalization ability is not deteriorated even if all the candidate features are deleted, we terminate the algorithm. Otherwise, we delete by backward deletion the candidate feature that improves the generalization ability the most, and determine the candidate set that is a subset of the current candidate set. We iterate the above procedure until the candidate set is empty. We evaluate our method using support vector machines for some benchmark data sets and show that many features are deleted without deteriorating the generalization ability.
Since training of a classifier takes time, usually some criterion other than the recognition rate is used for feature selection. This may, however, leads to deteriorating the generalization ability by feature selection. To overcome this problem, in this paper, we propose modified backward feature selection by cross validation. Initially, we determine the candidate set which consists of the features that do not deteriorate the generalization ability, if each is deleted from the initial set of features. If the generalization ability is not deteriorated even if all the candidate features are deleted, we terminate the algorithm. Otherwise, we delete by backward deletion the candidate feature that improves the generalization ability the most, and determine the candidate set that is a subset of the current candidate set. We iterate the above procedure until the candidate set is empty. We evaluate our method using support vector machines for some benchmark data sets and show that many features are deleted without deteriorating the generalization ability.
ES2005-21
Initialisation improvement in engineering feedforward ANN models
Agathoklis Krimpenis, G.-C. Vosniakos
Initialisation improvement in engineering feedforward ANN models
Agathoklis Krimpenis, G.-C. Vosniakos
Abstract:
Any feedforward artificial neural network (ANN) training procedure begins with the initialisation of the connection weights’ values. These initial values are generally selected in a random or quasi-random way in order to increase training speed. Nevertheless, it is common practice to initialize the same ANN architecture in a repetitive way in order for satisfactory training results to be achieved. This is due to the fact that the error function may have many local extrema and the training algorithm can get trapped in any one of them depending on its starting point based on the particular initialisation of weights. This paper proposes a systematic way for weight initialisation that is based on performing multiple linear regression on the training data. Experimental data from a metal cutting process were used for ANN model building to demonstrate an improvement on both training speed and achieved training error regardless of the selected architecture.
Any feedforward artificial neural network (ANN) training procedure begins with the initialisation of the connection weights’ values. These initial values are generally selected in a random or quasi-random way in order to increase training speed. Nevertheless, it is common practice to initialize the same ANN architecture in a repetitive way in order for satisfactory training results to be achieved. This is due to the fact that the error function may have many local extrema and the training algorithm can get trapped in any one of them depending on its starting point based on the particular initialisation of weights. This paper proposes a systematic way for weight initialisation that is based on performing multiple linear regression on the training data. Experimental data from a metal cutting process were used for ANN model building to demonstrate an improvement on both training speed and achieved training error regardless of the selected architecture.
ES2005-41
An On-line Fisher Discriminant
Manuel Ortega-Moral, Vanessa Gómez-Verdejo, Jerónimo Arenas-García, Anibal R. Figueiras-Vidal
An On-line Fisher Discriminant
Manuel Ortega-Moral, Vanessa Gómez-Verdejo, Jerónimo Arenas-García, Anibal R. Figueiras-Vidal
Abstract:
Many applications in signal processing need an adaptive algorithm. Adaptive schemes are useful when the statistics of the problem are unknown or when facing varying environments. Nonetheless, many of these applications deal with classification tasks, and most algorithms are not specifically thought to tackle these kinds of problems. Whereas Fisher's criterion aimed to find the most adequate direction to discriminate classes in a stationary setting, the newly proposed On-line Fisher Linear Discriminant (OFLD) is able to adaptively update its parameters maintaining its discrimination goal. The algorithm has been tested in an equalization problem for several conditions.
Many applications in signal processing need an adaptive algorithm. Adaptive schemes are useful when the statistics of the problem are unknown or when facing varying environments. Nonetheless, many of these applications deal with classification tasks, and most algorithms are not specifically thought to tackle these kinds of problems. Whereas Fisher's criterion aimed to find the most adequate direction to discriminate classes in a stationary setting, the newly proposed On-line Fisher Linear Discriminant (OFLD) is able to adaptively update its parameters maintaining its discrimination goal. The algorithm has been tested in an equalization problem for several conditions.
ES2005-48
Averaging on Riemannian manifolds and unsupervised learning using neural associative memory
Nowicki Dimitri, Oleksiy Dekhtyarenko
Averaging on Riemannian manifolds and unsupervised learning using neural associative memory
Nowicki Dimitri, Oleksiy Dekhtyarenko
Abstract:
This paper is dedicated to the new algorithm for unsupervised learning and clustering. This algorithm is based on Hopfield-type pseudoinverse associative memory. We propose to represent synaptic matrices of this type of neural network as points on the Grassmann manifold. Then we establish the procedure of generalized averaging on this manifold. This procedure enables us to endow the associative memory with ability of data generalization. In the paper we provide experimental testing for the algorithm using simulated random data. After the synthesis of associative memory containing generalized data. Cluster centers are retrieved using procedure of associative recall with random starts.
This paper is dedicated to the new algorithm for unsupervised learning and clustering. This algorithm is based on Hopfield-type pseudoinverse associative memory. We propose to represent synaptic matrices of this type of neural network as points on the Grassmann manifold. Then we establish the procedure of generalized averaging on this manifold. This procedure enables us to endow the associative memory with ability of data generalization. In the paper we provide experimental testing for the algorithm using simulated random data. After the synthesis of associative memory containing generalized data. Cluster centers are retrieved using procedure of associative recall with random starts.
ES2005-57
A Stability Condition for Neural Network Control of Uncertain Systems
Pornchai Khlaeo-om, Suwat Kuntanapreeda
A Stability Condition for Neural Network Control of Uncertain Systems
Pornchai Khlaeo-om, Suwat Kuntanapreeda
Abstract:
This paper derives a stability condition for neural network control systems which the parameters of the controlled systems are uncertain. The stability condition can be imposed in training processes to guarantee the stability of the control systems. The controller is a single hidden layer, feedforward neural network. The controlled system is assumed to be full-state accessible and can be modeled as a linear uncertain system. The stability is confirmed by the existence of a Lyapunov function of the closed loop systems. A simulation result on Van der Pol's equation with parametric uncertainty presented to demonstrate an application of the condition. A modified backpropagation algorithm with a model reference technique is used to train the controller.
This paper derives a stability condition for neural network control systems which the parameters of the controlled systems are uncertain. The stability condition can be imposed in training processes to guarantee the stability of the control systems. The controller is a single hidden layer, feedforward neural network. The controlled system is assumed to be full-state accessible and can be modeled as a linear uncertain system. The stability is confirmed by the existence of a Lyapunov function of the closed loop systems. A simulation result on Van der Pol's equation with parametric uncertainty presented to demonstrate an application of the condition. A modified backpropagation algorithm with a model reference technique is used to train the controller.
ES2005-63
A new approach based on wavelet-ICA algorithms for fetal electrocardiogram extraction
Bruno Azzerboni, Fabio La Foresta, Nadia Mammone, Francesco Carlo Morabito
A new approach based on wavelet-ICA algorithms for fetal electrocardiogram extraction
Bruno Azzerboni, Fabio La Foresta, Nadia Mammone, Francesco Carlo Morabito
Abstract:
The fetal electrocardiogram (fECG) is a non-invasive technique for monitoring the fetus condition during pregnancy and it consists in collecting electrical signals by some sensors on the body of the mother. The Independent Component Analysis (ICA) has been exploited with success to isolate the fECG; then, the extracted fECG has been denoised by a wavelet post-processing. Here we propose to use the Wavelet-ICA method, based on the joint use of Wavelet Analysis and ICA, in order to improve the extraction performance. We also show a comparison between the techniques and we discuss the advantage of the proposed method.
The fetal electrocardiogram (fECG) is a non-invasive technique for monitoring the fetus condition during pregnancy and it consists in collecting electrical signals by some sensors on the body of the mother. The Independent Component Analysis (ICA) has been exploited with success to isolate the fECG; then, the extracted fECG has been denoised by a wavelet post-processing. Here we propose to use the Wavelet-ICA method, based on the joint use of Wavelet Analysis and ICA, in order to improve the extraction performance. We also show a comparison between the techniques and we discuss the advantage of the proposed method.
ES2005-65
graph-based normalization
catherine Aaron
graph-based normalization
catherine Aaron
Abstract:
In this paper we construct a graph-based normalisation algorithm for non-linear data. The principle of this algorithm is to make the graph equivalent in all directions. In a first paragraph we show why this algorithm can be useful as a preliminary for some neural algorithms as those that need to compute geodesic distance. Then we present an algorithm and its stochastic version and some graphical results. Finally, we observe the effects of algorithm on reconstruction of geodesic distance running Dijksrta's algorithm.
In this paper we construct a graph-based normalisation algorithm for non-linear data. The principle of this algorithm is to make the graph equivalent in all directions. In a first paragraph we show why this algorithm can be useful as a preliminary for some neural algorithms as those that need to compute geodesic distance. Then we present an algorithm and its stochastic version and some graphical results. Finally, we observe the effects of algorithm on reconstruction of geodesic distance running Dijksrta's algorithm.
ES2005-66
Domain expert approximation through oracle learning
Joshua Menke, Tony Martinez
Domain expert approximation through oracle learning
Joshua Menke, Tony Martinez
Abstract:
In theory, improved generalization accuracy can be obtained by training separate learning models as ``experts'' over subparts of a given application domain. For example, given an application with both clean and noisy data, one solution is to train a single classifier on a set of both clean and noisy data. More accurate results can be obtained by training separate expert classifiers, one for clean data and one for noisy data, and then using the appropriate classifier depedning on the environment. Unfortunately, it is usually difficult to distinguish between clean and noisy data outside of training. We present a novel approach using oracle learning to approximate the clean and noisy domain experts with one learning model. On a set of both noisy and clean optical character recognition data, using oracle learning to approximate domain experts resulted in a statistically significant improvement (p < 0.0001) over using a single classifier trained on mixed data.
In theory, improved generalization accuracy can be obtained by training separate learning models as ``experts'' over subparts of a given application domain. For example, given an application with both clean and noisy data, one solution is to train a single classifier on a set of both clean and noisy data. More accurate results can be obtained by training separate expert classifiers, one for clean data and one for noisy data, and then using the appropriate classifier depedning on the environment. Unfortunately, it is usually difficult to distinguish between clean and noisy data outside of training. We present a novel approach using oracle learning to approximate the clean and noisy domain experts with one learning model. On a set of both noisy and clean optical character recognition data, using oracle learning to approximate domain experts resulted in a statistically significant improvement (p < 0.0001) over using a single classifier trained on mixed data.
ES2005-68
Generalised Cross Validation for Noise-Free Data
Tony Dodd, Tun Ladoni
Generalised Cross Validation for Noise-Free Data
Tony Dodd, Tun Ladoni
Abstract:
Whilst machine learning is principally concerned with function approximation from noisy data there are situations where the data maybe noise-free. This arises, for example, in metamodelling where we seek models of computationally expensive high fidelity simulation models. In this paper we derive a noise-free version of generalised cross validation (GCV) which can be used for model selection and hyperparamter estimation in metamodelling. This noise-free GCV measure is applied to the determination of the optimal kernel width in a reproducing kernel Hilbert space interpolation problem.
Whilst machine learning is principally concerned with function approximation from noisy data there are situations where the data maybe noise-free. This arises, for example, in metamodelling where we seek models of computationally expensive high fidelity simulation models. In this paper we derive a noise-free version of generalised cross validation (GCV) which can be used for model selection and hyperparamter estimation in metamodelling. This noise-free GCV measure is applied to the determination of the optimal kernel width in a reproducing kernel Hilbert space interpolation problem.
Perceptrons and Multi-Layer Perceptrons
ES2005-88
Neural network classification using Shannon's entropy
Luis Silva, Joaquim Marques de Sá, Luis Alexandre
Neural network classification using Shannon's entropy
Luis Silva, Joaquim Marques de Sá, Luis Alexandre
Abstract:
The last years have witnessed an increasing attention to entropy-based criteria in adaptive systems. Several principles were proposed based on the maximization or minimization of entropic cost functions. We propose a new type of neural network classifiers with multilayer perceptron (MLP) architecture, but where the usual mean square error minimization principle is substituted by the minimization of Shannon's entropy of the differences between the MLP's output and the desired target. The backpropagation algorithm is optimized with a variable learning rate and tested in five well known datasets. The results show a very good performance of MLPs trained with Shannon's entropy when compared with the mean square error and cros-entropy criteria.
The last years have witnessed an increasing attention to entropy-based criteria in adaptive systems. Several principles were proposed based on the maximization or minimization of entropic cost functions. We propose a new type of neural network classifiers with multilayer perceptron (MLP) architecture, but where the usual mean square error minimization principle is substituted by the minimization of Shannon's entropy of the differences between the MLP's output and the desired target. The backpropagation algorithm is optimized with a variable learning rate and tested in five well known datasets. The results show a very good performance of MLPs trained with Shannon's entropy when compared with the mean square error and cros-entropy criteria.
ES2005-129
efficient estimation of multidimensional regression model with multilayer perceptron
joseph Rynkiewicz
efficient estimation of multidimensional regression model with multilayer perceptron
joseph Rynkiewicz
Abstract:
This work concerns estimation of multidimensional nonlinear regression models using multilayer perceptron (MLP). The main problem with such models is that we have to know the covariance matrix of the noise to get optimal estimator. however we show that, if we choose as cost function the determinant of the empirical error covariance matrix, or more precisely the logarithm of this determinant, we get an asymptotically optimal estimator.
This work concerns estimation of multidimensional nonlinear regression models using multilayer perceptron (MLP). The main problem with such models is that we have to know the covariance matrix of the noise to get optimal estimator. however we show that, if we choose as cost function the determinant of the empirical error covariance matrix, or more precisely the logarithm of this determinant, we get an asymptotically optimal estimator.
ES2005-83
performance of EMI based mine detection using back-propagation neural networks
Matthew Draper, Taskin Kocak
performance of EMI based mine detection using back-propagation neural networks
Matthew Draper, Taskin Kocak
Abstract:
We propose and evaluate a neural network approach to mine detection using Electromagnetic Induction (EMI) sensors which provides a robust non-parametric approach. In our approach, a neural network with the well-known back-propagation learning algorithm combines the S-Statistic with the Delta-Technique to discriminate between non-mine patterns and mines. Experimental results show that this approach reduces false alarms substantially over using just the Delta-Technique or the energy detector.
We propose and evaluate a neural network approach to mine detection using Electromagnetic Induction (EMI) sensors which provides a robust non-parametric approach. In our approach, a neural network with the well-known back-propagation learning algorithm combines the S-Statistic with the Delta-Technique to discriminate between non-mine patterns and mines. Experimental results show that this approach reduces false alarms substantially over using just the Delta-Technique or the energy detector.
ES2005-85
Perceptron Learning with Discrete Weights
Joaquim Marques de Sá, Carlos Felgueiras
Perceptron Learning with Discrete Weights
Joaquim Marques de Sá, Carlos Felgueiras
Abstract:
Perceptron learning bounds with real weights have been presented by several authors. In the present paper we study the perceptron learning task when using integer weights in $[-k, k]^{d + 1}$. We present a sample complexity formula based on an exact counting result of the finite class of functions implemented by the perceptron, and show that this bound is less pessimistic than existing bounds for the discrete and, in certain conditions, also for the continuous weight cases.
Perceptron learning bounds with real weights have been presented by several authors. In the present paper we study the perceptron learning task when using integer weights in $[-k, k]^{d + 1}$. We present a sample complexity formula based on an exact counting result of the finite class of functions implemented by the perceptron, and show that this bound is less pessimistic than existing bounds for the discrete and, in certain conditions, also for the continuous weight cases.
Evolutionary and neural computation
ES2005-151
Synergies between Evolutionary and Neural Computation
Christian Igel, Bernhard Sendhoff
Synergies between Evolutionary and Neural Computation
Christian Igel, Bernhard Sendhoff
ES2005-132
Evolutionary framework for the construction of diverse hybrid ensembles
Arjun Chandra, Xin Yao
Evolutionary framework for the construction of diverse hybrid ensembles
Arjun Chandra, Xin Yao
Abstract:
Enforcing diversity explicitly in ensembles while at the same time making individual predictors accurate as well has been shown to be promising. This idea was recently taken into account in the algorithm DIVACE. There have been a multitude of theories on how one can enforce diversity within a combined predictor setup. This paper aims to bring these theories together in an attempt to synthesise a framework that can be used to engender new evolutionary ensemble learning algorithms. The framework treats diversity and accuracy as evolutionary pressures that can be exerted at multiple levels of abstraction and is shown to be effective.
Enforcing diversity explicitly in ensembles while at the same time making individual predictors accurate as well has been shown to be promising. This idea was recently taken into account in the algorithm DIVACE. There have been a multitude of theories on how one can enforce diversity within a combined predictor setup. This paper aims to bring these theories together in an attempt to synthesise a framework that can be used to engender new evolutionary ensemble learning algorithms. The framework treats diversity and accuracy as evolutionary pressures that can be exerted at multiple levels of abstraction and is shown to be effective.
ES2005-72
Efficient reinforcement learning through Evolutionary Acquisition of Neural Topologies
Yohannes Kassahun, Gerald Sommer
Efficient reinforcement learning through Evolutionary Acquisition of Neural Topologies
Yohannes Kassahun, Gerald Sommer
Abstract:
In this paper we present a novel method, called Evolutionary Acquisition of Neural Topologies (EANT), of evolving the structure and weights of neural networks. The method introduces an efficient and compact genetic encoding of a neural network onto a linear genome that enables one to evaluate the network without decoding it. The method explores new structures whenever it is not possible to further exploit the structures found so far. This enables it to find minimal neural structures for solving a given learning task. We tested the algorithm on a benchmark control task and found it to perform very well.
In this paper we present a novel method, called Evolutionary Acquisition of Neural Topologies (EANT), of evolving the structure and weights of neural networks. The method introduces an efficient and compact genetic encoding of a neural network onto a linear genome that enables one to evaluate the network without decoding it. The method explores new structures whenever it is not possible to further exploit the structures found so far. This enables it to find minimal neural structures for solving a given learning task. We tested the algorithm on a benchmark control task and found it to perform very well.
ES2005-71
Evolving neural networks: Is it really worth the effort?
John Bullinaria
Evolving neural networks: Is it really worth the effort?
John Bullinaria
Abstract:
The idea of using simulated evolution to create neural networks that learn faster and generalize better is becoming increasingly widespread. However, such evolutionary processes are usually extremely computationally intensive. In this paper, I present an empirical study to investigate whether the improved performance obtained really does justify the extra effort, and whether it might be possible to extract some general principles from existing evolved networks that can be applied directly to our hand-crafted networks.
The idea of using simulated evolution to create neural networks that learn faster and generalize better is becoming increasingly widespread. However, such evolutionary processes are usually extremely computationally intensive. In this paper, I present an empirical study to investigate whether the improved performance obtained really does justify the extra effort, and whether it might be possible to extract some general principles from existing evolved networks that can be applied directly to our hand-crafted networks.
ES2005-64
Efficient evolutionary optimization using individual-based evolution control and neural networks: A comparative study
Lars Gräning, Yaochu Jin, Bernhard Sendhoff
Efficient evolutionary optimization using individual-based evolution control and neural networks: A comparative study
Lars Gräning, Yaochu Jin, Bernhard Sendhoff
Abstract:
To reduce the number of expensive fitness function evaluations in evolutionary optimization, several individual-based and generation-based evolution control methods have been suggested. This paper compares four individual-based evolution control frameworks on three widely used test functions. Feedforward neural networks are employed for fitness estimation. Two conclusions can be drawn from our simulation results. First, the pre-selection strategy seems to be the most stable individual-based evolution control method. Second, structure optimization of neural networks mostly improves the performance of all compared algorithms.
To reduce the number of expensive fitness function evaluations in evolutionary optimization, several individual-based and generation-based evolution control methods have been suggested. This paper compares four individual-based evolution control frameworks on three widely used test functions. Feedforward neural networks are employed for fitness estimation. Two conclusions can be drawn from our simulation results. First, the pre-selection strategy seems to be the most stable individual-based evolution control method. Second, structure optimization of neural networks mostly improves the performance of all compared algorithms.
ES2005-123
Applications of multi-objective structure optimization
Alexander Gepperth, Stefan Roth
Applications of multi-objective structure optimization
Alexander Gepperth, Stefan Roth
Abstract:
We present an application of multi-objective evolutionary optimization of feed-forward neural networks (NN) to two real world problems, car and face classification. The possibly conflicting requirements on the NN are speed and classification accuracy, both of which can enhance the embedding systems as a whole. We compare the results to the outcome of a greedy optimization heuristic (magnitude-based pruning) coupled with a multi-objective performance evaluation. For the car classification problem, magnitude-based pruning yields competitive results, whereas for the more difficult face classification, we find that the evolutionary approach to NN design is clearly preferable
We present an application of multi-objective evolutionary optimization of feed-forward neural networks (NN) to two real world problems, car and face classification. The possibly conflicting requirements on the NN are speed and classification accuracy, both of which can enhance the embedding systems as a whole. We compare the results to the outcome of a greedy optimization heuristic (magnitude-based pruning) coupled with a multi-objective performance evaluation. For the car classification problem, magnitude-based pruning yields competitive results, whereas for the more difficult face classification, we find that the evolutionary approach to NN design is clearly preferable
Independent Component Analysis
ES2005-49
Empirical evidence of the linear nature of magnetoencephalograms
Antti Honkela, Tomas Östman, Ricardo Vigário
Empirical evidence of the linear nature of magnetoencephalograms
Antti Honkela, Tomas Östman, Ricardo Vigário
Abstract:
Over recent years many algorithms have been used for the analysis of electro- and magnetoencephalograms, assuming a linear model for the mixing of cortical activity at the sensor plane. Such linearity can be theoretically justified, through the Maxwell equations. In the present paper we exploit the adaptive and modular nature of the variational Bayesian hierarchical nonlinear factor analysis to give empirical evidence of linearity, as well as to estimate the intrinsic dimension of the generative source space.
Over recent years many algorithms have been used for the analysis of electro- and magnetoencephalograms, assuming a linear model for the mixing of cortical activity at the sensor plane. Such linearity can be theoretically justified, through the Maxwell equations. In the present paper we exploit the adaptive and modular nature of the variational Bayesian hierarchical nonlinear factor analysis to give empirical evidence of linearity, as well as to estimate the intrinsic dimension of the generative source space.
ES2005-95
To apply score function difference based ICA algorithms to high-dimensional data
Kun Zhang, Lai-Wan Chan
To apply score function difference based ICA algorithms to high-dimensional data
Kun Zhang, Lai-Wan Chan
Abstract:
Recently, the score function difference (SFD) has been applied to develop ICA algorithms. But such algorithms are not suitable for high-dimensional data because the SFD estimation in a high-dimensional space is problematic due to the ``curse of dimensionality". In this paper, by investigating the relationship between mutual independence and pairwise independence, we develop an approach for ICA with linear instantaneous mixtures or convolutive mixtures based on pairwise independence. In this approach only the computation of the 2-dimensional SFD is involved and it can be directly applied to high-dimensional data. The experimental result illustrates the usefulness of this approach.
Recently, the score function difference (SFD) has been applied to develop ICA algorithms. But such algorithms are not suitable for high-dimensional data because the SFD estimation in a high-dimensional space is problematic due to the ``curse of dimensionality". In this paper, by investigating the relationship between mutual independence and pairwise independence, we develop an approach for ICA with linear instantaneous mixtures or convolutive mixtures based on pairwise independence. In this approach only the computation of the 2-dimensional SFD is involved and it can be directly applied to high-dimensional data. The experimental result illustrates the usefulness of this approach.
ES2005-143
generative independent component analysis for EEG classification
Silvia Chiappa, David Barber
generative independent component analysis for EEG classification
Silvia Chiappa, David Barber
Abstract:
We present an application of Independent Component Analysis (ICA) to the discrimination of mental tasks for EEG-based Brain Computer Interface systems. ICA is most commonly used with EEG for artifact identification with little work on the use of ICA for direct discrimination of different types of EEG signals. By viewing ICA as a generative model, we can use Bayes' rule to form a classifier. This enables us also to investigate whether simple spatial information is sufficiently informative to produce state-of-the-art results when compared to more traditional methods based on using temporal features as inputs to off-the-shelf classifiers. Experiments conducted on two subjects suggest that knowing `where' activity is happening alone gives encouraging results.
We present an application of Independent Component Analysis (ICA) to the discrimination of mental tasks for EEG-based Brain Computer Interface systems. ICA is most commonly used with EEG for artifact identification with little work on the use of ICA for direct discrimination of different types of EEG signals. By viewing ICA as a generative model, we can use Bayes' rule to form a classifier. This enables us also to investigate whether simple spatial information is sufficiently informative to produce state-of-the-art results when compared to more traditional methods based on using temporal features as inputs to off-the-shelf classifiers. Experiments conducted on two subjects suggest that knowing `where' activity is happening alone gives encouraging results.
Classification using non-standard metrics
ES2005-150
Classification using non-standard metrics
Barbara Hammer, Thomas Villmann
Classification using non-standard metrics
Barbara Hammer, Thomas Villmann
ES2005-105
clustering using a random walk based distance measure
Luh Yen, Denis Vanvyve, Fabien Wouters, François Fouss, Michel Verleysen, Marco Saerens
clustering using a random walk based distance measure
Luh Yen, Denis Vanvyve, Fabien Wouters, François Fouss, Michel Verleysen, Marco Saerens
Abstract:
This work proposes a simple way to improve a clustering algorithm. The idea is to exploit a new distance metric called the “Euclidian Commute Time” (ECT) distance, based on a random walk model on a graph derived from the data. Using this distance measure instead of the usual Euclidean distance in a k-means algorithm allows to retrieve well-separated clusters of arbitrary shape, without working hypothesis about their data distribution. Experimental results show that the use of this new distance measure significantly improves the quality of the clustering on the tested data sets.
This work proposes a simple way to improve a clustering algorithm. The idea is to exploit a new distance metric called the “Euclidian Commute Time” (ECT) distance, based on a random walk model on a graph derived from the data. Using this distance measure instead of the usual Euclidean distance in a k-means algorithm allows to retrieve well-separated clusters of arbitrary shape, without working hypothesis about their data distribution. Experimental results show that the use of this new distance measure significantly improves the quality of the clustering on the tested data sets.
ES2005-131
A probabilistic framework for mismatch and profile string kernels
Alexei Vinokourov, Andrei Soklakov, Craig Saunders
A probabilistic framework for mismatch and profile string kernels
Alexei Vinokourov, Andrei Soklakov, Craig Saunders
Abstract:
There has recently been numerous applications of kernel methods in the field of bioinformatics. In particular, the problem of protein homology has served as a benchmark for the performance of many new kernels which operate directly on strings (such as amino-acid sequences). Several new kernels have been developed and successfully applied to this type of data, including spectrum, string, mismatch, and profile kernels. In this paper we introduce a general probabilistic framework for string kernels which uses the fisher-kernel approach and includes spectrum, mismatch and profile kernels, among others, as special cases. The use of a probabilistic model however provides additional flexibility both in definition and for the re-weighting of features through feature selection methods, prior knowledge or semi-supervised approaches which use data repositories such as BLAST. We give details of the framework and also give preliminary experimental results which show the applicability of the technique.
There has recently been numerous applications of kernel methods in the field of bioinformatics. In particular, the problem of protein homology has served as a benchmark for the performance of many new kernels which operate directly on strings (such as amino-acid sequences). Several new kernels have been developed and successfully applied to this type of data, including spectrum, string, mismatch, and profile kernels. In this paper we introduce a general probabilistic framework for string kernels which uses the fisher-kernel approach and includes spectrum, mismatch and profile kernels, among others, as special cases. The use of a probabilistic model however provides additional flexibility both in definition and for the re-weighting of features through feature selection methods, prior knowledge or semi-supervised approaches which use data repositories such as BLAST. We give details of the framework and also give preliminary experimental results which show the applicability of the technique.
ES2005-109
Generalized Relevance LVQ with Correlation Measures for Biological Data
Marc Strickert, Nese Sreenivasulu, Winfriede Weschke, Udo Seiffert, Thomas Villmann
Generalized Relevance LVQ with Correlation Measures for Biological Data
Marc Strickert, Nese Sreenivasulu, Winfriede Weschke, Udo Seiffert, Thomas Villmann
Abstract:
Generalized Relevance Learning Vector Quantization (GRLVQ) is combined with correlation-based similarity measures. These are derived from the Pearson correlation coefficient in order to replace the adaptive squared Euclidean distance which is typically used for GRLVQ. Patterns can thus be used without further preprocessing and compared in a manner invariant to data shifting and scaling transforms. High accuracies are demonstrated for a reference experiment of handwritten character recognition and good discrimination ability is shown for the detection of systematic differences between gene expression experiments.
Generalized Relevance Learning Vector Quantization (GRLVQ) is combined with correlation-based similarity measures. These are derived from the Pearson correlation coefficient in order to replace the adaptive squared Euclidean distance which is typically used for GRLVQ. Patterns can thus be used without further preprocessing and compared in a manner invariant to data shifting and scaling transforms. High accuracies are demonstrated for a reference experiment of handwritten character recognition and good discrimination ability is shown for the detection of systematic differences between gene expression experiments.
ES2005-116
Non-Euclidean metrics for similarity search in noisy datasets
Damien Francois, Vincent Wertz, Michel Verleysen
Non-Euclidean metrics for similarity search in noisy datasets
Damien Francois, Vincent Wertz, Michel Verleysen
Abstract:
In the context of classification, The dissimilarity between data elements is often measured by a metric defined on the data space. Often, the choice of the metric is often disregarded and the Euclidean distance is used without further inquiries. This paper illustrates the fact that when other noise schemes than the white Gaussian noise are encountered, it can be interesting to use alternative metrics for similarity search
In the context of classification, The dissimilarity between data elements is often measured by a metric defined on the data space. Often, the choice of the metric is often disregarded and the Euclidean distance is used without further inquiries. This paper illustrates the fact that when other noise schemes than the white Gaussian noise are encountered, it can be interesting to use alternative metrics for similarity search
ES2005-145
unsupervised fuzzy ensembles and their use in intrusion detection
Paul Evangelista, Piero Bonissone, Mark Embrechts, Boleslaw Szymanski
unsupervised fuzzy ensembles and their use in intrusion detection
Paul Evangelista, Piero Bonissone, Mark Embrechts, Boleslaw Szymanski
Abstract:
This paper proposes a novel method for unsupervised ensembles that specifically addresses unbalanced, unsupervised, binary classification problems. Unsupervised learning often experiences the curse of dimensionality, however subspace modeling can overcome this problem. For each subspace created, the classifier produces a decision value. The aggregation of the decision values occurs through the use of fuzzy logic, creating the fuzzy ROC curve. The one-class SVM is utilized for unsupervised classification. The primary source of data for this research is a host based computer intrusion detection dataset.
This paper proposes a novel method for unsupervised ensembles that specifically addresses unbalanced, unsupervised, binary classification problems. Unsupervised learning often experiences the curse of dimensionality, however subspace modeling can overcome this problem. For each subspace created, the classifier produces a decision value. The aggregation of the decision values occurs through the use of fuzzy logic, creating the fuzzy ROC curve. The one-class SVM is utilized for unsupervised classification. The primary source of data for this research is a host based computer intrusion detection dataset.
ES2005-138
Usage Guided Clustering of Web Pages with the Median Self Organizing Map
Fabrice Rossi, Aïcha El Golli, Yves Lechevallier
Usage Guided Clustering of Web Pages with the Median Self Organizing Map
Fabrice Rossi, Aïcha El Golli, Yves Lechevallier
Abstract:
Web Usage Mining aims at improving Web sites thanks to the analysis of the behavior of their users. This paper proposes to cluster web pages of a web site thanks to usage data. In big web sites, clustering individual pages is not possible, therefore the proposed method is based on a prior clustering of pages that uses semantic information about the site, such as its organization on the server. Then usage based dissimilarities between prior clusters is defined and an adaptation of the Self Organization Map to such data is used to provide visualization and clustering of groups of pages of the site. The method is illustrated on the web site of INRIA.
Web Usage Mining aims at improving Web sites thanks to the analysis of the behavior of their users. This paper proposes to cluster web pages of a web site thanks to usage data. In big web sites, clustering individual pages is not possible, therefore the proposed method is based on a prior clustering of pages that uses semantic information about the site, such as its organization on the server. Then usage based dissimilarities between prior clusters is defined and an adaptation of the Self Organization Map to such data is used to provide visualization and clustering of groups of pages of the site. The method is illustrated on the web site of INRIA.
ES2005-17
Mixed Topological Map
mustapha lebbah, Aymeric CHAZOTTES, fouad Badran, Sylvie Thiria
Mixed Topological Map
mustapha lebbah, Aymeric CHAZOTTES, fouad Badran, Sylvie Thiria
Abstract:
We propose a new algorithm which is based on a topological map model and dedicated to mixed data, with numerical and binary components. The algorithm computes directly the referent vectors, as a mixed data vectors sharing the same interpretation with the observations. The method is validated on a real data related to the ocean colour domain.
We propose a new algorithm which is based on a topological map model and dedicated to mixed data, with numerical and binary components. The algorithm computes directly the referent vectors, as a mixed data vectors sharing the same interpretation with the observations. The method is validated on a real data related to the ocean colour domain.
ES2005-47
linear algebra for time series of spikes
Andrew Carnell, Daniel Richardson
linear algebra for time series of spikes
Andrew Carnell, Daniel Richardson
Abstract:
The set of time series of spikes is expanded into a vector space, V , by taking all linear combinations. A definition is then given for an inner product on this vector space. This gives a definition of norm, of distance between time series, and of orthogonality. This also allows us to compute the best approximation to a given time series which can be formed by a linear combination of some given time series. It is shown how this can be applied to the problem of learning for liquid state machines.
The set of time series of spikes is expanded into a vector space, V , by taking all linear combinations. A definition is then given for an inner product on this vector space. This gives a definition of norm, of distance between time series, and of orthogonality. This also allows us to compute the best approximation to a given time series which can be formed by a linear combination of some given time series. It is shown how this can be applied to the problem of learning for liquid state machines.
Learning III
ES2005-74
Relevance determination in reinforcement learning
Katharina Tluk v. Toschanowitz, Barbara Hammer, Helge Ritter
Relevance determination in reinforcement learning
Katharina Tluk v. Toschanowitz, Barbara Hammer, Helge Ritter
Abstract:
We propose relevance determination and minimisation schemes in reinforcement learning which are solely based on the Q-matrix and which can thus be applied during training without prior knowledge about the system dynamics. On the one hand, we judge the relevance of separate state space dimensions based on the variance in the Q-matrix. On the other hand, we perform Q-matrix reduction by means of a combination of Q-learning with neighbourhood cooperation of the state values where the neighbourhood is defined based on the Q-values itself. The effectivity of the methods is shown in a (simple though relevant) gridworld example.
We propose relevance determination and minimisation schemes in reinforcement learning which are solely based on the Q-matrix and which can thus be applied during training without prior knowledge about the system dynamics. On the one hand, we judge the relevance of separate state space dimensions based on the variance in the Q-matrix. On the other hand, we perform Q-matrix reduction by means of a combination of Q-learning with neighbourhood cooperation of the state values where the neighbourhood is defined based on the Q-values itself. The effectivity of the methods is shown in a (simple though relevant) gridworld example.
ES2005-75
Feature selection for high-dimensional industrial data
Michael Bensch, Michael Schröder, Martin Bogdan, Wolfgang Rosenstiel
Feature selection for high-dimensional industrial data
Michael Bensch, Michael Schröder, Martin Bogdan, Wolfgang Rosenstiel
Abstract:
In the semiconductor industry the number of circuits per chip is still drastically increasing. This fact and strong competition lead to the particular importance of quality control and quality assurance. As a result a vast amount of data is recorded during the fabrication process, which is very complex in structure and massively affected with noise. The evaluation of this data is a vital task to support engineers in the analysis of process problems. The current work tackles this problem by identifying the features responsible for success or failure in the manufacturing process (feature selection).
In the semiconductor industry the number of circuits per chip is still drastically increasing. This fact and strong competition lead to the particular importance of quality control and quality assurance. As a result a vast amount of data is recorded during the fabrication process, which is very complex in structure and massively affected with noise. The evaluation of this data is a vital task to support engineers in the analysis of process problems. The current work tackles this problem by identifying the features responsible for success or failure in the manufacturing process (feature selection).
ES2005-76
Adaptive robot learning in a non-stationary environment
Kary Främling
Adaptive robot learning in a non-stationary environment
Kary Främling
Abstract:
Adaptive control is challenging in real-world applications such as robotics. Learning has to be rapid enough to be performed in real time and to avoid damage to the robot. Models using linear function approximation are interesting in such tasks because they offer rapid learning and have small memory and processing requirements. This makes them suitable as adaptive controllers in non-stationary environments, especially when the controller needs to be an embedded system. Experiments with a light-seeking robot illustrate how the robot adapts to the environment by Reinforcement Learning where the robot collects training samples by exploring the environment.
Adaptive control is challenging in real-world applications such as robotics. Learning has to be rapid enough to be performed in real time and to avoid damage to the robot. Models using linear function approximation are interesting in such tasks because they offer rapid learning and have small memory and processing requirements. This makes them suitable as adaptive controllers in non-stationary environments, especially when the controller needs to be an embedded system. Experiments with a light-seeking robot illustrate how the robot adapts to the environment by Reinforcement Learning where the robot collects training samples by exploring the environment.
ES2005-79
Phase transition in sparse associative neural networks
Oleksiy Dekhtyarenko, Valery Tereshko, Colin Fyfe
Phase transition in sparse associative neural networks
Oleksiy Dekhtyarenko, Valery Tereshko, Colin Fyfe
Abstract:
We study the phenomenon of phase transition occurring in sparse associative neural networks, which is characterized by the abrupt emergence of associative properties with the growth of network connectivity. It is shown that this discontinuous behaviour is caused by the specific way of architecture selection. Based on empirical results the relationship among critical parameters is suggested.
We study the phenomenon of phase transition occurring in sparse associative neural networks, which is characterized by the abrupt emergence of associative properties with the growth of network connectivity. It is shown that this discontinuous behaviour is caused by the specific way of architecture selection. Based on empirical results the relationship among critical parameters is suggested.
ES2005-89
Neuromimetic model of interval timing
Claude Touzet, Pierrick Demoulin, Boris Burle, Franck Vidal, Françoise Macar
Neuromimetic model of interval timing
Claude Touzet, Pierrick Demoulin, Boris Burle, Franck Vidal, Françoise Macar
Abstract:
Neuromimetic models of time processing mechanisms in the sub-second to minute range are mainly focussed on the mean and variance properties of time estimation (scalarity) but offer no appropriate account of attention manipulations: a systematic underestimation of time with decreasing levels of attention. Our model is able to reproduce the scalarity and attentional effects, and fits both behavioral and brain imaging data.
Neuromimetic models of time processing mechanisms in the sub-second to minute range are mainly focussed on the mean and variance properties of time estimation (scalarity) but offer no appropriate account of attention manipulations: a systematic underestimation of time with decreasing levels of attention. Our model is able to reproduce the scalarity and attentional effects, and fits both behavioral and brain imaging data.
ES2005-92
Contextual Processing of Graphs using Self-Organizing Maps
Markus Hagenbuchner, Alessandro Sperduti, Ah Chung Tsoi
Contextual Processing of Graphs using Self-Organizing Maps
Markus Hagenbuchner, Alessandro Sperduti, Ah Chung Tsoi
Abstract:
This paper introduces a novel approach to Self-Organizing Maps which are capable of processing graphs such that the context of vertices and sub-graphs is considered in the mapping process. The result is that any vertex in a graph is mapped onto an n-dimensional map depending on its label and the graph structure as a whole. Experimental results demonstrate that the proposed approach achieves the desired outcomes.
This paper introduces a novel approach to Self-Organizing Maps which are capable of processing graphs such that the context of vertices and sub-graphs is considered in the mapping process. The result is that any vertex in a graph is mapped onto an n-dimensional map depending on its label and the graph structure as a whole. Experimental results demonstrate that the proposed approach achieves the desired outcomes.
ES2005-97
Structural feature selection for wrapper methods
Gianluca Bontempi
Structural feature selection for wrapper methods
Gianluca Bontempi
Abstract:
The wrapper approach to feature selection requires the assessment of several subset alternatives and the selection of the one which is expected to have the lowest generalization error. To tackle this problem, practitioners have often recourse to a search procedure in a very large space of subsets of features aiming to minimize a leave-one-out or more in general a cross-validation criterion. It has been previously discussed in literature, how this practice can lead to a strong bias selection in the case of high dimensionality problems. We propose here an alternative method, inspired to structural identification in model selection, which replaces a single global search by a number of searches into a sequence of nested spaces of features with an increasing number of variables. The paper presents some promising, although preliminary results on several real nonlinear regression problems.
The wrapper approach to feature selection requires the assessment of several subset alternatives and the selection of the one which is expected to have the lowest generalization error. To tackle this problem, practitioners have often recourse to a search procedure in a very large space of subsets of features aiming to minimize a leave-one-out or more in general a cross-validation criterion. It has been previously discussed in literature, how this practice can lead to a strong bias selection in the case of high dimensionality problems. We propose here an alternative method, inspired to structural identification in model selection, which replaces a single global search by a number of searches into a sequence of nested spaces of features with an increasing number of variables. The paper presents some promising, although preliminary results on several real nonlinear regression problems.
ES2005-98
Coverage-performance estimation for classification with ambiguous data
Thomas Trappenberg
Coverage-performance estimation for classification with ambiguous data
Thomas Trappenberg
Abstract:
Classifier tradeoffs between accuracy and specificity are often analyzed with receiver operating curves(ROC). Here we study a related analysis of the data in terms of coverage-performance curves (CPC) which more clearly indicate the presence of ambiguous data in classification problems with overlapping class distributions. We show that feedforward mapping networks are well suited to derive such curves with minimal effort. Based on such classifiers we can identify data that need further analysis before attempting classification with sufficient confidence.
Classifier tradeoffs between accuracy and specificity are often analyzed with receiver operating curves(ROC). Here we study a related analysis of the data in terms of coverage-performance curves (CPC) which more clearly indicate the presence of ambiguous data in classification problems with overlapping class distributions. We show that feedforward mapping networks are well suited to derive such curves with minimal effort. Based on such classifiers we can identify data that need further analysis before attempting classification with sufficient confidence.
ES2005-102
UWB radar target identification based on linear RBFNN
Min Wang, Shuyuan Yang, Shunjun Wu
UWB radar target identification based on linear RBFNN
Min Wang, Shuyuan Yang, Shunjun Wu
Abstract:
In this paper, a radical-basis-function neural network(RBFNN) with efficient linear learning algorithm is presented for the identification on target profiles of Ultra Wideband(UWB) radar. This linear RBFNN has both good localization approximation and linear computation complexity with the number of dimension and number of inputs. Its performance is comparable with support vector machine (SVM) for tasks of pattern recognition and regression and its processing speed is rapider than SVM for the linear learning. We applied the linear RBFNN to the identification of target profile in UWB radar, which needs excessively fast processing for the very short impulses of UWB radar. Good results are achieved including high recognition rate and short consumed time, which is superiority to its counterparts.
In this paper, a radical-basis-function neural network(RBFNN) with efficient linear learning algorithm is presented for the identification on target profiles of Ultra Wideband(UWB) radar. This linear RBFNN has both good localization approximation and linear computation complexity with the number of dimension and number of inputs. Its performance is comparable with support vector machine (SVM) for tasks of pattern recognition and regression and its processing speed is rapider than SVM for the linear learning. We applied the linear RBFNN to the identification of target profile in UWB radar, which needs excessively fast processing for the very short impulses of UWB radar. Good results are achieved including high recognition rate and short consumed time, which is superiority to its counterparts.
Biologically inspired models
ES2005-99
A multi-modular associator network for simple temporal sequence learning and generation
Michael Lawrence, Thomas Trappenberg, Alan Fine
A multi-modular associator network for simple temporal sequence learning and generation
Michael Lawrence, Thomas Trappenberg, Alan Fine
Abstract:
Temporal sequence generation readily occurs in nature. For example performing a series of motor movements or recalling a sequence of episodic memories. Proposed networks which perform temporal sequence generation are often in the form of a modification to an auto-associative memory by using hetero-associative or time-varying synaptic strengths, requiring some pre-chosen temporal functions. Intra-mudular synapses are trained auto-associatively with a Hebb rule, while a set of inter-module synapses are hetero-associative. Our model is compared to one by Lisman, which uses hetero-associative recurrent synapses in one of the modules, and auto-associative synapses between modules.
Temporal sequence generation readily occurs in nature. For example performing a series of motor movements or recalling a sequence of episodic memories. Proposed networks which perform temporal sequence generation are often in the form of a modification to an auto-associative memory by using hetero-associative or time-varying synaptic strengths, requiring some pre-chosen temporal functions. Intra-mudular synapses are trained auto-associatively with a Hebb rule, while a set of inter-module synapses are hetero-associative. Our model is compared to one by Lisman, which uses hetero-associative recurrent synapses in one of the modules, and auto-associative synapses between modules.
ES2005-73
Attractor neural networks with patchy connectivity
Christopher Johansson, Martin Rehn, Anders Lansner
Attractor neural networks with patchy connectivity
Christopher Johansson, Martin Rehn, Anders Lansner
Abstract:
We investigate the effects of patchy (clustered) connectivity in sparsely connected attractor neural networks (NNs). This study is motivated by the fact that the connectivity of pyramidal neurons in layer II/III of the mammalian visual cortex is patchy and sparse. The storage capacity of attractor NNs of Hopfield and Willshaw type with this kind of connectivity is investigated analytically as well as by simulation experiments. We find that patchy connectivity gives a higher storage capacity, given an overall sparse connectivity and regardless of network model.
We investigate the effects of patchy (clustered) connectivity in sparsely connected attractor neural networks (NNs). This study is motivated by the fact that the connectivity of pyramidal neurons in layer II/III of the mammalian visual cortex is patchy and sparse. The storage capacity of attractor NNs of Hopfield and Willshaw type with this kind of connectivity is investigated analytically as well as by simulation experiments. We find that patchy connectivity gives a higher storage capacity, given an overall sparse connectivity and regardless of network model.
ES2005-158
Isolated word recognition using a Liquid State Machine
David Verstraeten, Benjamin Schrauwen, Dirk Stroobandt
Isolated word recognition using a Liquid State Machine
David Verstraeten, Benjamin Schrauwen, Dirk Stroobandt
Abstract:
An implementation of the recently proposed concept of the Liquid State Machine using a Spiking Neural Network (SNN) is trained to perform isolated word recognition. We investigate two different speech front ends and different ways of coding the inputs into spike trains. The robustness against noise added to the speech is also briefly researched. It turns out that a biologically realistic configuration of the LSM gives the best result, and that its performance rivals that of a state-of-the-art speech recognition system.
An implementation of the recently proposed concept of the Liquid State Machine using a Spiking Neural Network (SNN) is trained to perform isolated word recognition. We investigate two different speech front ends and different ways of coding the inputs into spike trains. The robustness against noise added to the speech is also briefly researched. It turns out that a biologically realistic configuration of the LSM gives the best result, and that its performance rivals that of a state-of-the-art speech recognition system.
ES2005-118
new evidences for sparse coding strategy employed in visual neurons: from the image processing and nonlinear approximation viewpoint
Shan Tan, Licheng Jiao
new evidences for sparse coding strategy employed in visual neurons: from the image processing and nonlinear approximation viewpoint
Shan Tan, Licheng Jiao
Abstract:
Sparse coding is a ubiquitous strategy employed in the sensory information processing system of mammals. Some work has focused on the validation of this strategy through finding the sparse component of sensory input and then illustrating a fact that the resulting basis functions or corresponding filter response have the visually similar receptive field to those found in primary visual cortex (V1). In this review, we show that several newly proposed systems in the area of image processing and nonlinear approximation provide new evidences for the sparse coding strategy along a contrary line. Inspired by the property of receptive field of neuron in V1, the bases functions of these systems are constructed with special structures, namely, band-pass, being localized and multi-orientation. Interestingly, these systems can sparsely represent the special classes of images dominated with edges.
Sparse coding is a ubiquitous strategy employed in the sensory information processing system of mammals. Some work has focused on the validation of this strategy through finding the sparse component of sensory input and then illustrating a fact that the resulting basis functions or corresponding filter response have the visually similar receptive field to those found in primary visual cortex (V1). In this review, we show that several newly proposed systems in the area of image processing and nonlinear approximation provide new evidences for the sparse coding strategy along a contrary line. Inspired by the property of receptive field of neuron in V1, the bases functions of these systems are constructed with special structures, namely, band-pass, being localized and multi-orientation. Interestingly, these systems can sparsely represent the special classes of images dominated with edges.
Kernel methods and the exponential family
ES2005-149
Kernel methods and the exponential family
Stéphane Canu, Alex Smola
Kernel methods and the exponential family
Stéphane Canu, Alex Smola
Abstract:
The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated reproducing kernel Hilbert space (rkhs). Exponential families, a standard tool in statistics, can be used to unify many existing machine learning algorithms based on kernels (such as SVM) and to invent novel ones quite effortlessly. In this paper we will discuss how exponential families, a standard tool in statistics, can be used with great success in machine learning to unify many existing algorithms and to invent novel ones quite effortlessly. A new derivation of the novelty detection algorithm based on the one class SVM is proposed to illustrates the power of the exponential family model in a rkhs
The success of Support Vector Machine (SVM) gave rise to the development of a new class of theoretically elegant learning machines which use a central concept of kernels and the associated reproducing kernel Hilbert space (rkhs). Exponential families, a standard tool in statistics, can be used to unify many existing machine learning algorithms based on kernels (such as SVM) and to invent novel ones quite effortlessly. In this paper we will discuss how exponential families, a standard tool in statistics, can be used with great success in machine learning to unify many existing algorithms and to invent novel ones quite effortlessly. A new derivation of the novelty detection algorithm based on the one class SVM is proposed to illustrates the power of the exponential family model in a rkhs
ES2005-90
Joint Regularization
Karsten M. Borgwardt, Omri Guttman, S.V.N. Vishwanathan, Alex Smola
Joint Regularization
Karsten M. Borgwardt, Omri Guttman, S.V.N. Vishwanathan, Alex Smola
Abstract:
We present a principled method to combine kernels under joint regularization constraints. Central to our method is an extension of the representer theorem for handling multiple joint regularization constraints. Experimental evidence shows the feasibility of our approach.
We present a principled method to combine kernels under joint regularization constraints. Central to our method is an extension of the representer theorem for handling multiple joint regularization constraints. Experimental evidence shows the feasibility of our approach.
ES2005-160
A Class of Kernels For Sets of Vectors
Frédéric Desobry, Manuel DAVY, William Fitzgerald
A Class of Kernels For Sets of Vectors
Frédéric Desobry, Manuel DAVY, William Fitzgerald
Abstract:
In some important applications such as speaker recognition or image texture classification, the data to be processed are sets of vectors. As opposed to standard settings where the data are individual vectors, it is difficult to design a reliable kernel between sets of vectors of possibly different cardinality. In this paper, we build kernels between sets of vectors from probability density functions level sets estimated for each set of vectors, where a pdf level set is (roughly) a part of the space where most of the data lie.
In some important applications such as speaker recognition or image texture classification, the data to be processed are sets of vectors. As opposed to standard settings where the data are individual vectors, it is difficult to design a reliable kernel between sets of vectors of possibly different cardinality. In this paper, we build kernels between sets of vectors from probability density functions level sets estimated for each set of vectors, where a pdf level set is (roughly) a part of the space where most of the data lie.
ES2005-121
Support Vector Machine For Functional Data Classification
Nathalie Villa, Fabrice Rossi
Support Vector Machine For Functional Data Classification
Nathalie Villa, Fabrice Rossi
Abstract:
Functional data analysis is a growing research field and numerous works present a generalization of the classical statistical methods to function classification or regression. In this paper, we focus on the problem of using Support Vector Machines (SVMs) for curve discrimination. We recall that important theoretical results for SVMs apply in functional space and propose simple functional kernels that take advantage of the nature of the data. Those kernels are illustrated on a spectrometric real world benchmark.
Functional data analysis is a growing research field and numerous works present a generalization of the classical statistical methods to function classification or regression. In this paper, we focus on the problem of using Support Vector Machines (SVMs) for curve discrimination. We recall that important theoretical results for SVMs apply in functional space and propose simple functional kernels that take advantage of the nature of the data. Those kernels are illustrated on a spectrometric real world benchmark.
ES2005-111
Translation invariant classification of non-stationary signals
Vincent Guigue, Alain Rakotomamonjy, Stéphane Canu
Translation invariant classification of non-stationary signals
Vincent Guigue, Alain Rakotomamonjy, Stéphane Canu
Abstract:
Non-stationary signal classification is a difficult and complex problem. On top of that, we add the following hypothesis : each signal includes a discriminant waveform, the position of which is random and unknown. This is a problem that may arise in Brain Computer Interface (BCI). The aim of this article is to provide a new description to classify this kind of data. This representation must characterize the waveform without reference to the absolute time position of the pattern in the signal. We will show that it is possible to create a signal description using graphs on a time-scale representation. The definition of an inner product between graphs is then requested to implement classification algorithm. Our experimental results showed that this approach is very promising.
Non-stationary signal classification is a difficult and complex problem. On top of that, we add the following hypothesis : each signal includes a discriminant waveform, the position of which is random and unknown. This is a problem that may arise in Brain Computer Interface (BCI). The aim of this article is to provide a new description to classify this kind of data. This representation must characterize the waveform without reference to the absolute time position of the pattern in the signal. We will show that it is possible to create a signal description using graphs on a time-scale representation. The definition of an inner product between graphs is then requested to implement classification algorithm. Our experimental results showed that this approach is very promising.
Applications
ES2005-91
Chemical similarity searching using a neural graph matcher
Stefan Klinger, Jim Austin
Chemical similarity searching using a neural graph matcher
Stefan Klinger, Jim Austin
Abstract:
A neural graph matcher based on Correlation Matrix Memories is evaluated in terms of efficiency and effectiveness against two maximum common subgraph algorithms. The algorithm removes implausible solutions below a user-defined threshold and runs faster than conventional mcs methods on our database of chemical graphs while being slightly less effective.
A neural graph matcher based on Correlation Matrix Memories is evaluated in terms of efficiency and effectiveness against two maximum common subgraph algorithms. The algorithm removes implausible solutions below a user-defined threshold and runs faster than conventional mcs methods on our database of chemical graphs while being slightly less effective.
ES2005-126
SVM and pattern-enriched common fate graphs for the game of go
Liva Ralaivola, Lin Wu, Pierre Baldi
SVM and pattern-enriched common fate graphs for the game of go
Liva Ralaivola, Lin Wu, Pierre Baldi
Abstract:
We propose a pattern-based approach combined with the concept of Enriched Common Fate Graph for the problem of classifying Go positions. A kernel function for weighted graph to compute the similarity between two board positions is proposed and used to learn a support vector machine and address the problem of position evaluation. Numerical simulations are carried out using a set of human played games and show the relevance of our approach.
We propose a pattern-based approach combined with the concept of Enriched Common Fate Graph for the problem of classifying Go positions. A kernel function for weighted graph to compute the similarity between two board positions is proposed and used to learn a support vector machine and address the problem of position evaluation. Numerical simulations are carried out using a set of human played games and show the relevance of our approach.
ES2005-146
Using CMU PIE Human Face Database to a Convolutional Neural Network - Neocognitron
José Hiroki Saito, Thiago Vieira de Carvalho, Marcelo Hirakuri, André Saunite, Alessandro Noriaki Ide, Sandra Abib
Using CMU PIE Human Face Database to a Convolutional Neural Network - Neocognitron
José Hiroki Saito, Thiago Vieira de Carvalho, Marcelo Hirakuri, André Saunite, Alessandro Noriaki Ide, Sandra Abib
Abstract:
This work presents the application of neocognitron to the human face recognition. Using a large scale human face database (CMU PIE), it is verified the optimal thresholds of the neocognitron to human face recognition. During the first experiment, increasing the activation thresholds of the neocognitron it is obtained their stable values to be used in the second experiment increasing the number of training images per subject. As a result it is verified that a few number of training images per subjects is enough to obtain very high recognition rate (98%) to the frontal pose images from the database.
This work presents the application of neocognitron to the human face recognition. Using a large scale human face database (CMU PIE), it is verified the optimal thresholds of the neocognitron to human face recognition. During the first experiment, increasing the activation thresholds of the neocognitron it is obtained their stable values to be used in the second experiment increasing the number of training images per subject. As a result it is verified that a few number of training images per subjects is enough to obtain very high recognition rate (98%) to the frontal pose images from the database.
ES2005-159
Morphological memories for feature extraction in hyperspectral images
Manuel Graña, Xabier Albizuri
Morphological memories for feature extraction in hyperspectral images
Manuel Graña, Xabier Albizuri
Abstract:
In previous papers we proposed Associative Morphological Memories (AMM)as tools for endmember extraction in hyperspectral images. Linear Spectral Unmixing (LSU) based on these endmembers is a kind of unsupervised image segmentation. In this paper we propose that the fractional abundance coefficients may be used as features for the construction of supervised pixel spectra classifiers. Thus we compare them with two well-known linear feature extraction algorithms: Principal Component Analysis (PCA) and Independent Component Analysis (ICA).
In previous papers we proposed Associative Morphological Memories (AMM)as tools for endmember extraction in hyperspectral images. Linear Spectral Unmixing (LSU) based on these endmembers is a kind of unsupervised image segmentation. In this paper we propose that the fractional abundance coefficients may be used as features for the construction of supervised pixel spectra classifiers. Thus we compare them with two well-known linear feature extraction algorithms: Principal Component Analysis (PCA) and Independent Component Analysis (ICA).
Learning IV
ES2005-103
Mutual information and gamma test for input selection
Nima Reyhani, Jin Hao, Yongnan Ji, Amaury Lendasse
Mutual information and gamma test for input selection
Nima Reyhani, Jin Hao, Yongnan Ji, Amaury Lendasse
Abstract:
In this paper, input selection is performed using two different approaches. The first approach is based on the Gamma test. This test estimates the mean square error (MSE) that can be achieved without overfitting. The best set of inputs is the one that minimises the result of the Gamma test. The second method estimates the Mutual Information between a set of inputs and the output. The best set of inputs is the one that maximises the Mutual Information. Both methods are applied for the selection of the inputs for function approximation and time series prediction problems.
In this paper, input selection is performed using two different approaches. The first approach is based on the Gamma test. This test estimates the mean square error (MSE) that can be achieved without overfitting. The best set of inputs is the one that minimises the result of the Gamma test. The second method estimates the Mutual Information between a set of inputs and the output. The best set of inputs is the one that maximises the Mutual Information. Both methods are applied for the selection of the inputs for function approximation and time series prediction problems.
ES2005-104
Pruned lazy learning models for time series prediction
Antti Sorjamaa, Amaury Lendasse, Michel Verleysen
Pruned lazy learning models for time series prediction
Antti Sorjamaa, Amaury Lendasse, Michel Verleysen
Abstract:
This paper presents two improvements of Lazy Learning. Both methods include input selection and are applied to the long-term prediction of time series. The first method is based on an iterative pruning of the inputs; the second one performs a brute force search in the possible set of inputs using a k-NN approximator. Two benchmarks are used to illustrate the efficiency of these two methods: the Santa Fe A and the CATS Benchmark time series.
This paper presents two improvements of Lazy Learning. Both methods include input selection and are applied to the long-term prediction of time series. The first method is based on an iterative pruning of the inputs; the second one performs a brute force search in the possible set of inputs using a k-NN approximator. Two benchmarks are used to illustrate the efficiency of these two methods: the Santa Fe A and the CATS Benchmark time series.
ES2005-106
a new wrapper method for feature subset selection
Noelia Sánchez-Maroño, Amparo Alonso-Betanzos, Enrique Castillo
a new wrapper method for feature subset selection
Noelia Sánchez-Maroño, Amparo Alonso-Betanzos, Enrique Castillo
Abstract:
ANOVA decomposition is used as the basis for the development of a new wrapper feature subset selection method, in which functional networks are used as the induction algorithm. The performance of the proposed method was tested against several artificial and real data sets. The results obtained are comparable, and even better,in some cases, to those accomplished by other well-known methods, being the proposed algorithm faster.
ANOVA decomposition is used as the basis for the development of a new wrapper feature subset selection method, in which functional networks are used as the induction algorithm. The performance of the proposed method was tested against several artificial and real data sets. The results obtained are comparable, and even better,in some cases, to those accomplished by other well-known methods, being the proposed algorithm faster.
ES2005-110
Analysis of contrast functions in a genetic algorithm for post-nonlinear blind source separation
Fernando Rojas, García Puntonet Carlos, Ignacio Rojas
Analysis of contrast functions in a genetic algorithm for post-nonlinear blind source separation
Fernando Rojas, García Puntonet Carlos, Ignacio Rojas
Abstract:
There are many real-world situations in which directly observable data just keep certain relation with the data that is really of interest. The task of recovering the unknown sources from the set of mixtures available and little more information about the way they were mixed is called the blind source separation problem. If the assumption in order to obtain the original sources is their statistical independence, then ICA (Independent Component Analysis) maybe the technique to recover the signals. In this contribution, we propose and analyze three evaluation functions (contrast functions in Independent Component Analysis terminology) for the use in a genetic algorithm (PNL-GABSS, Post-NonLinear Genetic Algorithm for Blind Source Separation) which solves source separation in nonlinear mixtures, assuming the post-nonlinear mixture model. A thorough analysis of the performance of the chosen contrast functions is made by means of ANOVA (Analysis of Variance).
There are many real-world situations in which directly observable data just keep certain relation with the data that is really of interest. The task of recovering the unknown sources from the set of mixtures available and little more information about the way they were mixed is called the blind source separation problem. If the assumption in order to obtain the original sources is their statistical independence, then ICA (Independent Component Analysis) maybe the technique to recover the signals. In this contribution, we propose and analyze three evaluation functions (contrast functions in Independent Component Analysis terminology) for the use in a genetic algorithm (PNL-GABSS, Post-NonLinear Genetic Algorithm for Blind Source Separation) which solves source separation in nonlinear mixtures, assuming the post-nonlinear mixture model. A thorough analysis of the performance of the chosen contrast functions is made by means of ANOVA (Analysis of Variance).
ES2005-112
sparse Bayesian promoter based gene classification
Kee Khoon Lee, Gavin Cawley, Michael Bevan
sparse Bayesian promoter based gene classification
Kee Khoon Lee, Gavin Cawley, Michael Bevan
Abstract:
A method to distinguish between co-regulated genes that are up- or down-regulated under a given treatment, based on the composition of the upstrem promoter region, would be a valuable tool in deciphering gene regulatory networks. Ideally, the classification should be based on a small number of regulatory motifs, whos presence in the promoter region of a gene induce a significant effect on its transcriptional regulation. In this paper, we investigate the use of Relevance Vector Machines for this task, and present initial results of an analysis of glucose response in the model plant Arabidopsis thaliana, that has revealed novel biological information.
A method to distinguish between co-regulated genes that are up- or down-regulated under a given treatment, based on the composition of the upstrem promoter region, would be a valuable tool in deciphering gene regulatory networks. Ideally, the classification should be based on a small number of regulatory motifs, whos presence in the promoter region of a gene induce a significant effect on its transcriptional regulation. In this paper, we investigate the use of Relevance Vector Machines for this task, and present initial results of an analysis of glucose response in the model plant Arabidopsis thaliana, that has revealed novel biological information.
ES2005-120
Graph projection techniques for Self-Organizing Maps
Georg Pölzlbauer, Andreas Rauber, Michael Dittenbach
Graph projection techniques for Self-Organizing Maps
Georg Pölzlbauer, Andreas Rauber, Michael Dittenbach
Abstract:
The Self-Organizing Map is a popular neural network model for data analysis, for which a wide variety of visualization techniques exists. We present two novel techniques that take the density of the data into account. Our methods define graphs resulting from nearest neighbor- and radius-based distance calculations in data space and show projections of these graph structures on the map. It can then be observed how relations between the data are preserved by the projection, yielding interesting insights into the topology of the mapping, and helping to identify outliers as well as dense regions.
The Self-Organizing Map is a popular neural network model for data analysis, for which a wide variety of visualization techniques exists. We present two novel techniques that take the density of the data into account. Our methods define graphs resulting from nearest neighbor- and radius-based distance calculations in data space and show projections of these graph structures on the map. It can then be observed how relations between the data are preserved by the projection, yielding interesting insights into the topology of the mapping, and helping to identify outliers as well as dense regions.
ES2005-124
A Neural Network that helps building a Nonlinear Dynamical model of a Power Amplifier
Georgina Stegmayer, Omar Chiotti, Giancarlo Orengo
A Neural Network that helps building a Nonlinear Dynamical model of a Power Amplifier
Georgina Stegmayer, Omar Chiotti, Giancarlo Orengo
Abstract:
This paper presents a new neural network-based model that can be applied to characterize the nonlinear dynamical behavior of power amplifiers. We use a time-delayed feed-forward neural network to make an input-output time-domain characterization, that can provide also an analytical expression (as a Volterra Series model) to predict the amplifier response to multiple power levels. Simulation results that validate our proposal are presented.
This paper presents a new neural network-based model that can be applied to characterize the nonlinear dynamical behavior of power amplifiers. We use a time-delayed feed-forward neural network to make an input-output time-domain characterization, that can provide also an analytical expression (as a Volterra Series model) to predict the amplifier response to multiple power levels. Simulation results that validate our proposal are presented.
ES2005-127
Contextual priming for artificial visual perception
Hervé Guillaume, Nathalie Denquive, Philippe Tarroux
Contextual priming for artificial visual perception
Hervé Guillaume, Nathalie Denquive, Philippe Tarroux
Abstract:
The construction of robotics autonomous systems able to locate themselves in unknown environments requires the elaboration of efficient visual recognition algorithms. Our knowledge of the mechanisms of natural perception suggests that, when the recognition process fails due to the degradation of the observation conditions and to the blurring of the intrinsic attributes of the objects, the information concerning the context is used by human for object recognition priming. In this case the indices used for object identification can be greatly simplified. We present in this paper an attempt to precise how such a principle can be applied to autonomous robotics. We show that using a compact frequency coding of the scene together with an unsupervised SOM learning we obtain syntactic categories that exhibit specific relationships with object categories. Thus, the construction of these syntactic categories should be useful for estimating the occurrence probability of object categories during the exploration of the perceptual space of a robotic system.
The construction of robotics autonomous systems able to locate themselves in unknown environments requires the elaboration of efficient visual recognition algorithms. Our knowledge of the mechanisms of natural perception suggests that, when the recognition process fails due to the degradation of the observation conditions and to the blurring of the intrinsic attributes of the objects, the information concerning the context is used by human for object recognition priming. In this case the indices used for object identification can be greatly simplified. We present in this paper an attempt to precise how such a principle can be applied to autonomous robotics. We show that using a compact frequency coding of the scene together with an unsupervised SOM learning we obtain syntactic categories that exhibit specific relationships with object categories. Thus, the construction of these syntactic categories should be useful for estimating the occurrence probability of object categories during the exploration of the perceptual space of a robotic system.
ES2005-130
Learning to classify a collection of images and texts
Panagiotis Saragiotis, Bogdan Vrusias, Khurshid Ahmad
Learning to classify a collection of images and texts
Panagiotis Saragiotis, Bogdan Vrusias, Khurshid Ahmad
Abstract:
A single net system based on Kohonen’s Feature map was trained using a combined vector that contains visual features of an image and its collateral keywords. The performance of the single net was compared with a multinet system, comprising two SOMs, one trained with visual features and the other on keywords, in the presence of a Hebbian network that learns to associate visual features with keywords. The multi-net system performs better than the single net. Similar results were obtained when Grossberg’s ART networks were used instead of SOMs.
A single net system based on Kohonen’s Feature map was trained using a combined vector that contains visual features of an image and its collateral keywords. The performance of the single net was compared with a multinet system, comprising two SOMs, one trained with visual features and the other on keywords, in the presence of a Hebbian network that learns to associate visual features with keywords. The multi-net system performs better than the single net. Similar results were obtained when Grossberg’s ART networks were used instead of SOMs.
ES2005-135
Self-Organizing Maps computing on Graphic Process Unit
Zhongwen Luo, Hongzhi Liu, Zhengping Yang, Xincai Wu
Self-Organizing Maps computing on Graphic Process Unit
Zhongwen Luo, Hongzhi Liu, Zhengping Yang, Xincai Wu
Abstract:
Self-Organizing Maps (SOM) is a widely used artificial neural network (ANN) model. Because of its heavy computation load when the map is big and inherent parallel, there is a need to apply parallel algorithm on it. As a SIMD parallel processor, Graphic processing unit (GPU) shows a fast grow speed than CPU. And it also provided programmability recently. In this paper, we explore how commodity available GPU can be applied to SOM as a parallel processor. We develop a program based on GPU to make SOM computing. We show that GPU can make SOM computing much fast than standard CPU. We also give some design tricks for improve the efficiency. Based on our results and current trends in the development of GPU, we believe that graphics hardware will widely used in general purpose computing for high-performance computing.
Self-Organizing Maps (SOM) is a widely used artificial neural network (ANN) model. Because of its heavy computation load when the map is big and inherent parallel, there is a need to apply parallel algorithm on it. As a SIMD parallel processor, Graphic processing unit (GPU) shows a fast grow speed than CPU. And it also provided programmability recently. In this paper, we explore how commodity available GPU can be applied to SOM as a parallel processor. We develop a program based on GPU to make SOM computing. We show that GPU can make SOM computing much fast than standard CPU. We also give some design tricks for improve the efficiency. Based on our results and current trends in the development of GPU, we believe that graphics hardware will widely used in general purpose computing for high-performance computing.
ES2005-137
Novel Algorithm for Eliminating Folding Effect in Standard SOM
Kirmene Marzouki, Takeshi Yamakawa
Novel Algorithm for Eliminating Folding Effect in Standard SOM
Kirmene Marzouki, Takeshi Yamakawa
Abstract:
Self-organizing maps, SOMs, are a data visualization technique developed to reduce the dimensions of data through the use of self-organizing neural networks. However, as the original input manifold can be complicated with an inherent dimension larger than that of the feature map, the dimension reduction in SOM can be too drastic, generating a folded feature map. In order to eliminate this phenomenon, we extend the neighborhood concept to a new set of sub-neighbors, other than those introduced by Kohonen. The modified algorithm was applied to color classification and performed very well in comparison with the traditional SOM.
Self-organizing maps, SOMs, are a data visualization technique developed to reduce the dimensions of data through the use of self-organizing neural networks. However, as the original input manifold can be complicated with an inherent dimension larger than that of the feature map, the dimension reduction in SOM can be too drastic, generating a folded feature map. In order to eliminate this phenomenon, we extend the neighborhood concept to a new set of sub-neighbors, other than those introduced by Kohonen. The modified algorithm was applied to color classification and performed very well in comparison with the traditional SOM.
ES2005-139
Adaline-based estimation of power harmonics
Djaffar Ould Abdeslam, Mercklé Jean, Patrice Wira
Adaline-based estimation of power harmonics
Djaffar Ould Abdeslam, Mercklé Jean, Patrice Wira
Abstract:
A new strategy to estimate harmonic distortion from an AC line is presented. An ADALINE neural network is used to determine precisely the necessary currents in order to cancel harmful harmonics. The proposed strategy is based on an original decomposition of the measured currents to specify the neural network inputs. The decomposition is based on the Fourier series analysis of the current signals and a modified LMS training algorithm carries out the weights. This new estimation strategy appreciably improves the performances of traditional compensating methods and is valid both for single-phase and three-phase systems. The proposed strategy also allows to extract the harmonics individually.
A new strategy to estimate harmonic distortion from an AC line is presented. An ADALINE neural network is used to determine precisely the necessary currents in order to cancel harmful harmonics. The proposed strategy is based on an original decomposition of the measured currents to specify the neural network inputs. The decomposition is based on the Fourier series analysis of the current signals and a modified LMS training algorithm carries out the weights. This new estimation strategy appreciably improves the performances of traditional compensating methods and is valid both for single-phase and three-phase systems. The proposed strategy also allows to extract the harmonics individually.
ES2005-141
Radar target recognition using SVMs with a wrapper feature selection driven by immune clonal algorithm
Xiangrong Zhang, Shuang Wang, Shan Tan, Jiao Licheng
Radar target recognition using SVMs with a wrapper feature selection driven by immune clonal algorithm
Xiangrong Zhang, Shuang Wang, Shan Tan, Jiao Licheng
Abstract:
A wrapper feature selection method based on Immune Clonal Algorithm for SVM is presented and applied to 1-D images recognition of radar targets in this paper. In the proposed method, the cross-validation is used for feature evaluation in wrapper feature selection step for SVMs. And Immune Clonal Algorithm, which is characterized by rapid convergence to global optimal solution, is applied to find the optimal feature subset. Experimental results on 1-D images of 3 airplanes obtained in a microwave anechoic chamber show the effectiveness of the proposed method.
A wrapper feature selection method based on Immune Clonal Algorithm for SVM is presented and applied to 1-D images recognition of radar targets in this paper. In the proposed method, the cross-validation is used for feature evaluation in wrapper feature selection step for SVMs. And Immune Clonal Algorithm, which is characterized by rapid convergence to global optimal solution, is applied to find the optimal feature subset. Experimental results on 1-D images of 3 airplanes obtained in a microwave anechoic chamber show the effectiveness of the proposed method.
ES2005-147
Experimental validation of a synapse model by adding synaptic conductances to excitable endocrine cells in culture
Boussa Sofiane, Marin mattieu, LeFoll Frank, Faure Alain, Leboulanger Francois
Experimental validation of a synapse model by adding synaptic conductances to excitable endocrine cells in culture
Boussa Sofiane, Marin mattieu, LeFoll Frank, Faure Alain, Leboulanger Francois
Abstract:
The purpose of the present work is to study and investigate the physiological behaviour of artificial synapses in excitable non-neural cells. We have supplied a numerical synapse model to a frog melanotrope cell by using either the current-clamp technique or a home-made easy-to-use application to achieve dynamic-clamp recordings in real time conditions. From results obtained with the conventional biophysical AMPA/NMDA model of glutamatergic neurotransmission, we have introduced changes in order to adapt the model to the electrophysiological responses of the recorded cell. Further developments are in progress to extend such hydrid connections to multi-component neural networks.
The purpose of the present work is to study and investigate the physiological behaviour of artificial synapses in excitable non-neural cells. We have supplied a numerical synapse model to a frog melanotrope cell by using either the current-clamp technique or a home-made easy-to-use application to achieve dynamic-clamp recordings in real time conditions. From results obtained with the conventional biophysical AMPA/NMDA model of glutamatergic neurotransmission, we have introduced changes in order to adapt the model to the electrophysiological responses of the recorded cell. Further developments are in progress to extend such hydrid connections to multi-component neural networks.
ES2005-148
A neural network approach of ultra-wideband nearfield adaptive beamforming
Min Wang, Shuyuan Yang, Shunjun Wu
A neural network approach of ultra-wideband nearfield adaptive beamforming
Min Wang, Shuyuan Yang, Shunjun Wu
Abstract:
An adaptive beamforming method for ultrawide-band (UWB) array in the case of nearfield is proposed based on the radical basis function neural network (RBFNN) in this paper. The desired outputs corresponding with measured inputs for the nearfield impulse source are reflected into a set of training samples. Genetic algorithm and recursive least square algorithm are employed to determine the structure and the parameters of RBFNN. It can avoid the computation of an inverse matrix and alleviate the impact of the mutual coupling phenomenon. The experimental results also prove its efficiency and feasibility.
An adaptive beamforming method for ultrawide-band (UWB) array in the case of nearfield is proposed based on the radical basis function neural network (RBFNN) in this paper. The desired outputs corresponding with measured inputs for the nearfield impulse source are reflected into a set of training samples. Genetic algorithm and recursive least square algorithm are employed to determine the structure and the parameters of RBFNN. It can avoid the computation of an inverse matrix and alleviate the impact of the mutual coupling phenomenon. The experimental results also prove its efficiency and feasibility.
ES2005-156
Support vector algorithms as regularization networks
Andrea Caponnetto, Lorenzo Rosasco, Francesca Odone, Alessandro Verri
Support vector algorithms as regularization networks
Andrea Caponnetto, Lorenzo Rosasco, Francesca Odone, Alessandro Verri
Abstract:
In this paper we show that several Support Vector methods, including one-class SVM and a number of non-standard SVM classification techniques, can be viewed as special implementations of a general regularization network. Formally, the connection is obtained by choosing the appropriate loss function and parametrized by the exponent of the offset in the penalty term. The mathematical properties of the underlying algorithms can then be more conveniently studied within the theoretical framework of regularization networks.
In this paper we show that several Support Vector methods, including one-class SVM and a number of non-standard SVM classification techniques, can be viewed as special implementations of a general regularization network. Formally, the connection is obtained by choosing the appropriate loss function and parametrized by the exponent of the offset in the penalty term. The mathematical properties of the underlying algorithms can then be more conveniently studied within the theoretical framework of regularization networks.
ES2005-163
Spike-timing-dependent plasticity in 'small world' networks
Karsten Kube, Andreas Herzog, Bernd Michaelis, Ayoub Al-Hamadi, Ana de Lima, Thomas Voigt
Spike-timing-dependent plasticity in 'small world' networks
Karsten Kube, Andreas Herzog, Bernd Michaelis, Ayoub Al-Hamadi, Ana de Lima, Thomas Voigt
Abstract:
Biologically plausible excitatory networks develop a stable synchronized pattern of activity due to synaptic refractoriness (short term depression). The introduction of spike-timing-dependent plasticity (STDP) modifies the weights of synaptic connections in such a way that synchronization of neuronal activity is considerably weakened. By changing network connections to include long-distance connections according a power law distribution ('small world' topology) we found that synchronization could be much better sustained, despite STDP influence.
Biologically plausible excitatory networks develop a stable synchronized pattern of activity due to synaptic refractoriness (short term depression). The introduction of spike-timing-dependent plasticity (STDP) modifies the weights of synaptic connections in such a way that synchronization of neuronal activity is considerably weakened. By changing network connections to include long-distance connections according a power law distribution ('small world' topology) we found that synchronization could be much better sustained, despite STDP influence.