Bruges, Belgium, April 28-29-30
Content of the proceedings
-
Supervised and recurrent models
Computational Intelligence Business Applications
Motion estimation and segmentation
Information Visualization, Nonlinear Dimensionality Reduction, Manifold and Topological Learning
Learning I
Mixture and generative models
Sparse representation of data
Physiology and learning
Machine learning techniques based on random projections
Learning II
Unsupervised learning
Image and video analysis
Computational Intelligence in Biomedicine
Learning III
Supervised and recurrent models
ES2010-73
Efficient online learning of a non-negative sparse autoencoder
Andre Lemme, Felix Reinhart, Jochen Steil
Efficient online learning of a non-negative sparse autoencoder
Andre Lemme, Felix Reinhart, Jochen Steil
Abstract:
We introduce an efficient online learning mechanism for non-negative sparse coding in autoencoder neural networks.In this paper we compare the novel method to the batch algorithm non-negative matrix factorization with and without sparseness constraint.We show that the efficient autoencoder yields to better sparseness and lower reconstruction errors than the batch algorithms on the MNIST benchmark dataset.
We introduce an efficient online learning mechanism for non-negative sparse coding in autoencoder neural networks.In this paper we compare the novel method to the batch algorithm non-negative matrix factorization with and without sparseness constraint.We show that the efficient autoencoder yields to better sparseness and lower reconstruction errors than the batch algorithms on the MNIST benchmark dataset.
ES2010-116
The Markov Decision Process Extraction Network
Siegmund Duell, Alexander Hans, Steffen Udluft
The Markov Decision Process Extraction Network
Siegmund Duell, Alexander Hans, Steffen Udluft
Abstract:
This paper presents the Markov decision process extraction network, which is a data-efficient, automatic state estimation approach for discrete-time reinforcement learning (RL) based on recurrent neural networks. The architecture is designed to model minimal relevant dynamics of an environment, capable of condensing large sets of continuous observables to a compact state representation and excluding irrelevant information. To the best of our knowledge, it is the first approach published to automatically extract minimal relevant aspects of a dynamics from observations to model a Markov decision process, suitable for RL, without requiring special knowledge of the regarded environment. The capabilities of the neural state estimation approach are evaluated using the cart-pole problem and standard table-based dynamic programming.
This paper presents the Markov decision process extraction network, which is a data-efficient, automatic state estimation approach for discrete-time reinforcement learning (RL) based on recurrent neural networks. The architecture is designed to model minimal relevant dynamics of an environment, capable of condensing large sets of continuous observables to a compact state representation and excluding irrelevant information. To the best of our knowledge, it is the first approach published to automatically extract minimal relevant aspects of a dynamics from observations to model a Markov decision process, suitable for RL, without requiring special knowledge of the regarded environment. The capabilities of the neural state estimation approach are evaluated using the cart-pole problem and standard table-based dynamic programming.
ES2010-74
Maximal Discrepancy for Support Vector Machines
Davide Anguita, Alessandro Ghio, Sandro Ridella
Maximal Discrepancy for Support Vector Machines
Davide Anguita, Alessandro Ghio, Sandro Ridella
Abstract:
Several theoretical methods have been developed in the past years to evaluate the generalization ability of a classifier: they provide extremely useful insights on the learning phenomena, but are not as effective in giving good generalization estimates in practice. We focus in this work on the application of the Maximal Discrepancy method to the Support Vector Machine for computing an upper bound of its generalization bias.
Several theoretical methods have been developed in the past years to evaluate the generalization ability of a classifier: they provide extremely useful insights on the learning phenomena, but are not as effective in giving good generalization estimates in practice. We focus in this work on the application of the Maximal Discrepancy method to the Support Vector Machine for computing an upper bound of its generalization bias.
ES2010-93
Ensemble Modeling with a Constrained Linear System of Leave-One-Out Outputs
Yoan Miché, Emil Eirola, Patrick Bas, Olli Simula, Christian Jutten, Amaury Lendasse, Michel Verleysen
Ensemble Modeling with a Constrained Linear System of Leave-One-Out Outputs
Yoan Miché, Emil Eirola, Patrick Bas, Olli Simula, Christian Jutten, Amaury Lendasse, Michel Verleysen
Abstract:
This paper proposes a method for ensemble models using their Leave-One-Out output and solving a constrained linear system. By the use of the proposed method to create an ensemble of Locally Linear models, results on six different regression data sets are comparable to state-of-the-art methods such as Least-Squares Support Vector Machines and Gaussian Processes, while being orders of magnitude faster.
This paper proposes a method for ensemble models using their Leave-One-Out output and solving a constrained linear system. By the use of the proposed method to create an ensemble of Locally Linear models, results on six different regression data sets are comparable to state-of-the-art methods such as Least-Squares Support Vector Machines and Gaussian Processes, while being orders of magnitude faster.
ES2010-50
Financial time series forecasting with machine learning techniques: a survey
Bjoern Krollner, Bruce Vanstone, Gavin Finnie
Financial time series forecasting with machine learning techniques: a survey
Bjoern Krollner, Bruce Vanstone, Gavin Finnie
Abstract:
Stock index forecasting is vital for making informed investment decisions. This paper surveys recent literature in the domain of machine learning techniques and artificial intelligence used to forecast stock market movements. The publications are categorised according to the machine learning technique used, the forecasting timeframe, the input variables used, and the evaluation techniques employed. It is found that there is a consensus between researchers stressing the importance of stock index forecasting. Artificial Neural Networks (ANNs) are identified to be the dominant machine learning technique in this area. We conclude with possible future research directions.
Stock index forecasting is vital for making informed investment decisions. This paper surveys recent literature in the domain of machine learning techniques and artificial intelligence used to forecast stock market movements. The publications are categorised according to the machine learning technique used, the forecasting timeframe, the input variables used, and the evaluation techniques employed. It is found that there is a consensus between researchers stressing the importance of stock index forecasting. Artificial Neural Networks (ANNs) are identified to be the dominant machine learning technique in this area. We conclude with possible future research directions.
Computational Intelligence Business Applications
ES2010-5
Introduction to Computational Intelligence Business Applications
Thiago Turchetti Maia, Antonio Padua Braga
Introduction to Computational Intelligence Business Applications
Thiago Turchetti Maia, Antonio Padua Braga
Abstract:
Computational intelligence business applications have been developed since the early days of computing and are now commonly found in many aspects of modern society. This paper briefly surveys historic applications of computational intelligence in different business contexts. In addition, it describes this type of application from a business point of view, where computational intelligence is used as a source of competitive advantage. It concludes with an analysis of how organizations may create the proper environment and effectively use computational intelligence to improve their business.
Computational intelligence business applications have been developed since the early days of computing and are now commonly found in many aspects of modern society. This paper briefly surveys historic applications of computational intelligence in different business contexts. In addition, it describes this type of application from a business point of view, where computational intelligence is used as a source of competitive advantage. It concludes with an analysis of how organizations may create the proper environment and effectively use computational intelligence to improve their business.
ES2010-105
Heuristics Miner for Time Intervals
Andrea Burattin, Alessandro Sperduti
Heuristics Miner for Time Intervals
Andrea Burattin, Alessandro Sperduti
Abstract:
Process Mining attempts to reconstruct the workflow of a business process from logs of activities. This task is quite important in business scenarios where there is not a well understood and structured definition of the business process performed by workers. Activities logs are thus mined in the attempt to reconstruct the actual business process. In this paper, we propose the generalization of a popular process mining algorithm, named Heuristics Miner, to time intervals. We show that the possibility to use, when available, time interval information for the performed activities allows the algorithm to produce better workflow models.
Process Mining attempts to reconstruct the workflow of a business process from logs of activities. This task is quite important in business scenarios where there is not a well understood and structured definition of the business process performed by workers. Activities logs are thus mined in the attempt to reconstruct the actual business process. In this paper, we propose the generalization of a popular process mining algorithm, named Heuristics Miner, to time intervals. We show that the possibility to use, when available, time interval information for the performed activities allows the algorithm to produce better workflow models.
ES2010-17
Machine learning analysis and modeling of interest rate curves
Mikhaïl Kanevski, Vadim Timonin
Machine learning analysis and modeling of interest rate curves
Mikhaïl Kanevski, Vadim Timonin
Abstract:
The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.
The present research deals with the review of the analysis and modeling of Swiss franc interest rate curves (IRC) by using unsupervised (SOM, Gaussian Mixtures) and supervised machine (MLP) learning algorithms. IRC are considered as objects embedded into different feature spaces: maturities; maturity-date, parameters of Nelson-Siegel model (NSM). Analysis of NSM parameters and their temporal and clustering structures helps to understand the relevance of model and its potential use for the forecasting. Mapping of IRC in a maturity-date feature space is presented and analyzed for the visualization and forecasting purposes.
ES2010-77
Modeling contextualized textual knowledge as a Long-Term Working Memory
Mauro Mazzieri, Sara Topi, Aldo Franco Dragoni, Germano Vallesi
Modeling contextualized textual knowledge as a Long-Term Working Memory
Mauro Mazzieri, Sara Topi, Aldo Franco Dragoni, Germano Vallesi
Abstract:
A knowledge management system is more than an archive of textual documents; it provides context information, allowing to know which documents where used by people with a common goal. In the hypothesis that a set of textual documents with a common context can be assimilated to the long term memory of a human expert executor, we can use on them mining techniques inspired by the mechanic of human comprehension in expert domains. Text mining techniques for KM task can use a model of the long-term memory to extract meaningful keywords from the documents. The model acts as a dynamic and non-stationary dimensionality reduction strategy, allowing the clustering of context documents according to keyword presence, the classification of external documents according to local criteria, and a better understanding of document content and relatedness.
A knowledge management system is more than an archive of textual documents; it provides context information, allowing to know which documents where used by people with a common goal. In the hypothesis that a set of textual documents with a common context can be assimilated to the long term memory of a human expert executor, we can use on them mining techniques inspired by the mechanic of human comprehension in expert domains. Text mining techniques for KM task can use a model of the long-term memory to extract meaningful keywords from the documents. The model acts as a dynamic and non-stationary dimensionality reduction strategy, allowing the clustering of context documents according to keyword presence, the classification of external documents according to local criteria, and a better understanding of document content and relatedness.
Motion estimation and segmentation
ES2010-39
Neural competition for motion segmentation
Jan Steffen, Michael Pardowitz, Jochen Steil, Helge Ritter
Neural competition for motion segmentation
Jan Steffen, Michael Pardowitz, Jochen Steil, Helge Ritter
Abstract:
We present a system for sensory classification and segmentation in motion trajectories. It consists of a combination of manifolds from Unsupervised Kernel Regression (UKR) and the recurrent neural Competitive Layer Model (CLM). The UKR manifolds hold learned representations of a set of candidate motions and the CLM dynamics, working on features defined in the UKR domain, realises the segmentation of observed trajectory data according to the competing candidates. The evaluation on trajectories describing four different letters yields improved classification results compared to our previous, pure manifold approach.
We present a system for sensory classification and segmentation in motion trajectories. It consists of a combination of manifolds from Unsupervised Kernel Regression (UKR) and the recurrent neural Competitive Layer Model (CLM). The UKR manifolds hold learned representations of a set of candidate motions and the CLM dynamics, working on features defined in the UKR domain, realises the segmentation of observed trajectory data according to the competing candidates. The evaluation on trajectories describing four different letters yields improved classification results compared to our previous, pure manifold approach.
ES2010-76
Adaptive velocity tuning for visual motion estimation
Volker Willert, Julian Eggert
Adaptive velocity tuning for visual motion estimation
Volker Willert, Julian Eggert
Abstract:
In the brain, both neural processing dynamics as well as the perceptual interpretation of a stimulus can depend on sensory history. The underlying principle is a sensory adaptation to the statistics of the input collected over a certain amount of time, allowing the system to tune its detectors, e.g.~by improving the sampling of the input space. Here we show how a generative formulation for the problem of visual motion estimation leads to an online adaptation of velocity tuning that is compatible with physiological sensory adaptation and observed perceptual effects.
In the brain, both neural processing dynamics as well as the perceptual interpretation of a stimulus can depend on sensory history. The underlying principle is a sensory adaptation to the statistics of the input collected over a certain amount of time, allowing the system to tune its detectors, e.g.~by improving the sampling of the input space. Here we show how a generative formulation for the problem of visual motion estimation leads to an online adaptation of velocity tuning that is compatible with physiological sensory adaptation and observed perceptual effects.
Information Visualization, Nonlinear Dimensionality Reduction, Manifold and Topological Learning
ES2010-4
Recent Advances in Nonlinear Dimensionality Reduction, Manifold and Topological Learning
Axel Wismueller, Michel Verleysen, Michaël Aupetit, John Aldo Lee
Recent Advances in Nonlinear Dimensionality Reduction, Manifold and Topological Learning
Axel Wismueller, Michel Verleysen, Michaël Aupetit, John Aldo Lee
Abstract:
The ever-growing amount of data stored in digital databases raises the question of how to organize and extract useful knowledge. This paper outlines some current developments in the domains of dimensionality reduction, manifold learning, and topological learning. Several aspects are dealt with, ranging from novel algorithmic approaches to their realworld applications. The issue of quality assessment is also considered and progress in quantitive as well as visual crieria is reported.
The ever-growing amount of data stored in digital databases raises the question of how to organize and extract useful knowledge. This paper outlines some current developments in the domains of dimensionality reduction, manifold learning, and topological learning. Several aspects are dealt with, ranging from novel algorithmic approaches to their realworld applications. The issue of quality assessment is also considered and progress in quantitive as well as visual crieria is reported.
ES2010-107
Curvilinear component analysis and Bregman divergences
Jigang Sun, Colin Fyfe, Malcolm Crowe
Curvilinear component analysis and Bregman divergences
Jigang Sun, Colin Fyfe, Malcolm Crowe
Abstract:
Curvilinear Component Analysis (CCA) is an interesting flavour of multidimensional scaling. In this paper one version of CCA is proved to be equivalent to a kind of Bregman divergence, and its parameter (the neighbourhood radius) is explained.
Curvilinear Component Analysis (CCA) is an interesting flavour of multidimensional scaling. In this paper one version of CCA is proved to be equivalent to a kind of Bregman divergence, and its parameter (the neighbourhood radius) is explained.
ES2010-71
Exploratory Observation Machine (XOM) with Kullback-Leibler Divergence for Dimensionality Reduction and Visualization
Kerstin Bunte, Barbara Hammer, Thomas Villmann, Michael Biehl, Axel Wismueller
Exploratory Observation Machine (XOM) with Kullback-Leibler Divergence for Dimensionality Reduction and Visualization
Kerstin Bunte, Barbara Hammer, Thomas Villmann, Michael Biehl, Axel Wismueller
Abstract:
We present an extension of the Exploratory Observation Machine (XOM) for structure-preserving dimensionality reduction. Based on minimizing the Kullback-Leibler divergence of neighborhood functions in data and image spaces, this Neighbor Embedding XOM (NE-XOM) creates a link between fast sequential online learning known from topology-preserving mappings and principled direct divergence optimization approaches. We quantitatively evaluate our method on real world data using multiple embedding quality measures. In this comparison, NE-XOM performs as a competitive trade-off between high embedding quality and low computational expense, which motiviates its further use in real-world settings throughout science and engineering.
We present an extension of the Exploratory Observation Machine (XOM) for structure-preserving dimensionality reduction. Based on minimizing the Kullback-Leibler divergence of neighborhood functions in data and image spaces, this Neighbor Embedding XOM (NE-XOM) creates a link between fast sequential online learning known from topology-preserving mappings and principled direct divergence optimization approaches. We quantitatively evaluate our method on real world data using multiple embedding quality measures. In this comparison, NE-XOM performs as a competitive trade-off between high embedding quality and low computational expense, which motiviates its further use in real-world settings throughout science and engineering.
ES2010-78
Adaptive matrix distances aiming at optimum regression subspaces
Marc Strickert, Axel J. Soto, Gustavo E. Vazquez
Adaptive matrix distances aiming at optimum regression subspaces
Marc Strickert, Axel J. Soto, Gustavo E. Vazquez
Abstract:
A new supervised adaptive metric approach is introduced for mapping an input vector space to a plottable low-dimensional subspace in which the pairwise distances are in maximum correlation with distances of the associated target space. The formalism of multivariate subspace regression (MSR) is based on cost function optimization, and it allows assessing the relevance of input vector attributes. An application to molecular descriptors in a chemical compound data base is presented for targeting octanol-water partitioning properties.
A new supervised adaptive metric approach is introduced for mapping an input vector space to a plottable low-dimensional subspace in which the pairwise distances are in maximum correlation with distances of the associated target space. The formalism of multivariate subspace regression (MSR) is based on cost function optimization, and it allows assessing the relevance of input vector attributes. An application to molecular descriptors in a chemical compound data base is presented for targeting octanol-water partitioning properties.
ES2010-29
Self Organizing Star (SOS) for health monitoring
Etienne Côme, Marie Cottrell, Michel Verleysen, Jérôme Lacaille
Self Organizing Star (SOS) for health monitoring
Etienne Côme, Marie Cottrell, Michel Verleysen, Jérôme Lacaille
Abstract:
Self Organizing Maps (SOM) have been successfully applied in a lot of real world hard problems since their apparition. In this paper we present new topologies for SOM based on a planar graph. The design of a specific graph to encode prior information on the dataset topology is the central question addressed in this paper. In this context, star-shaped graphs are advocated for health monitoring applications, leading to a new kind of SOM that we denote by Self Organizing Star (SOS). Experiments using aircraft engine measurements show that SOS lead to meaningful and natural dataset representation.
Self Organizing Maps (SOM) have been successfully applied in a lot of real world hard problems since their apparition. In this paper we present new topologies for SOM based on a planar graph. The design of a specific graph to encode prior information on the dataset topology is the central question addressed in this paper. In this context, star-shaped graphs are advocated for health monitoring applications, leading to a new kind of SOM that we denote by Self Organizing Star (SOS). Experiments using aircraft engine measurements show that SOS lead to meaningful and natural dataset representation.
ES2010-26
Reliability of dimension reduction visualizations of hierarchical structures
Elina Parviainen
Reliability of dimension reduction visualizations of hierarchical structures
Elina Parviainen
Abstract:
Dimension reduction can produce visualizations of hierarchical structures, like those produced by cluster analysis. So far, reliability of such visualizations has only been assessed with rudimentary means. Here, a method for assessing reliability of such visualizations is developed. It measures how accurately the location of a data point in high-dimensional hierarchy tree can be inferred from a tree based on the low-dimensional visualization. The criterion can be used in point-wise fashion, allowing visual assessment of results, or as average values, for comparing visualizations. Use of the criterion is demonstrated on handwritten digits data, comparing visualizations by three dimension reduction methods.
Dimension reduction can produce visualizations of hierarchical structures, like those produced by cluster analysis. So far, reliability of such visualizations has only been assessed with rudimentary means. Here, a method for assessing reliability of such visualizations is developed. It measures how accurately the location of a data point in high-dimensional hierarchy tree can be inferred from a tree based on the low-dimensional visualization. The criterion can be used in point-wise fashion, allowing visual assessment of results, or as average values, for comparing visualizations. Use of the criterion is demonstrated on handwritten digits data, comparing visualizations by three dimension reduction methods.
ES2010-54
Mapping without visualizing local default is nonsense
Sylvain Lespinats, Michaël Aupetit
Mapping without visualizing local default is nonsense
Sylvain Lespinats, Michaël Aupetit
Abstract:
High-dimensional dataset are often Embedded in two-dimensional spaces so as to visualize neighborhood relationships. When the map is effective (i.e. when short distances are preserved) it is a powerful mean to understand the dataset. But, mappings most often show defaults and the user is then led astray. Following this line, a mapping should not be considered when its overall quality is not good enough. Many imperfect mappings can however be exploited by informing the user of nature and level of defaults. In this work, we propose to visualize local indices trustworthiness and continuity in that purpose.
High-dimensional dataset are often Embedded in two-dimensional spaces so as to visualize neighborhood relationships. When the map is effective (i.e. when short distances are preserved) it is a powerful mean to understand the dataset. But, mappings most often show defaults and the user is then led astray. Following this line, a mapping should not be considered when its overall quality is not good enough. Many imperfect mappings can however be exploited by informing the user of nature and level of defaults. In this work, we propose to visualize local indices trustworthiness and continuity in that purpose.
Learning I
ES2010-13
Active set training of support vector regressors
Shigeo Abe
Active set training of support vector regressors
Shigeo Abe
Abstract:
In our previous work we have discussed the training method of a support vector classifier by active set training allowing the solution to be infeasible during training. In this paper, we extend this method to training a support vector regressor (SVR). We use the dual form of the SVR where variables take real values and in the objective function the weighted linear sum of absolute values of the variables is included. We allow the variables to change signs from one step to the next. This means changes of the active inequality constraints. Namely, we solve the quadratic programming problem for the initial working set of training data by Newton's method, delete from the working set the data within the epsilon tube, add to the working set training data outside of the epsilon tube, and repeat training the SVM until the working set does not change. We demonstrate the effectiveness of the proposed method using some benchmark data sets.
In our previous work we have discussed the training method of a support vector classifier by active set training allowing the solution to be infeasible during training. In this paper, we extend this method to training a support vector regressor (SVR). We use the dual form of the SVR where variables take real values and in the objective function the weighted linear sum of absolute values of the variables is included. We allow the variables to change signs from one step to the next. This means changes of the active inequality constraints. Namely, we solve the quadratic programming problem for the initial working set of training data by Newton's method, delete from the working set the data within the epsilon tube, add to the working set training data outside of the epsilon tube, and repeat training the SVM until the working set does not change. We demonstrate the effectiveness of the proposed method using some benchmark data sets.
ES2010-28
Time series input selection using multiple kernel learning
Loris Foresti, Devis Tuia, Vadim Timonin, Mikhaïl Kanevski
Time series input selection using multiple kernel learning
Loris Foresti, Devis Tuia, Vadim Timonin, Mikhaïl Kanevski
Abstract:
In this paper we study the relevance of multiple kernel learning (MKL) for the automatic selection of time series inputs. Recently, MKL has gained great attention in the machine learning community due to its flexibility in modelling complex patterns and performing feature selection. In general, MKL constructs the kernel as a weighted linear combination of basis kernels, exploiting different sources of information. An efficient algorithm wrapping a Support Vector Regression model for optimizing the MKL weights, named SimpleMKL, is used for the analysis. In this sense, MKL performs feature selection by discarding inputs/kernels with low or null weights. The approach proposed is tested with simulated linear and nonlinear time series (AutoRegressive, Henon and Lorenz series).
In this paper we study the relevance of multiple kernel learning (MKL) for the automatic selection of time series inputs. Recently, MKL has gained great attention in the machine learning community due to its flexibility in modelling complex patterns and performing feature selection. In general, MKL constructs the kernel as a weighted linear combination of basis kernels, exploiting different sources of information. An efficient algorithm wrapping a Support Vector Regression model for optimizing the MKL weights, named SimpleMKL, is used for the analysis. In this sense, MKL performs feature selection by discarding inputs/kernels with low or null weights. The approach proposed is tested with simulated linear and nonlinear time series (AutoRegressive, Henon and Lorenz series).
ES2010-64
Fast and good initialization of RBF networks
Dietmar Bauer, Jonas Sjoberg
Fast and good initialization of RBF networks
Dietmar Bauer, Jonas Sjoberg
Abstract:
In this paper a new method for fast initialization of radial basis function (RBF) networks is proposed. A grid of possible positions and widths for the basis functions is defined and new nodes to the RBF network are introduced one at the time. The definition of the grid points is done in a specific way which leads to algorithms which are computationally inexpensive due to fact that intermediate results can be reused and do not need to be re-computed. If the grid is dense one obtains estimators close to estimators resulting from an exhaustive search for the initial parameters which leads to a lower risk to be caught in local minima in the minimization which follows. The usefulness of the approach is demonstrated in a simulation example.
In this paper a new method for fast initialization of radial basis function (RBF) networks is proposed. A grid of possible positions and widths for the basis functions is defined and new nodes to the RBF network are introduced one at the time. The definition of the grid points is done in a specific way which leads to algorithms which are computationally inexpensive due to fact that intermediate results can be reused and do not need to be re-computed. If the grid is dense one obtains estimators close to estimators resulting from an exhaustive search for the initial parameters which leads to a lower risk to be caught in local minima in the minimization which follows. The usefulness of the approach is demonstrated in a simulation example.
ES2010-115
Least 1-Norm SVMs: a new SVM variant between standard and LS-SVMs
Jorge López, José R. Dorronsoro
Least 1-Norm SVMs: a new SVM variant between standard and LS-SVMs
Jorge López, José R. Dorronsoro
Abstract:
Least Squares Support Vector Machines (LS-SVMs) were proposed by replacing the inequality constraints inherent to L1-SVMs with equality constraints. So far this idea has only been suggested for a least squares (L2) loss. We describe how this can also be done for the sum-of-slacks (L1) loss, yielding a new classifier (Least 1-Norm SVMs) which gives similar models in terms of complexity and accuracy and that may also be more robust than LS-SVMs with respect to outliers.
Least Squares Support Vector Machines (LS-SVMs) were proposed by replacing the inequality constraints inherent to L1-SVMs with equality constraints. So far this idea has only been suggested for a least squares (L2) loss. We describe how this can also be done for the sum-of-slacks (L1) loss, yielding a new classifier (Least 1-Norm SVMs) which gives similar models in terms of complexity and accuracy and that may also be more robust than LS-SVMs with respect to outliers.
ES2010-122
An augmented efficient backpropagation training strategy for deep autoassociative neural networks
Mark Embrechts, Blake Hargis, Jonathan Linton
An augmented efficient backpropagation training strategy for deep autoassociative neural networks
Mark Embrechts, Blake Hargis, Jonathan Linton
Abstract:
The purpose of this paper is to introduce an effective training strategy of the backpropagation algorithm for deep autoencoders without relying on a weight initialization with restricted Boltzmann machines. This strategy is an extension of Efficient BackProp first proposed by LeCun et al. and will be benchmarked on three different types of application data sets.
The purpose of this paper is to introduce an effective training strategy of the backpropagation algorithm for deep autoencoders without relying on a weight initialization with restricted Boltzmann machines. This strategy is an extension of Efficient BackProp first proposed by LeCun et al. and will be benchmarked on three different types of application data sets.
ES2010-65
Model Learning from Weights by Adaptive Enhanced Probabilistic Convergent Network
Pierre Lorrentz, Gareth Howells, Klaus McDonald-Maier
Model Learning from Weights by Adaptive Enhanced Probabilistic Convergent Network
Pierre Lorrentz, Gareth Howells, Klaus McDonald-Maier
Abstract:
Current weightless classifiers require historical data to model a system and make prediction about a system successfully. Historical data either is not always available or does not take a recent system modification into consideration. For this reason an adaptive filter is designed, which when employed with a weightless classifier enables system model, difficult to characterise system model, and system output prediction, successfully. Results of experiments performed show that the fusion of an adaptive filter and a weightless classifier is more beneficial than the filter or the classifier utilised singly, and that no speed advantage is observed.
Current weightless classifiers require historical data to model a system and make prediction about a system successfully. Historical data either is not always available or does not take a recent system modification into consideration. For this reason an adaptive filter is designed, which when employed with a weightless classifier enables system model, difficult to characterise system model, and system output prediction, successfully. Results of experiments performed show that the fusion of an adaptive filter and a weightless classifier is more beneficial than the filter or the classifier utilised singly, and that no speed advantage is observed.
ES2010-82
Directional predictions for 4-class BCI data
Dieter Devlaminck, Willem Waegeman, Bruno Bauwens, Bart Wyns, Georges Otte, Luc Boullart, Patrick Santens
Directional predictions for 4-class BCI data
Dieter Devlaminck, Willem Waegeman, Bruno Bauwens, Bart Wyns, Georges Otte, Luc Boullart, Patrick Santens
Abstract:
Brain-computer interfaces (BCI) can allow disabled people to drive a wheel chair, just by imagining movement. Therefore, present day BCIs use training data, recorded during four different conditions, to calibrate a classifier. This, however, causes jerky behavior while abruptly switching between discrete states. We propose a cost-sensitive support vector approach for estimating two-dimensional directions based on four-class BCI data, that is recorded from three subjects. We found that our method reduces the number of severe errors compared to classical support vector machines and results in a smoother trajectory estimate for application in a wheel chair.
Brain-computer interfaces (BCI) can allow disabled people to drive a wheel chair, just by imagining movement. Therefore, present day BCIs use training data, recorded during four different conditions, to calibrate a classifier. This, however, causes jerky behavior while abruptly switching between discrete states. We propose a cost-sensitive support vector approach for estimating two-dimensional directions based on four-class BCI data, that is recorded from three subjects. We found that our method reduces the number of severe errors compared to classical support vector machines and results in a smoother trajectory estimate for application in a wheel chair.
ES2010-52
Autoregressive independent process analysis with missing observations
Szabó Zoltán
Autoregressive independent process analysis with missing observations
Szabó Zoltán
Abstract:
The goal of this paper is to search for independent multidimensional processes subject to missing and mixed observations. The corresponding cocktail-party problem has a number of successful applications, however, the case of missing observations has been worked out only for the simplest Independent Component Analysis (ICA) task, where the hidden processes (i) are one-dimensional, and (ii) signal generation in time is independent and identically distributed (i.i.d.). Here, the missing observation situation is extended to processes with (i) autoregressive (AR) dynamics and (ii) multidimensional driving sources. Performance of the solution method is illustrated by numerical examples.
The goal of this paper is to search for independent multidimensional processes subject to missing and mixed observations. The corresponding cocktail-party problem has a number of successful applications, however, the case of missing observations has been worked out only for the simplest Independent Component Analysis (ICA) task, where the hidden processes (i) are one-dimensional, and (ii) signal generation in time is independent and identically distributed (i.i.d.). Here, the missing observation situation is extended to processes with (i) autoregressive (AR) dynamics and (ii) multidimensional driving sources. Performance of the solution method is illustrated by numerical examples.
ES2010-55
Combining back-propagation and genetic algorithms to train neural networks for start-up time modeling in combined cycle power plants
Ilaria Bertini, Matteo De Felice, Stefano Pizzuti
Combining back-propagation and genetic algorithms to train neural networks for start-up time modeling in combined cycle power plants
Ilaria Bertini, Matteo De Felice, Stefano Pizzuti
Abstract:
This paper presents a neural networks based approach in order to estimate the start-up time of turbine based power plants. Neural networks are trained with a hybrid approach, indeed we combine the Back-propagation (BP) algorithm and the Simple Genetic Algorithm (GA) in order to effectively train neural networks in such a way that the BP algorithm initializes a few individuals of the GA's population. Experiments have been performed over a big amount of data and results have shown a remarkable improvement in accuracy compared to the single traditional methods
This paper presents a neural networks based approach in order to estimate the start-up time of turbine based power plants. Neural networks are trained with a hybrid approach, indeed we combine the Back-propagation (BP) algorithm and the Simple Genetic Algorithm (GA) in order to effectively train neural networks in such a way that the BP algorithm initializes a few individuals of the GA's population. Experiments have been performed over a big amount of data and results have shown a remarkable improvement in accuracy compared to the single traditional methods
ES2010-60
A pseudoregression formulation of emphasized soft target procedures for classification problems
Soufiane El Jelali, Abdelouahid Lyhyaoui, Anibal R. Figueiras-Vidal
A pseudoregression formulation of emphasized soft target procedures for classification problems
Soufiane El Jelali, Abdelouahid Lyhyaoui, Anibal R. Figueiras-Vidal
Abstract:
Replacing a hard decision by a soft targets version including an attentional mechanism provides performance advantage and flexibility to solve classification tasks. In this paper, we modify the standard emphasized soft target method by proposing two new ideas, to avoid unnecessary updating and inappropiate definition of soft targets, in order to increase designs performance. Experimental results using MLPs show the effectiveness of this approach compared with the standard ST and other methods.
Replacing a hard decision by a soft targets version including an attentional mechanism provides performance advantage and flexibility to solve classification tasks. In this paper, we modify the standard emphasized soft target method by proposing two new ideas, to avoid unnecessary updating and inappropiate definition of soft targets, in order to increase designs performance. Experimental results using MLPs show the effectiveness of this approach compared with the standard ST and other methods.
ES2010-79
Exploiting hierarchical prediction structures for mixed 2d-3d tracking
Chen Zhang, Julian Eggert
Exploiting hierarchical prediction structures for mixed 2d-3d tracking
Chen Zhang, Julian Eggert
Abstract:
In this paper, we present a generic way to use a hierarchical representation of prediction models for adaptive tracking. Starting with a basic appearance-based tracker working in 2D retinal space, we show how to combine individual trackers for the left and right eye to a true 3D tracker that is built on top of the 2D trackers. We show how the trackers benefit from the hierarchical structure by dynamical model switching depending on the reliability of the tracking results.
In this paper, we present a generic way to use a hierarchical representation of prediction models for adaptive tracking. Starting with a basic appearance-based tracker working in 2D retinal space, we show how to combine individual trackers for the left and right eye to a true 3D tracker that is built on top of the 2D trackers. We show how the trackers benefit from the hierarchical structure by dynamical model switching depending on the reliability of the tracking results.
ES2010-96
Hybrid Soft Computing for PVT Properties Prediction
Ghouti Lahouari, Saeed Al-Bukhitan
Hybrid Soft Computing for PVT Properties Prediction
Ghouti Lahouari, Saeed Al-Bukhitan
Abstract:
Pressure-Volume-Temperature (PVT) properties are very important in the reservoir engineering computations. There are many approaches for predicting various PVT properties based on empirical correlations and statistical regression models. Last decade, researchers utilized neural networks to develop more accurate PVT correlations. These achievements of neural networks open the door to data mining techniques to play a major role in oil and gas industry. Unfortunately, the developed neural networks correlations are often limited and global correlations are usually less accurate compared to local correlations. Recently, adaptive neuro-fuzzy inference systems have been proposed as a new intelligence framework for both prediction and classification based on fuzzy clustering optimization criterion and ranking. In this paper, a genetic-neuro-fuzzy inference system is proposed for estimating PVT properties of crude oil systems.
Pressure-Volume-Temperature (PVT) properties are very important in the reservoir engineering computations. There are many approaches for predicting various PVT properties based on empirical correlations and statistical regression models. Last decade, researchers utilized neural networks to develop more accurate PVT correlations. These achievements of neural networks open the door to data mining techniques to play a major role in oil and gas industry. Unfortunately, the developed neural networks correlations are often limited and global correlations are usually less accurate compared to local correlations. Recently, adaptive neuro-fuzzy inference systems have been proposed as a new intelligence framework for both prediction and classification based on fuzzy clustering optimization criterion and ranking. In this paper, a genetic-neuro-fuzzy inference system is proposed for estimating PVT properties of crude oil systems.
ES2010-58
Approximation of chemical reaction rates in turbulent combustion simulation
Lars Große, Franz Joos
Approximation of chemical reaction rates in turbulent combustion simulation
Lars Große, Franz Joos
Abstract:
It is essential to increase the efficiency of the commercially available combustion engines because of the limitations in fossil energy resources and environmental pollution Also the emission standards are a challenging aspect. If one succeeds in designing the combustion process, in particular the chemical reactions, it would be feasible to partly replace complex experiments by computer simulations. The suggestion made in this paper, is the use of artificial neuronal networks for approximation of complex chemistry in turbulent combustion applications. The use of complex chemistry is computationally expensive and limited to simple geometry, therefore it is replaced by trained ANNs.
It is essential to increase the efficiency of the commercially available combustion engines because of the limitations in fossil energy resources and environmental pollution Also the emission standards are a challenging aspect. If one succeeds in designing the combustion process, in particular the chemical reactions, it would be feasible to partly replace complex experiments by computer simulations. The suggestion made in this paper, is the use of artificial neuronal networks for approximation of complex chemistry in turbulent combustion applications. The use of complex chemistry is computationally expensive and limited to simple geometry, therefore it is replaced by trained ANNs.
Mixture and generative models
ES2010-126
Exploiting local structure in stacked Boltzmann machines
Hannes Schulz, Andreas Müller, Sven Behnke
Exploiting local structure in stacked Boltzmann machines
Hannes Schulz, Andreas Müller, Sven Behnke
Abstract:
Restricted Boltzmann Machines (RBM) are well-studied generative models. For image data, however, RBMs are suboptimal, since they do not exploit the local nature of image statistics. We modify RBMs to focus on local structure by restricting visible-hidden interactions. Long-range interactions can then be modelled using direct or indirect lateral interaction between hidden variables. While learning in our model is much faster, it retains generative and discriminative properties of RBMs.
Restricted Boltzmann Machines (RBM) are well-studied generative models. For image data, however, RBMs are suboptimal, since they do not exploit the local nature of image statistics. We modify RBMs to focus on local structure by restricting visible-hidden interactions. Long-range interactions can then be modelled using direct or indirect lateral interaction between hidden variables. While learning in our model is much faster, it retains generative and discriminative properties of RBMs.
ES2010-120
Asymptotic properties of mixture-of-experts models
Madalina Olteanu, joseph Rynkiewicz
Asymptotic properties of mixture-of-experts models
Madalina Olteanu, joseph Rynkiewicz
Abstract:
The statistical properties of the likelihood ratio test statistic (LRTS) for mixture-of-expert models are addressed in this paper. This question is essential when estimating the number of experts in the model. Our purpose is to extend the existing results for mixtures (Liu and Shao, 2003) and mixtures of multilayer perceptron (Olteanu and Rynkiewicz, 2008). In this paper we study a simple example which embodies all the difficulties arising in such models. We find that in some cases the LRTS diverges but, with additional assumptions, the behavior of such models can be totally explicated.
The statistical properties of the likelihood ratio test statistic (LRTS) for mixture-of-expert models are addressed in this paper. This question is essential when estimating the number of experts in the model. Our purpose is to extend the existing results for mixtures (Liu and Shao, 2003) and mixtures of multilayer perceptron (Olteanu and Rynkiewicz, 2008). In this paper we study a simple example which embodies all the difficulties arising in such models. We find that in some cases the LRTS diverges but, with additional assumptions, the behavior of such models can be totally explicated.
ES2010-34
Adaptive learning rate control for "neural gas principal component analysis"
Wolfram Schenck, Ralph Welsch, Alexander Kaiser, Ralf Moeller
Adaptive learning rate control for "neural gas principal component analysis"
Wolfram Schenck, Ralph Welsch, Alexander Kaiser, Ralf Moeller
Abstract:
We propose a novel algorithm for adaptive learning rate control for Gaussian mixture models of the NGPCA type. The core idea is to introduce a unit-specific learning rate which is adjusted automatically depending on the match between the local principal component analysis of each unit (interpreted as Gaussian distribution) and the empirical distribution within the unit's data partition. In contrast to fixed annealing schemes for the learning rate, the novel algorithm is applicable to real online learning. Two experimental studies are presented which demonstrate this important property and the general performance of this algorithm.
We propose a novel algorithm for adaptive learning rate control for Gaussian mixture models of the NGPCA type. The core idea is to introduce a unit-specific learning rate which is adjusted automatically depending on the match between the local principal component analysis of each unit (interpreted as Gaussian distribution) and the empirical distribution within the unit's data partition. In contrast to fixed annealing schemes for the learning rate, the novel algorithm is applicable to real online learning. Two experimental studies are presented which demonstrate this important property and the general performance of this algorithm.
ES2010-104
Towards sub-quadratic learning of probability density models in the form of mixtures of trees
François Schnitzler, Philippe Leray, Louis Wehenkel
Towards sub-quadratic learning of probability density models in the form of mixtures of trees
François Schnitzler, Philippe Leray, Louis Wehenkel
Abstract:
We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger randomizations remain very competitive in terms of accuracy, specially for small sample sizes.
We consider randomization schemes of the Chow-Liu algorithm from weak (bagging, of quadratic complexity) to strong ones (full random sampling, of linear complexity), for learning probability density models in the form of mixtures of Markov trees. Our empirical study on high-dimensional synthetic problems shows that, while bagging is the most accurate scheme on average, some of the stronger randomizations remain very competitive in terms of accuracy, specially for small sample sizes.
Sparse representation of data
ES2010-7
Sparse representation of data
Thomas Villmann, Frank Michael Schleif, Barbara Hammer
Sparse representation of data
Thomas Villmann, Frank Michael Schleif, Barbara Hammer
Abstract:
The amount of electronic data available today as well as its dimensionality and complexity increases rapidly in many scientific areas including biology, (bio-)chemistry, medicine, physics and its application fields like robotics, bioinformatics or multimedia technologies. Many of these data sets are very complex but have also a simple inherent structure which allows an appropriate sparse representation and modeling of such data with less or no information loss. Advanced methods are needed to extract these inherent but hidden information. Sparsity can be observed at different levels: sparse representation of data points using e.g. dimensionality reduction for efficient data storage, sparse representation of full data sets using e.g. prototypes to achieve compact models for lifelong learning and sparse models of the underlying data structure using sparse encoding techniques. One main goal is to achieve a human-interpretable representation of the essential information. Sparse representations account for the ubiquitous problem that humans have to deal with ever increasing and inherently unlimited information by means of limited resources such as limited time, memory, or perception abilities. Starting with the seminal paper of Olshausen&Field researchers recognized that sparsity can be used as a fundamental principle to arrive at very efficient information processing models for huge and complex data such as observed e.g. in the visual cortex. Nowadays, sparse models include diverse methods such as relevance learning in prototype based representations, sparse coding neural gas, factor analysis methods, latent semantic indexing, sparse Bayesian networks, relevance vector machines and other. This tutorial paper reviews recent developments in the field.
The amount of electronic data available today as well as its dimensionality and complexity increases rapidly in many scientific areas including biology, (bio-)chemistry, medicine, physics and its application fields like robotics, bioinformatics or multimedia technologies. Many of these data sets are very complex but have also a simple inherent structure which allows an appropriate sparse representation and modeling of such data with less or no information loss. Advanced methods are needed to extract these inherent but hidden information. Sparsity can be observed at different levels: sparse representation of data points using e.g. dimensionality reduction for efficient data storage, sparse representation of full data sets using e.g. prototypes to achieve compact models for lifelong learning and sparse models of the underlying data structure using sparse encoding techniques. One main goal is to achieve a human-interpretable representation of the essential information. Sparse representations account for the ubiquitous problem that humans have to deal with ever increasing and inherently unlimited information by means of limited resources such as limited time, memory, or perception abilities. Starting with the seminal paper of Olshausen&Field researchers recognized that sparsity can be used as a fundamental principle to arrive at very efficient information processing models for huge and complex data such as observed e.g. in the visual cortex. Nowadays, sparse models include diverse methods such as relevance learning in prototype based representations, sparse coding neural gas, factor analysis methods, latent semantic indexing, sparse Bayesian networks, relevance vector machines and other. This tutorial paper reviews recent developments in the field.
ES2010-18
Highly sparse kernel spectral clustering with predictive out-of-sample extensions
Carlos Alzate, Johan Suykens
Highly sparse kernel spectral clustering with predictive out-of-sample extensions
Carlos Alzate, Johan Suykens
Abstract:
Kernel spectral clustering has been formulated as a primal - dual optimization setting allowing natural extensions to out-of-sample data together with model selection in a learning framework which is important for obtaining a good generalization performance. In this paper, we propose a new sparse method for kernel spectral clustering. The approach exploits the structure of the eigenvectors and the corresponding projections of the data when the clusters are well formed. Experimental results with toy data and images show highly sparse clustering models with predictive capabilities.
Kernel spectral clustering has been formulated as a primal - dual optimization setting allowing natural extensions to out-of-sample data together with model selection in a learning framework which is important for obtaining a good generalization performance. In this paper, we propose a new sparse method for kernel spectral clustering. The approach exploits the structure of the eigenvectors and the corresponding projections of the data when the clusters are well formed. Experimental results with toy data and images show highly sparse clustering models with predictive capabilities.
ES2010-57
Learning sparse codes for image reconstruction
Kai Labusch, Thomas Martinetz
Learning sparse codes for image reconstruction
Kai Labusch, Thomas Martinetz
Abstract:
We propose a new algorithm for the design of overcomplete dictionaries for sparse coding that generalizes the Sparse Coding Neural Gas (SCNG) algorithm such that it is not bound to a particular approximation method for the coefficients of the dictionary elements. In an application to image reconstruction, a dictionary that has been learned using this algorithm outperforms a dictionary that has been obtained from the widely-used K-SVD algorithm, an overcomplete Haar-wavelet dictionary and an overcomplete discrete cosine transformation (DCT).
We propose a new algorithm for the design of overcomplete dictionaries for sparse coding that generalizes the Sparse Coding Neural Gas (SCNG) algorithm such that it is not bound to a particular approximation method for the coefficients of the dictionary elements. In an application to image reconstruction, a dictionary that has been learned using this algorithm outperforms a dictionary that has been obtained from the widely-used K-SVD algorithm, an overcomplete Haar-wavelet dictionary and an overcomplete discrete cosine transformation (DCT).
ES2010-97
Divergence based Learning Vector Quantization
Ernest Mwebaze, Petra Schneider, Frank Michael Schleif, Sven Haase, Thomas Villmann, Michael Biehl
Divergence based Learning Vector Quantization
Ernest Mwebaze, Petra Schneider, Frank Michael Schleif, Sven Haase, Thomas Villmann, Michael Biehl
ES2010-24
Finding correlations in multimodal data using decomposition approaches
Daniel Dornbusch, Robert Haschke, Stefan Menzel, Heiko Wersing
Finding correlations in multimodal data using decomposition approaches
Daniel Dornbusch, Robert Haschke, Stefan Menzel, Heiko Wersing
Abstract:
In this paper, we propose the application of standard decomposition approaches to find local correlations in multimodal data. In a test scenario, we apply these methods to correlate the local shape of turbine blades with their associated aerodynamic flow fields. We compare several decomposition algorithms with regards to their efficiency at finding local correlations and their ability to predict one modality from another.
In this paper, we propose the application of standard decomposition approaches to find local correlations in multimodal data. In a test scenario, we apply these methods to correlate the local shape of turbine blades with their associated aerodynamic flow fields. We compare several decomposition algorithms with regards to their efficiency at finding local correlations and their ability to predict one modality from another.
ES2010-43
Geometric models with co-occurrence groups
Joan Bruna Estrach, Stéphane Mallat
Geometric models with co-occurrence groups
Joan Bruna Estrach, Stéphane Mallat
Abstract:
A geometric model of sparse signal representations is introduced for classes of signals. It is computed by optimizing co-occurrence groups with a maximum likelihood estimate calculated with a Bernoulli mixture model. Applications to face image compression and MNIST digit classification illustrate the applicability of this model.
A geometric model of sparse signal representations is introduced for classes of signals. It is computed by optimizing co-occurrence groups with a maximum likelihood estimate calculated with a Bernoulli mixture model. Applications to face image compression and MNIST digit classification illustrate the applicability of this model.
ES2010-87
Deep learning of visual control policies
Sascha Lange, Martin Riedmiller
Deep learning of visual control policies
Sascha Lange, Martin Riedmiller
Abstract:
This paper discusses the effectiveness of deep auto-encoding neural nets in visual reinforcement learning (RL) tasks. We describe a new algorithm and give results on succesfully learning policies directly on synthesized and real images without a predefined image processing. Furthermore, we present a thorough evaluation of the learned feature spaces.
This paper discusses the effectiveness of deep auto-encoding neural nets in visual reinforcement learning (RL) tasks. We describe a new algorithm and give results on succesfully learning policies directly on synthesized and real images without a predefined image processing. Furthermore, we present a thorough evaluation of the learned feature spaces.
ES2010-106
Learning vector quantization for heterogeneous structured data
Dietlind Zühlke, Frank Michael Schleif, Tina Geweniger, Sven Haase, Thomas Villmann
Learning vector quantization for heterogeneous structured data
Dietlind Zühlke, Frank Michael Schleif, Tina Geweniger, Sven Haase, Thomas Villmann
Abstract:
In this paper we introduce an approach to integrate heterogeneous structured data into a learning vector quantization. The total distance between to heterogeneous structured samples is defined as a weighted sum of the distances in the single structural components. The weights are adapted in every iteration of learning using gradient descend on the cost function inspired by Generalized Learning Vector Quantization. The new method was tested on a real world data set for pollen recognition using image analysis.
In this paper we introduce an approach to integrate heterogeneous structured data into a learning vector quantization. The total distance between to heterogeneous structured samples is defined as a weighted sum of the distances in the single structural components. The weights are adapted in every iteration of learning using gradient descend on the cost function inspired by Generalized Learning Vector Quantization. The new method was tested on a real world data set for pollen recognition using image analysis.
ES2010-94
Relational Generative Topographic Map
Andrej Gisbrecht, Bassam Mokbel, Barbara Hammer
Relational Generative Topographic Map
Andrej Gisbrecht, Bassam Mokbel, Barbara Hammer
Abstract:
The generative topographic map (GTM) has been proposed as a statistical model to represent high dimensional data by means of a sparse lattice of points in latent space, such that visualization and data inspection become possible. Original GTM is restricted to euclidean data points in a vector space. Often, data are not explicitly embedded in a real vector space, rather pairwise dissimilarities of data can be computed, i.e.\ the relations between data points are given rather than the data vectors itself. We propose a method which extends the GTM to relational data and which allows to achieve a sparse representation of data characterized by pairwise dissimilarities in latent space. The method, relational GTM, is demonstrated on several benchmarks.
The generative topographic map (GTM) has been proposed as a statistical model to represent high dimensional data by means of a sparse lattice of points in latent space, such that visualization and data inspection become possible. Original GTM is restricted to euclidean data points in a vector space. Often, data are not explicitly embedded in a real vector space, rather pairwise dissimilarities of data can be computed, i.e.\ the relations between data points are given rather than the data vectors itself. We propose a method which extends the GTM to relational data and which allows to achieve a sparse representation of data characterized by pairwise dissimilarities in latent space. The method, relational GTM, is demonstrated on several benchmarks.
Physiology and learning
ES2010-63
Neural oscillations allow for selective inhibition - New perspective on the role of cortical gamma oscillations
Thomas Burwick
Neural oscillations allow for selective inhibition - New perspective on the role of cortical gamma oscillations
Thomas Burwick
Abstract:
A pattern recognition mechanism is proposed that uses inhibitory oscillations as fundamental ingredient. The mechanism realizes selective inhibition that could not be reached without oscillations. It uses couplings that are motivated by physiology. Since inhibitory oscillations are key to the generation of cortical gamma oscillation, the proposed mechanism may also shed new light on the gamma oscillation functionality.
A pattern recognition mechanism is proposed that uses inhibitory oscillations as fundamental ingredient. The mechanism realizes selective inhibition that could not be reached without oscillations. It uses couplings that are motivated by physiology. Since inhibitory oscillations are key to the generation of cortical gamma oscillation, the proposed mechanism may also shed new light on the gamma oscillation functionality.
ES2010-118
Learning how to grasp objects
Annalisa Barla, Luca Baldassarre, Nicoletta Noceti, Francesca Odone
Learning how to grasp objects
Annalisa Barla, Luca Baldassarre, Nicoletta Noceti, Francesca Odone
Abstract:
This paper deals with the problem of estimating an appro- priate hand posture to grasp an ob ject, from 2D ob ject’s visual cues in a many-to-many (objects,grasp) configuration. A statistical learning proto- col implementing vector-valued regression is adopted for both classifying the most likely grasp type and estimating the hand posture. An exten- sive experimental evaluation on a publicly available dataset of visuo-motor data reports very promising results and encourages further investigations.
This paper deals with the problem of estimating an appro- priate hand posture to grasp an ob ject, from 2D ob ject’s visual cues in a many-to-many (objects,grasp) configuration. A statistical learning proto- col implementing vector-valued regression is adopted for both classifying the most likely grasp type and estimating the hand posture. An exten- sive experimental evaluation on a publicly available dataset of visuo-motor data reports very promising results and encourages further investigations.
Machine learning techniques based on random projections
ES2010-3
Machine Learning Techniques based on Random Projections
Yoan Miché, Benjamin Schrauwen, Amaury Lendasse
Machine Learning Techniques based on Random Projections
Yoan Miché, Benjamin Schrauwen, Amaury Lendasse
Abstract:
This paper presents a short introduction to the Reservoir Computing and Extreme Learning Machine main ideas and developments. While both methods make use of Neural Networks and Random Projections, the Reservoir Computing allows the network to have a recurrent structure, while the Extreme Learning Machine is a Feedforward neural network only. Some state of the art techniques are briefly presented and this special session papers are finally quickly described, in the terms of this introductory paper.
This paper presents a short introduction to the Reservoir Computing and Extreme Learning Machine main ideas and developments. While both methods make use of Neural Networks and Random Projections, the Reservoir Computing allows the network to have a recurrent structure, while the Extreme Learning Machine is a Feedforward neural network only. Some state of the art techniques are briefly presented and this special session papers are finally quickly described, in the terms of this introductory paper.
ES2010-99
Extending reservoir computing with random static projections: a hybrid between extreme learning and RC
John Butcher, David Verstraeten, Benjamin Schrauwen, Charles Day, Peter Haycock
Extending reservoir computing with random static projections: a hybrid between extreme learning and RC
John Butcher, David Verstraeten, Benjamin Schrauwen, Charles Day, Peter Haycock
Abstract:
Reservoir Computing is a relatively new paradigm in the field of neural networks that has shown promise in applications where traditional recurrent neural networks have performed poorly. The main advantage of using reservoirs is that only the output weights are trained, reducing computational requirements significantly. There is a trade-off, however, between the amount of memory a reservoir can possess and its capability of mapping data into a highly non-linear transformation space. A new, hybrid architecture, combining a reservoir with an extreme learning machine, is presented which overcomes this trade-off, whose performance is demonstrated on a 4th order polynomial modelling task and an isolated spoken digit recognition task.
Reservoir Computing is a relatively new paradigm in the field of neural networks that has shown promise in applications where traditional recurrent neural networks have performed poorly. The main advantage of using reservoirs is that only the output weights are trained, reducing computational requirements significantly. There is a trade-off, however, between the amount of memory a reservoir can possess and its capability of mapping data into a highly non-linear transformation space. A new, hybrid architecture, combining a reservoir with an extreme learning machine, is presented which overcomes this trade-off, whose performance is demonstrated on a 4th order polynomial modelling task and an isolated spoken digit recognition task.
ES2010-133
Solving Large Regression Problems using an Ensemble of GPU-accelerated ELMs
Mark van Heeswijk, Yoan Miché, Erkki Oja, Amaury Lendasse
Solving Large Regression Problems using an Ensemble of GPU-accelerated ELMs
Mark van Heeswijk, Yoan Miché, Erkki Oja, Amaury Lendasse
Abstract:
This paper presents an approach that allows for performing regression on large data sets in reasonable time. The main component of the approach consists in speeding up the slowest operation of the used algorithm by running it on the Graphics Processing Unit (GPU) of the video card, instead of the processor (CPU). The experiments show a speedup of an order of magnitude by using the GPU, and competitive performance on the regression task. Furthermore, the presented approach lends itself for further parallelization, that has still to be investigated.
This paper presents an approach that allows for performing regression on large data sets in reasonable time. The main component of the approach consists in speeding up the slowest operation of the used algorithm by running it on the Graphics Processing Unit (GPU) of the video card, instead of the processor (CPU). The experiments show a speedup of an order of magnitude by using the GPU, and competitive performance on the regression task. Furthermore, the presented approach lends itself for further parallelization, that has still to be investigated.
ES2010-56
Using SVMs with randomised feature spaces: an extreme learning approach
Benoît Frénay, Michel Verleysen
Using SVMs with randomised feature spaces: an extreme learning approach
Benoît Frénay, Michel Verleysen
Abstract:
Extreme learning machines are fast models which almost compare to standard SVMs in terms of accuracy, but are much faster. However, they optimise a sum of squared errors whereas SVMs are maximum-margin classifiers. This paper proposes to merge both approaches by defining a new kernel. This kernel is computed by the first layer of an extreme learning machine and used to train a SVM. Experiments show that this new kernel compares to the standard RBF kernel in terms of accuracy and is faster. Indeed, experiments show that the number of neurons of the ELM behind the randomised kernel does not need to be tuned and can be set to a sufficient value without altering the accuracy significantly.
Extreme learning machines are fast models which almost compare to standard SVMs in terms of accuracy, but are much faster. However, they optimise a sum of squared errors whereas SVMs are maximum-margin classifiers. This paper proposes to merge both approaches by defining a new kernel. This kernel is computed by the first layer of an extreme learning machine and used to train a SVM. Experiments show that this new kernel compares to the standard RBF kernel in terms of accuracy and is faster. Indeed, experiments show that the number of neurons of the ELM behind the randomised kernel does not need to be tuned and can be set to a sufficient value without altering the accuracy significantly.
ES2010-36
A Markovian characterization of redundancy in echo state networks by PCA
Claudio Gallicchio, Alessio Micheli
A Markovian characterization of redundancy in echo state networks by PCA
Claudio Gallicchio, Alessio Micheli
Abstract:
Richness of dynamics is a desirable feature of Echo State Networks (ESNs) limited by a known high redundancy of state units activations. We show how this feature is mainly influenced by the Markovian state space organization of ESNs through a Principal Component Analysis (PCA) of the reservoir space. Relevances of principal components are coherent with Markovianity, whose role is further enlightened by investigating the strong relation among the suffix elements of the input sequence and the most relevant directions of variability in the state space.
Richness of dynamics is a desirable feature of Echo State Networks (ESNs) limited by a known high redundancy of state units activations. We show how this feature is mainly influenced by the Markovian state space organization of ESNs through a Principal Component Analysis (PCA) of the reservoir space. Relevances of principal components are coherent with Markovianity, whose role is further enlightened by investigating the strong relation among the suffix elements of the input sequence and the most relevant directions of variability in the state space.
ES2010-88
Random search enhancement of error minimized extreme learning machine
Yuan Lan, Soh Yeng Chai, Huang Guang-Bin
Random search enhancement of error minimized extreme learning machine
Yuan Lan, Soh Yeng Chai, Huang Guang-Bin
Abstract:
Error minimized extreme learning machine (EM-ELM) proposed by Feng et al. [1] can automatically determine the number of hidden nodes in generalized single-hidden layer feedforward networks (SLFNs). We recently find that some of the hidden nodes that are added into the network may play a very minor role in the network output, which increases the network complexity. Hence, this paper proposes an enhancement of EM-ELM (referred to as EEM-ELM), which introduce a selection phase based on the random search method. The empirical study shows that EEM-ELM leads to a more compact network structure.
Error minimized extreme learning machine (EM-ELM) proposed by Feng et al. [1] can automatically determine the number of hidden nodes in generalized single-hidden layer feedforward networks (SLFNs). We recently find that some of the hidden nodes that are added into the network may play a very minor role in the network output, which increases the network complexity. Hence, this paper proposes an enhancement of EM-ELM (referred to as EEM-ELM), which introduce a selection phase based on the random search method. The empirical study shows that EEM-ELM leads to a more compact network structure.
ES2010-101
TreeESN: a Preliminary Experimental Analysis
Claudio Gallicchio, Alessio Micheli
TreeESN: a Preliminary Experimental Analysis
Claudio Gallicchio, Alessio Micheli
Abstract:
In this paper we introduce an efficient approach to Recursive Neural Networks (RecNNs) modeling, the Tree Echo State Network (TreeESN), extending the Echo State Network (ESN) model from sequential to tree structured domains processing. For structure-to-element transductions, the state mapping (i.e. the way in which the state values for the whole structure are selected/collected) turns out to have a relevant role and the importance of its choice is pointed out by experimental results.
In this paper we introduce an efficient approach to Recursive Neural Networks (RecNNs) modeling, the Tree Echo State Network (TreeESN), extending the Echo State Network (ESN) model from sequential to tree structured domains processing. For structure-to-element transductions, the state mapping (i.e. the way in which the state values for the whole structure are selected/collected) turns out to have a relevant role and the importance of its choice is pointed out by experimental results.
Learning II
ES2010-114
A novel interactive biometric passport photograph alignment system
George McConnon, Farzin Deravi, Sanaul Hoque, Gareth Howells, Konstantinos Sirlantzis
A novel interactive biometric passport photograph alignment system
George McConnon, Farzin Deravi, Sanaul Hoque, Gareth Howells, Konstantinos Sirlantzis
Abstract:
A novel framework for interactively acquiring images was developed in MathWorks™ MATLAB® which used real-time visual and audio cues to assist in guiding the user into correct alignment for compliance with European biometric passport regulations. The user would pose in front of a camera using visual feedback from a monitor to roughly position themselves before Iris detection was used to calculate their roll (z-axis) alignment. Audio instructions would then be given to refine their posture if required. Blink detection was then used to detect the user’s readiness to have their passport image taken.
A novel framework for interactively acquiring images was developed in MathWorks™ MATLAB® which used real-time visual and audio cues to assist in guiding the user into correct alignment for compliance with European biometric passport regulations. The user would pose in front of a camera using visual feedback from a monitor to roughly position themselves before Iris detection was used to calculate their roll (z-axis) alignment. Audio instructions would then be given to refine their posture if required. Blink detection was then used to detect the user’s readiness to have their passport image taken.
ES2010-117
Oriented Bounding Box Computation Using Particle Swarm Optimization
Pierre Borckmans, Pierre-Antoine Absil
Oriented Bounding Box Computation Using Particle Swarm Optimization
Pierre Borckmans, Pierre-Antoine Absil
Abstract:
The problem of finding the optimal oriented bounding box (OBB) for a given set of points in $\mathbb{R}^3$, yet simple to state, is computationally challenging. Existing state-of-the-art methods dealing with this problem are either exact but slow, or fast but very approximative and unreliable. We propose a method based on Particle Swarm Optimization (PSO) to approximate solutions both effectively and accurately. The original PSO algorithm is modified so as to search for optimal solutions over the rotation group $SO(3)$. Particles are defined as 3D rotation matrices and operations are expressed over $SO(3)$ using matrix products, exponentials and logarithms. The symmetry of the problem is also exploited. Numerical experiments show that the proposed algorithm outperforms existing methods, often by far.
The problem of finding the optimal oriented bounding box (OBB) for a given set of points in $\mathbb{R}^3$, yet simple to state, is computationally challenging. Existing state-of-the-art methods dealing with this problem are either exact but slow, or fast but very approximative and unreliable. We propose a method based on Particle Swarm Optimization (PSO) to approximate solutions both effectively and accurately. The original PSO algorithm is modified so as to search for optimal solutions over the rotation group $SO(3)$. Particles are defined as 3D rotation matrices and operations are expressed over $SO(3)$ using matrix products, exponentials and logarithms. The symmetry of the problem is also exploited. Numerical experiments show that the proposed algorithm outperforms existing methods, often by far.
ES2010-129
Identifying informative features for ERP speller systems based on RSVP paradigm
Tian Lan, Deniz Erdogmus, Lois Black, Jan Van Santen
Identifying informative features for ERP speller systems based on RSVP paradigm
Tian Lan, Deniz Erdogmus, Lois Black, Jan Van Santen
Abstract:
This preliminary study focused on identifying informative features in the frequency and spatial domains for single-trial Event Related Potential (ERP) detection for ERP spelling systems. A predefined sequence of letters was presented to subjects in a Rapid Serial Visual Presentation (RSVP) paradigm. EEG data were collected and analyzed offline. A Linear Discriminant Analysis (LDA) classifier was selected as ERP detector for its simplicity and robustness. A range of features in different frequency bands and EEG channel subsets was extracted and detection accuracies were compared for different classes of features.
This preliminary study focused on identifying informative features in the frequency and spatial domains for single-trial Event Related Potential (ERP) detection for ERP spelling systems. A predefined sequence of letters was presented to subjects in a Rapid Serial Visual Presentation (RSVP) paradigm. EEG data were collected and analyzed offline. A Linear Discriminant Analysis (LDA) classifier was selected as ERP detector for its simplicity and robustness. A range of features in different frequency bands and EEG channel subsets was extracted and detection accuracies were compared for different classes of features.
ES2010-14
Predicting spike-timing of a thalamic neuron using a stochastic synaptic model
Karim El-Laithy, Martin Bogdan
Predicting spike-timing of a thalamic neuron using a stochastic synaptic model
Karim El-Laithy, Martin Bogdan
Abstract:
A twofold spike-timing dependent stochastic synaptic model is used along with a leaky Integrate-and-Fire neuronal model to predict the spike timing of a single post-synaptic neuron in the lateral geniculate nucleus, knowing the spike train on the pre-synaptic side (i.e. in a retinal ganglion cell). In this synaptic model, spike-timing dependency is introduced for both the magnitude and relaxation of the dynamics representing the synaptic action. The results show that the used model is able to reliably predict the exact timing of spikes. These results and the model are the winner of a recent international competition.
A twofold spike-timing dependent stochastic synaptic model is used along with a leaky Integrate-and-Fire neuronal model to predict the spike timing of a single post-synaptic neuron in the lateral geniculate nucleus, knowing the spike train on the pre-synaptic side (i.e. in a retinal ganglion cell). In this synaptic model, spike-timing dependency is introduced for both the magnitude and relaxation of the dynamics representing the synaptic action. The results show that the used model is able to reliably predict the exact timing of spikes. These results and the model are the winner of a recent international competition.
ES2010-61
Modelling the McGurk effect
Ioana Sporea, Andre Gruning
Modelling the McGurk effect
Ioana Sporea, Andre Gruning
Abstract:
The current study investigates the McGurk effect by modelling it with neural networks. The simulations are designed to test the two main theories about the moment at which the auditory-visual integration happens. To further analyze the causes behind the McGurk illusion, the neural network that best models the effect is used to simulate the influence of language and the frequency of phonemes on auditory-visual speech perception, using two phonetic distribution from English and Japanese, with different empirical results in the McGurk effect.
The current study investigates the McGurk effect by modelling it with neural networks. The simulations are designed to test the two main theories about the moment at which the auditory-visual integration happens. To further analyze the causes behind the McGurk illusion, the neural network that best models the effect is used to simulate the influence of language and the frequency of phonemes on auditory-visual speech perception, using two phonetic distribution from English and Japanese, with different empirical results in the McGurk effect.
ES2010-95
A critique of BCM behavior verification for STDP-type plasticity models
Christian Mayr, Johannes Partzsch, Rene Schueffny
A critique of BCM behavior verification for STDP-type plasticity models
Christian Mayr, Johannes Partzsch, Rene Schueffny
Abstract:
Rate based (Bienenstock-Cooper-Munroe, BCM) and spike timing dependent plasticity (STDP) are the two principal learning behaviors found at cortical synapses. Some BCM induction protocols have been shown to be compatible with STDP rules, thus combining both forms of plasticity. However, we demonstrate that the majority of actual experimental BCM protocols cannot be reproduced by STDP. This sensitivity to spike protocol is inconsistent with the robust BCM behavior generally found in experiments. Moreover, we show that major recent spike timing rules, despite incorporating rate based effects, cannot replicate actual experimental BCM evidence. Thus, the purported convergence between these two important plasticity phenomena is called in question.
Rate based (Bienenstock-Cooper-Munroe, BCM) and spike timing dependent plasticity (STDP) are the two principal learning behaviors found at cortical synapses. Some BCM induction protocols have been shown to be compatible with STDP rules, thus combining both forms of plasticity. However, we demonstrate that the majority of actual experimental BCM protocols cannot be reproduced by STDP. This sensitivity to spike protocol is inconsistent with the robust BCM behavior generally found in experiments. Moreover, we show that major recent spike timing rules, despite incorporating rate based effects, cannot replicate actual experimental BCM evidence. Thus, the purported convergence between these two important plasticity phenomena is called in question.
Unsupervised learning
ES2010-31
An automated SOM clustering based on data topology
Kadim Tasdemir, Pavel Milenov
An automated SOM clustering based on data topology
Kadim Tasdemir, Pavel Milenov
Abstract:
Self-organizing maps are powerful for cluster extraction due to their ability of obtaining a topologically ordered and adaptive vector quantization of data. Thanks to lower-dimensional representation of high-dimensional data on SOM lattice, clustering is often done interactively from informative SOM visualizations. Yet large volumes of today’s data sets necessitate to have automated methods that are as successful as interactive ones for fast and accurate knowledge discovery. An automated SOM clustering, based on hierarchical clustering of a topology representing graph, is proposed here. Applications on several data sets indicate that the proposed method can be successfully used for automated partitioning.
Self-organizing maps are powerful for cluster extraction due to their ability of obtaining a topologically ordered and adaptive vector quantization of data. Thanks to lower-dimensional representation of high-dimensional data on SOM lattice, clustering is often done interactively from informative SOM visualizations. Yet large volumes of today’s data sets necessitate to have automated methods that are as successful as interactive ones for fast and accurate knowledge discovery. An automated SOM clustering, based on hierarchical clustering of a topology representing graph, is proposed here. Applications on several data sets indicate that the proposed method can be successfully used for automated partitioning.
ES2010-32
A randomized algorithm for spectral clustering
Nicola Rebagliati, Alessandro Verri
A randomized algorithm for spectral clustering
Nicola Rebagliati, Alessandro Verri
Abstract:
Spectral Clustering has reached a wide level of diffusion among unsupervised learning applications. Despite its practical success we believe that for a correct usage one has to face a difficult problem: given a target number of classes K the optimal K-dimensional subspace is not necessarily spanned by the first K eigenvectors of the graph Normalized Laplacian. This problem is usually disregarded but it can make the correct solution not computable by current Spectral Clustering algorithms. The contribution of this paper is twofold. First, we show a bound for choosing a correct number of eigenvectors. Second, we propose a novel randomized spectral algorithm for finding the solution. We show with experiments on real world graphs the efficacy of the algorithm. Our proposal is a scheme that naturally extends the current usage of Spectral Clustering.
Spectral Clustering has reached a wide level of diffusion among unsupervised learning applications. Despite its practical success we believe that for a correct usage one has to face a difficult problem: given a target number of classes K the optimal K-dimensional subspace is not necessarily spanned by the first K eigenvectors of the graph Normalized Laplacian. This problem is usually disregarded but it can make the correct solution not computable by current Spectral Clustering algorithms. The contribution of this paper is twofold. First, we show a bound for choosing a correct number of eigenvectors. Second, we propose a novel randomized spectral algorithm for finding the solution. We show with experiments on real world graphs the efficacy of the algorithm. Our proposal is a scheme that naturally extends the current usage of Spectral Clustering.
ES2010-72
Relevance learning in generative topographic maps
Andrej Gisbrecht, Barbara Hammer
Relevance learning in generative topographic maps
Andrej Gisbrecht, Barbara Hammer
Abstract:
The generative topographic map (GTM) provides a flexible statistical model for unsupervised data inspection and topographic mapping. However, it shares the property of most unsupervised tools that noise in the data cannot be recognized as such and, in consequence, is visualized in the map. The framework of relevance learning or learning metrics as introduced in \cite{grlvq,kaski} offers an elegant way to shape the metric according to auxiliary information at hand such that only those aspects are displayed in distance-based approaches which are relevant for a given classification task. Here we introduce the concept of relevance learning into GTM such that the metric is shaped according to auxiliary class labels. Relying on the prototype-based nature of GTM, several efficient realizations of this paradigm are developed and compared on a couple of benchmarks.
The generative topographic map (GTM) provides a flexible statistical model for unsupervised data inspection and topographic mapping. However, it shares the property of most unsupervised tools that noise in the data cannot be recognized as such and, in consequence, is visualized in the map. The framework of relevance learning or learning metrics as introduced in \cite{grlvq,kaski} offers an elegant way to shape the metric according to auxiliary information at hand such that only those aspects are displayed in distance-based approaches which are relevant for a given classification task. Here we introduce the concept of relevance learning into GTM such that the metric is shaped according to auxiliary class labels. Relying on the prototype-based nature of GTM, several efficient realizations of this paradigm are developed and compared on a couple of benchmarks.
ES2010-113
Multiple Local Models for System Identification Using Vector Quantization Algorithms
Luis Souza, Guilherme Barreto
Multiple Local Models for System Identification Using Vector Quantization Algorithms
Luis Souza, Guilherme Barreto
Abstract:
We introduce a novel method to build multiple local regression models based on the prototype vectors of the SOM network and other well-known vector quantization (VQ) algorithms. The resulting models are evaluated in the task of identifying the inverse dynamics of a heat exchanger data set. Additionally, we evaluate through statistical hypothesis testing the influence of the VQ algorithm on the performance of the local model. Simulation results demonstrate that the proposed method consistently outperforms previous MLP- and SOM-based approaches for system identification.
We introduce a novel method to build multiple local regression models based on the prototype vectors of the SOM network and other well-known vector quantization (VQ) algorithms. The resulting models are evaluated in the task of identifying the inverse dynamics of a heat exchanger data set. Additionally, we evaluate through statistical hypothesis testing the influence of the VQ algorithm on the performance of the local model. Simulation results demonstrate that the proposed method consistently outperforms previous MLP- and SOM-based approaches for system identification.
ES2010-123
Extending FSNPC to handle data points with fuzzy class assignments
Tina Geweniger, Thomas Villmann
Extending FSNPC to handle data points with fuzzy class assignments
Tina Geweniger, Thomas Villmann
Abstract:
In this paper we present an advanced Nearest Prototype Classification to handle data points with unsharp class assignments. Therefore we extend the Soft Nearest Prototype Classification as proposed by Seo et al. and its further enhancement working with fuzzy labeled prototypes as introduced by Villmann et al. We adapt the cost function and derive appropriate update rules for the prototypes. We assess the performance on a toy data set and a real-world problem and compare the classification result with the results obtained by Fuzzy Robust Soft LVQ by means of Fuzzy Cohen's Kappa.
In this paper we present an advanced Nearest Prototype Classification to handle data points with unsharp class assignments. Therefore we extend the Soft Nearest Prototype Classification as proposed by Seo et al. and its further enhancement working with fuzzy labeled prototypes as introduced by Villmann et al. We adapt the cost function and derive appropriate update rules for the prototypes. We assess the performance on a toy data set and a real-world problem and compare the classification result with the results obtained by Fuzzy Robust Soft LVQ by means of Fuzzy Cohen's Kappa.
Image and video analysis
ES2010-127
Principal curve tracing
Erhan Bas, Deniz Erdogmus
Principal curve tracing
Erhan Bas, Deniz Erdogmus
Abstract:
We propose a principal curve tracing algorithm that uses the gradient and the Hessian of a given density estimate. Curve definition requires the local smoothness of data density and is based on the concept of subspace local maxima. Tracing of the curve is handled through the leading eigenvector where fixed-step updates are used. We also propose an image segmentation algorithm based on the original idea and show the effectiveness of the proposed algorithm on a Brainbow dataset.
We propose a principal curve tracing algorithm that uses the gradient and the Hessian of a given density estimate. Curve definition requires the local smoothness of data density and is based on the concept of subspace local maxima. Tracing of the curve is handled through the leading eigenvector where fixed-step updates are used. We also propose an image segmentation algorithm based on the original idea and show the effectiveness of the proposed algorithm on a Brainbow dataset.
ES2010-91
Mode estimation in high-dimensional spaces with flat-top kernels: application to image denoising
Arnaud de Decker, John Aldo Lee, Damien Francois, Michel Verleysen
Mode estimation in high-dimensional spaces with flat-top kernels: application to image denoising
Arnaud de Decker, John Aldo Lee, Damien Francois, Michel Verleysen
Abstract:
Data denoising can be achieved by approximating the data distribution and replacing each data item with an estimate of its closest mode. This idea has already been successfully applied to image denoising. The data then consists of pixel intensities or image patches, that is, vectorized groups of pixel intensities. The latter case raises the issue of mode estimation in a high-dimensional space, since patches can contain about 10 to more than 100 pixels. This paper shows that the widely used Gaussian kernel is outperformed by flat-top kernels that are specifically tailored in order to fight the curse of dimensionality.
Data denoising can be achieved by approximating the data distribution and replacing each data item with an estimate of its closest mode. This idea has already been successfully applied to image denoising. The data then consists of pixel intensities or image patches, that is, vectorized groups of pixel intensities. The latter case raises the issue of mode estimation in a high-dimensional space, since patches can contain about 10 to more than 100 pixels. This paper shows that the widely used Gaussian kernel is outperformed by flat-top kernels that are specifically tailored in order to fight the curse of dimensionality.
ES2010-83
Figure-ground Segmentation using Metrics Adaptation in Level Set Methods
Alexander Denecke, Irene Ayllon Clemente, Heiko Wersing, Julian Eggert, Jochen Steil
Figure-ground Segmentation using Metrics Adaptation in Level Set Methods
Alexander Denecke, Irene Ayllon Clemente, Heiko Wersing, Julian Eggert, Jochen Steil
Abstract:
We present an approach for hypothesis-based image segmentation founding on the integration of level set methods and discriminative feature clustering techniques. Building up on previous work, we investigate Localized Generalized Matrix Learning Vector Quantization (LGMLVQ) to train a classifier for fore- and background of an image. Here we extend this concept towards level set segmentation algorithms, where region descriptors are used to adapt the object contour according to the image features. Finally we demonstrate that the fusion of both methods is capable to outperform their individual applications and improve the performance compared to other state of the art segmentation methods.
We present an approach for hypothesis-based image segmentation founding on the integration of level set methods and discriminative feature clustering techniques. Building up on previous work, we investigate Localized Generalized Matrix Learning Vector Quantization (LGMLVQ) to train a classifier for fore- and background of an image. Here we extend this concept towards level set segmentation algorithms, where region descriptors are used to adapt the object contour according to the image features. Finally we demonstrate that the fusion of both methods is capable to outperform their individual applications and improve the performance compared to other state of the art segmentation methods.
ES2010-85
An ART-type network approach for video object detection
Rafael Luque, Enrique Domínguez, Esteban Palomo, José Muñoz
An ART-type network approach for video object detection
Rafael Luque, Enrique Domínguez, Esteban Palomo, José Muñoz
Abstract:
This paper presents an ART2 (adaptive resonant theory) network to detect objects in a video sequence classifying the pixels as foreground or background. The proposed ART network not only possess the structure and learning ability of an ART-based network, but also uses a neural merging process to adapt the variability of the input data (pixels) in the scene along the time. Experimental results demonstrate the effectiveness and feasibility of the proposed ART network for object detection. Standard datasets are used to compare the efficiency of the proposed approach against other traditional methods based on gaussian models.
This paper presents an ART2 (adaptive resonant theory) network to detect objects in a video sequence classifying the pixels as foreground or background. The proposed ART network not only possess the structure and learning ability of an ART-based network, but also uses a neural merging process to adapt the variability of the input data (pixels) in the scene along the time. Experimental results demonstrate the effectiveness and feasibility of the proposed ART network for object detection. Standard datasets are used to compare the efficiency of the proposed approach against other traditional methods based on gaussian models.
Computational Intelligence in Biomedicine
ES2010-2
Computational Intelligence in biomedicine: Some contributions
Paulo J.G. Lisboa, Alfredo Vellido, José D. Martín
Computational Intelligence in biomedicine: Some contributions
Paulo J.G. Lisboa, Alfredo Vellido, José D. Martín
ES2010-45
Segmentation of EMG time series using a variational Bayesian approach for the robust estimation of cortical silent periods
Iván Olier, Amengual Julià, Alfredo Vellido
Segmentation of EMG time series using a variational Bayesian approach for the robust estimation of cortical silent periods
Iván Olier, Amengual Julià, Alfredo Vellido
Abstract:
A variational Bayesian formulation for a manifold-constrained Hidden Markov Model is used in this paper to segment a set of multivariate time series of electromyographic recordings corresponding to stroke patients and control subjects. An index of variability associated to this model is defined and applied to the robust detection of the silent period interval of the signal. The accuracy in the estimation of the duration of this interval is paramount to assess the rehabilitation of stroke patients.
A variational Bayesian formulation for a manifold-constrained Hidden Markov Model is used in this paper to segment a set of multivariate time series of electromyographic recordings corresponding to stroke patients and control subjects. An index of variability associated to this model is defined and applied to the robust detection of the silent period interval of the signal. The accuracy in the estimation of the duration of this interval is paramount to assess the rehabilitation of stroke patients.
ES2010-15
Spectral Prototype Extraction for dimensionality reduction in brain tumour diagnosis
Sandra Ortega-Martorell, Iván Olier, Alfredo Vellido, Margarida Julià-Sapé, Carles Arús
Spectral Prototype Extraction for dimensionality reduction in brain tumour diagnosis
Sandra Ortega-Martorell, Iván Olier, Alfredo Vellido, Margarida Julià-Sapé, Carles Arús
Abstract:
Diagnosis in neuro-oncology can be assisted by non-invasive data acquisition techniques such as Magnetic Resonance Spectroscopy (MRS). From the viewpoint of computer-based brain tumour classification, the high dimensionality of MRS poses a difficulty, and the use of dimensionality reduction (DR) techniques is advisable. Despite some important limitations, Principal Component Analysis (PCA) is commonly used for DR in MRS data analysis. Here, we define a novel DR technique, namely Spectral Prototype Extraction, based on a manifold-constrained Hidden Markov Model (HMM). Its formulation within a variational Bayesian framework imbues it with regularization properties that minimize the negative effect of the presence of noise in the data. Its use for MRS pre-processing is illustrated in a difficult brain tumour classification problem.
Diagnosis in neuro-oncology can be assisted by non-invasive data acquisition techniques such as Magnetic Resonance Spectroscopy (MRS). From the viewpoint of computer-based brain tumour classification, the high dimensionality of MRS poses a difficulty, and the use of dimensionality reduction (DR) techniques is advisable. Despite some important limitations, Principal Component Analysis (PCA) is commonly used for DR in MRS data analysis. Here, we define a novel DR technique, namely Spectral Prototype Extraction, based on a manifold-constrained Hidden Markov Model (HMM). Its formulation within a variational Bayesian framework imbues it with regularization properties that minimize the negative effect of the presence of noise in the data. Its use for MRS pre-processing is illustrated in a difficult brain tumour classification problem.
ES2010-16
On the use of a clinical kernel in survival analysis
Vanya Van Belle, Kristiaan Pelckmans, Johan Suykens, Sabine Van Huffel
On the use of a clinical kernel in survival analysis
Vanya Van Belle, Kristiaan Pelckmans, Johan Suykens, Sabine Van Huffel
Abstract:
Clinical datasets typically contain continuous, ordinal, cate- gorical and binary variables. To model this type of datasets, linear kernel methods are generally used. However, the linear kernel has some disad- vantages, which were tackled by the introduction of a clinical one. This work shows that the use of a clinical kernel can improve the performance of support vector machine survival models. In addition, the polynomial kernel is adapted in the same way to obtain a clinical polynomial kernel. A comparison is made with other non-linear additive kernels on six different survival data. Our results indicate that the use of a clinical kernel is a simple way to obtain non-linear models for survival analysis, without the need to tune an extra parameter.
Clinical datasets typically contain continuous, ordinal, cate- gorical and binary variables. To model this type of datasets, linear kernel methods are generally used. However, the linear kernel has some disad- vantages, which were tackled by the introduction of a clinical one. This work shows that the use of a clinical kernel can improve the performance of support vector machine survival models. In addition, the polynomial kernel is adapted in the same way to obtain a clinical polynomial kernel. A comparison is made with other non-linear additive kernels on six different survival data. Our results indicate that the use of a clinical kernel is a simple way to obtain non-linear models for survival analysis, without the need to tune an extra parameter.
ES2010-35
The Application of Gaussian Processes in the Prediction of Percutaneous Absorption for Mammalian and Synthetic Membranes
Yi Sun, Gary Moss, Maria Prapopoulou, Rod Adams, Marc Brown, Neil Davey
The Application of Gaussian Processes in the Prediction of Percutaneous Absorption for Mammalian and Synthetic Membranes
Yi Sun, Gary Moss, Maria Prapopoulou, Rod Adams, Marc Brown, Neil Davey
Abstract:
Improving predictions of the skin permeability coefficient is a difficult problem, and an important issue with the increasing use of skin patches as a means of drug delivery. In this work, we apply Gaussian processes (GPs) with five different covariance functions to predict the permeability coefficients of Human, Pig, Rodent and Silastic membranes. We obtain a considerable improvement over the quantitative structure-activity relationship (QSARs) predictors. The GPs with Matern and neural network covariance functions give the best performance in this work. We find that five compound features applied to Human, Pig and Rodent membranes cannot represent the main characteristics of the Silastic dataset.
Improving predictions of the skin permeability coefficient is a difficult problem, and an important issue with the increasing use of skin patches as a means of drug delivery. In this work, we apply Gaussian processes (GPs) with five different covariance functions to predict the permeability coefficients of Human, Pig, Rodent and Silastic membranes. We obtain a considerable improvement over the quantitative structure-activity relationship (QSARs) predictors. The GPs with Matern and neural network covariance functions give the best performance in this work. We find that five compound features applied to Human, Pig and Rodent membranes cannot represent the main characteristics of the Silastic dataset.
ES2010-81
Neural models for the analysis of kidney disease patients
Emilio Soria, José D. Martín, Mónica Climente, Amparo Soldevila, Antonio J. Serrano
Neural models for the analysis of kidney disease patients
Emilio Soria, José D. Martín, Mónica Climente, Amparo Soldevila, Antonio J. Serrano
Abstract:
This work uses Machine Learning techniques and other classical approaches to analyze both physiological variables and treatment characteristics in patients undergoing chronic renal failure. Firstly, the use of Self-Organizing Maps is proposed in order to extract qualitative knowledge. Secondly, the Hemoglobin concentration is predicted one-month ahead by models based on the Multilayer Perceptron; the prediction uses information from two months (the current month and the previous one). Achieved results support the usefulness of these tools in daily clinical practice.
This work uses Machine Learning techniques and other classical approaches to analyze both physiological variables and treatment characteristics in patients undergoing chronic renal failure. Firstly, the use of Self-Organizing Maps is proposed in order to extract qualitative knowledge. Secondly, the Hemoglobin concentration is predicted one-month ahead by models based on the Multilayer Perceptron; the prediction uses information from two months (the current month and the previous one). Achieved results support the usefulness of these tools in daily clinical practice.
Learning III
ES2010-19
Distance functions for local PCA methods
Alexander Kaiser, Wolfram Schenck, Ralf Moeller
Distance functions for local PCA methods
Alexander Kaiser, Wolfram Schenck, Ralf Moeller
Abstract:
The NGPCA method, a combination of the robust neural gas vector quantization method and a fast neural principal component analyzer, has proved to be a valuable tool for the generalized learning of high-dimensional data. At its core, the method uses a competitive ranking to adapt its units. The competition is guided by a specialized distance function -- known as the normalized Mahalanobis distance -- that assumes elliptic cluster shapes. Recently, an alternative distance function, the normalized Rayleigh quotient, has been suggested. This paper compares the performance of NGPCA on different distance functions. For the comparison a data set from a realistic robot arm experiment is used.
The NGPCA method, a combination of the robust neural gas vector quantization method and a fast neural principal component analyzer, has proved to be a valuable tool for the generalized learning of high-dimensional data. At its core, the method uses a competitive ranking to adapt its units. The competition is guided by a specialized distance function -- known as the normalized Mahalanobis distance -- that assumes elliptic cluster shapes. Recently, an alternative distance function, the normalized Rayleigh quotient, has been suggested. This paper compares the performance of NGPCA on different distance functions. For the comparison a data set from a realistic robot arm experiment is used.
ES2010-33
KNN behavior with set-valued attributes
Mabel González Castellanos, Yanet Rodríguez Sarabia, Carlos Morell Pérez
KNN behavior with set-valued attributes
Mabel González Castellanos, Yanet Rodríguez Sarabia, Carlos Morell Pérez
Abstract:
This paper addresses the problem of treatment of set-valued attributes in the lazy learning context. This type of attribute is present in various domains, yet the instance-based learning tools don't provide a representation for them. To solve this problem, we present a proposal for the treatment of the sets in the context of k-NN algorithm through an extension to HEOM distance. Experiments using various data sets show the feasibility of this option.
This paper addresses the problem of treatment of set-valued attributes in the lazy learning context. This type of attribute is present in various domains, yet the instance-based learning tools don't provide a representation for them. To solve this problem, we present a proposal for the treatment of the sets in the context of k-NN algorithm through an extension to HEOM distance. Experiments using various data sets show the feasibility of this option.
ES2010-44
Kernel generative topographic mapping
Iván Olier, Alfredo Vellido, Jesús Giraldo
Kernel generative topographic mapping
Iván Olier, Alfredo Vellido, Jesús Giraldo
Abstract:
A kernel version of Generative Topographic Mapping, a model of the manifold learning family, is defined in this paper. Its ability to adequately model non-i.i.d. data is illustrated in a problem concerning the identification of protein subfamilies from protein sequences.
A kernel version of Generative Topographic Mapping, a model of the manifold learning family, is defined in this paper. Its ability to adequately model non-i.i.d. data is illustrated in a problem concerning the identification of protein subfamilies from protein sequences.
ES2010-86
On Finding Complementary Clusterings
Timo Pröscholdt, Michel Crucianu
On Finding Complementary Clusterings
Timo Pröscholdt, Michel Crucianu
Abstract:
In many cases, a dataset can be clustered following several criteria that complement each other: group membership following one criterion provides little or no information regarding group membership following the other criterion. When these criteria are not known a priori, they have to be determined from the data. We put forward one method for simultaneously finding the complementary criteria and the clustering corresponding to each criterion.
In many cases, a dataset can be clustered following several criteria that complement each other: group membership following one criterion provides little or no information regarding group membership following the other criterion. When these criteria are not known a priori, they have to be determined from the data. We put forward one method for simultaneously finding the complementary criteria and the clustering corresponding to each criterion.
ES2010-92
Consensus clustering by graph based approach
Haytham Elghazel, Khalid Benabdeslem, Fatma Hamdi
Consensus clustering by graph based approach
Haytham Elghazel, Khalid Benabdeslem, Fatma Hamdi
Abstract:
In this paper, we propose G-Cons, an extension of a minimal graph coloring paradigm for consensus clustering. Based on the co-association values between data, our approach is a graph partitioning one yielding a combined partition by maximizing an objective function given by the average mutual information between the consensus partition and all initial clusterings. It exhibits more important consensus clustering features (quality and computational complexity) and enables to build a combined partition by improving the stability and accuracy of clustering solutions. The proposed approach is evaluated against benchmark databases and promising results are obtained compared to another consensus clustering techniques.
In this paper, we propose G-Cons, an extension of a minimal graph coloring paradigm for consensus clustering. Based on the co-association values between data, our approach is a graph partitioning one yielding a combined partition by maximizing an objective function given by the average mutual information between the consensus partition and all initial clusterings. It exhibits more important consensus clustering features (quality and computational complexity) and enables to build a combined partition by improving the stability and accuracy of clustering solutions. The proposed approach is evaluated against benchmark databases and promising results are obtained compared to another consensus clustering techniques.
ES2010-89
Web Document Clustering based on a Hierarchical Self-Organizing Model
Esteban Palomo, Enrique Domínguez, Rafael Luque, José Muñoz
Web Document Clustering based on a Hierarchical Self-Organizing Model
Esteban Palomo, Enrique Domínguez, Rafael Luque, José Muñoz
Abstract:
In this work, a hierarchical self-organizing model based on the GHSOM is presented in order to cluster web contents. The GHSOM is an artificial neural network that has been widely used for data clustering. The hierarchical architecture of the GHSOM is more flexible than a single SOM since it is adapted to input data, mirroring inherent hierarchical relations among them. The adaptation process of the GHSOM architecture is controlled by two parameters. However, these parameters have to be established in advance and this task is not always easy. In this paper, a one parameter hierarchical self-organizing model is proposed. This model has been evaluated by using the 'BankSearch' benchmark dataset. Experimental results show the good performance of this approach.
In this work, a hierarchical self-organizing model based on the GHSOM is presented in order to cluster web contents. The GHSOM is an artificial neural network that has been widely used for data clustering. The hierarchical architecture of the GHSOM is more flexible than a single SOM since it is adapted to input data, mirroring inherent hierarchical relations among them. The adaptation process of the GHSOM architecture is controlled by two parameters. However, these parameters have to be established in advance and this task is not always easy. In this paper, a one parameter hierarchical self-organizing model is proposed. This model has been evaluated by using the 'BankSearch' benchmark dataset. Experimental results show the good performance of this approach.
ES2010-49
Online speaker diarization with a size-monitored growing neural gas algorithm
Jean-Louis Gutzwiller, Hervé Frezza-Buet, Olivier Pietquin
Online speaker diarization with a size-monitored growing neural gas algorithm
Jean-Louis Gutzwiller, Hervé Frezza-Buet, Olivier Pietquin
Abstract:
This paper proposes a method for segmenting and clustering an audio flow on the basis of speaker turns. This process, also known as speaker diarization, is of major importance in multimedia indexation. Here, we propose to realize this process online and without any prior knowledge on the number of speakers. This is done thanks to a statistical modelling of speakers based on a size-monitored growing neural gas algorithm.
This paper proposes a method for segmenting and clustering an audio flow on the basis of speaker turns. This process, also known as speaker diarization, is of major importance in multimedia indexation. Here, we propose to realize this process online and without any prior knowledge on the number of speakers. This is done thanks to a statistical modelling of speakers based on a size-monitored growing neural gas algorithm.
ES2010-59
Validation of unsupervised clustering methods for leaf phenotype screening
Andreas Backhaus, Asuka Kuwabara, Andrew Fleming, Udo Seiffert
Validation of unsupervised clustering methods for leaf phenotype screening
Andreas Backhaus, Asuka Kuwabara, Andrew Fleming, Udo Seiffert
Abstract:
The assessment of visible differences in leaf shape between plant species or mutants (phenotyping) plays a significant role in plant research. This paper investigates the application of unsupervised data clustering techniques for phenotype screening to find hidden common shape categories. A set of two wildtypes and seven mutations of Arabidopsis acted as a test case. K-Means, NG, GNG, SOM and ART2a were evaluated by classical validity indices and one index derived from the task at hand. K-Means showed the best results and a low agreement between classical validity measures and task constraints was found.
The assessment of visible differences in leaf shape between plant species or mutants (phenotyping) plays a significant role in plant research. This paper investigates the application of unsupervised data clustering techniques for phenotype screening to find hidden common shape categories. A set of two wildtypes and seven mutations of Arabidopsis acted as a test case. K-Means, NG, GNG, SOM and ART2a were evaluated by classical validity indices and one index derived from the task at hand. K-Means showed the best results and a low agreement between classical validity measures and task constraints was found.
ES2010-132
A Novel Two-Phase SOM Clustering Approach to Discover Visitor Interests in a Website
Ahmad Ammari, Valentina Zharkova
A Novel Two-Phase SOM Clustering Approach to Discover Visitor Interests in a Website
Ahmad Ammari, Valentina Zharkova
Abstract:
Mining content, structure and usage data in websites can uncover browsing patterns that different groups of Web visitors follow to access the subjects that are truly valuable to them. Many works in the literature focused on proposing new similarity measures to cluster Web logs and detect segments of browsing behaviors. However, this does not reveal which contents the visitors are interested in since a Web page may contain many different topics. In this paper, a novel two-phase clustering approach based on Self Organizing Maps (SOM) is proposed to address this problem. A systematic process to prepare Web content data for clustering is also described.
Mining content, structure and usage data in websites can uncover browsing patterns that different groups of Web visitors follow to access the subjects that are truly valuable to them. Many works in the literature focused on proposing new similarity measures to cluster Web logs and detect segments of browsing behaviors. However, this does not reveal which contents the visitors are interested in since a Web page may contain many different topics. In this paper, a novel two-phase clustering approach based on Self Organizing Maps (SOM) is proposed to address this problem. A systematic process to prepare Web content data for clustering is also described.
ES2010-128
Image registration by the extended evolutionary self-organizing map
Everardo Maia, Guilherme Barreto, André Coelho
Image registration by the extended evolutionary self-organizing map
Everardo Maia, Guilherme Barreto, André Coelho
Abstract:
The Evolutionary Self-Organizing Map (EvSOM) is a recently proposed robust algorithm for topographic map formation through evolutionary strategies [10, 11]. This work extends the EvSOM algorithm to deal with the Image Registration Problem and evaluate its performance when the relationship between the reference image (Ir) and the free image (If) can be approximate by an affine transformation. The aim of this work is then to use the neighborhood preservation property of topographic maps to improve the performance of traditional image registration algorithms. The results are compared with two other strategies: iterative closest point (ICP) based image registration and template matching (TM) based image registration. Experimental results using black-white retinal images indicate the feasibility of the proposed approach.
The Evolutionary Self-Organizing Map (EvSOM) is a recently proposed robust algorithm for topographic map formation through evolutionary strategies [10, 11]. This work extends the EvSOM algorithm to deal with the Image Registration Problem and evaluate its performance when the relationship between the reference image (Ir) and the free image (If) can be approximate by an affine transformation. The aim of this work is then to use the neighborhood preservation property of topographic maps to improve the performance of traditional image registration algorithms. The results are compared with two other strategies: iterative closest point (ICP) based image registration and template matching (TM) based image registration. Experimental results using black-white retinal images indicate the feasibility of the proposed approach.
ES2010-48
Programmable triangular neighborhood functions of Kohonen Self-Organizing Maps realized in CMOS technology
Rafal Dlugosz, Marta Kolasa, Witold Pedrycz
Programmable triangular neighborhood functions of Kohonen Self-Organizing Maps realized in CMOS technology
Rafal Dlugosz, Marta Kolasa, Witold Pedrycz
Abstract:
The paper presents a programmable triangular neighborhood function for application in low power transistor level implemented Kohonen self-organized maps (SOMs). Detailed simulations carried out for the software model of such network show that the triangular function forms a good approximation of the Gaussian function, while being implemented in a much easier way in hardware. The proposed circuit is very flexible and allows for easy adjustments of the slope of the function. It enables the asynchronous and fully parallel operation of all neurons in the network thus making it very fast. The proposed mechanism can be used in custom designed networks either in their analog or digital implementation. Due to the simple structure, the energy consumption per a single input pattern is low (120 pJ in case of the map of 16 x 16 neurons).
The paper presents a programmable triangular neighborhood function for application in low power transistor level implemented Kohonen self-organized maps (SOMs). Detailed simulations carried out for the software model of such network show that the triangular function forms a good approximation of the Gaussian function, while being implemented in a much easier way in hardware. The proposed circuit is very flexible and allows for easy adjustments of the slope of the function. It enables the asynchronous and fully parallel operation of all neurons in the network thus making it very fast. The proposed mechanism can be used in custom designed networks either in their analog or digital implementation. Due to the simple structure, the energy consumption per a single input pattern is low (120 pJ in case of the map of 16 x 16 neurons).
ES2010-84
Evolution of adaptive center-crossing continuous time recurrent neural networks for biped robot control
Ángel Campo, José Santos
Evolution of adaptive center-crossing continuous time recurrent neural networks for biped robot control
Ángel Campo, José Santos
Abstract:
We used simulated evolution to obtain continuous time recurrent neural networks to control the locomotion of simulated bipeds. We also used the definition of center-crossing networks, so that the recurrent networks nodes can reach their areas of maximum sensitivity of their activation functions. Moreover, we incorporated a run-time adaptation of the nodes' biases to obtain such condition. We tested the improvements and possibilities this adaptation adds, focusing in the use for biped robot control
We used simulated evolution to obtain continuous time recurrent neural networks to control the locomotion of simulated bipeds. We also used the definition of center-crossing networks, so that the recurrent networks nodes can reach their areas of maximum sensitivity of their activation functions. Moreover, we incorporated a run-time adaptation of the nodes' biases to obtain such condition. We tested the improvements and possibilities this adaptation adds, focusing in the use for biped robot control
ES2010-102
Free-energy-based reinforcement learning in a partially observable environment
Makoto Otsuka, Junichiro Yoshimoto, Kenji Doya
Free-energy-based reinforcement learning in a partially observable environment
Makoto Otsuka, Junichiro Yoshimoto, Kenji Doya
Abstract:
Free-energy-based reinforcement learning (FERL) can handle Markov decision processes (MDPs) with high-dimensional state spaces by approximating the state-action value function with the negative equilibrium free energy of a restricted Boltzmann machine (RBM). In this study, we extend the FERL framework to handle partially observable MDPs (POMDPs) by incorporating a recurrent neural network that learns a memory representation sufficient for predicting future observations and rewards. We demonstrate that the proposed method successfully solves POMDPs with high-dimensional observations without any prior knowledge of the environmental hidden states and dynamics. After learning, task structures are implicitly represented in the distributed activation patterns of hidden nodes of the RBM.
Free-energy-based reinforcement learning (FERL) can handle Markov decision processes (MDPs) with high-dimensional state spaces by approximating the state-action value function with the negative equilibrium free energy of a restricted Boltzmann machine (RBM). In this study, we extend the FERL framework to handle partially observable MDPs (POMDPs) by incorporating a recurrent neural network that learns a memory representation sufficient for predicting future observations and rewards. We demonstrate that the proposed method successfully solves POMDPs with high-dimensional observations without any prior knowledge of the environmental hidden states and dynamics. After learning, task structures are implicitly represented in the distributed activation patterns of hidden nodes of the RBM.