Bruges, Belgium, April 26-27-28
Content of the proceedings
-
Deep and kernel methods: best of two worlds
Randomized Machine Learning approaches: analysis and developments
Classification
Biomedical data analysis in translational research: integration of expert knowledge and interpretable models
Environmental signal processing: new trends and applications
Kernels, graphs and clustering
Regression, robots and biological systems
Processing, Mining and Visualizing Massive Urban Data
Signal and image processing, collaborative filtering
Algorithmic Challenges in Big Data Analytics
Deep learning
Deep and kernel methods: best of two worlds
ES2017-5
Bridging deep and kernel methods
Lluís Belanche, Marta Costa-jussa
Bridging deep and kernel methods
Lluís Belanche, Marta Costa-jussa
ES2017-108
Structure optimization for deep multimodal fusion networks using graph-induced kernels
Dhanesh Ramachandram, Michal Lisicki, Timothy J. Shields, Mohamed R. Amer, Graham W. Taylor
Structure optimization for deep multimodal fusion networks using graph-induced kernels
Dhanesh Ramachandram, Michal Lisicki, Timothy J. Shields, Mohamed R. Amer, Graham W. Taylor
Abstract:
A popular testbed for deep learning has been multimodal recognition of human activity or gesture involving diverse inputs such as video, audio, skeletal pose and depth images. Deep learning architectures have excelled on such problems due to their ability to combine modality representations at different levels of nonlinear feature extraction. However, designing an optimal architecture in which to fuse such learned represen- tations has largely been a non-trivial human engineering effort. We treat fusion structure optimization as a hyper-parameter search and cast it as a discrete optimization problem under the Bayesian optimization frame- work. We propose a novel graph-induced kernel to compute structural similarities in the search space of tree-structured multimodal architectures and demonstrate its effectiveness using two challenging multimodal human activity recognition datasets.
A popular testbed for deep learning has been multimodal recognition of human activity or gesture involving diverse inputs such as video, audio, skeletal pose and depth images. Deep learning architectures have excelled on such problems due to their ability to combine modality representations at different levels of nonlinear feature extraction. However, designing an optimal architecture in which to fuse such learned represen- tations has largely been a non-trivial human engineering effort. We treat fusion structure optimization as a hyper-parameter search and cast it as a discrete optimization problem under the Bayesian optimization frame- work. We propose a novel graph-induced kernel to compute structural similarities in the search space of tree-structured multimodal architectures and demonstrate its effectiveness using two challenging multimodal human activity recognition datasets.
ES2017-97
Scalable Hybrid Deep Neural Kernel Networks
Siamak Mehrkanoon, Andreas Zell, Johan A. K. Suykens
Scalable Hybrid Deep Neural Kernel Networks
Siamak Mehrkanoon, Andreas Zell, Johan A. K. Suykens
Abstract:
This paper introduces a novel hybrid deep neural kernel framework. The proposed deep learning model follows a combination of neural networks based architecture and a kernel based model. In particular, here an explicit feature map, based on random Fourier features, is used to make the transition between the two architectures more straightforward as well as making the model scalable to large datasets by solving the optimization problem in the primal. The introduced framework can be considered as the first building block for the development of even deeper models and more advanced architectures. Experimental results show a significant improvement over shallow models on several medium to large scale real-life datasets.
This paper introduces a novel hybrid deep neural kernel framework. The proposed deep learning model follows a combination of neural networks based architecture and a kernel based model. In particular, here an explicit feature map, based on random Fourier features, is used to make the transition between the two architectures more straightforward as well as making the model scalable to large datasets by solving the optimization problem in the primal. The introduced framework can be considered as the first building block for the development of even deeper models and more advanced architectures. Experimental results show a significant improvement over shallow models on several medium to large scale real-life datasets.
ES2017-120
Learning dot-product polynomials for multiclass problems
Ivano Lauriola, Michele Donini, Fabio Aiolli
Learning dot-product polynomials for multiclass problems
Ivano Lauriola, Michele Donini, Fabio Aiolli
Abstract:
Several mechanisms exist in the literature to solve a multiclass classification problem exploiting a binary kernel-machine. Most of them are based on problem decomposition that consists on splitting the problem in many binary tasks. These tasks have different complexity and they require different kernels. Our goal is to use the Multiple Kernel Learning (MKL) paradigm to learn the best dot-product kernel for each decomposed binary task. In this context, we propose an efficient learning procedure to reduce the searching space of hyperparameters, showing its empirically effectiveness.
Several mechanisms exist in the literature to solve a multiclass classification problem exploiting a binary kernel-machine. Most of them are based on problem decomposition that consists on splitting the problem in many binary tasks. These tasks have different complexity and they require different kernels. Our goal is to use the Multiple Kernel Learning (MKL) paradigm to learn the best dot-product kernel for each decomposed binary task. In this context, we propose an efficient learning procedure to reduce the searching space of hyperparameters, showing its empirically effectiveness.
ES2017-63
Support vector components analysis
Michiel van der Ree, Jos Roerdink, Christophe Phillips, Gaëtan Garraux, Eric Salmon, Marco Wiering
Support vector components analysis
Michiel van der Ree, Jos Roerdink, Christophe Phillips, Gaëtan Garraux, Eric Salmon, Marco Wiering
Abstract:
In this paper we propose a novel method for learning a distance metric in the process of training Support Vector Machines (SVMs) with various kernels. A transformation matrix is adapted in such a way that the SVM dual objective of a classification problem is optimized. By using a wide transformation matrix the method can effectively be used as a means of supervised dimensionality reduction. We compare our method with other algorithms on a toy dataset and on PET-scans of patients with various Parkinsonisms, finding that our method either outperforms or performs on par with the other algorithms.
In this paper we propose a novel method for learning a distance metric in the process of training Support Vector Machines (SVMs) with various kernels. A transformation matrix is adapted in such a way that the SVM dual objective of a classification problem is optimized. By using a wide transformation matrix the method can effectively be used as a means of supervised dimensionality reduction. We compare our method with other algorithms on a toy dataset and on PET-scans of patients with various Parkinsonisms, finding that our method either outperforms or performs on par with the other algorithms.
ES2017-37
Algebraic multigrid support vector machines
Ehsan Sadrfaridpour, Sandeep Jeereddy, Ken Kennedy, Andre Luckow, Talayeh Razzaghi, Ilya Safro
Algebraic multigrid support vector machines
Ehsan Sadrfaridpour, Sandeep Jeereddy, Ken Kennedy, Andre Luckow, Talayeh Razzaghi, Ilya Safro
Abstract:
The support vector machine is a flexible optimization-based technique widely used for classification problems. In practice, its training part becomes computationally expensive on large-scale data sets because of such reasons as the complexity and number of iterations in the parameter fitting methods, underlying optimization solvers, and nonlinearity of kernels. We introduce a fast multilevel framework for solving support vector machine models that is inspired by the algebraic multigrid. Significant improvement in the running has been achieved without any loss in the quality. The proposed technique is highly beneficial on imbalanced sets. We demonstrate computational results on publicly available and industrial data sets.
The support vector machine is a flexible optimization-based technique widely used for classification problems. In practice, its training part becomes computationally expensive on large-scale data sets because of such reasons as the complexity and number of iterations in the parameter fitting methods, underlying optimization solvers, and nonlinearity of kernels. We introduce a fast multilevel framework for solving support vector machine models that is inspired by the algebraic multigrid. Significant improvement in the running has been achieved without any loss in the quality. The proposed technique is highly beneficial on imbalanced sets. We demonstrate computational results on publicly available and industrial data sets.
ES2017-135
Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks
Stephan Baier, Sigurd Spieckermann, Volker Tresp
Attention-based Information Fusion using Multi-Encoder-Decoder Recurrent Neural Networks
Stephan Baier, Sigurd Spieckermann, Volker Tresp
Abstract:
With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data. When predicting future behavior, incorporating information from neighboring sensor stations is often beneficial. We propose a new RNN based architecture for context specific information fusion across multiple spatially distributed sensor stations. Hereby, latent representations of multiple local models, each modeling one sensor station, are jointed and weighted, according to their importance for the prediction. The particular importance is assessed depending on the current context using a separate attention function. We demonstrate the effectiveness of our model on three different real-world sensor network datasets.
With the rising number of interconnected devices and sensors, modeling distributed sensor networks is of increasing interest. Recurrent neural networks (RNN) are considered particularly well suited for modeling sensory and streaming data. When predicting future behavior, incorporating information from neighboring sensor stations is often beneficial. We propose a new RNN based architecture for context specific information fusion across multiple spatially distributed sensor stations. Hereby, latent representations of multiple local models, each modeling one sensor station, are jointed and weighted, according to their importance for the prediction. The particular importance is assessed depending on the current context using a separate attention function. We demonstrate the effectiveness of our model on three different real-world sensor network datasets.
ES2017-96
Fusion of Stereo Vision for Pedestrian Recognition using Convolutional Neural Networks
Danut Ovidiu Pop, Alexandrina Rogozan, Fawzi Nashashibi, Abdelaziz Bensrhair
Fusion of Stereo Vision for Pedestrian Recognition using Convolutional Neural Networks
Danut Ovidiu Pop, Alexandrina Rogozan, Fawzi Nashashibi, Abdelaziz Bensrhair
Abstract:
Pedestrian detection is a highly debated issue in scientific world due to its outstanding importance for a large number of applications, especially in the fields of automotive safety, robotics and surveillance. In spite of the widely varying methods developed in recent years, pedestrian detection is still an open challenge whose accuracy and robustness has to be improved. Therefore, in this paper, we focus on the improvement of the classification component in the pedestrian detection task on the Daimler stereo vision data set by adopting two approaches: 1) by combining three image modalities (intensity, depth and flow) to feed a unique convolutional neural network (CNN) and 2) by fusing the results of three independent CNNs.
Pedestrian detection is a highly debated issue in scientific world due to its outstanding importance for a large number of applications, especially in the fields of automotive safety, robotics and surveillance. In spite of the widely varying methods developed in recent years, pedestrian detection is still an open challenge whose accuracy and robustness has to be improved. Therefore, in this paper, we focus on the improvement of the classification component in the pedestrian detection task on the Daimler stereo vision data set by adopting two approaches: 1) by combining three image modalities (intensity, depth and flow) to feed a unique convolutional neural network (CNN) and 2) by fusing the results of three independent CNNs.
ES2017-50
Training convolutional networks with weight–wise adaptive learning rates
Alan Mosca, George Magoulas
Training convolutional networks with weight–wise adaptive learning rates
Alan Mosca, George Magoulas
Abstract:
Current state–of–the–art Deep Learning classification with Convolutional Neural Networks achieves very impressive results, which are, in some cases, close to human level performance. However, training these methods to their optimal performance requires very long training periods, usually by applying the Stochastic Gradient Descent method. We show that by applying more modern methods, which involve adapting a different learning rate for each weight rather than using a single, global, learning rate for the entire network, we are able to reach close to state–of–the–art performance on the same architectures, and improve the training time and accuracy.
Current state–of–the–art Deep Learning classification with Convolutional Neural Networks achieves very impressive results, which are, in some cases, close to human level performance. However, training these methods to their optimal performance requires very long training periods, usually by applying the Stochastic Gradient Descent method. We show that by applying more modern methods, which involve adapting a different learning rate for each weight rather than using a single, global, learning rate for the entire network, we are able to reach close to state–of–the–art performance on the same architectures, and improve the training time and accuracy.
ES2017-130
Invariant representations of images for better learning
Muthuvel Murugan Issakkimuthu, Subrahmanyam K. V.
Invariant representations of images for better learning
Muthuvel Murugan Issakkimuthu, Subrahmanyam K. V.
Abstract:
We study the problem of obtaining representations of images which are invariant to transformation of the image under rotations, towards improving supervised learning. We show that using simple ideas from group representation theory we get invariant representations of images. Off the shelf learning algorithms perform much better on such representations.
We study the problem of obtaining representations of images which are invariant to transformation of the image under rotations, towards improving supervised learning. We show that using simple ideas from group representation theory we get invariant representations of images. Off the shelf learning algorithms perform much better on such representations.
ES2017-112
Feature Extraction for On-Road Vehicle Detection Based on Support Vector Machine
Samuel Giatti Silva Filho, Roberto Freire, Leandro dos Santos Coelho
Feature Extraction for On-Road Vehicle Detection Based on Support Vector Machine
Samuel Giatti Silva Filho, Roberto Freire, Leandro dos Santos Coelho
Abstract:
Inspired by alarming statistics of deaths and injuries in car accidents, this work presents the development of vehicles detection method, which is part of an Advanced Driving Assistance System. A computer vision software capable to interpret real-time events on roads, that can identify vehicles based on Support Vector Machine, was presented and evaluated by adopting two distinct techniques for features extraction. Comparisons between two feature extraction techniques (Invariant Features Transform and Histogram of Oriented Gradients) were presented, and promising results in terms of vehicles identification accuracy were obtained when a frame scan technique was integrated to the system.
Inspired by alarming statistics of deaths and injuries in car accidents, this work presents the development of vehicles detection method, which is part of an Advanced Driving Assistance System. A computer vision software capable to interpret real-time events on roads, that can identify vehicles based on Support Vector Machine, was presented and evaluated by adopting two distinct techniques for features extraction. Comparisons between two feature extraction techniques (Invariant Features Transform and Histogram of Oriented Gradients) were presented, and promising results in terms of vehicles identification accuracy were obtained when a frame scan technique was integrated to the system.
ES2017-38
Predicting Time Series with Space-Time Convolutional and Recurrent Neural Networks
Wolfgang Groß, Sascha Lange, Joschka Bödecker, Manuel Blum
Predicting Time Series with Space-Time Convolutional and Recurrent Neural Networks
Wolfgang Groß, Sascha Lange, Joschka Bödecker, Manuel Blum
Abstract:
We present a novel approach to predict time series with a deep recurrent and convolutional neural network. In order to apply modern deep learning techniques to financial time series, deep neural networks have to learn problem-specific, spatio-temporal features. In computer vi- sion, convolutional neural networks with their ability to learn useful spatial features have given rise to groundbreaking results, but spatio-temporal patterns—as they arise in multivariate financial time series—pose additional challenges. We demonstrate that the features the model learns are better than hand-crafted features of a professional trader. We also show that our model beats other models at predicting the price development on the European Power Exchange (EPEX).
We present a novel approach to predict time series with a deep recurrent and convolutional neural network. In order to apply modern deep learning techniques to financial time series, deep neural networks have to learn problem-specific, spatio-temporal features. In computer vi- sion, convolutional neural networks with their ability to learn useful spatial features have given rise to groundbreaking results, but spatio-temporal patterns—as they arise in multivariate financial time series—pose additional challenges. We demonstrate that the features the model learns are better than hand-crafted features of a professional trader. We also show that our model beats other models at predicting the price development on the European Power Exchange (EPEX).
Randomized Machine Learning approaches: analysis and developments
ES2017-4
Randomized Machine Learning Approaches: Recent Developments and Challenges
Claudio Gallicchio, José D. Martín-Guerrero, Alessio Micheli, Emilio Soria-Olivas
Randomized Machine Learning Approaches: Recent Developments and Challenges
Claudio Gallicchio, José D. Martín-Guerrero, Alessio Micheli, Emilio Soria-Olivas
ES2017-69
Fisher memory of linear Wigner echo state networks
Peter Tino
Fisher memory of linear Wigner echo state networks
Peter Tino
Abstract:
We study asymptotic properties of Fisher memory of linear Echo State Networks with randomized reservoir coupling prescribed by the class of Wigner matrices. Three properties of Fisher memory normalized per state space dimension are derived: (1) as the system size grows, the contribution of self-coupling of self-loops on reservoir units to the Fisher memory is negligible; (2) the maximal Fisher memory is achieved when the input-to-state coupling is collinear with the dominant eigenvector of the state space coupling matrix; and (3) when the input-to-state coupling is collinear with the sum of eigenvectors of the state space coupling, the expected normalized memory is four time smaller than the maximal memory value.
We study asymptotic properties of Fisher memory of linear Echo State Networks with randomized reservoir coupling prescribed by the class of Wigner matrices. Three properties of Fisher memory normalized per state space dimension are derived: (1) as the system size grows, the contribution of self-coupling of self-loops on reservoir units to the Fisher memory is negligible; (2) the maximal Fisher memory is achieved when the input-to-state coupling is collinear with the dominant eigenvector of the state space coupling matrix; and (3) when the input-to-state coupling is collinear with the sum of eigenvectors of the state space coupling, the expected normalized memory is four time smaller than the maximal memory value.
ES2017-53
Generalization Performances of Randomized Classifiers and Algorithms built on Data Dependent Distributions
Luca Oneto, Sandro Ridella, Davide Anguita
Generalization Performances of Randomized Classifiers and Algorithms built on Data Dependent Distributions
Luca Oneto, Sandro Ridella, Davide Anguita
Abstract:
In this paper we prove that a randomized algorithm based on the data generating dependent prior and data dependent posterior Boltzmann distributions of Catoni (2007) is Differentially Private (DP) and shows better generalization properties than the Gibbs (randomized) classifier associated to the same distributions. For this purpose, we will develop a sharper DP-based generalization bounds, which improve over the current state-of-the-art Hoeffding-type bound.
In this paper we prove that a randomized algorithm based on the data generating dependent prior and data dependent posterior Boltzmann distributions of Catoni (2007) is Differentially Private (DP) and shows better generalization properties than the Gibbs (randomized) classifier associated to the same distributions. For this purpose, we will develop a sharper DP-based generalization bounds, which improve over the current state-of-the-art Hoeffding-type bound.
ES2017-27
ELM Preference Learning for Physiological Data
Davide Bacciu, Michele Colombo, Davide Morelli, David Plans
ELM Preference Learning for Physiological Data
Davide Bacciu, Michele Colombo, Davide Morelli, David Plans
Abstract:
The work confronts two approaches to realize preference learning using Extreme Learning Machine networks, relaying on limited and subject-dependant information concerning pairwise relations between data samples. We describe an application within the context of assessing the effect of breathing exercises on heart-rate variability, using a dataset of over $19$K exercising sessions. Results highlight the importance of using weight sharing architectures to learn smooth e generalizable complete orders induced by the preference relation.
The work confronts two approaches to realize preference learning using Extreme Learning Machine networks, relaying on limited and subject-dependant information concerning pairwise relations between data samples. We describe an application within the context of assessing the effect of breathing exercises on heart-rate variability, using a dataset of over $19$K exercising sessions. Results highlight the importance of using weight sharing architectures to learn smooth e generalizable complete orders induced by the preference relation.
ES2017-92
Advanced query strategies for Active Learning with Extreme Learning Machines
Anton Akusok, Emil Eirola, Yoan Miché, Andrey Gritsenko, Amaury Lendasse
Advanced query strategies for Active Learning with Extreme Learning Machines
Anton Akusok, Emil Eirola, Yoan Miché, Andrey Gritsenko, Amaury Lendasse
Abstract:
This work addresses an important part of solving applied problems that is data acquisition. Often raw data is cheap while labeling is an expensive manual job. Active Learning reduces the labeling effort by suggesting particular samples with a query strategy. The paper proposes three new query strategies built on recent developments in extreme learning machines: based a committee of class-weighted ELM, based on prediction intervals found with ELM, and based on mislabeled samples detection with ELM. The proposed strategies perform on the state-of-the-art level on three real world datasets.
This work addresses an important part of solving applied problems that is data acquisition. Often raw data is cheap while labeling is an expensive manual job. Active Learning reduces the labeling effort by suggesting particular samples with a query strategy. The paper proposes three new query strategies built on recent developments in extreme learning machines: based a committee of class-weighted ELM, based on prediction intervals found with ELM, and based on mislabeled samples detection with ELM. The proposed strategies perform on the state-of-the-art level on three real world datasets.
ES2017-43
Random projection initialization for deep neural networks
Piotr Iwo Wójcik, Marcin Kurdziel
Random projection initialization for deep neural networks
Piotr Iwo Wójcik, Marcin Kurdziel
Abstract:
In this work we propose to initialize rectifier neural networks with random projection matrices. We focus on Convolutional Neural Networks and fully-connected networks with pretraining. Our results show, that in convolutional networks a well designed random projection initialization can perform better than the current state-of-the-art He's initialization. Specifically, in our evaluation, initialization based on the Subsampled Randomized Hadamard Transform consistently outperformed He's initialization on several evaluated image classification datasets.
In this work we propose to initialize rectifier neural networks with random projection matrices. We focus on Convolutional Neural Networks and fully-connected networks with pretraining. Our results show, that in convolutional networks a well designed random projection initialization can perform better than the current state-of-the-art He's initialization. Specifically, in our evaluation, initialization based on the Subsampled Randomized Hadamard Transform consistently outperformed He's initialization on several evaluated image classification datasets.
Classification
ES2017-34
Fine-grained event learning of human-object interaction with LSTM-CRF
Tuan Do, James Pustejovsky
Fine-grained event learning of human-object interaction with LSTM-CRF
Tuan Do, James Pustejovsky
Abstract:
Event learning is one of the most important problems in AI. However, notwithstanding significant research efforts, it is still a very complex task, especially when the events involve the interaction of humans or agents with other objects, as it requires modeling human kinematics and object movements. This study proposes a methodology for learning complex human-object interaction (HOI) events, involving the recording, annotation and classification of event interactions. For annotation, we allow multiple interpretations of a motion capture by slicing over its temporal span; for classification, we use Long-Short Term Memory (LSTM) sequence models with Conditional Randon Field (CRF) for constraints of outputs. Using a setup involving captures of human-object interaction as three dimensional inputs, we argue that this approach could be used for event types involving complex spatio-temporal dynamics.
Event learning is one of the most important problems in AI. However, notwithstanding significant research efforts, it is still a very complex task, especially when the events involve the interaction of humans or agents with other objects, as it requires modeling human kinematics and object movements. This study proposes a methodology for learning complex human-object interaction (HOI) events, involving the recording, annotation and classification of event interactions. For annotation, we allow multiple interpretations of a motion capture by slicing over its temporal span; for classification, we use Long-Short Term Memory (LSTM) sequence models with Conditional Randon Field (CRF) for constraints of outputs. Using a setup involving captures of human-object interaction as three dimensional inputs, we argue that this approach could be used for event types involving complex spatio-temporal dynamics.
ES2017-8
Distance metric learning: a two-phase approach
Bac Nguyen, Carlos Morell, Bernard De Baets
Distance metric learning: a two-phase approach
Bac Nguyen, Carlos Morell, Bernard De Baets
Abstract:
Distance metric learning has been successfully incorporated in many machine learning applications. The main challenge arises from the positive semidefiniteness constraint on the Mahalanobis matrix, which results in a high computational cost. In this paper, we develop a novel approach to reduce this computational burden. We first map each training example into a new space by an orthonormal transformation. Then, in the transformed space, we simply learn a diagonal matrix. This two-phase approach is thus much easier and less costly than learning a full Mahalanobis matrix in one phase as is commonly done.
Distance metric learning has been successfully incorporated in many machine learning applications. The main challenge arises from the positive semidefiniteness constraint on the Mahalanobis matrix, which results in a high computational cost. In this paper, we develop a novel approach to reduce this computational burden. We first map each training example into a new space by an orthonormal transformation. Then, in the transformed space, we simply learn a diagonal matrix. This two-phase approach is thus much easier and less costly than learning a full Mahalanobis matrix in one phase as is commonly done.
ES2017-57
An EM transfer learning algorithm with applications in bionic hand prostheses
Benjamin Paassen, Alexander Schulz, Janne Hahne, Barbara Hammer
An EM transfer learning algorithm with applications in bionic hand prostheses
Benjamin Paassen, Alexander Schulz, Janne Hahne, Barbara Hammer
Abstract:
Modern bionic hand prostheses feature unprecedented functionality, permitting simultaneous motion in multiple degrees of freedom. An intuitive user interface based on muscle signals requires machine learning models. However, current models are not yet sufficiently robust to everyday disturbances, such as electrode shifts. We propose a novel expectation maximization approach for transfer learning to rapidly recalibrate a machine learning model if disturbances occur. In our experimental evaluation we show that even under conditions of incomplete class coverage and few data points our approach finds a viable transfer mapping which improves classification accuracy significantly.
Modern bionic hand prostheses feature unprecedented functionality, permitting simultaneous motion in multiple degrees of freedom. An intuitive user interface based on muscle signals requires machine learning models. However, current models are not yet sufficiently robust to everyday disturbances, such as electrode shifts. We propose a novel expectation maximization approach for transfer learning to rapidly recalibrate a machine learning model if disturbances occur. In our experimental evaluation we show that even under conditions of incomplete class coverage and few data points our approach finds a viable transfer mapping which improves classification accuracy significantly.
ES2017-54
Dropout Prediction at University of Genoa: a Privacy Preserving Data Driven Approach
Luca Oneto, Anna Siri, Gianvittorio Luria, Davide Anguita
Dropout Prediction at University of Genoa: a Privacy Preserving Data Driven Approach
Luca Oneto, Anna Siri, Gianvittorio Luria, Davide Anguita
Abstract:
Nowadays many educational institutions crucially need to understand the dynamics at the basis of the university dropout (UD) phenomenon. However, the most informative educational data are personal and subject to strict privacy constraints. The challenge is therefore to develop a data driven system which accurately predicts students dropouts while preserving the privacy of individual data instances. In the present paper we investigate this problem, making use of data collected at University of Genoa as a case study.
Nowadays many educational institutions crucially need to understand the dynamics at the basis of the university dropout (UD) phenomenon. However, the most informative educational data are personal and subject to strict privacy constraints. The challenge is therefore to develop a data driven system which accurately predicts students dropouts while preserving the privacy of individual data instances. In the present paper we investigate this problem, making use of data collected at University of Genoa as a case study.
ES2017-70
Physical activity recognition from sub-bandage sensors using both feature selection and extraction
Thiago Turchetti Maia, Fabio Di Francesco, Valentina Dini, Beatrice Lazzerini, Marco Romanelli, Pietro Salvo
Physical activity recognition from sub-bandage sensors using both feature selection and extraction
Thiago Turchetti Maia, Fabio Di Francesco, Valentina Dini, Beatrice Lazzerini, Marco Romanelli, Pietro Salvo
Abstract:
In this paper, we present a neural network-based approach to classify the activities performed by 40 subjects by analyzing sub-bandage pressure signals. The approach includes an input dimensionality reduction obtained employing both feature extraction and feature selection techniques. The results show that our model is able to classify the activities performed with 98.12% accuracy.
In this paper, we present a neural network-based approach to classify the activities performed by 40 subjects by analyzing sub-bandage pressure signals. The approach includes an input dimensionality reduction obtained employing both feature extraction and feature selection techniques. The results show that our model is able to classify the activities performed with 98.12% accuracy.
ES2017-11
A multi-criteria meta-learning method to select under-sampling algorithms for imbalanced datasets
Romero Morais, Péricles Miranda, Ricardo Silva
A multi-criteria meta-learning method to select under-sampling algorithms for imbalanced datasets
Romero Morais, Péricles Miranda, Ricardo Silva
Abstract:
Standard classifiers consider a balanced distribution of examples' classes in the data, thus, imbalanced datasets may hinder the learning process. Sampling techniques balance the data by adjusting the examples' classes distribution. However, selecting an appropriate sampling technique and its parameters for a given imbalanced dataset is still an open problem. This work proposes a method that uses Meta-Learning to recommend a technique for an imbalanced dataset considering multiple performance criteria. The experiments revealed that the proposal reached results comparable to those achieved by the brute-force approach, overcame the techniques with their default parameters most of the time, and always surpassed the random search approach.
Standard classifiers consider a balanced distribution of examples' classes in the data, thus, imbalanced datasets may hinder the learning process. Sampling techniques balance the data by adjusting the examples' classes distribution. However, selecting an appropriate sampling technique and its parameters for a given imbalanced dataset is still an open problem. This work proposes a method that uses Meta-Learning to recommend a technique for an imbalanced dataset considering multiple performance criteria. The experiments revealed that the proposal reached results comparable to those achieved by the brute-force approach, overcame the techniques with their default parameters most of the time, and always surpassed the random search approach.
ES2017-71
Large-scale nonlinear dimensionality reduction for network intrusion detection
Yasir Hamid, Ludovic Journaux, John Aldo Lee, Lucile Sautot, Nabi Bushra, M. Sugumaran
Large-scale nonlinear dimensionality reduction for network intrusion detection
Yasir Hamid, Ludovic Journaux, John Aldo Lee, Lucile Sautot, Nabi Bushra, M. Sugumaran
Abstract:
Network intrusion detection (NID) is a complex classification problem. In this paper, we combine classification with recent and scalable nonlinear dimensionality reduction (NLDR) methods. Classification and DR are not necessarily adversarial, provided adequate cluster magnification occurring in NLDR methods like t-SNE: DR mitigates the curse of dimensionality, while cluster magnification can maintain class separability. We demonstrate experimentally the effectiveness of the approach by analyzing and comparing results on the big KDD99 dataset, using both NLDR quality assessment and classification rate for SVMs and random forests. Since data involves features of mixed types (numerical and categorical), the use of Gower’s similarity coefficient as metric further improves the results over the classical similarity metric.
Network intrusion detection (NID) is a complex classification problem. In this paper, we combine classification with recent and scalable nonlinear dimensionality reduction (NLDR) methods. Classification and DR are not necessarily adversarial, provided adequate cluster magnification occurring in NLDR methods like t-SNE: DR mitigates the curse of dimensionality, while cluster magnification can maintain class separability. We demonstrate experimentally the effectiveness of the approach by analyzing and comparing results on the big KDD99 dataset, using both NLDR quality assessment and classification rate for SVMs and random forests. Since data involves features of mixed types (numerical and categorical), the use of Gower’s similarity coefficient as metric further improves the results over the classical similarity metric.
ES2017-117
Acceleration of Prototype Based Models with Cascade Computation
Cem Karaoguz, Alexander Gepperth
Acceleration of Prototype Based Models with Cascade Computation
Cem Karaoguz, Alexander Gepperth
Abstract:
Prototype-based generative description of data space is shown to be effective in incremental learning. However, computation of similarities of input vectors to prototypes may be demanding especially in the face of high input dimensions and high number of prototypes. The main contribution of the paper is the acceleration of the prototype-based model by a cascade computation approach. The evaluation of the presented architecture on a human detection and pose estimation problem shows that the cascade computation results in a significant reduction of computational resource requirements at the expense of minor degradations in the classification performance.
Prototype-based generative description of data space is shown to be effective in incremental learning. However, computation of similarities of input vectors to prototypes may be demanding especially in the face of high input dimensions and high number of prototypes. The main contribution of the paper is the acceleration of the prototype-based model by a cascade computation approach. The evaluation of the presented architecture on a human detection and pose estimation problem shows that the cascade computation results in a significant reduction of computational resource requirements at the expense of minor degradations in the classification performance.
ES2017-42
Automatic crime report classication through a weightless neural network
Rafael Adnet Pinho, Walkir Brito, Claudia Motta, Priscila Lima
Automatic crime report classication through a weightless neural network
Rafael Adnet Pinho, Walkir Brito, Claudia Motta, Priscila Lima
Abstract:
Anonymous crime reporting is a tool that helps to reduce and prevent crime occurrences. The classification of the crime reports received by the call center is necessary for the data organization and also to stipulate the importance of a particular report and its relation to others. The objective of this work is to develop a system that assists the call center's operator by recommending classification to new reports. The system uses a weightless neural network that automatically attribute a class to a report. At the end of this work it was possible to observe that automatic classifications of crime reports with high accuracy are possible using a weightless neural network.
Anonymous crime reporting is a tool that helps to reduce and prevent crime occurrences. The classification of the crime reports received by the call center is necessary for the data organization and also to stipulate the importance of a particular report and its relation to others. The objective of this work is to develop a system that assists the call center's operator by recommending classification to new reports. The system uses a weightless neural network that automatically attribute a class to a report. At the end of this work it was possible to observe that automatic classifications of crime reports with high accuracy are possible using a weightless neural network.
ES2017-123
Efficient Neural-based patent document segmentation with Term Order Probabilities
Danilo Silva de Carvalho, Minh-Le Nguyen
Efficient Neural-based patent document segmentation with Term Order Probabilities
Danilo Silva de Carvalho, Minh-Le Nguyen
Abstract:
The internationally growing trend of patent applications puts great pressure on the agents involved in managing this kind of information and creates a demand for efficient and effective patent analysis methods. This work presents an computationally efficient approach for patent document segmentation based on structured ANNs and a simple distributional semantics composition method. The conducted experiments indicate effectiveness of the approach, which benefits a wide array of patent processing techniques that work upon structured inputs.
The internationally growing trend of patent applications puts great pressure on the agents involved in managing this kind of information and creates a demand for efficient and effective patent analysis methods. This work presents an computationally efficient approach for patent document segmentation based on structured ANNs and a simple distributional semantics composition method. The conducted experiments indicate effectiveness of the approach, which benefits a wide array of patent processing techniques that work upon structured inputs.
Biomedical data analysis in translational research: integration of expert knowledge and interpretable models
ES2017-2
Biomedical data analysis in translational research: integration of expert knowledge and interpretable models
Gyan Bhanot, Michael Biehl, Thomas Villmann, Dietlind Zühlke
Biomedical data analysis in translational research: integration of expert knowledge and interpretable models
Gyan Bhanot, Michael Biehl, Thomas Villmann, Dietlind Zühlke
ES2017-67
Feature Relevance Bounds for Linear Classification
Christina Göpfert, Lukas Pfannschmidt, Barbara Hammer
Feature Relevance Bounds for Linear Classification
Christina Göpfert, Lukas Pfannschmidt, Barbara Hammer
Abstract:
Biomedical applications often aim for an identification of relevant features for a given classification task, since these carry the promise of semantic insight into the underlying process. For correlated input dimensions, feature relevances are not unique, and the identification of meaningful subtle biomarkers remains a challenge. One approach is to identify intervals for the possible relevance of given features, a problem related to all relevant feature determination. In this contribution, we address the important case of linear classifiers and we transfer the problem how to infer feature relevance bounds to a convex optimization problem. We demonstrate the superiority of the resulting technique in comparison to popular feature-relevance determination methods in several benchmarks.
Biomedical applications often aim for an identification of relevant features for a given classification task, since these carry the promise of semantic insight into the underlying process. For correlated input dimensions, feature relevances are not unique, and the identification of meaningful subtle biomarkers remains a challenge. One approach is to identify intervals for the possible relevance of given features, a problem related to all relevant feature determination. In this contribution, we address the important case of linear classifiers and we transfer the problem how to infer feature relevance bounds to a convex optimization problem. We demonstrate the superiority of the resulting technique in comparison to popular feature-relevance determination methods in several benchmarks.
ES2017-86
Prediction of preterm infant mortality with Gaussian process classification
Olli-Pekka Rinta-Koski, Simo Särkkä, Jaakko Hollmén, Markus Leskinen, Sture Andersson
Prediction of preterm infant mortality with Gaussian process classification
Olli-Pekka Rinta-Koski, Simo Särkkä, Jaakko Hollmén, Markus Leskinen, Sture Andersson
Abstract:
We present a method for predicting preterm infant in-hospital-mortality using Bayesian Gaussian process classification. We combined features extracted from sensor measurements, made during the first 24 hours of care for 581 Very Low Birth Weight infants, with standard clinical features calculated on arrival at the Neonatal Intensive Care Unit. We achieved a classification result with area under curve of 0.94 (standard error 0.02), which is in excess of the results achieved by using the clinical standard SNAP-II and SNAPPE-II scores.
We present a method for predicting preterm infant in-hospital-mortality using Bayesian Gaussian process classification. We combined features extracted from sensor measurements, made during the first 24 hours of care for 581 Very Low Birth Weight infants, with standard clinical features calculated on arrival at the Neonatal Intensive Care Unit. We achieved a classification result with area under curve of 0.94 (standard error 0.02), which is in excess of the results achieved by using the clinical standard SNAP-II and SNAPPE-II scores.
ES2017-94
Comparison of strategies to learn from imbalanced classes for computer aided diagnosis of inborn steroidogenic disorders
Sreejita Ghosh, Elizabeth Sarah Baranowski, Rick van Veen, Gert-Jan de Vries, Michael Biehl, Wiebke Arlt, Peter Tino, Kerstin Bunte
Comparison of strategies to learn from imbalanced classes for computer aided diagnosis of inborn steroidogenic disorders
Sreejita Ghosh, Elizabeth Sarah Baranowski, Rick van Veen, Gert-Jan de Vries, Michael Biehl, Wiebke Arlt, Peter Tino, Kerstin Bunte
Abstract:
\N
\N
Environmental signal processing: new trends and applications
ES2017-1
Environmental signal processing: new trends and applications
Matthieu Puigt, Gilles Delmaire, Gilles Roussel
Environmental signal processing: new trends and applications
Matthieu Puigt, Gilles Delmaire, Gilles Roussel
ES2017-103
Solving Inverse Source Problems for Sources with Arbitrary Shapes using Sensor Networks
John Murray-Bruce, Pier Luigi Dragotti
Solving Inverse Source Problems for Sources with Arbitrary Shapes using Sensor Networks
John Murray-Bruce, Pier Luigi Dragotti
Abstract:
Recently, the use of wireless sensor networks for environmental monitoring has been a topic of intensive research. The sensor nodes obtain spatiotemporal samples of physical fields over the region of interest. For most cases these fields are driven by well-known partial differential equations---the diffusion and wave equations for example---and this prior knowledge can be used to solve such \textit{physics-driven} inverse source problems (ISPs). In this work, we demonstrate how to estimate the unknown source shape inducing the field by assuming that it can be described by a model having a finite number of unknown parameters.
Recently, the use of wireless sensor networks for environmental monitoring has been a topic of intensive research. The sensor nodes obtain spatiotemporal samples of physical fields over the region of interest. For most cases these fields are driven by well-known partial differential equations---the diffusion and wave equations for example---and this prior knowledge can be used to solve such \textit{physics-driven} inverse source problems (ISPs). In this work, we demonstrate how to estimate the unknown source shape inducing the field by assuming that it can be described by a model having a finite number of unknown parameters.
ES2017-83
Non-negative decomposition of geophysical dynamics
Manuel Lopez-Radcenco, Abdeldjalil Aïssa-El-Bey, Pierre Ailliot, Ronan Fablet
Non-negative decomposition of geophysical dynamics
Manuel Lopez-Radcenco, Abdeldjalil Aïssa-El-Bey, Pierre Ailliot, Ronan Fablet
Abstract:
The decomposition of geophysical processes into relevant modes is a key issue for characterization, forecasting and reconstruction problems. The blind separation of contributions from different sources is a well-studied problem in signal and image processing. Recently, significant advances have been reported with the introduction of non-negative and sparse formulations. In this work, we address an extension to the blind decomposition of linear operators or transfer functions between variables of interest with an emphasis on a non-negative setting. As illustrated here, such decompositions are of key interest for the analysis of geophysical dynamics and the relationships between different geophysical variables.
The decomposition of geophysical processes into relevant modes is a key issue for characterization, forecasting and reconstruction problems. The blind separation of contributions from different sources is a well-studied problem in signal and image processing. Recently, significant advances have been reported with the introduction of non-negative and sparse formulations. In this work, we address an extension to the blind decomposition of linear operators or transfer functions between variables of interest with an emphasis on a non-negative setting. As illustrated here, such decompositions are of key interest for the analysis of geophysical dynamics and the relationships between different geophysical variables.
ES2017-74
Impact of the initialisation of a blind unmixing method dealing with intra-class variability
Charlotte REVEL, Yannick Deville, Véronique ACHARD, Xavier BRIOTTET
Impact of the initialisation of a blind unmixing method dealing with intra-class variability
Charlotte REVEL, Yannick Deville, Véronique ACHARD, Xavier BRIOTTET
Abstract:
In hyperspectral imagery, unmixing methods are often used to analyse the composition of the pixels. Such methods usually suppose that a single spectral signature, called an endmember, can be associated with each pure material present in the scene. Such an assumption is no more valid for materials that exhibit spectral variability due to illumination conditions, weathering, slight variations of the composition, etc. In this paper, we investigate a new method based on the assumption of a linear mixing model, that deals with intra-class spectral variability. A new formulation of the linear mixing is provided. In our model a pure material cannot be described by a single spectrum in the image but it can in a pixel. A method is presented to handle this new model. It is based on a pixel-by-pixel Nonnegative Matrix Factorization (NMF) methods. The method is tested on a semi-synthetic data set built with spectra extracted from a real hyperspectral image and mixtures of these spectra. We particularly focused our tests to study the impact of the initialisation of our method.
In hyperspectral imagery, unmixing methods are often used to analyse the composition of the pixels. Such methods usually suppose that a single spectral signature, called an endmember, can be associated with each pure material present in the scene. Such an assumption is no more valid for materials that exhibit spectral variability due to illumination conditions, weathering, slight variations of the composition, etc. In this paper, we investigate a new method based on the assumption of a linear mixing model, that deals with intra-class spectral variability. A new formulation of the linear mixing is provided. In our model a pure material cannot be described by a single spectrum in the image but it can in a pixel. A method is presented to handle this new model. It is based on a pixel-by-pixel Nonnegative Matrix Factorization (NMF) methods. The method is tested on a semi-synthetic data set built with spectra extracted from a real hyperspectral image and mixtures of these spectra. We particularly focused our tests to study the impact of the initialisation of our method.
ES2017-136
Application of Tensor and Matrix Completion on Environmental Sensing Data
Michalis Giannopoulos, Sofia Savvaki, Grigorios Tsagkatakis, Panagiotis Tsakalides
Application of Tensor and Matrix Completion on Environmental Sensing Data
Michalis Giannopoulos, Sofia Savvaki, Grigorios Tsagkatakis, Panagiotis Tsakalides
Abstract:
As environmental resources utilization becomes more and more crucial, Wireless Sensor Networks (WSNs) are introduced in order to capture the variation of diverse parameters. However, limitations such as network connectivity, power consumption, and storage capacity lead to missing measurements from such networked sensors. To address this problem, we investigate the potential of recovering high dimensional environmental signals from small sets of observations. To account for the dimen- sionality of the data, we invoke tensor modelling and we propose a low-rank tensor recovery formulation. Experimental results using real WSN data from an indoor industrial environment as well as from an outdoor natural environment demonstrate that the estimation of missing measurements is much better addressed when structural information is considered.
As environmental resources utilization becomes more and more crucial, Wireless Sensor Networks (WSNs) are introduced in order to capture the variation of diverse parameters. However, limitations such as network connectivity, power consumption, and storage capacity lead to missing measurements from such networked sensors. To address this problem, we investigate the potential of recovering high dimensional environmental signals from small sets of observations. To account for the dimen- sionality of the data, we invoke tensor modelling and we propose a low-rank tensor recovery formulation. Experimental results using real WSN data from an indoor industrial environment as well as from an outdoor natural environment demonstrate that the estimation of missing measurements is much better addressed when structural information is considered.
ES2017-131
Indoor air pollutant sources using Blind Source Separation Methods
Rachid OUARET, Anda IONESCU, Olivier RAMALHO, Yves CANDAU
Indoor air pollutant sources using Blind Source Separation Methods
Rachid OUARET, Anda IONESCU, Olivier RAMALHO, Yves CANDAU
Abstract:
The objective of this study is to separate different sources of variability of air pollutant concentrations time series of particulate matter (PM) monitored in real indoor environments. Different blind source separation (BSS) methods (ICA, PMF, NMF) were applied in order to identify the PM sources and their contributions. The source profiles were characterized by their autocorrelation functions (ACF) which were compared to the ACFs of other variables. Their interpretation was completed by the analysis of polar plots including exogenous factors. Source contributions were also quantified.
The objective of this study is to separate different sources of variability of air pollutant concentrations time series of particulate matter (PM) monitored in real indoor environments. Different blind source separation (BSS) methods (ICA, PMF, NMF) were applied in order to identify the PM sources and their contributions. The source profiles were characterized by their autocorrelation functions (ACF) which were compared to the ACFs of other variables. Their interpretation was completed by the analysis of polar plots including exogenous factors. Source contributions were also quantified.
ES2017-149
High dimensionality voltammetric biosensor data processed with artificial neural networks
Andreu González-Calabuig, Georgina Faura, Manel del Valle
High dimensionality voltammetric biosensor data processed with artificial neural networks
Andreu González-Calabuig, Georgina Faura, Manel del Valle
Abstract:
This work report the coupling of an array of voltammetric sensors with artificial neural networks (ANN), usually named Electronic Tongue, for the simultaneous quantification of tryptophan, tyrosine and cysteine aminoacids. The obtained signals were compressed using fast Fourier transform (FFT) and then the ANN model was constructed from a set of low-frequency components. An ANN predictive model was obtained by back-propagation, which had 160 input neurons, one hidden layer with 7 neurons and used purelin and satlins functions in the hidden and output layer respectively, trained with a factorial design scheme . The model attained a total normalized root mean square error of 0.032 for an independent test set of data (n=15).
This work report the coupling of an array of voltammetric sensors with artificial neural networks (ANN), usually named Electronic Tongue, for the simultaneous quantification of tryptophan, tyrosine and cysteine aminoacids. The obtained signals were compressed using fast Fourier transform (FFT) and then the ANN model was constructed from a set of low-frequency components. An ANN predictive model was obtained by back-propagation, which had 160 input neurons, one hidden layer with 7 neurons and used purelin and satlins functions in the hidden and output layer respectively, trained with a factorial design scheme . The model attained a total normalized root mean square error of 0.032 for an independent test set of data (n=15).
Kernels, graphs and clustering
ES2017-116
Learning sparse models of diffusive graph signals
Shuyu Dong, Dorina Thanou, Pierre-Antoine Absil, Pascal Frossard
Learning sparse models of diffusive graph signals
Shuyu Dong, Dorina Thanou, Pierre-Antoine Absil, Pascal Frossard
Abstract:
Graph signals that describe data living on irregularly structured domains provide a generic representation for structured information in very diverse applications. The effective analysis and processing of such signals however necessitate good models that identify the most relevant signal components. In this paper, we propose to learn sparse representation models for graph signals that describe heat diffusion processes. This consists in learning a dictionary that incorporates spectral properties of an implicit graph diffusion kernel. The underlying formulation enables the identification of both sparse features and an adaptive graph structure from mere signal observations. Experiments on synthetic and real datasets show that the proposed dictionaries not only reflect the underlying diffusion process but also significantly reduce over-fitting of data in comparison to state-of-the-art methods.
Graph signals that describe data living on irregularly structured domains provide a generic representation for structured information in very diverse applications. The effective analysis and processing of such signals however necessitate good models that identify the most relevant signal components. In this paper, we propose to learn sparse representation models for graph signals that describe heat diffusion processes. This consists in learning a dictionary that incorporates spectral properties of an implicit graph diffusion kernel. The underlying formulation enables the identification of both sparse features and an adaptive graph structure from mere signal observations. Experiments on synthetic and real datasets show that the proposed dictionaries not only reflect the underlying diffusion process but also significantly reduce over-fitting of data in comparison to state-of-the-art methods.
ES2017-127
The Conjunctive Disjunctive Node Kernel
Dinh Tran Van, Alessandro Sperduti, Fabrizio Costa
The Conjunctive Disjunctive Node Kernel
Dinh Tran Van, Alessandro Sperduti, Fabrizio Costa
Abstract:
Gene-disease associations are inferred on the basis of similarities between genes. Biological relationships that are exploited to define similarities range from interacting proteins, proteins that participate in pathways and gene expression profiles. Though graph kernel methods have become a prominent approach for association prediction, most solutions are based on a notion of information diffusion that does not capture the specificity of different network parts. Here we propose a graph kernel method that explicitly models the configuration of each gene’s context. An empirical evaluation on several biological databases shows that our proposal is competitive w.r.t. state-of-the-art kernel approaches.
Gene-disease associations are inferred on the basis of similarities between genes. Biological relationships that are exploited to define similarities range from interacting proteins, proteins that participate in pathways and gene expression profiles. Though graph kernel methods have become a prominent approach for association prediction, most solutions are based on a notion of information diffusion that does not capture the specificity of different network parts. Here we propose a graph kernel method that explicitly models the configuration of each gene’s context. An empirical evaluation on several biological databases shows that our proposal is competitive w.r.t. state-of-the-art kernel approaches.
ES2017-66
POKer: a Partial Order Kernel for Comparing Strings with Alternative Substrings
Maryam Abdollahyan, Fabrizio Smeraldi
POKer: a Partial Order Kernel for Comparing Strings with Alternative Substrings
Maryam Abdollahyan, Fabrizio Smeraldi
Abstract:
We introduce a Partial Order Kernel (POKer) on the weighted sum of local alignment scores that can be used for comparison and classification of strings containing alternative substrings of variable length. POKer is defined over the product of two directed acyclic graphs, each representing a string with alternative substrings, and is computed efficiently using dynamic programming. We evaluate the performance of POKer with Support Vector Machines on a dataset of strings generated by detecting overlapping motifs in a set of simulated DNA sequences. Compared to a generalization of a state-of-the-art string kernel, POKer achieves a higher classification accuracy.
We introduce a Partial Order Kernel (POKer) on the weighted sum of local alignment scores that can be used for comparison and classification of strings containing alternative substrings of variable length. POKer is defined over the product of two directed acyclic graphs, each representing a string with alternative substrings, and is computed efficiently using dynamic programming. We evaluate the performance of POKer with Support Vector Machines on a dataset of strings generated by detecting overlapping motifs in a set of simulated DNA sequences. Compared to a generalization of a state-of-the-art string kernel, POKer achieves a higher classification accuracy.
ES2017-41
Accelerating stochastic kernel SOM
Jérôme Mariette, Fabrice Rossi, Madalina Olteanu, Nathalie Villa-Vialaneix
Accelerating stochastic kernel SOM
Jérôme Mariette, Fabrice Rossi, Madalina Olteanu, Nathalie Villa-Vialaneix
Abstract:
Analyzing non vectorial data has become a common trend in a number of real-life applications. Various prototype-based methods have been extended to answer this need by means of kernalization that embed data into an (implicit) Euclidean space. One drawback of those approaches is their omplexity, which is commonly of order the square or the cube of the number of observations. In this paper, we propose an efficient method to reduce complexity of the stochastic kernel SOM. The results are illustrated on large datasets and compared to the standard kernel SOM. The approach has been implemented in the last version of the R package SOMbrero version 1.2.
Analyzing non vectorial data has become a common trend in a number of real-life applications. Various prototype-based methods have been extended to answer this need by means of kernalization that embed data into an (implicit) Euclidean space. One drawback of those approaches is their omplexity, which is commonly of order the square or the cube of the number of observations. In this paper, we propose an efficient method to reduce complexity of the stochastic kernel SOM. The results are illustrated on large datasets and compared to the standard kernel SOM. The approach has been implemented in the last version of the R package SOMbrero version 1.2.
ES2017-49
Viral initialization for spectral clustering
Vahan Petrosyan, Alexandre Proutiere
Viral initialization for spectral clustering
Vahan Petrosyan, Alexandre Proutiere
Abstract:
Spectral Clustering is one of the most widely used clustering algorithms. To find k clusters, it runs the K-means algorithm on the top k eigenvectors of a Laplacian matrix constructed from the data. As a consequence, it inherits the initialization issues of K-means. In this paper, we propose Viral Initialization (VI), a novel initialization procedure implemented in the Spectral Clustering algorithm before K-means is applied. VI is designed so that the resulting clusterings exhibit low normalized cut (Ncuts) values. This design principle is aligned with the recent observation that "good" clusterings have low Ncuts values. We show, through extensive numerical experiments, that the Spectral Clustering algorithm with VI consistently outperforms other state-of-the-art clustering techniques.
Spectral Clustering is one of the most widely used clustering algorithms. To find k clusters, it runs the K-means algorithm on the top k eigenvectors of a Laplacian matrix constructed from the data. As a consequence, it inherits the initialization issues of K-means. In this paper, we propose Viral Initialization (VI), a novel initialization procedure implemented in the Spectral Clustering algorithm before K-means is applied. VI is designed so that the resulting clusterings exhibit low normalized cut (Ncuts) values. This design principle is aligned with the recent observation that "good" clusterings have low Ncuts values. We show, through extensive numerical experiments, that the Spectral Clustering algorithm with VI consistently outperforms other state-of-the-art clustering techniques.
ES2017-134
Approximated Neighbours MinHash Graph Node Kernel
Nicolò Navarin, Alessandro Sperduti
Approximated Neighbours MinHash Graph Node Kernel
Nicolò Navarin, Alessandro Sperduti
Abstract:
In this paper, we propose a scalable kernel for nodes in a (huge) graph. In contrast with other state-of-the-art kernels that scale more than quadratically in the number of nodes, our approach scales lin- early in the average out-degree and quadratically in the number of nodes (for the Gram matrix computation). The kernel presented in this paper considers neighbours as sets, thus it ignores edge weights. Nevertheless, experimental results on real-world datasets show promising results.
In this paper, we propose a scalable kernel for nodes in a (huge) graph. In contrast with other state-of-the-art kernels that scale more than quadratically in the number of nodes, our approach scales lin- early in the average out-degree and quadratically in the number of nodes (for the Gram matrix computation). The kernel presented in this paper considers neighbours as sets, thus it ignores edge weights. Nevertheless, experimental results on real-world datasets show promising results.
ES2017-140
Fast hyperparameter selection for graph kernels via subsampling and multiple kernel learning
Michele Donini, Nicolò Navarin, Ivano Lauriola, Fabio Aiolli, Fabrizio Costa
Fast hyperparameter selection for graph kernels via subsampling and multiple kernel learning
Michele Donini, Nicolò Navarin, Ivano Lauriola, Fabio Aiolli, Fabrizio Costa
Abstract:
Model selection is one of the most computationally expensive tasks in a machine learning application. When dealing with kernel methods for structures, the choice with the largest impact on the overall performance is the selection of the feature bias, i.e. the choice of the concrete kernel for structures. Each kernel in turn exposes several hyper-parameters which also need to be fine tuned. Multiple Kernel Learning offers a way to approach this computational bottleneck by generating a combination of different kernels under different parametric settings. However, this solution still requires the computation of many large kernel matrices. In this paper we propose a method to efficiently select a small number of kernels on a subset of the original data, gaining a dramatic reduction in the runtime without a significant loss of predictive performance.
Model selection is one of the most computationally expensive tasks in a machine learning application. When dealing with kernel methods for structures, the choice with the largest impact on the overall performance is the selection of the feature bias, i.e. the choice of the concrete kernel for structures. Each kernel in turn exposes several hyper-parameters which also need to be fine tuned. Multiple Kernel Learning offers a way to approach this computational bottleneck by generating a combination of different kernels under different parametric settings. However, this solution still requires the computation of many large kernel matrices. In this paper we propose a method to efficiently select a small number of kernels on a subset of the original data, gaining a dramatic reduction in the runtime without a significant loss of predictive performance.
ES2017-24
A Simple Cluster Validation Index with Maximal Coverage
Susanne Jauhiainen, Tommi Karkkainen
A Simple Cluster Validation Index with Maximal Coverage
Susanne Jauhiainen, Tommi Karkkainen
Abstract:
Clustering is an unsupervised technique to detect general, distinct profiles from a given dataset. Similarly to the existence of various different clustering methods and algorithms, there exists many cluster validation methods and indices to suggest the number of clusters. The purpose of this paper is, firstly, to propose a new, simple internal cluster validation index. The index has a maximal coverage: also one cluster, i.e., lack of division of a dataset into disjoint subsets, can be detected. Secondly, the proposed index is compared to the available indices from five different packages implemented in R or Matlab to assess its utilizability. The comparison also suggests many interesting findings in the available implementations of the existing indices. The experiments and the comparison support the viability of the proposed cluster validation index.
Clustering is an unsupervised technique to detect general, distinct profiles from a given dataset. Similarly to the existence of various different clustering methods and algorithms, there exists many cluster validation methods and indices to suggest the number of clusters. The purpose of this paper is, firstly, to propose a new, simple internal cluster validation index. The index has a maximal coverage: also one cluster, i.e., lack of division of a dataset into disjoint subsets, can be detected. Secondly, the proposed index is compared to the available indices from five different packages implemented in R or Matlab to assess its utilizability. The comparison also suggests many interesting findings in the available implementations of the existing indices. The experiments and the comparison support the viability of the proposed cluster validation index.
ES2017-17
The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study
Patrick Glauner, Manxing Du, Victor Paraschiv, Andrey Boytsov, Isabel Lopez Andrade, Jorge Augusto Meira, Petko Valtchev, Radu State
The Top 10 Topics in Machine Learning Revisited: A Quantitative Meta-Study
Patrick Glauner, Manxing Du, Victor Paraschiv, Andrey Boytsov, Isabel Lopez Andrade, Jorge Augusto Meira, Petko Valtchev, Radu State
Abstract:
Which topics of machine learning are most commonly addressed in research? This question was initially answered in 2007 by doing a qualitative survey among distinguished researchers. In our study, we revisit this question from a quantitative perspective. Concretely, we collect 54K abstracts of papers published between 2007 and 2016 in leading machine learning journals and conferences. We then use machine learning in order to determine the top 10 topics in machine learning. We not only include models, but provide a holistic view across optimization, data, features, etc. This quantitative approach allows reducing the bias of surveys. It reveals new and up-to-date insights into what the 10 most prolific topics in machine learning research are. This allows researchers to identify popular topics as well as new and rising topics for their research.
Which topics of machine learning are most commonly addressed in research? This question was initially answered in 2007 by doing a qualitative survey among distinguished researchers. In our study, we revisit this question from a quantitative perspective. Concretely, we collect 54K abstracts of papers published between 2007 and 2016 in leading machine learning journals and conferences. We then use machine learning in order to determine the top 10 topics in machine learning. We not only include models, but provide a holistic view across optimization, data, features, etc. This quantitative approach allows reducing the bias of surveys. It reveals new and up-to-date insights into what the 10 most prolific topics in machine learning research are. This allows researchers to identify popular topics as well as new and rising topics for their research.
Regression, robots and biological systems
ES2017-77
Piecewise-Bézier C1 smoothing on manifolds with application to wind field estimation
Pierre-Yves Gousenbourger, Estelle Massart, Antoni Musolas, Pierre-Antoine Absil, Julien M. Hendrickx, Laurent Jacques, Youssef Marzouk
Piecewise-Bézier C1 smoothing on manifolds with application to wind field estimation
Pierre-Yves Gousenbourger, Estelle Massart, Antoni Musolas, Pierre-Antoine Absil, Julien M. Hendrickx, Laurent Jacques, Youssef Marzouk
Abstract:
We propose an algorithm for fitting C1 piecewise-Bézier curves to (possibly corrupted) data points on manifolds. The curve is chosen as a compromise between proximity to data points and regularity. We apply our algorithm as an example to fit a curve to a set of low-rank covariance matrices, a task arising in wind field modeling. We show that our algorithm has denoising abilities for this application.
We propose an algorithm for fitting C1 piecewise-Bézier curves to (possibly corrupted) data points on manifolds. The curve is chosen as a compromise between proximity to data points and regularity. We apply our algorithm as an example to fit a curve to a set of low-rank covariance matrices, a task arising in wind field modeling. We show that our algorithm has denoising abilities for this application.
ES2017-95
Reducing variance due to importance weighting in covariate shift bias correction
Van-Tinh Tran, Alex Aussem
Reducing variance due to importance weighting in covariate shift bias correction
Van-Tinh Tran, Alex Aussem
Abstract:
Covariate shift is a problem in machine learning when the input distributions of training and test data are different (p(x)≠ p′(x))while their conditional distribution p(y|x) is the same. A common technique to deal with this problem, called importance weighting, amounts to reweighting the training instances in order to make them resemble the test distribution. However this usually comes at the expense of a reduction of the effective sample size, which is harmful when the initial training sample size is already small. In this paper, we show that there exists a weighting scheme on the unlabeled data such that the combination of the weighted unlabeled data and the labeled training data mimics the test distribution.We further prove that the labels are missing at random in this combined data set and thus can be imputed safely. Imputing the missing labels mitigates the undesirable sample-size-reduction effect of importance weighting.A series of experiments on synthetic and real-world data are conducted to demonstrate the efficiency of our approach.
Covariate shift is a problem in machine learning when the input distributions of training and test data are different (p(x)≠ p′(x))while their conditional distribution p(y|x) is the same. A common technique to deal with this problem, called importance weighting, amounts to reweighting the training instances in order to make them resemble the test distribution. However this usually comes at the expense of a reduction of the effective sample size, which is harmful when the initial training sample size is already small. In this paper, we show that there exists a weighting scheme on the unlabeled data such that the combination of the weighted unlabeled data and the labeled training data mimics the test distribution.We further prove that the labels are missing at random in this combined data set and thus can be imputed safely. Imputing the missing labels mitigates the undesirable sample-size-reduction effect of importance weighting.A series of experiments on synthetic and real-world data are conducted to demonstrate the efficiency of our approach.
ES2017-47
Complex activity patterns generated by short-term synaptic plasticity
Bulcsu Sandor, Claudius Gros
Complex activity patterns generated by short-term synaptic plasticity
Bulcsu Sandor, Claudius Gros
Abstract:
Short-term synaptic plasticity (STSP) affects the efficiency of synaptic transmission for persistent presynaptic activities. We consider attractor neural networks, for which the attractors are given, in the absence of STSP, by cell assemblies of excitatory cliques. We show that STSP may transform these attracting states into attractor relics, inducing ongoing transient-state dynamics in terms of sequences of transiently activated cell assemblies, the former attractors. Subsequent cell assemblies may be both disjoint or partially overlapping. It may hence be possible to use the resulting dynamics for the generation of motor control sequences.
Short-term synaptic plasticity (STSP) affects the efficiency of synaptic transmission for persistent presynaptic activities. We consider attractor neural networks, for which the attractors are given, in the absence of STSP, by cell assemblies of excitatory cliques. We show that STSP may transform these attracting states into attractor relics, inducing ongoing transient-state dynamics in terms of sequences of transiently activated cell assemblies, the former attractors. Subsequent cell assemblies may be both disjoint or partially overlapping. It may hence be possible to use the resulting dynamics for the generation of motor control sequences.
ES2017-89
Criticality in Biocomputation
Tjeerd olde Scheper
Criticality in Biocomputation
Tjeerd olde Scheper
Abstract:
Complexity in biological computation is one of the recognised means by which biological systems manage to function in a complex chaotic world. The ability to function and solve problems irrespective of scale and relative complexity, including higher-order interactions, is essential to the efficacy of biological systems. However, it has been unclear how the required complexity can be introduced to allow these functions to be realised. Nonlinear local interactions are required to combine into a global stable system. The property of criticality, that is exhibited by many nonlinear physical systems, can be exploited to allow local nonlinear oscillators to interact, resulting in a globally stable system. This concept introduces robustness, as well as, a means to control global stability.
Complexity in biological computation is one of the recognised means by which biological systems manage to function in a complex chaotic world. The ability to function and solve problems irrespective of scale and relative complexity, including higher-order interactions, is essential to the efficacy of biological systems. However, it has been unclear how the required complexity can be introduced to allow these functions to be realised. Nonlinear local interactions are required to combine into a global stable system. The property of criticality, that is exhibited by many nonlinear physical systems, can be exploited to allow local nonlinear oscillators to interact, resulting in a globally stable system. This concept introduces robustness, as well as, a means to control global stability.
ES2017-65
Scholar Performance Prediction using Boosted Regression Trees Techniques
Bernardo Stearns, Fabio Rangel, Flavio Rangel, Fabrício Faria, Jonice Oliveira
Scholar Performance Prediction using Boosted Regression Trees Techniques
Bernardo Stearns, Fabio Rangel, Flavio Rangel, Fabrício Faria, Jonice Oliveira
Abstract:
The possibility of predicting a student performance based only on their socioeconomic status may help to infer what cultural features are important in education. This work was based on scores and socioeconomic data from the most popular exam to enter universities in Brazil: the National High School Exam. Statistical and computational methods used in data mining were applied on a data set of 8 millions data points from Brazil's National High School Exam to examine the predictability of the performance in Mathematics based on socioeconomic status. The results showed that it is possible to predict a students' scores using two ensemble techniques: AdaBoost and Gradient Boosting. The latter presented better results.
The possibility of predicting a student performance based only on their socioeconomic status may help to infer what cultural features are important in education. This work was based on scores and socioeconomic data from the most popular exam to enter universities in Brazil: the National High School Exam. Statistical and computational methods used in data mining were applied on a data set of 8 millions data points from Brazil's National High School Exam to examine the predictability of the performance in Mathematics based on socioeconomic status. The results showed that it is possible to predict a students' scores using two ensemble techniques: AdaBoost and Gradient Boosting. The latter presented better results.
ES2017-80
Imitation learning for a continuum trunk robot
Milad Malekzadeh Shafaroudi, Jeffrey F. Queißer, Jochen J. Steil
Imitation learning for a continuum trunk robot
Milad Malekzadeh Shafaroudi, Jeffrey F. Queißer, Jochen J. Steil
Abstract:
The paper applies learning from demonstration (LfD) for high-level trajectory planning and movement control of the Bionic Handling Assistant (BHA) robot. For such soft continuum robot with mechanical elasticity and complex dynamics it is difficult to use kinesthetic teaching to collect demonstration data. We propose to use an active compliant controller to this aim and record both position and orientation of the BHA's end-effector. Subsequently, this data is then encoded with a state-of-the-art task-parameterized probabilistic Gaussian mixture model and its performance and generalization is experimentally evaluated.
The paper applies learning from demonstration (LfD) for high-level trajectory planning and movement control of the Bionic Handling Assistant (BHA) robot. For such soft continuum robot with mechanical elasticity and complex dynamics it is difficult to use kinesthetic teaching to collect demonstration data. We propose to use an active compliant controller to this aim and record both position and orientation of the BHA's end-effector. Subsequently, this data is then encoded with a state-of-the-art task-parameterized probabilistic Gaussian mixture model and its performance and generalization is experimentally evaluated.
ES2017-141
ELM vs. WiSARD: a performance comparison
Luiz Oliveira, Felipe França
ELM vs. WiSARD: a performance comparison
Luiz Oliveira, Felipe França
Abstract:
The Extreme Learning Machine (ELM) is known for being a fast learning neural model. This work presents a performance comparison between ELM and the WiSARD weightless neural network model, regarding training and testing times, and classification accuracy as well. The two models were implemented in the same programming language and experiments were carried out on the same hardware environment. By using a group of datasets from the public repositories UCI and Statlog, experimental results shows that the WiSARD presented training times approximately one order of magnitude smaller than ELM, while classification accuracy varied according the number of classes involved. However, while WiSARD's architecture setups were not exhaustively searched, architecture setups for ELM were kept the same as the ones found in the literature as the best for each given dataset.
The Extreme Learning Machine (ELM) is known for being a fast learning neural model. This work presents a performance comparison between ELM and the WiSARD weightless neural network model, regarding training and testing times, and classification accuracy as well. The two models were implemented in the same programming language and experiments were carried out on the same hardware environment. By using a group of datasets from the public repositories UCI and Statlog, experimental results shows that the WiSARD presented training times approximately one order of magnitude smaller than ELM, while classification accuracy varied according the number of classes involved. However, while WiSARD's architecture setups were not exhaustively searched, architecture setups for ELM were kept the same as the ones found in the literature as the best for each given dataset.
ES2017-12
A novel principle for causal inference in data with small error variance
Patrick Blöbaum, Shohei Shimizu, Takashi Washio
A novel principle for causal inference in data with small error variance
Patrick Blöbaum, Shohei Shimizu, Takashi Washio
Abstract:
Causal inference addresses the problem of identifying cause and effect variables in observed data. While most of the current techniques base heavily on exploiting asymmetries in the error noise, these techniques struggle in data that only contain small noise. We present a novel principle for causal inference in data with small error variance. For this, we exploit an asymmetry in the prediction error under the assumption of additive noise and an independence between data generating mechanism and its input. The advantages of our approach is corroborated with empirical evaluations in artificial and real-world data sets.
Causal inference addresses the problem of identifying cause and effect variables in observed data. While most of the current techniques base heavily on exploiting asymmetries in the error noise, these techniques struggle in data that only contain small noise. We present a novel principle for causal inference in data with small error variance. For this, we exploit an asymmetry in the prediction error under the assumption of additive noise and an independence between data generating mechanism and its input. The advantages of our approach is corroborated with empirical evaluations in artificial and real-world data sets.
ES2017-10
Learning null space projections fast
Jeevan Manavalan, Matthew Howard
Learning null space projections fast
Jeevan Manavalan, Matthew Howard
Abstract:
Typically robot interactions with the environment may involve some type of constraint which impedes the motion of the system. This paper proposes an approach to learn kinematic constraints from observed movements. Our method derives the null space projection of a kinematically constrained system using gradient descent. Moreover, we compare this method to the existing brute force-based approach for learning constraints on datasets of different dimensionality, to demonstrate how it can learn constraints from datasets of a much higher dimensionality.
Typically robot interactions with the environment may involve some type of constraint which impedes the motion of the system. This paper proposes an approach to learn kinematic constraints from observed movements. Our method derives the null space projection of a kinematically constrained system using gradient descent. Moreover, we compare this method to the existing brute force-based approach for learning constraints on datasets of different dimensionality, to demonstrate how it can learn constraints from datasets of a much higher dimensionality.
ES2017-98
Comparison of adaptive MCMC methods
Edna Milgo, Nixon Ronoh, Peter Waiganjo Wagacha, Bernard Manderick
Comparison of adaptive MCMC methods
Edna Milgo, Nixon Ronoh, Peter Waiganjo Wagacha, Bernard Manderick
Abstract:
We compare three adaptive MCMC samplers to Metropolis-Hastings algorithm with optimal proposal distribution as our benchmark. We transform a simple Evolution Strategy algorithm into a sampler and show that it already outperforms the other samplers on the test suite used in the initial research on adaptive MCMC.
We compare three adaptive MCMC samplers to Metropolis-Hastings algorithm with optimal proposal distribution as our benchmark. We transform a simple Evolution Strategy algorithm into a sampler and show that it already outperforms the other samplers on the test suite used in the initial research on adaptive MCMC.
ES2017-113
Pseudo-analytical solutions for stochastic options pricing using Monte Carlo simulation and Breeding PSO-trained neural networks
Sam Palmer, Denise Gorse
Pseudo-analytical solutions for stochastic options pricing using Monte Carlo simulation and Breeding PSO-trained neural networks
Sam Palmer, Denise Gorse
Abstract:
A neural network is trained using a novel form of particle swarm optimisation to learn the pricing formula for European call options using training samples generated via a Monte Carlo process. The trained neural network has effectively learnt an approximate analytical solution, with errors shown statistically comparable to Monte Carlo pricing, alleviating the need to re-run computationally costly simulations for different model parameter settings.
A neural network is trained using a novel form of particle swarm optimisation to learn the pricing formula for European call options using training samples generated via a Monte Carlo process. The trained neural network has effectively learnt an approximate analytical solution, with errors shown statistically comparable to Monte Carlo pricing, alleviating the need to re-run computationally costly simulations for different model parameter settings.
ES2017-32
Spikes as regularizers
Anders Søgaard
Spikes as regularizers
Anders Søgaard
Abstract:
We present a confidence-based single-layer feed-forward learning algorithm {\sc Spiral}~(Spike Regularized Adaptive Learning) relying on an encoding of activation {\em spikes}. We adaptively update a weight vector relying on confidence estimates and activation offsets relative to previous activity. We regularize updates proportionally to item-level confidence and weight-specific support, loosely inspired by the observation from neurophysiology that high spike rates are sometimes accompanied by low temporal precision. Our experiments suggest that the new learning algorithm {\sc Spiral} is more robust and less prone to overfitting than both the averaged perceptron and {\sc Arow}
We present a confidence-based single-layer feed-forward learning algorithm {\sc Spiral}~(Spike Regularized Adaptive Learning) relying on an encoding of activation {\em spikes}. We adaptively update a weight vector relying on confidence estimates and activation offsets relative to previous activity. We regularize updates proportionally to item-level confidence and weight-specific support, loosely inspired by the observation from neurophysiology that high spike rates are sometimes accompanied by low temporal precision. Our experiments suggest that the new learning algorithm {\sc Spiral} is more robust and less prone to overfitting than both the averaged perceptron and {\sc Arow}
ES2017-55
Moving Least Squares Support Vector Machines for weather temperature prediction
Zahra Karevan, Yunlong Feng, Johan A. K. Suykens
Moving Least Squares Support Vector Machines for weather temperature prediction
Zahra Karevan, Yunlong Feng, Johan A. K. Suykens
Abstract:
Local learning methods have been investigated by many researchers. While global learning methods consider the same weight for all training points in model fitting, local learning methods assume that the training samples in the test point region are more influential. In this paper, we propose Moving Least Squares Support Vector Machines (M-LSSVM) in which each training sample is involved in the model fitting depending on the similarity between its feature vector and the one of the test point. The experimental results on an application of weather forecasting indicate that the proposed method can improve the prediction performance.
Local learning methods have been investigated by many researchers. While global learning methods consider the same weight for all training points in model fitting, local learning methods assume that the training samples in the test point region are more influential. In this paper, we propose Moving Least Squares Support Vector Machines (M-LSSVM) in which each training sample is involved in the model fitting depending on the similarity between its feature vector and the one of the test point. The experimental results on an application of weather forecasting indicate that the proposed method can improve the prediction performance.
ES2017-44
A Robust Minimal Learning Machine based on the M-Estimator
Joao Gomes, Diego Mesquita, Ananda Freire, Amauri Souza Junior, Tommi Karkkainen
A Robust Minimal Learning Machine based on the M-Estimator
Joao Gomes, Diego Mesquita, Ananda Freire, Amauri Souza Junior, Tommi Karkkainen
Abstract:
In this paper we propose a robust Minimal Learning Machine (R-RLM) for regression problems. The proposed method uses a robust M-estimator to generate a linear mapping between input and output distances matrices of MLM. The R-MLM was tested on one synthetic and three real world datasets that were contaminated with an increasing number of outliers. The method achieved a performance comparable to the robust Extreme Learning Machine (R-RLM) and thus can be seen as a valid alternative for regression tasks on datasets with outliers.
In this paper we propose a robust Minimal Learning Machine (R-RLM) for regression problems. The proposed method uses a robust M-estimator to generate a linear mapping between input and output distances matrices of MLM. The R-MLM was tested on one synthetic and three real world datasets that were contaminated with an increasing number of outliers. The method achieved a performance comparable to the robust Extreme Learning Machine (R-RLM) and thus can be seen as a valid alternative for regression tasks on datasets with outliers.
Processing, Mining and Visualizing Massive Urban Data
ES2017-3
Processing, mining and visualizing massive urban data
Pierre Borgnat, Etienne Côme, Latifa Oukhellou
Processing, mining and visualizing massive urban data
Pierre Borgnat, Etienne Côme, Latifa Oukhellou
Abstract:
The development of smart technologies and the advent of new observation capabilities have increased the availability of massive urban datasets that can greatly benefit urban studies. For example, a large amount of urban data is collected by various sensors, such as smart meters, or provided by GSM, Wi-Fi or Bluetooth records, ticketing data, geo-tagged posts on social networks, etc. Analysis of such digital records can help to build decision-making tools (for analytical, forecasting and display purposes) with a view to better understanding the operating of urban systems, to enable urban stakeholders to plan better when extending infrastructures and to provide better services to citizens in order to assist the development of the city and improve quality of life. This paper will focus on three main domains of application: transportation and mobility, water and energy.
The development of smart technologies and the advent of new observation capabilities have increased the availability of massive urban datasets that can greatly benefit urban studies. For example, a large amount of urban data is collected by various sensors, such as smart meters, or provided by GSM, Wi-Fi or Bluetooth records, ticketing data, geo-tagged posts on social networks, etc. Analysis of such digital records can help to build decision-making tools (for analytical, forecasting and display purposes) with a view to better understanding the operating of urban systems, to enable urban stakeholders to plan better when extending infrastructures and to provide better services to citizens in order to assist the development of the city and improve quality of life. This paper will focus on three main domains of application: transportation and mobility, water and energy.
ES2017-93
Anomaly detection and characterization in smart card logs using NMF and Tweets
Emeric Tonnelier, Nicolas Baskiotis, Vincent Guigue, Patrick Gallinari
Anomaly detection and characterization in smart card logs using NMF and Tweets
Emeric Tonnelier, Nicolas Baskiotis, Vincent Guigue, Patrick Gallinari
Abstract:
This article describes a novel approach to detect anomalies in smart card logs. In this study, we chose to work on a 24h base for every station in the Parisian metro network. We also consider separately the 7 days of the week. We first build a robust averaged reference for (day,station) couples and then, we focus on the difference between particular situations and references. All experiments are conducted both on the raw data and using an NMF denoised approximation of the log flow. We demonstrate the interest and the robustness of the latter strategy. Then we mine RATP Twitter account to obtain ground truth information about operating incidents. This synchronized flow is used to evaluate our models.
This article describes a novel approach to detect anomalies in smart card logs. In this study, we chose to work on a 24h base for every station in the Parisian metro network. We also consider separately the 7 days of the week. We first build a robust averaged reference for (day,station) couples and then, we focus on the difference between particular situations and references. All experiments are conducted both on the raw data and using an NMF denoised approximation of the log flow. We demonstrate the interest and the robustness of the latter strategy. Then we mine RATP Twitter account to obtain ground truth information about operating incidents. This synchronized flow is used to evaluate our models.
ES2017-25
Using degree constrained gravity null-models to understand the structure of journeys' networks in bicycle sharing systems
Remy Cazabet, Pierre Borgnat, Pablo Jensen
Using degree constrained gravity null-models to understand the structure of journeys' networks in bicycle sharing systems
Remy Cazabet, Pierre Borgnat, Pablo Jensen
Abstract:
Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys' data can be extracted, providing rich information to better understand it. Recent works have used network analysis, and in particular space-corrected community detection, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.
Bicycle Sharing Systems are now ubiquitous in large cities around the world. In most of these systems, journeys' data can be extracted, providing rich information to better understand it. Recent works have used network analysis, and in particular space-corrected community detection, to analyse such datasets. In this paper, we show that spatial-null models used in previous methods have a systematic bias, and we propose a degree-contrained null-model to improve the results. We finally apply the proposed method on the BSS of a city.
ES2017-138
A neuro-symbolic approach to GPS trajectory classification
Diego Carvalho, Felipe França, Raul Barbosa, Douglas Cardoso
A neuro-symbolic approach to GPS trajectory classification
Diego Carvalho, Felipe França, Raul Barbosa, Douglas Cardoso
Abstract:
This paper proposes approaches to GPS trajectory classification problem in the context of the Rio de Janeiro's public transit system. The approaches are inspired by the neuro-symbolic sense of adding knowledge from the domain as opposed to the use of a raw machine learning approach. Experimental results show performance boosts when using these strategies.
This paper proposes approaches to GPS trajectory classification problem in the context of the Rio de Janeiro's public transit system. The approaches are inspired by the neuro-symbolic sense of adding knowledge from the domain as opposed to the use of a raw machine learning approach. Experimental results show performance boosts when using these strategies.
ES2017-22
Non-negative matrix factorization as a pre-processing tool for travelers temporal profiles clustering
Léna Carel, Pierre Alquier
Non-negative matrix factorization as a pre-processing tool for travelers temporal profiles clustering
Léna Carel, Pierre Alquier
Abstract:
We propose to use non-negative matrix factorization (NMF) to build a dictionary of travelers temporal profiles. Clustering based on decomposition in this dictionary rather than on the full profiles (as in previous works) lead to more interpretable clusters.
We propose to use non-negative matrix factorization (NMF) to build a dictionary of travelers temporal profiles. Clustering based on decomposition in this dictionary rather than on the full profiles (as in previous works) lead to more interpretable clusters.
ES2017-31
Extracting urban water usage habits from smart meter data: a functional clustering approach
Nicolas CHEIFETZ, Allou Samé, Zineb Sabir, Anne-Claire Sandraz, Cédric Féliers
Extracting urban water usage habits from smart meter data: a functional clustering approach
Nicolas CHEIFETZ, Allou Samé, Zineb Sabir, Anne-Claire Sandraz, Cédric Féliers
Abstract:
The recent development of smart grids offers, through automated meter reading systems, the opportunity for an efficient and responsible management of water resources. In this framework, the present paper describes a novel methodology for identifying relevant usage profiles from hourly water consumption series collected by smart meters located on a water distribution network. The proposed approach operates in two stages. First, an additive time series decomposition model is used in order to extract seasonal patterns from the time series, which are intended to represent the customers habits in terms of water consumption. Then, two functional clustering approaches are used to group the extracted seasonal patterns into homogeneous clusters: a functional version of the well-known K-means algorithm, and a Fourier regression mixture-model-based algorithm. The two clustering strategies are applied to real world data from a smart grid deployed on a large water distribution network in France and a realistic interpretation of the consumption habits is given to each cluster.
The recent development of smart grids offers, through automated meter reading systems, the opportunity for an efficient and responsible management of water resources. In this framework, the present paper describes a novel methodology for identifying relevant usage profiles from hourly water consumption series collected by smart meters located on a water distribution network. The proposed approach operates in two stages. First, an additive time series decomposition model is used in order to extract seasonal patterns from the time series, which are intended to represent the customers habits in terms of water consumption. Then, two functional clustering approaches are used to group the extracted seasonal patterns into homogeneous clusters: a functional version of the well-known K-means algorithm, and a Fourier regression mixture-model-based algorithm. The two clustering strategies are applied to real world data from a smart grid deployed on a large water distribution network in France and a realistic interpretation of the consumption habits is given to each cluster.
ES2017-72
Multiscale Spatio-Temporal Data Aggregation and Mapping for Urban Data Exploration
Anaïs Remy, Etienne Côme
Multiscale Spatio-Temporal Data Aggregation and Mapping for Urban Data Exploration
Anaïs Remy, Etienne Côme
Abstract:
Maps seem the most intuitive way to visualize massive urban data but they also raise some well-known graphical problems (such as visual clutter, etc.). This paper focuses on processing massive spatio-temporal data in order to ease multi-scale exploration. To this end, we describe a preprocessing tool that enables the automatic creation of a multi-resolution grid from a high resolution grid of spatio-temporal data in a format compatible with webmapping applications (vector tiles). The use of this tool is exemplified through a prototype that offers the possibility to navigate into a massive itinerary request dataset collected in the Ile-de-France region.
Maps seem the most intuitive way to visualize massive urban data but they also raise some well-known graphical problems (such as visual clutter, etc.). This paper focuses on processing massive spatio-temporal data in order to ease multi-scale exploration. To this end, we describe a preprocessing tool that enables the automatic creation of a multi-resolution grid from a high resolution grid of spatio-temporal data in a format compatible with webmapping applications (vector tiles). The use of this tool is exemplified through a prototype that offers the possibility to navigate into a massive itinerary request dataset collected in the Ile-de-France region.
ES2017-73
Detection of non-recurrent road traffic events based on clustering indicators
Pierre-Antoine Laharotte, Romain Billot, Nour-Eddin El Faouzi
Detection of non-recurrent road traffic events based on clustering indicators
Pierre-Antoine Laharotte, Romain Billot, Nour-Eddin El Faouzi
Abstract:
We propose a new indicator for detecting non recurrent road traffic conditions. The idea is based on the perplexity of a generative probabilistic model (LDA) used for predicting traffic pattern. The resulting filter method reduces the inaccuracies of comparable detection method and enables a better separation between usual traffic pattern and non-recurrent situations.
We propose a new indicator for detecting non recurrent road traffic conditions. The idea is based on the perplexity of a generative probabilistic model (LDA) used for predicting traffic pattern. The resulting filter method reduces the inaccuracies of comparable detection method and enables a better separation between usual traffic pattern and non-recurrent situations.
Signal and image processing, collaborative filtering
ES2017-23
Collaborative filtering with neural networks
Josef Feigl, Martin Bogdan
Collaborative filtering with neural networks
Josef Feigl, Martin Bogdan
Abstract:
Collaborative filtering methods try to determine a user's preferences given their historical usage data. In this paper, a flexible neural network architecture to solve collaborative filtering problems is reviewed and further developed. It will be shown how modern adaptive learning rate methods can be modified to allow the network to be trained in about half the time without sacrificing any predictive performance. Additionally, the effects of Dropout on the performance of the model are evaluated. The results of this approach are demonstrated on the Netflix Prize dataset.
Collaborative filtering methods try to determine a user's preferences given their historical usage data. In this paper, a flexible neural network architecture to solve collaborative filtering problems is reviewed and further developed. It will be shown how modern adaptive learning rate methods can be modified to allow the network to be trained in about half the time without sacrificing any predictive performance. Additionally, the effects of Dropout on the performance of the model are evaluated. The results of this approach are demonstrated on the Netflix Prize dataset.
ES2017-45
Investigating optical transmission error correction using wavelet transforms
Weam Binjumah, Alexey Redyuk, Rod Adams, Neil Davey, Yi Sun
Investigating optical transmission error correction using wavelet transforms
Weam Binjumah, Alexey Redyuk, Rod Adams, Neil Davey, Yi Sun
Abstract:
Reducing bit error rate and improving performance of modern coherent optical communication system is a significant issue. As the distance travelled by the information signal increases, bit error rate will degrade. Support Vector Machines are the most up to date machine learning method for error correction in optical transmission systems. Wavelet transform has been a popular method to signals processing. In this study, results show that the bit error rate can be improved by using classification based on wavelet transforms (WT) and support vector machine (SVM).
Reducing bit error rate and improving performance of modern coherent optical communication system is a significant issue. As the distance travelled by the information signal increases, bit error rate will degrade. Support Vector Machines are the most up to date machine learning method for error correction in optical transmission systems. Wavelet transform has been a popular method to signals processing. In this study, results show that the bit error rate can be improved by using classification based on wavelet transforms (WT) and support vector machine (SVM).
ES2017-133
WiSARDrp for Change Detection in Video Sequences
Massimo De Gregorio, Giordano Maurizio
WiSARDrp for Change Detection in Video Sequences
Massimo De Gregorio, Giordano Maurizio
Abstract:
Weightless neural networks have been successfully used as learners and detectors of background regions in video processing, as they feature fast learning algorithm, noise tolerance and an incremental update of learnt knowledge, also referred to as online training. These features make weightless neural networks suitable and effective to be used for change (motion) detection in scenarios in which environmental changes (light, camera view, cluttered background) and moving objects force the modeling of background regions to change continuously and in drastic ways. In this paper, we present a change detection method in video processing that uses a weightless neural system, called WiSARDrp, as underlying learning mechanism, equipped with a reinforcing/weakening scheme, that builds and continuously updates a model of background at pixel-level. The performance of the proposed background modeling and change detection techniques are evaluated on the ChangeDetection.net video archive.
Weightless neural networks have been successfully used as learners and detectors of background regions in video processing, as they feature fast learning algorithm, noise tolerance and an incremental update of learnt knowledge, also referred to as online training. These features make weightless neural networks suitable and effective to be used for change (motion) detection in scenarios in which environmental changes (light, camera view, cluttered background) and moving objects force the modeling of background regions to change continuously and in drastic ways. In this paper, we present a change detection method in video processing that uses a weightless neural system, called WiSARDrp, as underlying learning mechanism, equipped with a reinforcing/weakening scheme, that builds and continuously updates a model of background at pixel-level. The performance of the proposed background modeling and change detection techniques are evaluated on the ChangeDetection.net video archive.
ES2017-152
Learning human behaviors and lifestyle by capturing temporal relations in mobility patterns
Eyal Ben Zion, Boaz Lerner
Learning human behaviors and lifestyle by capturing temporal relations in mobility patterns
Eyal Ben Zion, Boaz Lerner
Abstract:
Many applications benefit from learning human behaviors and lifestyle. Different trajectories can represent a behavior, and previous behaviors and trajectories can influence decisions on further behaviors and on visiting future places and taking familiar or new trajectories. To more accurately explain and predict personal behavior, we extend a topic model to capture temporal relations among previous trajectories/weeks and current ones. In addition, we show how different trajectories may have the same latent cause, which we relate to lifestyle. The code for our algorithm is available online.
Many applications benefit from learning human behaviors and lifestyle. Different trajectories can represent a behavior, and previous behaviors and trajectories can influence decisions on further behaviors and on visiting future places and taking familiar or new trajectories. To more accurately explain and predict personal behavior, we extend a topic model to capture temporal relations among previous trajectories/weeks and current ones. In addition, we show how different trajectories may have the same latent cause, which we relate to lifestyle. The code for our algorithm is available online.
ES2017-104
Hierarchical Combination of Video Features for Personalised Pain Level Recognition
Patrick Thiam, Viktor Kessler, Friedhelm Schwenker
Hierarchical Combination of Video Features for Personalised Pain Level Recognition
Patrick Thiam, Viktor Kessler, Friedhelm Schwenker
Abstract:
In this work, we present a personalized participant independent pain recognition system based on the video channel. Instead of using an entire annotated dataset to train a classification model that would be later applied to an unseen participant, a similarity metric is used to select the most interesting annotated samples based on the data of the unseen participant. These samples are subsequently used to train a model adapted to the unseen participant. The selection process helps to avoid redundant and irrelevant data samples, thus improves the performance as well as the efficiency of the trained model. From the video channel, several features are extracted and subsequently fed into an hierarchical fusion architecture to further improve the performance of the system.
In this work, we present a personalized participant independent pain recognition system based on the video channel. Instead of using an entire annotated dataset to train a classification model that would be later applied to an unseen participant, a similarity metric is used to select the most interesting annotated samples based on the data of the unseen participant. These samples are subsequently used to train a model adapted to the unseen participant. The selection process helps to avoid redundant and irrelevant data samples, thus improves the performance as well as the efficiency of the trained model. From the video channel, several features are extracted and subsequently fed into an hierarchical fusion architecture to further improve the performance of the system.
ES2017-29
A performance acceleration algorithm of spectral unmixing via subset selection
Jing Ke, Yi Guo, Arcot Sowmya, Tomasz Bednarz
A performance acceleration algorithm of spectral unmixing via subset selection
Jing Ke, Yi Guo, Arcot Sowmya, Tomasz Bednarz
Abstract:
An acceleration algorithm for spectral unmixing approach is proposed based on subset selection. The method classifies the pixels in a spectral image into accurate and approximated unmixing groups based on the similarity and dissimilarity of geomorphological features in neighboring areas. Real spectral images are used for unmixing benchmark tests for accuracy and performance verification. The results reveal good performance speedup with only small accuracy loss.
An acceleration algorithm for spectral unmixing approach is proposed based on subset selection. The method classifies the pixels in a spectral image into accurate and approximated unmixing groups based on the similarity and dissimilarity of geomorphological features in neighboring areas. Real spectral images are used for unmixing benchmark tests for accuracy and performance verification. The results reveal good performance speedup with only small accuracy loss.
ES2017-16
Myoelectrical signal classification based on S transform and two-directional 2DPCA
Hong-Bo Xie, Hui Liu
Myoelectrical signal classification based on S transform and two-directional 2DPCA
Hong-Bo Xie, Hui Liu
Abstract:
In order to extract discriminative information, time-frequency matrix is often transformed into a 1D vector followed by principal component analysis. This study contributes a two-directional two-dimensional principal component analysis (2D2PCA) based technique for time-frequency feature extraction. 2D2PCA is directly conducted on the time-frequency matrix obtained from the S transform rather than 1D vectors for feature extraction. The proposed method can significantly reduce the computational cost while capture the directions of maximal time-frequency matrix variance. The efficiency and effectiveness of the proposed method is demonstrated by classifying eight hand motions using four-channel myoelectric signals recorded in health subjects and amputees.
In order to extract discriminative information, time-frequency matrix is often transformed into a 1D vector followed by principal component analysis. This study contributes a two-directional two-dimensional principal component analysis (2D2PCA) based technique for time-frequency feature extraction. 2D2PCA is directly conducted on the time-frequency matrix obtained from the S transform rather than 1D vectors for feature extraction. The proposed method can significantly reduce the computational cost while capture the directions of maximal time-frequency matrix variance. The efficiency and effectiveness of the proposed method is demonstrated by classifying eight hand motions using four-channel myoelectric signals recorded in health subjects and amputees.
ES2017-40
Hyper-spectral frequency selection for the classification of vegetation diseases
Klaas Dijkstra, Jaap van de Loosdrecht, Lambert Schomaker, Marco Wiering
Hyper-spectral frequency selection for the classification of vegetation diseases
Klaas Dijkstra, Jaap van de Loosdrecht, Lambert Schomaker, Marco Wiering
Abstract:
Reducing the use of pesticides by early visual detection of diseases in precision agriculture is important. Because of the color similarity between potato-plant diseases, narrow band hyper-spectral imaging is required. Payload restrains on unmanned aerial vehicles require reduction of spectral bands. Therefore, we present a methodology for per-pixel classification combined with hyper-spectral band selection. In controlled experiments performed on a set of individual leaves, we measure the performance of five classifiers and three dimensionality-reduction methods with three patch sizes. With the best-performing classifier an error rate of 1.5\% is achieved for distinguishing two important potato-plant diseases.
Reducing the use of pesticides by early visual detection of diseases in precision agriculture is important. Because of the color similarity between potato-plant diseases, narrow band hyper-spectral imaging is required. Payload restrains on unmanned aerial vehicles require reduction of spectral bands. Therefore, we present a methodology for per-pixel classification combined with hyper-spectral band selection. In controlled experiments performed on a set of individual leaves, we measure the performance of five classifiers and three dimensionality-reduction methods with three patch sizes. With the best-performing classifier an error rate of 1.5\% is achieved for distinguishing two important potato-plant diseases.
ES2017-36
Outlining a simple and robust method for the automatic detection of EEG arousals
Isaac Fernández-Varela, Diego Álvarez-Estévez, Elena Hernández-Pereira, Vicente Moret-Bonillo
Outlining a simple and robust method for the automatic detection of EEG arousals
Isaac Fernández-Varela, Diego Álvarez-Estévez, Elena Hernández-Pereira, Vicente Moret-Bonillo
Abstract:
This work proposes a new technique for the automatic detection of electroencephalographic (EEG) arousals in sleep polysomnographic recordings. We have developed a non-computationally complex algorithm with the idea of providing an easy integration into different software platforms. The approach combines different well-known signal analyses to identify relevant arousal patterns. Special emphasis is carried out to produce a robust, artifact tolerant algorithm. The resulting approach was tested using a database of 6 polysomnographic recordings from real patients, achieving an average kappa index of 0.77 with respect to the visual scorings made by clinical experts.
This work proposes a new technique for the automatic detection of electroencephalographic (EEG) arousals in sleep polysomnographic recordings. We have developed a non-computationally complex algorithm with the idea of providing an easy integration into different software platforms. The approach combines different well-known signal analyses to identify relevant arousal patterns. Special emphasis is carried out to produce a robust, artifact tolerant algorithm. The resulting approach was tested using a database of 6 polysomnographic recordings from real patients, achieving an average kappa index of 0.77 with respect to the visual scorings made by clinical experts.
ES2017-39
A decision support system based on cellular automata to help the control of late blight in tomato cultures
Gizelle Vianna, Gustavo Oliveira, Gabriel Cunha
A decision support system based on cellular automata to help the control of late blight in tomato cultures
Gizelle Vianna, Gustavo Oliveira, Gabriel Cunha
Abstract:
We designed and implemented a decision support system for small tomatoes producers that investigates ways to recognize the late blight disease from the analysis of digital images of tomatoes, using a pair of multilayer perceptron neural network. The networks outputs are used to calculate the damage level at each plant and to construct a situation map of a farm where a cellular automata simulates the outbreak evolution over the fields. The simulator can test different pesticides actions, helping in the decision on when to start the spraying and in the analysis of losses and gains of each choice of action.
We designed and implemented a decision support system for small tomatoes producers that investigates ways to recognize the late blight disease from the analysis of digital images of tomatoes, using a pair of multilayer perceptron neural network. The networks outputs are used to calculate the damage level at each plant and to construct a situation map of a farm where a cellular automata simulates the outbreak evolution over the fields. The simulator can test different pesticides actions, helping in the decision on when to start the spraying and in the analysis of losses and gains of each choice of action.
ES2017-139
Comparison of manual and semi-manual delineations for classifying glioblastoma multiforme patients based on histogram and texture MRI features
Adrian Ion-Margineanu, Sofie Van Cauter, Diana M Sima, Frederik Maes, Stefaan Sunaert, Uwe Himmelreich, Sabine Van Huffel
Comparison of manual and semi-manual delineations for classifying glioblastoma multiforme patients based on histogram and texture MRI features
Adrian Ion-Margineanu, Sofie Van Cauter, Diana M Sima, Frederik Maes, Stefaan Sunaert, Uwe Himmelreich, Sabine Van Huffel
Abstract:
In this paper we study the task of classifying the follow-up course of brain tumour patients that had surgery. Multiple magnetic resonance imaging brain scans were taken for each patient. We propose a simple method of delineating the contrast enhancing tumour lesion based on the total tumour region. We compare balanced accuracy values after tuning SVM-lin and SVM-rbf on histogram and 3-D texture features extracted from semi-manual and manual delineations. Results show that our proposed delineating method outperforms the classical method.
In this paper we study the task of classifying the follow-up course of brain tumour patients that had surgery. Multiple magnetic resonance imaging brain scans were taken for each patient. We propose a simple method of delineating the contrast enhancing tumour lesion based on the total tumour region. We compare balanced accuracy values after tuning SVM-lin and SVM-rbf on histogram and 3-D texture features extracted from semi-manual and manual delineations. Results show that our proposed delineating method outperforms the classical method.
ES2017-60
Latent variable analysis in hospital electric power demand using non-negative matrix factorization
Diego García, Ignacio Díaz, Daniel Pérez, Abel Cuadrado, Manuel Domínguez
Latent variable analysis in hospital electric power demand using non-negative matrix factorization
Diego García, Ignacio Díaz, Daniel Pérez, Abel Cuadrado, Manuel Domínguez
Abstract:
Energy disaggregation techniques have recently attracted much interest, since they allow to obtain latent patterns from power demand data in buildings, revealing useful information to the user. Unsupervised methods are specially attractive, since they do not require labeled datasets. Particularly, non-negative matrix factorization (NMF) methods allow to decompose a single power demand measurement over a certain time period into a set of components or "parts" that are sparse, non-negative and sum up the original measured quantity. Such components reveal hidden temporal patterns and events along this period, related to scheduling events and/or demand patterns from subsystems in the network, that are very useful within an energy efficiency context. In this paper we use this approach on demand data from a hospital during a one-year period, using a calendar visualization of the components, revealing relevant facts about the energy expenditure.
Energy disaggregation techniques have recently attracted much interest, since they allow to obtain latent patterns from power demand data in buildings, revealing useful information to the user. Unsupervised methods are specially attractive, since they do not require labeled datasets. Particularly, non-negative matrix factorization (NMF) methods allow to decompose a single power demand measurement over a certain time period into a set of components or "parts" that are sparse, non-negative and sum up the original measured quantity. Such components reveal hidden temporal patterns and events along this period, related to scheduling events and/or demand patterns from subsystems in the network, that are very useful within an energy efficiency context. In this paper we use this approach on demand data from a hospital during a one-year period, using a calendar visualization of the components, revealing relevant facts about the energy expenditure.
ES2017-91
Supporting generative models of spatial behavior by user interaction
Ronny Hug, Wolfgang Hübner, Michael Arens
Supporting generative models of spatial behavior by user interaction
Ronny Hug, Wolfgang Hübner, Michael Arens
Abstract:
The analysis of spatial behavior in terms of motion profiles recorded along trajectories is a widely used technique in video analysis. Inherent to this approach is the problem to assign a meaningful score to observations. This score builds the basis for classification, ranking, or to generate user feedback. Score assignment can be done in terms of deviations from normal behavior, where normality is determined by learning a generative model. A general drawback is that the unsupervised learning process often assigns non-intuitive scores. In order to address this problem this paper proposes the usage of interactive concepts, which support the learning process. Interaction thereby strongly utilizes the generative models capabilities to synthesize samples, to give insight into the underlying representation. Initial results are shown on a trajectory rating task, illustrating the feasibility of the proposed approach.
The analysis of spatial behavior in terms of motion profiles recorded along trajectories is a widely used technique in video analysis. Inherent to this approach is the problem to assign a meaningful score to observations. This score builds the basis for classification, ranking, or to generate user feedback. Score assignment can be done in terms of deviations from normal behavior, where normality is determined by learning a generative model. A general drawback is that the unsupervised learning process often assigns non-intuitive scores. In order to address this problem this paper proposes the usage of interactive concepts, which support the learning process. Interaction thereby strongly utilizes the generative models capabilities to synthesize samples, to give insight into the underlying representation. Initial results are shown on a trajectory rating task, illustrating the feasibility of the proposed approach.
Algorithmic Challenges in Big Data Analytics
ES2017-6
Algorithmic challenges in big data analytics
Veronica Bolon-Canedo, Beatriz Remeseiro, Konstantinos Sechidis, David Martínez-Rego, Amparo Alonso-Betanzos
Algorithmic challenges in big data analytics
Veronica Bolon-Canedo, Beatriz Remeseiro, Konstantinos Sechidis, David Martínez-Rego, Amparo Alonso-Betanzos
Abstract:
This session studies specific challenges that Machine Learning (ML) algorithms have to tackle when faced with Big Data problems. These challenges can arise when any of the dimensions in a ML problem grows significantly: a) size of training set, b) size of test set or c) dimensionality. The studies included in this edition explore the extension of previous ML algorithms and practices to Big Data scenarios. Namely, specific algorithms for recurrent neural network training, ensemble learning, anomaly detection and clustering are proposed. The results obtained show that this new trend of ML problems presents both a challenge and an opportunity to obtain results which could allow ML to be integrated in many new applications in years to come.
This session studies specific challenges that Machine Learning (ML) algorithms have to tackle when faced with Big Data problems. These challenges can arise when any of the dimensions in a ML problem grows significantly: a) size of training set, b) size of test set or c) dimensionality. The studies included in this edition explore the extension of previous ML algorithms and practices to Big Data scenarios. Namely, specific algorithms for recurrent neural network training, ensemble learning, anomaly detection and clustering are proposed. The results obtained show that this new trend of ML problems presents both a challenge and an opportunity to obtain results which could allow ML to be integrated in many new applications in years to come.
ES2017-18
Partition-wise Recurrent Neural Networks for Point-based AIS Trajectory Classification
Xiang Jiang, Erico N de Souza, Xuan Liu, Behrouz Haji Soleimani, Xiaoguang Wang, Daniel L. Silver, Stan Matwin
Partition-wise Recurrent Neural Networks for Point-based AIS Trajectory Classification
Xiang Jiang, Erico N de Souza, Xuan Liu, Behrouz Haji Soleimani, Xiaoguang Wang, Daniel L. Silver, Stan Matwin
Abstract:
We present Partition-wise Recurrent Neural Networks (pRNNs) for point-based trajectory classification to detect fishing activities in the ocean. This method partitions each feature and uses region-specific parameters for distinct partitions, which can greatly improve the expressive power of deep recurrent neural networks on low-dimensional yet heterogeneous trajectory data. We show that our approach outperforms the state-of-the-art systems.
We present Partition-wise Recurrent Neural Networks (pRNNs) for point-based trajectory classification to detect fishing activities in the ocean. This method partitions each feature and uses region-specific parameters for distinct partitions, which can greatly improve the expressive power of deep recurrent neural networks on low-dimensional yet heterogeneous trajectory data. We show that our approach outperforms the state-of-the-art systems.
ES2017-35
Scalable approximate k-NN Graph construction based on Locality Sensitive Hashing
Carlos Eiras-Franco, Leslie Kanthan, Amparo Alonso-Betanzos, David Martínez-Rego
Scalable approximate k-NN Graph construction based on Locality Sensitive Hashing
Carlos Eiras-Franco, Leslie Kanthan, Amparo Alonso-Betanzos, David Martínez-Rego
Abstract:
Nearest neighbours graphs are a pervasive basic construct in areas such as Data mining, Machine Learning and Information Retrieval. Among them, the k Nearest Neighbours Graph (kNNG), is probably the most studied of all. Unfortunately, its naı̈ve construction is in O(n 2 ) for n data points, which becomes a quagmire when scaling to Big Data. However sub-quadratic construction of kNNG remains an open question. This paper explores an adaptive algorithm based on Locality Sensitive Hashing which presents good performance on distributed architectures.
Nearest neighbours graphs are a pervasive basic construct in areas such as Data mining, Machine Learning and Information Retrieval. Among them, the k Nearest Neighbours Graph (kNNG), is probably the most studied of all. Unfortunately, its naı̈ve construction is in O(n 2 ) for n data points, which becomes a quagmire when scaling to Big Data. However sub-quadratic construction of kNNG remains an open question. This paper explores an adaptive algorithm based on Locality Sensitive Hashing which presents good performance on distributed architectures.
ES2017-110
Degrees of Freedom in Regression Ensembles
Reeve Henry, Gavin Brown
Degrees of Freedom in Regression Ensembles
Reeve Henry, Gavin Brown
Abstract:
Negative correlation learning is an effective approach to ensemble learning in which model diversity is encouraged through a correlation penalty term. The level of emphasis placed upon the correlation penalty term is controlled by the diversity parameter. We shall provide a degrees of freedom analysis of negative correlation learning. Our contributions are as follows: we give an exact formula for the effective degrees of freedom in a negative correlation ensemble with fixed basis functions; we show that the effective degrees of freedom is a continuous, convex and monotonically increasing function of the diversity parameter; finally, we show that the degrees of freedom formula gives rise to an efficient way to tune the diversity parameter on large data sets.
Negative correlation learning is an effective approach to ensemble learning in which model diversity is encouraged through a correlation penalty term. The level of emphasis placed upon the correlation penalty term is controlled by the diversity parameter. We shall provide a degrees of freedom analysis of negative correlation learning. Our contributions are as follows: we give an exact formula for the effective degrees of freedom in a negative correlation ensemble with fixed basis functions; we show that the effective degrees of freedom is a continuous, convex and monotonically increasing function of the diversity parameter; finally, we show that the degrees of freedom formula gives rise to an efficient way to tune the diversity parameter on large data sets.
ES2017-82
Mutual information for improving the efficiency of the SCH algorithm
Diego Fernandez-Francos, Oscar Fontenla-Romero, Amparo Alonso-Betanzos, Gavin Brown
Mutual information for improving the efficiency of the SCH algorithm
Diego Fernandez-Francos, Oscar Fontenla-Romero, Amparo Alonso-Betanzos, Gavin Brown
Abstract:
A new approach to improve the efficiency of a one-class classification algorithm making it more suitable for big datasets is presented in this work. The original algorithm, called SCH (Scaled Convex Hull) algorithm, approximates a D-dimensional convex hull decision by means of random projections and an ensemble of 2-dimensional decisions. With this new approach we try to get rid of the redundant projections that lead to similar classification models in the low dimensional space. After the training phase, a new stage based on mutual information is added to the original algorithm in order to select the essential projections and remove the unnecessary ones, providing a lightweight classification model. This reduces significantly the computational complexity of the testing phase and preserves the performance of the original method. Finally, some experimental results are given to demonstrate the effectiveness and efficiency of these approach.
A new approach to improve the efficiency of a one-class classification algorithm making it more suitable for big datasets is presented in this work. The original algorithm, called SCH (Scaled Convex Hull) algorithm, approximates a D-dimensional convex hull decision by means of random projections and an ensemble of 2-dimensional decisions. With this new approach we try to get rid of the redundant projections that lead to similar classification models in the low dimensional space. After the training phase, a new stage based on mutual information is added to the original algorithm in order to select the essential projections and remove the unnecessary ones, providing a lightweight classification model. This reduces significantly the computational complexity of the testing phase and preserves the performance of the original method. Finally, some experimental results are given to demonstrate the effectiveness and efficiency of these approach.
ES2017-87
A distributed approach for classification using distance metrics
Laura Morán-Fernández, Veronica Bolon-Canedo, Amparo Alonso-Betanzos
A distributed approach for classification using distance metrics
Laura Morán-Fernández, Veronica Bolon-Canedo, Amparo Alonso-Betanzos
Abstract:
To cope with the huge quantity of data that fast development of sensoring, networking and inexpensive data storage has come, many distributed approaches have been developed during the last years. The main reason is that, when dealing with large datasets, most existing data mining algorithms do not scale well, and their efficiency may significantly deteriorate. Thus, we present a distributed approach by samples in which the original dataset will be divided into several nodes or processors. For classifying a new test sample, first we compute the distance to the data on each node, and then it will be classified by the model learned from the "closest" data. The proposed method has proved to be useful, demonstrating important savings in runtime and satisfactory performance.
To cope with the huge quantity of data that fast development of sensoring, networking and inexpensive data storage has come, many distributed approaches have been developed during the last years. The main reason is that, when dealing with large datasets, most existing data mining algorithms do not scale well, and their efficiency may significantly deteriorate. Thus, we present a distributed approach by samples in which the original dataset will be divided into several nodes or processors. For classifying a new test sample, first we compute the distance to the data on each node, and then it will be classified by the model learned from the "closest" data. The proposed method has proved to be useful, demonstrating important savings in runtime and satisfactory performance.
Deep learning
ES2017-48
Local Lyapunov Exponents of Deep RNN
Claudio Gallicchio, Alessio Micheli, Luca Silvestri
Local Lyapunov Exponents of Deep RNN
Claudio Gallicchio, Alessio Micheli, Luca Silvestri
Abstract:
The study of deep Recurrent Neural Network (RNN) models represents a research topic of increasing interest. In this paper we investigate layered recurrent architectures under a dynamical system point of view, focusing on characterizing the fundamental aspect of stability. To this end we provide a framework that allows the analysis of deepRNN dynamical regimes through the study of the maximum among the local Lyapunov exponents. Applied to the case of Reservoir Computing networks, our investigation also provides insights on the true merits of layering in RNN architectures, effectively showing how increasing the number of layers eventually results in progressively less stable global dynamics.
The study of deep Recurrent Neural Network (RNN) models represents a research topic of increasing interest. In this paper we investigate layered recurrent architectures under a dynamical system point of view, focusing on characterizing the fundamental aspect of stability. To this end we provide a framework that allows the analysis of deepRNN dynamical regimes through the study of the maximum among the local Lyapunov exponents. Applied to the case of Reservoir Computing networks, our investigation also provides insights on the true merits of layering in RNN architectures, effectively showing how increasing the number of layers eventually results in progressively less stable global dynamics.
ES2017-61
Learning Semantic Prediction using Pretrained Deep Feedforward Networks
Jörg Wagner, Volker Fischer, Michael Herman, Sven Behnke
Learning Semantic Prediction using Pretrained Deep Feedforward Networks
Jörg Wagner, Volker Fischer, Michael Herman, Sven Behnke
Abstract:
The ability to predict future environment states is crucial for anticipative behavior of autonomous agents. Deep learning based methods have proven to solve key perception challenges but currently mainly operate in a non-predictive fashion. We bridge this gap by proposing an approach to transform trained feed-forward networks into predictive ones via a combination of a recurrent predictive module with a teacher-student training strategy. This transformation can be conducted without the need of labeled data in a fully self-supervised fashion. Using simulated data, we demonstrate the ability of the resulting model to temporally predict a task-specific representation and additionally show the benefits of using our approach even when no corresponding feed-forward model is available.
The ability to predict future environment states is crucial for anticipative behavior of autonomous agents. Deep learning based methods have proven to solve key perception challenges but currently mainly operate in a non-predictive fashion. We bridge this gap by proposing an approach to transform trained feed-forward networks into predictive ones via a combination of a recurrent predictive module with a teacher-student training strategy. This transformation can be conducted without the need of labeled data in a fully self-supervised fashion. Using simulated data, we demonstrate the ability of the resulting model to temporally predict a task-specific representation and additionally show the benefits of using our approach even when no corresponding feed-forward model is available.
ES2017-102
Deep convolutional neural networks for detecting noisy neighbours in cloud infrastructure
Bruno Ordozgoiti, Alberto Mozo, Sandra Gómez Canaval, Udi Margolin, Elisha Rosensweig, Itai Segall
Deep convolutional neural networks for detecting noisy neighbours in cloud infrastructure
Bruno Ordozgoiti, Alberto Mozo, Sandra Gómez Canaval, Udi Margolin, Elisha Rosensweig, Itai Segall
Abstract:
Cloud infrastructure in data centers is expected to be one of the main technologies supporting Internet communications in the next few years. Virtualization is employed to achieve the flexibility and dynamicity required by the wide variety of applications used today. Therefore, optimal allocation of virtual machines is key to ensuring performance and efficiency. Noisy neighbor is a term used to describe virtual machines competing for physical resources and thus disturbing each other, a phenomenon that can dramatically degrade their performance. Detecting noisy neighbors using simple thresholding approaches is ineffective. To exploit the time-series nature of cloud infrastructure monitoring data, we propose an approach based on deep convolutional networks. We test it on real infrastructure data and show that it outperforms well-known classifiers in the detection of noisy neighbors.
Cloud infrastructure in data centers is expected to be one of the main technologies supporting Internet communications in the next few years. Virtualization is employed to achieve the flexibility and dynamicity required by the wide variety of applications used today. Therefore, optimal allocation of virtual machines is key to ensuring performance and efficiency. Noisy neighbor is a term used to describe virtual machines competing for physical resources and thus disturbing each other, a phenomenon that can dramatically degrade their performance. Detecting noisy neighbors using simple thresholding approaches is ineffective. To exploit the time-series nature of cloud infrastructure monitoring data, we propose an approach based on deep convolutional networks. We test it on real infrastructure data and show that it outperforms well-known classifiers in the detection of noisy neighbors.
ES2017-109
Real-time convolutional networks for sonar image classification in low-power embedded systems
Matias Valdenegro-Toro
Real-time convolutional networks for sonar image classification in low-power embedded systems
Matias Valdenegro-Toro
Abstract:
Deep Neural Networks have impressive classification performance, but this comes at the expense of significant computational resources at inference time. Autonomous Underwater Vehicles use low-power embedded systems for sonar image perception, and cannot execute large neural networks in real-time. We propose the use of max-pooling aggressively, and we demonstrate it with a Fire-based module and a new Tiny module that includes max-pooling in each module. By stacking them we build networks that achieve the same accuracy as bigger ones, while reducing the number of parameters and considerably increasing computational performance. Our networks can classify a 96 × 96 sonar image with 98.8 − 99.7% accuracy on only 41 to 61 milliseconds on a Raspberry Pi 2, which corresponds to speedups of 28.6 − 19.7.
Deep Neural Networks have impressive classification performance, but this comes at the expense of significant computational resources at inference time. Autonomous Underwater Vehicles use low-power embedded systems for sonar image perception, and cannot execute large neural networks in real-time. We propose the use of max-pooling aggressively, and we demonstrate it with a Fire-based module and a new Tiny module that includes max-pooling in each module. By stacking them we build networks that achieve the same accuracy as bigger ones, while reducing the number of parameters and considerably increasing computational performance. Our networks can classify a 96 × 96 sonar image with 98.8 − 99.7% accuracy on only 41 to 61 milliseconds on a Raspberry Pi 2, which corresponds to speedups of 28.6 − 19.7.
ES2017-30
Approximate operations in Convolutional Neural Networks with RNS data representation
Valentina Arrigoni, Beatrice Rossi, Pasqualina Fragneto, Giuseppe Desoli
Approximate operations in Convolutional Neural Networks with RNS data representation
Valentina Arrigoni, Beatrice Rossi, Pasqualina Fragneto, Giuseppe Desoli
Abstract:
In this work we modify the inference stage of a generic CNN by approximating computations using a data representation based on a Residue Number System at low-precision and introducing rescaling stages for weights and activations. In particular, we exploit an innovative procedure to tune up the system parameters that handles the reduced resolution while minimizing rounding and overflow errors. Our method decreases the hardware complexity of dot product operators and enables a parallelized implementation operating on values represented with few bits, with minimal loss in the overall accuracy of the network.
In this work we modify the inference stage of a generic CNN by approximating computations using a data representation based on a Residue Number System at low-precision and introducing rescaling stages for weights and activations. In particular, we exploit an innovative procedure to tune up the system parameters that handles the reduced resolution while minimizing rounding and overflow errors. Our method decreases the hardware complexity of dot product operators and enables a parallelized implementation operating on values represented with few bits, with minimal loss in the overall accuracy of the network.
ES2017-33
Learning convolutional neural network to maximize Pos@Top performance measure
Yanyan Geng, Liang Ru-Ze , Weizhi Li, Jingbin Wang, Liang Gaoyuan , Xu Chenhao , Wang Jing-Yan
Learning convolutional neural network to maximize Pos@Top performance measure
Yanyan Geng, Liang Ru-Ze , Weizhi Li, Jingbin Wang, Liang Gaoyuan , Xu Chenhao , Wang Jing-Yan
Abstract:
In the machine learning problems, the performance measure is used to evaluate the machine learning models. Recently, the number positive data points ranked at the top positions (Pos@Top) has been a popular performance measure in the machine learning community. In this paper, we propose to learn a convolutional neural network (CNN) model to maximize the Pos@Top performance measure. The CNN model is used to represent the multi-instance data point, and a classifier function is used to predict the label from the its CNN representation. We propose to minimize the loss function of Pos@Top over a training set to learn the filters of CNN and the classifier parameter. The classifier parameter vector is solved by the Lagrange multiplier method, and the filters are updated by the gradient descent method alternately in an iterative algorithm. Experiments over benchmark data sets show that the proposed method outperforms the state-of-the-art Pos@Top maximization methods.
In the machine learning problems, the performance measure is used to evaluate the machine learning models. Recently, the number positive data points ranked at the top positions (Pos@Top) has been a popular performance measure in the machine learning community. In this paper, we propose to learn a convolutional neural network (CNN) model to maximize the Pos@Top performance measure. The CNN model is used to represent the multi-instance data point, and a classifier function is used to predict the label from the its CNN representation. We propose to minimize the loss function of Pos@Top over a training set to learn the filters of CNN and the classifier parameter. The classifier parameter vector is solved by the Lagrange multiplier method, and the filters are updated by the gradient descent method alternately in an iterative algorithm. Experiments over benchmark data sets show that the proposed method outperforms the state-of-the-art Pos@Top maximization methods.
ES2017-122
Active learning strategy for CNN combining batchwise Dropout and Query-By-Committee
Melanie Ducoffe, Frédéric Precioso
Active learning strategy for CNN combining batchwise Dropout and Query-By-Committee
Melanie Ducoffe, Frédéric Precioso
Abstract:
While the current trend is to increase the depth of neural networks to improve their performance, the size of the training database has to grow accordingly. We thus notice an emergence of tremendous databases, although providing labels to build a training set still remains a very expensive task. In this paper, we tackle the problem of selecting the samples to be labeled in an online fashion. We present an active learning strategy based on query by committee and dropout technique to train a Convolutional Neural Network (CNN). We evaluate our active learning strategy for CNN on MNIST and USPS benchmarks, showing in particular that selecting less than 22 % from the annotated database is enough to get similar error rate as using the full training set.
While the current trend is to increase the depth of neural networks to improve their performance, the size of the training database has to grow accordingly. We thus notice an emergence of tremendous databases, although providing labels to build a training set still remains a very expensive task. In this paper, we tackle the problem of selecting the samples to be labeled in an online fashion. We present an active learning strategy based on query by committee and dropout technique to train a Convolutional Neural Network (CNN). We evaluate our active learning strategy for CNN on MNIST and USPS benchmarks, showing in particular that selecting less than 22 % from the annotated database is enough to get similar error rate as using the full training set.
ES2017-115
A Deep Q-Learning Agent for L-Game with Variable Batch Training
Petros Giannakopoulos, Yannis Cotronis
A Deep Q-Learning Agent for L-Game with Variable Batch Training
Petros Giannakopoulos, Yannis Cotronis
Abstract:
We employ the Deep Q-Learning algorithm with Experience Replay to train an agent capable of achieving a high-level of play in the L-Game while self-learning from low-dimensional states. We also employ variable batch size for training in order to mitigate the loss of the rare reward signal and significantly accelerate training. Despite the large action space due to the number of possible moves, the low-dimensional state space and the rarity of rewards, which only come at the end of a game, DQL is successful in training an agent capable of strong play without the use of any search methods or domain knowledge.
We employ the Deep Q-Learning algorithm with Experience Replay to train an agent capable of achieving a high-level of play in the L-Game while self-learning from low-dimensional states. We also employ variable batch size for training in order to mitigate the loss of the rare reward signal and significantly accelerate training. Despite the large action space due to the number of possible moves, the low-dimensional state space and the rarity of rewards, which only come at the end of a game, DQL is successful in training an agent capable of strong play without the use of any search methods or domain knowledge.
ES2017-100
TimeNet: Pre-trained deep recurrent neural network for time series classification
Pankaj Malhotra, VIshnu TV, Lovekesh Vig, Puneet Agarwal, Gautam Shroff
TimeNet: Pre-trained deep recurrent neural network for time series classification
Pankaj Malhotra, VIshnu TV, Lovekesh Vig, Puneet Agarwal, Gautam Shroff
Abstract:
Inspired by the tremendous success of deep Convolutional Neural Networks as generic feature extractors for images, we propose TimeNet: a deep recurrent neural network (RNN) trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series. Rather than relying on data from the problem domain, TimeNet attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously. Once trained, TimeNet can be used as a generic off-the-shelf feature extractor for time series. The representations or embeddings given by a pre-trained TimeNet are found to be useful for time series classification (TSC). For several publicly available datasets from UCR TSC Archive and an industrial telematics sensor data from vehicles, we observe that a classifier learned over the TimeNet embeddings yields significantly better performance compared to (i) a classifier learned over the embeddings given by a domain-specific RNN, as well as (ii) a nearest neighbor classifier based on Dynamic Time Warping.
Inspired by the tremendous success of deep Convolutional Neural Networks as generic feature extractors for images, we propose TimeNet: a deep recurrent neural network (RNN) trained on diverse time series in an unsupervised manner using sequence to sequence (seq2seq) models to extract features from time series. Rather than relying on data from the problem domain, TimeNet attempts to generalize time series representation across domains by ingesting time series from several domains simultaneously. Once trained, TimeNet can be used as a generic off-the-shelf feature extractor for time series. The representations or embeddings given by a pre-trained TimeNet are found to be useful for time series classification (TSC). For several publicly available datasets from UCR TSC Archive and an industrial telematics sensor data from vehicles, we observe that a classifier learned over the TimeNet embeddings yields significantly better performance compared to (i) a classifier learned over the embeddings given by a domain-specific RNN, as well as (ii) a nearest neighbor classifier based on Dynamic Time Warping.
ES2017-56
Uncertain photometric redshifts via combining deep convolutional and mixture density networks
Antonio D'Isanto, Kai Lars Polsterer
Uncertain photometric redshifts via combining deep convolutional and mixture density networks
Antonio D'Isanto, Kai Lars Polsterer
Abstract:
The need for accurate photometric redshifts estimation is a major subject in Astronomy. This is due to the necessity of efficiently obtaining redshift information without the need for spectroscopic analysis. We propose a method for determining accurate multi-modal predictive densities for redshift, using Mixture Density Networks and Deep Convolutional Networks. A comparison with the Random Forest is carried out and superior performance of the proposed architecture is demonstrated.
The need for accurate photometric redshifts estimation is a major subject in Astronomy. This is due to the necessity of efficiently obtaining redshift information without the need for spectroscopic analysis. We propose a method for determining accurate multi-modal predictive densities for redshift, using Mixture Density Networks and Deep Convolutional Networks. A comparison with the Random Forest is carried out and superior performance of the proposed architecture is demonstrated.
ES2017-90
Feature Extraction and Learning for RSSI based Indoor Device Localization
Stavros Timotheatos, Grigorios Tsagkatakis, Panagiotis Tsakalides, Panos Trahanias
Feature Extraction and Learning for RSSI based Indoor Device Localization
Stavros Timotheatos, Grigorios Tsagkatakis, Panagiotis Tsakalides, Panos Trahanias
Abstract:
In this paper, we study and experimentally compare two state-of-the-art methods for low dimensional feature extraction, within the context of RSSI fingerprinting for localization. On one hand, we consider Stacked Autoencoders, a prominent example of a deep learning architecture, while on the other hand, we explore Random Projections, a universal feature extraction approach. Experimental results suggest that feature learning has a dramatic impact on the subsequent analysis like location based classification.
In this paper, we study and experimentally compare two state-of-the-art methods for low dimensional feature extraction, within the context of RSSI fingerprinting for localization. On one hand, we consider Stacked Autoencoders, a prominent example of a deep learning architecture, while on the other hand, we explore Random Projections, a universal feature extraction approach. Experimental results suggest that feature learning has a dramatic impact on the subsequent analysis like location based classification.