Bruges, Belgium October 09 - 11
Content of the proceedings
-
Informed Machine Learning for Complex Data
Machine learning in distributed, federated and non-stationery environments
Continual Improvement of Deep Neural Networks in The Age of Big Data
Optimization
Classification and regression
Trust in Artificial Intelligence: Beyond Interpretability
Nonlinear dimensionality reduction and unsupervised learning
Graph learning
Domain Knowledge Integration in Machine Learning Systems
Online learning and concept drift
Time series, recurrent and reinforcement learning
Aeronautic data analysis
Modern Machine Learning Methods for robust and real-time Brain-Computer Interfaces (BCI)
Language models
Image processing and computer vision
Informed Machine Learning for Complex Data
Informed Machine Learning for Complex Data
Luca Oneto, Nicolò Navarin, Alessio Micheli, Luca Pasa, Claudio Gallicchio, Davide Bacciu, Davide Anguita
https://doi.org/10.14428/esann/2024.ES2024-1
Luca Oneto, Nicolò Navarin, Alessio Micheli, Luca Pasa, Claudio Gallicchio, Davide Bacciu, Davide Anguita
https://doi.org/10.14428/esann/2024.ES2024-1
Abstract:
In the contemporary era of data-driven decision-making, the application of Machine Learning (ML) on complex data (e.g., images, text, sequences, trees, and graphs) has become increasingly pivotal (e.g., Large Language Models and Graph Neural Networks). In this context, there is a gap between purely data-driven models and domain-specific knowledge, requirements, and expertise. In particular, this domain specificity needs to be integrated into the ML models to improve learning generalization, sustainability, trustworthiness, reliability, security, and safety. This additional knowledge can assume different forms, e.g.: software developers require ML to comply with many technical requirements, companies require ML to comply with economic and environmental sustainability, domain experts require ML to be aligned with physical and logical laws, and society requires ML to be aligned with ethical principles. This special session gathers valuable contributions and early findings in the field of Informed ML for Complex Data. Our main objective is to showcase the potential and limitations of new ideas, improvements, or the blending of ML and other research areas in solving real-world problems.
In the contemporary era of data-driven decision-making, the application of Machine Learning (ML) on complex data (e.g., images, text, sequences, trees, and graphs) has become increasingly pivotal (e.g., Large Language Models and Graph Neural Networks). In this context, there is a gap between purely data-driven models and domain-specific knowledge, requirements, and expertise. In particular, this domain specificity needs to be integrated into the ML models to improve learning generalization, sustainability, trustworthiness, reliability, security, and safety. This additional knowledge can assume different forms, e.g.: software developers require ML to comply with many technical requirements, companies require ML to comply with economic and environmental sustainability, domain experts require ML to be aligned with physical and logical laws, and society requires ML to be aligned with ethical principles. This special session gathers valuable contributions and early findings in the field of Informed ML for Complex Data. Our main objective is to showcase the potential and limitations of new ideas, improvements, or the blending of ML and other research areas in solving real-world problems.
Informed Machine Learning: Excess Risk and Generalization
Luca Oneto, Davide Anguita, Sandro Ridella
https://doi.org/10.14428/esann/2024.ES2024-20
Luca Oneto, Davide Anguita, Sandro Ridella
https://doi.org/10.14428/esann/2024.ES2024-20
Abstract:
Machine Learning (ML) based predictive models are currently impacting research, industry, and society at large thanks to their ability to model or surrogate real systems. Two of the main current limitations of ML are the need for large amounts of high quality data and low performance far away from the observed data. For this reason, in certain applications where prior knowledge is available, researchers have developed Informed ML (IML) to decrease ML high quality data voracity and increase ML extrapolation abilities. In this work we study the differences between ML and IML excess risk and generalization using also some examples to elucidate the theoretical discussions. Our findings shed some lights on the mechanisms and the conditions under which IML outperforms ML.
Machine Learning (ML) based predictive models are currently impacting research, industry, and society at large thanks to their ability to model or surrogate real systems. Two of the main current limitations of ML are the need for large amounts of high quality data and low performance far away from the observed data. For this reason, in certain applications where prior knowledge is available, researchers have developed Informed ML (IML) to decrease ML high quality data voracity and increase ML extrapolation abilities. In this work we study the differences between ML and IML excess risk and generalization using also some examples to elucidate the theoretical discussions. Our findings shed some lights on the mechanisms and the conditions under which IML outperforms ML.
Enhancing Echo State Networks with Gradient-based Explainability Methods
Francesco Spinnato, Andrea Cossu, Riccardo Guidotti, Andrea Ceni, Claudio Gallicchio, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-78
Francesco Spinnato, Andrea Cossu, Riccardo Guidotti, Andrea Ceni, Claudio Gallicchio, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-78
Abstract:
Recurrent Neural Networks are effective for analyzing temporal data, such as time series, but they often require costly and time-intensive training. Echo State Networks simplify the training process by using a fixed recurrent layer, the reservoir, and a trainable output layer, the readout. In sequence classification problems, the readout typically receives only the final state of the reservoir. However, averaging all states can sometimes be beneficial. In this work, we assess whether a weighted average of hidden states can enhance the Echo State Network performance. To this end, we propose a gradient-based, explainable technique to guide the contribution of each hidden state towards the final prediction. We show that our approach outperforms the naive average, as well as other baselines, in time series classification, particularly on noisy data.
Recurrent Neural Networks are effective for analyzing temporal data, such as time series, but they often require costly and time-intensive training. Echo State Networks simplify the training process by using a fixed recurrent layer, the reservoir, and a trainable output layer, the readout. In sequence classification problems, the readout typically receives only the final state of the reservoir. However, averaging all states can sometimes be beneficial. In this work, we assess whether a weighted average of hidden states can enhance the Echo State Network performance. To this end, we propose a gradient-based, explainable technique to guide the contribution of each hidden state towards the final prediction. We show that our approach outperforms the naive average, as well as other baselines, in time series classification, particularly on noisy data.
Generalizing Convolution to Point Clouds
Davide Bacciu, Francesco Landolfi
https://doi.org/10.14428/esann/2024.ES2024-145
Davide Bacciu, Francesco Landolfi
https://doi.org/10.14428/esann/2024.ES2024-145
Abstract:
Convolution, a fundamental operation in deep learning for structured grid data like images, cannot be directly applied to point clouds due to their irregular and unordered nature. Many approaches in literature that perform convolution on point clouds achieve this by designing a convolutional operator from scratch, often with little resemblance to the one used on images. We present two point cloud convolutions that naturally follow from the convolution in its standard definition popular with images. We do so by relaxing the indexing of the kernel weights with a ``soft'' dictionary that resembles the attention mechanism of the transformers. Finally, experimental results demonstrate the effectiveness of the proposed relaxations on two benchmark point cloud classification tasks.
Convolution, a fundamental operation in deep learning for structured grid data like images, cannot be directly applied to point clouds due to their irregular and unordered nature. Many approaches in literature that perform convolution on point clouds achieve this by designing a convolutional operator from scratch, often with little resemblance to the one used on images. We present two point cloud convolutions that naturally follow from the convolution in its standard definition popular with images. We do so by relaxing the indexing of the kernel weights with a ``soft'' dictionary that resembles the attention mechanism of the transformers. Finally, experimental results demonstrate the effectiveness of the proposed relaxations on two benchmark point cloud classification tasks.
Towards the application of Backpropagation-Free Graph Convolutional Networks on Huge Datasets
Nicolò Navarin, Luca Pasa, Alessandro Sperduti
https://doi.org/10.14428/esann/2024.ES2024-179
Nicolò Navarin, Luca Pasa, Alessandro Sperduti
https://doi.org/10.14428/esann/2024.ES2024-179
Abstract:
Backpropagation-Free Graph Convolutional Networks (BF-GCN) are backpropagation-free neural models dealing with graph data based on Gated Linear Networks. Each neuron in a BF-GCN is defined as a set of graph convolution filters (weight vectors) and a gating mechanism that, given a node's context, selects the weight vector to use for processing the node's attributes based on its distance from a set of prototypes. Given the higher expressivity BF-GNN's neurons compared to the standard graph convolutional neural networks ones, they show a bigger memory footprint. This makes it challenging to apply BF-GNN on huge datasets. In this paper, we explore how reducing the size of node contexts through randomization can reduce the memory occupancy of the method, enabling its application to huge datasets. We empirically show how working with very low dimensional contexts does not impact the resulting predictive performances.
Backpropagation-Free Graph Convolutional Networks (BF-GCN) are backpropagation-free neural models dealing with graph data based on Gated Linear Networks. Each neuron in a BF-GCN is defined as a set of graph convolution filters (weight vectors) and a gating mechanism that, given a node's context, selects the weight vector to use for processing the node's attributes based on its distance from a set of prototypes. Given the higher expressivity BF-GNN's neurons compared to the standard graph convolutional neural networks ones, they show a bigger memory footprint. This makes it challenging to apply BF-GNN on huge datasets. In this paper, we explore how reducing the size of node contexts through randomization can reduce the memory occupancy of the method, enabling its application to huge datasets. We empirically show how working with very low dimensional contexts does not impact the resulting predictive performances.
Continual Learning with Graph Reservoirs: Preliminary experiments in graph classification
Domenico Tortorella, Alessio Micheli
https://doi.org/10.14428/esann/2024.ES2024-21
Domenico Tortorella, Alessio Micheli
https://doi.org/10.14428/esann/2024.ES2024-21
Abstract:
Continual learning aims to address the challenge of catastrophic forgetting in training models where data patterns are non-stationary. Previous research has shown that fully-trained graph learning models are particularly affected by this issue. One approach to lifting part of the burden is to leverage the representations provided by a training-free reservoir computing model. In this work, we evaluate for the first time different continual learning strategies in conjunction with Graph Echo State Networks, which have already demonstrated their efficacy and efficiency in graph classification tasks.
Continual learning aims to address the challenge of catastrophic forgetting in training models where data patterns are non-stationary. Previous research has shown that fully-trained graph learning models are particularly affected by this issue. One approach to lifting part of the burden is to leverage the representations provided by a training-free reservoir computing model. In this work, we evaluate for the first time different continual learning strategies in conjunction with Graph Echo State Networks, which have already demonstrated their efficacy and efficiency in graph classification tasks.
XAI and Bias of Deep Graph Networks
Michele Fontanesi, Alessio Micheli, Marco Podda
https://doi.org/10.14428/esann/2024.ES2024-85
Michele Fontanesi, Alessio Micheli, Marco Podda
https://doi.org/10.14428/esann/2024.ES2024-85
Abstract:
Generalization in machine learning involves introducing inductive biases that restrict the solution space of the learning problem, allowing for the inductive leap. In this paper, we show the existence of different inductive biases between convolutional and recursive Deep Graph Networks (DGN) by applying Explainable AI (XAI) methods as model inspection techniques. We show that different architectures can perfectly solve the given tasks by learning different labelling policies. Our results promote the usage of different architectures to address a task and raise warnings on the assessment of XAI techniques as their benchmarks may contain more ground truths than those provided.
Generalization in machine learning involves introducing inductive biases that restrict the solution space of the learning problem, allowing for the inductive leap. In this paper, we show the existence of different inductive biases between convolutional and recursive Deep Graph Networks (DGN) by applying Explainable AI (XAI) methods as model inspection techniques. We show that different architectures can perfectly solve the given tasks by learning different labelling policies. Our results promote the usage of different architectures to address a task and raise warnings on the assessment of XAI techniques as their benchmarks may contain more ground truths than those provided.
Machine learning in distributed, federated and non-stationery environments
Machine learning in distributed, federated and non-stationary environments - recent trends
Mirko Polato, Barbara Hammer, Frank-Michael Schleif
https://doi.org/10.14428/esann/2024.ES2024-3
Mirko Polato, Barbara Hammer, Frank-Michael Schleif
https://doi.org/10.14428/esann/2024.ES2024-3
Abstract:
This tutorial provides an overview of machine learning methodologies applied in distributed, federated, and non-stationary environments. We focus on recent advancements and novel research contributions of the field. Key topics include data analysis and pattern recognition for non-stationary environments, model compression, federated learning algorithms, and privacy preservation. This tutorial aims to equip researchers and practitioners with insights into current challenges and innovative solutions in this dynamic field.
This tutorial provides an overview of machine learning methodologies applied in distributed, federated, and non-stationary environments. We focus on recent advancements and novel research contributions of the field. Key topics include data analysis and pattern recognition for non-stationary environments, model compression, federated learning algorithms, and privacy preservation. This tutorial aims to equip researchers and practitioners with insights into current challenges and innovative solutions in this dynamic field.
Sparse Uncertainty-Informed Sampling from Federated Streaming Data
Manuel Röder, Frank-Michael Schleif
https://doi.org/10.14428/esann/2024.ES2024-9
Manuel Röder, Frank-Michael Schleif
https://doi.org/10.14428/esann/2024.ES2024-9
Abstract:
We present a numerically robust, computationally efficient approach for non-I.I.D. data stream sampling in federated client systems, where resources are limited and labeled data for local model adaptation is sparse and expensive. The proposed method identifies relevant stream observations to optimize the underlying client model, given a local labeling budget, and performs instantaneous labeling decisions without relying on any memory buffering strategies. Our experiments show enhanced training batch diversity and an improved numerical robustness of the proposal compared to existing strategies over large-scale data streams, making our approach an effective and convenient solution in FL environments.
We present a numerically robust, computationally efficient approach for non-I.I.D. data stream sampling in federated client systems, where resources are limited and labeled data for local model adaptation is sparse and expensive. The proposed method identifies relevant stream observations to optimize the underlying client model, given a local labeling budget, and performs instantaneous labeling decisions without relying on any memory buffering strategies. Our experiments show enhanced training batch diversity and an improved numerical robustness of the proposal compared to existing strategies over large-scale data streams, making our approach an effective and convenient solution in FL environments.
On the Fine Structure of Drifting Features
Fabian Hinder, Valerie Vaquet, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-89
Fabian Hinder, Valerie Vaquet, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-89
Abstract:
Feature selection is one of the most relevant preprocessing and analysis techniques in machine learning, allowing for increases in model performance and knowledge discovery. In online setups, both can be affected by concept drift, i.e., changes of the underlying distribution. Recently, an adaption of classical feature relevance approaches to drift detection was introduced. While the method increases detection performance significantly, there is only little discussion on the explanatory aspects. In this work, we focus on understanding the structure of the ongoing drift by transferring the concept of strongly and weakly relevant features to it. We empirically evaluate our methodology using graphical models.
Feature selection is one of the most relevant preprocessing and analysis techniques in machine learning, allowing for increases in model performance and knowledge discovery. In online setups, both can be affected by concept drift, i.e., changes of the underlying distribution. Recently, an adaption of classical feature relevance approaches to drift detection was introduced. While the method increases detection performance significantly, there is only little discussion on the explanatory aspects. In this work, we focus on understanding the structure of the ongoing drift by transferring the concept of strongly and weakly relevant features to it. We empirically evaluate our methodology using graphical models.
FedHP: Federated Learning with Hyperspherical Prototypical Regularization
Samuele Fonio, Mirko Polato, Roberto Esposito
https://doi.org/10.14428/esann/2024.ES2024-183
Samuele Fonio, Mirko Polato, Roberto Esposito
https://doi.org/10.14428/esann/2024.ES2024-183
Abstract:
This paper presents FedHP, an algorithm that amalgamates federated learning, hyperspherical geometries, and prototype learning. Federated Learning (FL) has garnered attention as a privacy-preserving method for constructing robust models across distributed datasets. Traditionally, FL involves exchanging model parameters to uphold data privacy; however, in scenarios with costly data communication, exchanging large neural network models becomes impractical. In such instances, prototype learning provides a feasible solution by necessitating the exchange of a few class prototypes instead of entire deep learning models. Motivated by these considerations, our approach leverages recent advancements in prototype learning, particularly the benefits offered by non-Euclidean geometries. Alongside introducing FedHP, we provide empirical evidence demonstrating its comparable performance to other state-of-the-art approaches while significantly reducing communication costs.
This paper presents FedHP, an algorithm that amalgamates federated learning, hyperspherical geometries, and prototype learning. Federated Learning (FL) has garnered attention as a privacy-preserving method for constructing robust models across distributed datasets. Traditionally, FL involves exchanging model parameters to uphold data privacy; however, in scenarios with costly data communication, exchanging large neural network models becomes impractical. In such instances, prototype learning provides a feasible solution by necessitating the exchange of a few class prototypes instead of entire deep learning models. Motivated by these considerations, our approach leverages recent advancements in prototype learning, particularly the benefits offered by non-Euclidean geometries. Alongside introducing FedHP, we provide empirical evidence demonstrating its comparable performance to other state-of-the-art approaches while significantly reducing communication costs.
Few-shot similarity learning for motion classification via electromyography
Rui Liu, Benjamin Paassen
https://doi.org/10.14428/esann/2024.ES2024-43
Rui Liu, Benjamin Paassen
https://doi.org/10.14428/esann/2024.ES2024-43
Abstract:
Accurate motion classification from surface electromyography signals is crucial for controlling bionic prostheses. Unfortunately, most state-of-the-art classifiers need to be re-trained with lots of data to recognize any new motion. Therefore, we propose a few-shot similarity learning approach that can be applied to new classes without any re-training, just using one to five reference points per new class. In experiments on two real-world data sets, we find that our proposed approach outperforms two state-of-the-art approaches for few-shot learning on sEMG signals, namely a transfer learning and a contrastive learning approach. Our experiments also reveal that the choice of loss function is crucial for performance whereas the choice of similarity function has less effect.
Accurate motion classification from surface electromyography signals is crucial for controlling bionic prostheses. Unfortunately, most state-of-the-art classifiers need to be re-trained with lots of data to recognize any new motion. Therefore, we propose a few-shot similarity learning approach that can be applied to new classes without any re-training, just using one to five reference points per new class. In experiments on two real-world data sets, we find that our proposed approach outperforms two state-of-the-art approaches for few-shot learning on sEMG signals, namely a transfer learning and a contrastive learning approach. Our experiments also reveal that the choice of loss function is crucial for performance whereas the choice of similarity function has less effect.
About Vector Quantization and its Privacy in Federated Learning
Ronny Schubert, Thomas Villmann
https://doi.org/10.14428/esann/2024.ES2024-57
Ronny Schubert, Thomas Villmann
https://doi.org/10.14428/esann/2024.ES2024-57
Abstract:
In this work, we will consider how privacy for vector quantization models can be broken in a federated learning environment. We show how a potential attacker can expose data from the prototype updates without needing to know about the specific model used by exploiting the transparency of vector quantization. Finally, a 1-user environment example based on GLVQ will be shown.
In this work, we will consider how privacy for vector quantization models can be broken in a federated learning environment. We show how a potential attacker can expose data from the prototype updates without needing to know about the specific model used by exploiting the transparency of vector quantization. Finally, a 1-user environment example based on GLVQ will be shown.
Federated Time Series Classification with ROCKET features
Bruno Casella, Matthias Jakobs, Marco Aldinucci, Sebastian Buschjäger
https://doi.org/10.14428/esann/2024.ES2024-61
Bruno Casella, Matthias Jakobs, Marco Aldinucci, Sebastian Buschjäger
https://doi.org/10.14428/esann/2024.ES2024-61
Abstract:
This paper proposes FROCKS, a federated time series classification method using ROCKET features. Our approach dynamically adapts the models’ features by selecting and exchanging the best-performing ROCKET kernels from a federation of clients. Specifically, the server gathers the best-performing kernels of the clients together with the associated model parameters, and it performs a weighted average if a kernel is best-performing for more than one client. We compare the proposed method with state-of-the-art approaches on the UCR archive binary classification datasets and show superior performance on most datasets.
This paper proposes FROCKS, a federated time series classification method using ROCKET features. Our approach dynamically adapts the models’ features by selecting and exchanging the best-performing ROCKET kernels from a federation of clients. Specifically, the server gathers the best-performing kernels of the clients together with the associated model parameters, and it performs a weighted average if a kernel is best-performing for more than one client. We compare the proposed method with state-of-the-art approaches on the UCR archive binary classification datasets and show superior performance on most datasets.
Federated Learning in a Semi-Supervised Environment for Earth Observation Data
Bruno Casella, Alessio Barbaro Chisari, Marco Aldinucci, Sebastiano Battiato, Mario Valerio Giuffrida
https://doi.org/10.14428/esann/2024.ES2024-214
Bruno Casella, Alessio Barbaro Chisari, Marco Aldinucci, Sebastiano Battiato, Mario Valerio Giuffrida
https://doi.org/10.14428/esann/2024.ES2024-214
Abstract:
We propose FedRec, a federated learning workflow taking advantage of unlabelled data in a semi-supervised environment to assist in the training of a supervised aggregated model. In our proposed method, an encoder architecture extracting features from unlabelled data is aggregated with the feature extractor of a classification model via weight averaging. The fully connected layers of the supervised models are also averaged in a federated fashion. We show the effectiveness of our approach by comparing it with the state-of-the-art federated algorithm, an isolated and a centralised baseline, on novel cloud detection datasets. Our code is available here.
We propose FedRec, a federated learning workflow taking advantage of unlabelled data in a semi-supervised environment to assist in the training of a supervised aggregated model. In our proposed method, an encoder architecture extracting features from unlabelled data is aggregated with the feature extractor of a classification model via weight averaging. The fully connected layers of the supervised models are also averaged in a federated fashion. We show the effectiveness of our approach by comparing it with the state-of-the-art federated algorithm, an isolated and a centralised baseline, on novel cloud detection datasets. Our code is available here.
Continual Improvement of Deep Neural Networks in The Age of Big Data
Continual Learning of Deep Neural Networks in The Age of Big Data
Alexander Gepperth, Timothée Lesort
https://doi.org/10.14428/esann/2024.ES2024-2
Alexander Gepperth, Timothée Lesort
https://doi.org/10.14428/esann/2024.ES2024-2
Abstract:
Many applications of deep learning are set in an environment with perpetual change or at least with an ever-growing amount of data. In practice, deep neural networks (DNNs) and large language models (LLMs) are continually trained and evaluated. They need to incorporate new data or new annotations, where one typical issue is the extensive availability of unannotated or low-quality data, coupled with a bottleneck concerning annotations and/or curated samples. In such setups, the scaling behavior of continual learning (CL) algorithms w.r.t. training time becomes critical, which is in contrast to the standard CL setting operating on small databases like MNIST, CIFAR or ImageNet. Annotations or curated samples become available progressively, e.g., because they are created by humans, or due to an ongoing exploration of the environment, and need to be progressively incorporated into models. This article explores how advancement in continual learning can improve the scalability and performance of DNNs and LLMs in such setups. One interesting aspect is to leverage dedicated (small-scale) CL techniques to achieve advantageous trade-offs between computational cost and accuracy, or how such CL methods can maintain advantageous scaling behavior w.r.t. continuous re-training on all data.
Many applications of deep learning are set in an environment with perpetual change or at least with an ever-growing amount of data. In practice, deep neural networks (DNNs) and large language models (LLMs) are continually trained and evaluated. They need to incorporate new data or new annotations, where one typical issue is the extensive availability of unannotated or low-quality data, coupled with a bottleneck concerning annotations and/or curated samples. In such setups, the scaling behavior of continual learning (CL) algorithms w.r.t. training time becomes critical, which is in contrast to the standard CL setting operating on small databases like MNIST, CIFAR or ImageNet. Annotations or curated samples become available progressively, e.g., because they are created by humans, or due to an ongoing exploration of the environment, and need to be progressively incorporated into models. This article explores how advancement in continual learning can improve the scalability and performance of DNNs and LLMs in such setups. One interesting aspect is to leverage dedicated (small-scale) CL techniques to achieve advantageous trade-offs between computational cost and accuracy, or how such CL methods can maintain advantageous scaling behavior w.r.t. continuous re-training on all data.
Sequential Continual Pre-Training for Neural Machine Translation
Niko Dalla Noce, Michele Resta, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-165
Niko Dalla Noce, Michele Resta, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-165
Abstract:
We explore continual pre-training for Neural Machine Translation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently fine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our findings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts downstream task performance.
We explore continual pre-training for Neural Machine Translation within a continual learning framework. We introduce a setting where new languages are gradually added to pre-trained models across multiple training experiences. These pre-trained models are subsequently fine-tuned on downstream translation tasks. We compare mBART and mT5 pre-training objectives using four European Languages. Our findings demonstrate that sequentially adding languages during pre-training effectively mitigates catastrophic forgetting and minimally impacts downstream task performance.
Towards Deep Continual Workspace Monitoring: Performance Evaluation of CL Strategies for Object Detection in Working Sites
ASLI CELIK, OGUZHAN URHAN, Andrea Cossu, Vincenzo Lomonaco
https://doi.org/10.14428/esann/2024.ES2024-128
ASLI CELIK, OGUZHAN URHAN, Andrea Cossu, Vincenzo Lomonaco
https://doi.org/10.14428/esann/2024.ES2024-128
Abstract:
Object detection plays a crucial role in computer-based monitoring tasks, where the adaptability of object detection algorithms to complex and dynamic backgrounds is essential for achieving accurate and stable detection performance. Despite the effectiveness of state-of-the-art object detectors, continual object detection remains a significant challenge in real-world applications. In this study, we utilized a dataset tailored for continual object detection in diverse working environments. Using this dataset, a task-incremental continual learning scenario was established in which each experience, corresponding to object detection sub-datasets collected from different work sites, served as a separate task. Common baseline continual learning (CL) strategies were employed throughout the continual training process to evaluate their efficacy. Our findings, consistent with the CL literature, underscore replay-based strategies as the top performers, assessed across both task-aware and task-agnostic settings. Additionally, zero-shot object detection demonstrates notably lower performance compared to the best-performing CL strategies, emphasizing the critical importance of CL strategies in maintaining consistent detection performance and adapting to new environments and work sites.
Object detection plays a crucial role in computer-based monitoring tasks, where the adaptability of object detection algorithms to complex and dynamic backgrounds is essential for achieving accurate and stable detection performance. Despite the effectiveness of state-of-the-art object detectors, continual object detection remains a significant challenge in real-world applications. In this study, we utilized a dataset tailored for continual object detection in diverse working environments. Using this dataset, a task-incremental continual learning scenario was established in which each experience, corresponding to object detection sub-datasets collected from different work sites, served as a separate task. Common baseline continual learning (CL) strategies were employed throughout the continual training process to evaluate their efficacy. Our findings, consistent with the CL literature, underscore replay-based strategies as the top performers, assessed across both task-aware and task-agnostic settings. Additionally, zero-shot object detection demonstrates notably lower performance compared to the best-performing CL strategies, emphasizing the critical importance of CL strategies in maintaining consistent detection performance and adapting to new environments and work sites.
Optimization
Joint Entropy Search for Multi-objective Bayesian Optimization with Constraints and Multiple Fidelities
Daniel Fernández-Sánchez, Daniel Hernández-Lobato
https://doi.org/10.14428/esann/2024.ES2024-24
Daniel Fernández-Sánchez, Daniel Hernández-Lobato
https://doi.org/10.14428/esann/2024.ES2024-24
Abstract:
Bayesian optimization (BO) methods solve problems with several black-box objectives and constraints. Each black box is expensive to evaluate and lacks a closed form. They use a model of each black box to guide the search for the problem’s solution. Sometimes, however, the black boxes may be evaluated at different fidelity levels. A lower fidelity is simply a cheap proxy for the corresponding black box. Thus, lower fidelities that correlate with the actual black box can be used to reduce the optimization cost. We propose Joint Entropy Search for Multi-Fidelity and Multi-Objective Bayesian Optimization with Constraints (MF-JESMOC), a BO method for solving the aforementioned problems. It chooses the next point and fidelity level at which to evaluate the black boxes as the one that is expected to reduce the most the joint entropy of the Pareto set and the Pareto front, normalized by the fidelity’s cost. Deep Gaussian processes are used to model each black box and the dependencies between fidelities. In our experiments, MF-JESMOC outperforms other state-of-the-art methods for multi-objective BO with constraints and different fidelity levels.
Bayesian optimization (BO) methods solve problems with several black-box objectives and constraints. Each black box is expensive to evaluate and lacks a closed form. They use a model of each black box to guide the search for the problem’s solution. Sometimes, however, the black boxes may be evaluated at different fidelity levels. A lower fidelity is simply a cheap proxy for the corresponding black box. Thus, lower fidelities that correlate with the actual black box can be used to reduce the optimization cost. We propose Joint Entropy Search for Multi-Fidelity and Multi-Objective Bayesian Optimization with Constraints (MF-JESMOC), a BO method for solving the aforementioned problems. It chooses the next point and fidelity level at which to evaluate the black boxes as the one that is expected to reduce the most the joint entropy of the Pareto set and the Pareto front, normalized by the fidelity’s cost. Deep Gaussian processes are used to model each black box and the dependencies between fidelities. In our experiments, MF-JESMOC outperforms other state-of-the-art methods for multi-objective BO with constraints and different fidelity levels.
Convergence analysis of an inexact gradient method on smooth convex functions
Pierre Vernimmen, François Glineur
https://doi.org/10.14428/esann/2024.ES2024-171
Pierre Vernimmen, François Glineur
https://doi.org/10.14428/esann/2024.ES2024-171
Abstract:
We consider the classical gradient method with constant stepsizes where some error is introduced in the computation of each gradient. More specifically, we assume relative inexactness, in the sense that the norm of the difference between the true gradient and its approximate value is bounded by a certain fraction of the gradient norm. We establish a sublinear convergence rate for this inexact method when applied to smooth convex functions, and illustrate on a logistic regression example.
We consider the classical gradient method with constant stepsizes where some error is introduced in the computation of each gradient. More specifically, we assume relative inexactness, in the sense that the norm of the difference between the true gradient and its approximate value is bounded by a certain fraction of the gradient norm. We establish a sublinear convergence rate for this inexact method when applied to smooth convex functions, and illustrate on a logistic regression example.
ADLER - An efficient Hessian-based strategy for adaptive learning rate
Dario Balboni, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-132
Dario Balboni, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-132
Abstract:
We derive a sound positive semi-definite approximation of the Hessian of deep models for which Hessian-vector products are easily computable. This enables us to provide an adaptive SGD learning rate strategy based on the minimization of the local quadratic approximation, which requires just twice the computation of a single SGD run, but performs comparably with grid search on SGD learning rates on different model architectures (CNN with and without residual connections) on classification tasks, which makes the algorithm a promising first step toward obtaining hyperparameter-free optimization of deep learning models, and also reduces the energy impact of training. We also compare the novel approximation with the Gauss-Newton approximation.
We derive a sound positive semi-definite approximation of the Hessian of deep models for which Hessian-vector products are easily computable. This enables us to provide an adaptive SGD learning rate strategy based on the minimization of the local quadratic approximation, which requires just twice the computation of a single SGD run, but performs comparably with grid search on SGD learning rates on different model architectures (CNN with and without residual connections) on classification tasks, which makes the algorithm a promising first step toward obtaining hyperparameter-free optimization of deep learning models, and also reduces the energy impact of training. We also compare the novel approximation with the Gauss-Newton approximation.
Classification and regression
Automatic Miscalibration Diagnosis: Interpreting Probability Integral Transform (PIT) Histograms
Ondřej Podsztavek, Alexander I. Jordan, Pavel Tvrdík, Kai L. Polsterer
https://doi.org/10.14428/esann/2024.ES2024-15
Ondřej Podsztavek, Alexander I. Jordan, Pavel Tvrdík, Kai L. Polsterer
https://doi.org/10.14428/esann/2024.ES2024-15
Abstract:
Quantifying the predictive uncertainty of a model is essential for risk assessment. We address the proper calibration of the predictive uncertainty in regression tasks by employing the probability integral transform (PIT) histogram to diagnose miscalibration. PIT histograms are often difficult to interpret, and therefore we present an approach to an automatic interpretation of PIT histograms based on an interpreter trained with a synthetic data set. Given a PIT histogram of a model and a data set, the interpreter can estimate the data-generating distribution of the data set with the main purpose of identifying the cause of miscalibration.
Quantifying the predictive uncertainty of a model is essential for risk assessment. We address the proper calibration of the predictive uncertainty in regression tasks by employing the probability integral transform (PIT) histogram to diagnose miscalibration. PIT histograms are often difficult to interpret, and therefore we present an approach to an automatic interpretation of PIT histograms based on an interpreter trained with a synthetic data set. Given a PIT histogram of a model and a data set, the interpreter can estimate the data-generating distribution of the data set with the main purpose of identifying the cause of miscalibration.
Feature Learning using Multi-view Kernel Partial Least Squares
Xinjie Zeng, Qinghua Tao, Johan Suykens
https://doi.org/10.14428/esann/2024.ES2024-168
Xinjie Zeng, Qinghua Tao, Johan Suykens
https://doi.org/10.14428/esann/2024.ES2024-168
Abstract:
The multi-view learning deals with data of multiple views, aiming to explore the underlying relations between different views and use them for various tasks. In this paper, we derive a multi-view extension of kernel partial least squares for unsupervised feature learning. We establish the optimization objective in the primal as the pair-wise covariance between the projection scores and derive that this model can be trained in the dual form by solving an eigenvalue problem. Experiments are also conducted to verify the effectiveness of the method with real-life multi-view datasets, where the proposed method is adopted as a feature extractor and then the clustering task is conducted for performance comparisons.
The multi-view learning deals with data of multiple views, aiming to explore the underlying relations between different views and use them for various tasks. In this paper, we derive a multi-view extension of kernel partial least squares for unsupervised feature learning. We establish the optimization objective in the primal as the pair-wise covariance between the projection scores and derive that this model can be trained in the dual form by solving an eigenvalue problem. Experiments are also conducted to verify the effectiveness of the method with real-life multi-view datasets, where the proposed method is adopted as a feature extractor and then the clustering task is conducted for performance comparisons.
Stable Diffusion Dataset Generation for Downstream Classification Tasks
Eugenio Lomurno, Matteo D'Oria, Matteo Matteucci
https://doi.org/10.14428/esann/2024.ES2024-100
Eugenio Lomurno, Matteo D'Oria, Matteo Matteucci
https://doi.org/10.14428/esann/2024.ES2024-100
Abstract:
Recent advances in generative artificial intelligence have enabled the creation of high-quality synthetic data that closely mimics real-world data. This paper explores the adaptation of the Stable Diffusion 2.0 model for generating synthetic datasets, using Transfer Learning, Fine-Tuning and generation parameter optimisation techniques to improve the utility of the dataset for downstream classification tasks. We present a class-conditional version of the model that exploits a Class-Encoder and optimisation of key generation parameters. Our methodology led to synthetic datasets that, in a third of cases, produced models that outperformed those trained on real datasets.
Recent advances in generative artificial intelligence have enabled the creation of high-quality synthetic data that closely mimics real-world data. This paper explores the adaptation of the Stable Diffusion 2.0 model for generating synthetic datasets, using Transfer Learning, Fine-Tuning and generation parameter optimisation techniques to improve the utility of the dataset for downstream classification tasks. We present a class-conditional version of the model that exploits a Class-Encoder and optimisation of key generation parameters. Our methodology led to synthetic datasets that, in a third of cases, produced models that outperformed those trained on real datasets.
Extrapolating Venusian Atmospheric Profiles using MAGMA Gaussian Processes
Simon Lejoly, Arianna Piccialli, Arnaud Mahieux, Ann Carine Vandaele, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-142
Simon Lejoly, Arianna Piccialli, Arnaud Mahieux, Ann Carine Vandaele, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-142
Abstract:
In the field of spatial aeronomy, atmospheric profile datasets often contain partial data. Probabilistic models, particularly Gaussian processes (GPs), offer promising solutions for filling these data gaps. However, traditional GP algorithms encounter challenges when handling multiple sequences simultaneously, both in terms of performance and computational complexity. Recently, an algorithm named MAGMA was introduced to address these issues. This paper evaluates MAGMA’s performance using the SOIR Venus atmosphere dataset, marking the first application of MAGMA to atmospheric profiles. Results indicate that MAGMA represents a significant advancement towards the efficient application of GPs for extrapolating atmospheric profiles.
In the field of spatial aeronomy, atmospheric profile datasets often contain partial data. Probabilistic models, particularly Gaussian processes (GPs), offer promising solutions for filling these data gaps. However, traditional GP algorithms encounter challenges when handling multiple sequences simultaneously, both in terms of performance and computational complexity. Recently, an algorithm named MAGMA was introduced to address these issues. This paper evaluates MAGMA’s performance using the SOIR Venus atmosphere dataset, marking the first application of MAGMA to atmospheric profiles. Results indicate that MAGMA represents a significant advancement towards the efficient application of GPs for extrapolating atmospheric profiles.
Antagonism between Classification and Reconstruction Processes in Deep Predictive Coding Networks
Jan Rathjens, Laurenz Wiskott
https://doi.org/10.14428/esann/2024.ES2024-59
Jan Rathjens, Laurenz Wiskott
https://doi.org/10.14428/esann/2024.ES2024-59
Abstract:
Predictive coding-inspired deep networks for visual computing integrate classification and reconstruction processes in shared intermediate layers. Although synergy between these processes is commonly assumed, it has yet to be convincingly demonstrated. In this study, we utilize a purposefully designed family of autoencoder-like architectures with an added classification head to examine the consequences of combining classification- and reconstruction-driven information within the models' latent layers. Our findings underscore a significant challenge: Classification-driven information diminishes reconstruction-driven information in shared representations and vice versa. Our results challenge prevailing assumptions in predictive coding and offer guidance for future iterations of predictive coding concepts in deep networks.
Predictive coding-inspired deep networks for visual computing integrate classification and reconstruction processes in shared intermediate layers. Although synergy between these processes is commonly assumed, it has yet to be convincingly demonstrated. In this study, we utilize a purposefully designed family of autoencoder-like architectures with an added classification head to examine the consequences of combining classification- and reconstruction-driven information within the models' latent layers. Our findings underscore a significant challenge: Classification-driven information diminishes reconstruction-driven information in shared representations and vice versa. Our results challenge prevailing assumptions in predictive coding and offer guidance for future iterations of predictive coding concepts in deep networks.
Constraints as Alternative Learning Objective in Deep Learning
Quinten Van Baelen, Peter Karsmakers
https://doi.org/10.14428/esann/2024.ES2024-76
Quinten Van Baelen, Peter Karsmakers
https://doi.org/10.14428/esann/2024.ES2024-76
Abstract:
The success of deep learning has been based on smooth loss functions that can easily be optimized using gradient descent and an off-the-shelf optimizer. However, training a neural network for a new application is not trivial as it requires many hyperparameters to be tuned. Several issues exist such as overfitting and underfitting. Many applications allow for some errors to be made, although, traditional learning objectives will influence the training in all cases except the one perfect prediction is made. In this work, constraints are proposed to replace the cross-entropy or the mean squared error to allow the neural network to make some errors. These errors can be set in advance to reflect how accurate the predictions of the neural network need to be. For each loss function, it is shown on two different data sets that the proposed constraint based learning performs similarly or even outperforms the standard loss functions. Moreover, in the case of classification problems, the constraints can result in predictions with significantly higher probability on a test set.
The success of deep learning has been based on smooth loss functions that can easily be optimized using gradient descent and an off-the-shelf optimizer. However, training a neural network for a new application is not trivial as it requires many hyperparameters to be tuned. Several issues exist such as overfitting and underfitting. Many applications allow for some errors to be made, although, traditional learning objectives will influence the training in all cases except the one perfect prediction is made. In this work, constraints are proposed to replace the cross-entropy or the mean squared error to allow the neural network to make some errors. These errors can be set in advance to reflect how accurate the predictions of the neural network need to be. For each loss function, it is shown on two different data sets that the proposed constraint based learning performs similarly or even outperforms the standard loss functions. Moreover, in the case of classification problems, the constraints can result in predictions with significantly higher probability on a test set.
CNNGen: A Generator and a Dataset for Energy-Aware Neural Architecture Search
Antoine Gratia, Hong Liu, Shin'ichi Satoh, Paul Temple, Pierre-Yves Schobbens, Gilles Perrouin
https://doi.org/10.14428/esann/2024.ES2024-77
Antoine Gratia, Hong Liu, Shin'ichi Satoh, Paul Temple, Pierre-Yves Schobbens, Gilles Perrouin
https://doi.org/10.14428/esann/2024.ES2024-77
Abstract:
Neural Architecture Search (NAS) methods seek optimal networks within a set architecture space. Cells define this space and bound the search based on a reference neural architecture. Yet, optimality is mostly related to prediction performance, overlooking the environmental impacts of training thousands of models. Thus, reference architectures, designed for performance only, may hamper the search for tradeoffs between performance and energy consumption. We contribute to energy-aware NAS with i) a grammar-based Convolutional Neural Network (CNN) generator not requiring a predefined architecture; ii) A dataset of 1,300 architectures generated by CNNGen with their full description and implementation, performance and resource consumption measures; iii) Three novel performance and energy prediction models not requiring trained models and outperforming the state of the art.
Neural Architecture Search (NAS) methods seek optimal networks within a set architecture space. Cells define this space and bound the search based on a reference neural architecture. Yet, optimality is mostly related to prediction performance, overlooking the environmental impacts of training thousands of models. Thus, reference architectures, designed for performance only, may hamper the search for tradeoffs between performance and energy consumption. We contribute to energy-aware NAS with i) a grammar-based Convolutional Neural Network (CNN) generator not requiring a predefined architecture; ii) A dataset of 1,300 architectures generated by CNNGen with their full description and implementation, performance and resource consumption measures; iii) Three novel performance and energy prediction models not requiring trained models and outperforming the state of the art.
Adversarial Training without Hard Labels
Ammar Al-Najjar, István Megyeri, Mark Jelasity
https://doi.org/10.14428/esann/2024.ES2024-81
Ammar Al-Najjar, István Megyeri, Mark Jelasity
https://doi.org/10.14428/esann/2024.ES2024-81
Abstract:
Adversarial training is widely used to enhance classifier robustness. Several improvements have been proposed including different forms of distillation and self-alignment. Here, we propose a novel loss function combining these two approaches, while not using the hard ground truth labels directly. Our new loss function is demonstrated to simultaneously improve both the robustness and the accuracy of some well-known competing solutions. This is a step towards combatting the robustness accuracy tradeoff, a crucial issue in adversarial training. Our method also reduces the variance of the accuracy over the classes in the experimental scenarios we examined, leading to a more balanced model.
Adversarial training is widely used to enhance classifier robustness. Several improvements have been proposed including different forms of distillation and self-alignment. Here, we propose a novel loss function combining these two approaches, while not using the hard ground truth labels directly. Our new loss function is demonstrated to simultaneously improve both the robustness and the accuracy of some well-known competing solutions. This is a step towards combatting the robustness accuracy tradeoff, a crucial issue in adversarial training. Our method also reduces the variance of the accuracy over the classes in the experimental scenarios we examined, leading to a more balanced model.
Learning Kernel Parameters for Support Vector Classification Using Similarity Embeddings
Antonio Padua Braga, Murilo Menezes, Luiz Torres
https://doi.org/10.14428/esann/2024.ES2024-90
Antonio Padua Braga, Murilo Menezes, Luiz Torres
https://doi.org/10.14428/esann/2024.ES2024-90
Abstract:
In order to solve non-linear problems, kernel-based classifiers rely on implicit mappings to very high-dimensional spaces. These target spaces, although mathematically robust, often lack the property of visual interpretation, limiting the intuition of the problem at hand. In this work, the notion of a similarity space is presented, to which one can map input samples and visualize how they interact under a given kernel function. By exploring statistics in such space, a class separability measure is derived, which can be used to find optimal kernel parameters for binary classification. Experiments using support vector machines were conducted, showing the method's effectiveness when compared to grid-search approaches.
In order to solve non-linear problems, kernel-based classifiers rely on implicit mappings to very high-dimensional spaces. These target spaces, although mathematically robust, often lack the property of visual interpretation, limiting the intuition of the problem at hand. In this work, the notion of a similarity space is presented, to which one can map input samples and visualize how they interact under a given kernel function. By exploring statistics in such space, a class separability measure is derived, which can be used to find optimal kernel parameters for binary classification. Experiments using support vector machines were conducted, showing the method's effectiveness when compared to grid-search approaches.
Causes of Rejects in Prototype-based Classification Aleatoric vs. Epistemic Uncertainty
Johannes Brinkrolf, Valerie Vaquet, Fabian Hinder, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-156
Johannes Brinkrolf, Valerie Vaquet, Fabian Hinder, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-156
Abstract:
Prototype-based methods constitute a robust and transparent family of machine-learning models. To increase robustness in real-world applications, they are frequently coupled with reject options. While the state-of-the-art method, relative similarity, couples the rejection of samples with high aleatoric and epistemic uncertainty, the technique lacks transparency, i.e., an explanation of why a sample has been rejected. In this work, we analyze the relative similarity analytically and derive an explanation scheme for reject options in prototype-based classification.
Prototype-based methods constitute a robust and transparent family of machine-learning models. To increase robustness in real-world applications, they are frequently coupled with reject options. While the state-of-the-art method, relative similarity, couples the rejection of samples with high aleatoric and epistemic uncertainty, the technique lacks transparency, i.e., an explanation of why a sample has been rejected. In this work, we analyze the relative similarity analytically and derive an explanation scheme for reject options in prototype-based classification.
''Mental Images'' driven classification
Gianluca Coda, Massimo De Gregorio, Antonio Sorgente, Paolo Vanacore
https://doi.org/10.14428/esann/2024.ES2024-169
Gianluca Coda, Massimo De Gregorio, Antonio Sorgente, Paolo Vanacore
https://doi.org/10.14428/esann/2024.ES2024-169
Abstract:
Common sense rules are a form of implicit knowledge acquired through experience and observation of the world around us, and used by both humans and machines to reason and to make decisions about the surrounding environment. Artificial Intelligence systems can extract these rules by mining data and apply them to many predictive tasks. Herein, we first present a new method for extracting rules from DRASiW "Mental Images" (MI) and then how to exploit them to improve the classification performance of the system. The latter is confirmed by the obtained results.
Common sense rules are a form of implicit knowledge acquired through experience and observation of the world around us, and used by both humans and machines to reason and to make decisions about the surrounding environment. Artificial Intelligence systems can extract these rules by mining data and apply them to many predictive tasks. Herein, we first present a new method for extracting rules from DRASiW "Mental Images" (MI) and then how to exploit them to improve the classification performance of the system. The latter is confirmed by the obtained results.
Transfer learning to minimize the predictive risk in clinical research
Samuel Branders, Jérôme Paul, Arthur Ooghe, Alvaro Pereira
https://doi.org/10.14428/esann/2024.ES2024-25
Samuel Branders, Jérôme Paul, Arthur Ooghe, Alvaro Pereira
https://doi.org/10.14428/esann/2024.ES2024-25
Abstract:
The volume of data collected from patients enrolled in clinical trials is constantly on the rise. Classical linear and generalized linear models used in this context are unable to keep pace with this trend. Conversely, machine learning models have the potential to deal with such data, but cannot provide guarantees in terms of bias and interoperability. This paper explores a transfer learning approach that seeks to harmonize the strengths of both paradigms: providing unbiased and interpretable estimators while minimizing the expected predictive risk in finite samples.
The volume of data collected from patients enrolled in clinical trials is constantly on the rise. Classical linear and generalized linear models used in this context are unable to keep pace with this trend. Conversely, machine learning models have the potential to deal with such data, but cannot provide guarantees in terms of bias and interoperability. This paper explores a transfer learning approach that seeks to harmonize the strengths of both paradigms: providing unbiased and interpretable estimators while minimizing the expected predictive risk in finite samples.
Leveraging performance-based metadata for designing multi-objective NAS strategies for efficient models in Earth Observation
Emre Demir, Rene Traore, Andrés Camero
https://doi.org/10.14428/esann/2024.ES2024-94
Emre Demir, Rene Traore, Andrés Camero
https://doi.org/10.14428/esann/2024.ES2024-94
Abstract:
Earth Observational (EO) datasets present challenges that differ from tra- ditional Computer Vision benchmarks often examined by the AutoML community. To assist EO researchers in leveraging AutoML techniques, we offer a NAS benchmark with performance meta-data specifically for an EO context. This dataset not only focuses on resource-efficient mod- els crucial to EO but also includes hardware-based metrics. Moreover, we investigate performance prediction to build a data-centric approach for initializing multi-objective NAS search algorithms.
Earth Observational (EO) datasets present challenges that differ from tra- ditional Computer Vision benchmarks often examined by the AutoML community. To assist EO researchers in leveraging AutoML techniques, we offer a NAS benchmark with performance meta-data specifically for an EO context. This dataset not only focuses on resource-efficient mod- els crucial to EO but also includes hardware-based metrics. Moreover, we investigate performance prediction to build a data-centric approach for initializing multi-objective NAS search algorithms.
AI-based algorithm for intrusion detection on a real dataset
David Esteban Martínez, Bertha Guijarro-Berdiñas, Amparo Alonso Betanzos, Elena Hernández-Pereira, Alejandro Esteban Martínez
https://doi.org/10.14428/esann/2024.ES2024-204
David Esteban Martínez, Bertha Guijarro-Berdiñas, Amparo Alonso Betanzos, Elena Hernández-Pereira, Alejandro Esteban Martínez
https://doi.org/10.14428/esann/2024.ES2024-204
Abstract:
In the realm of cybersecurity, the detection of network intrusions stands as a paramount challenge, with ever-evolving threats demanding innovative solutions. This study delves into the application of diverse machine learning algorithms on a contemporary dataset (UGR'16) comprising real-world instances of intrusion in software systems. Specifically, several Machine Learning models (Outlier Detectors, Ensemble Methods, Deep Learning, and Conventional Classifiers) were tested and compared with previously reported results using a standard methodology. The obtained results reveal that the Ensemble Methods have been capable of improving the results from prior research. Particularly, the Extreme Gradient Boosting (XGBoost) algorithm offers better results than the original solution with Random Forest, with an AUC of 0.9218 as opposed to 0.8977, and more than four times as fast for the problem to solve.
In the realm of cybersecurity, the detection of network intrusions stands as a paramount challenge, with ever-evolving threats demanding innovative solutions. This study delves into the application of diverse machine learning algorithms on a contemporary dataset (UGR'16) comprising real-world instances of intrusion in software systems. Specifically, several Machine Learning models (Outlier Detectors, Ensemble Methods, Deep Learning, and Conventional Classifiers) were tested and compared with previously reported results using a standard methodology. The obtained results reveal that the Ensemble Methods have been capable of improving the results from prior research. Particularly, the Extreme Gradient Boosting (XGBoost) algorithm offers better results than the original solution with Random Forest, with an AUC of 0.9218 as opposed to 0.8977, and more than four times as fast for the problem to solve.
Similarity-Based Zero-Shot Domain Adaptation for Wearables
Markus Vieth, Nils Grimmelsmann, Axel Schneider, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-123
Markus Vieth, Nils Grimmelsmann, Axel Schneider, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-123
Abstract:
Biosensors measure signals from the human body, and usually process them with a small ML model on simple hardware. When a new person starts using such a device, a domain adaptation problem arises. We consider the case where no labels are known for the new person, but data (including labels) from several other people are available (unsupervised, multi-source). As an application scenario, we look at a shoe insole with 3-8 pressure sensors that estimates how much weight/force is put on the foot (regression problem). We propose a distance measure between a source and target domain, and a combination of all source models. Experiments on real world data from 13 persons show that our method outperforms all other tested methods by a good margin.
Biosensors measure signals from the human body, and usually process them with a small ML model on simple hardware. When a new person starts using such a device, a domain adaptation problem arises. We consider the case where no labels are known for the new person, but data (including labels) from several other people are available (unsupervised, multi-source). As an application scenario, we look at a shoe insole with 3-8 pressure sensors that estimates how much weight/force is put on the foot (regression problem). We propose a distance measure between a source and target domain, and a combination of all source models. Experiments on real world data from 13 persons show that our method outperforms all other tested methods by a good margin.
Robustness and Regularization in Hierarchical Re-Basin
Benedikt Franke, Florian Heinrich, Markus Lange, Arne Raulf
https://doi.org/10.14428/esann/2024.ES2024-22
Benedikt Franke, Florian Heinrich, Markus Lange, Arne Raulf
https://doi.org/10.14428/esann/2024.ES2024-22
Abstract:
This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.
This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.
Lightweight Cross-Modal Representation Learning
Bilal FAYE, Hanane Azzag, Mustapha Lebbah, Djamel BOUCHAFFRA
https://doi.org/10.14428/esann/2024.ES2024-96
Bilal FAYE, Hanane Azzag, Mustapha Lebbah, Djamel BOUCHAFFRA
https://doi.org/10.14428/esann/2024.ES2024-96
Abstract:
Low-cost cross-modal representation learning is crucial for deriving semantic representations across diverse modalities such as text, audio, images, and video. Traditional approaches typically depend on large specialized models trained from scratch, requiring extensive datasets and resulting in high resource and time costs. To overcome these challenges, we introduce a novel approach named Lightweight Cross-Modal Representation Learning (LightCRL). This method uses a single neural network titled Deep Fusion Encoder (DFE), which projects data from multiple modalities into a shared latent representation space. This reduces the overall parameter count while still delivering robust performance comparable to more complex systems.
Low-cost cross-modal representation learning is crucial for deriving semantic representations across diverse modalities such as text, audio, images, and video. Traditional approaches typically depend on large specialized models trained from scratch, requiring extensive datasets and resulting in high resource and time costs. To overcome these challenges, we introduce a novel approach named Lightweight Cross-Modal Representation Learning (LightCRL). This method uses a single neural network titled Deep Fusion Encoder (DFE), which projects data from multiple modalities into a shared latent representation space. This reduces the overall parameter count while still delivering robust performance comparable to more complex systems.
Human Activity Recognition from Thigh and Wrist Accelerometry
Alejandro Castellanos Alonso, Antonio López, Diego Garcia-Perez, Diego Álvarez, Juan Carlos Alvarez
https://doi.org/10.14428/esann/2024.ES2024-75
Alejandro Castellanos Alonso, Antonio López, Diego Garcia-Perez, Diego Álvarez, Juan Carlos Alvarez
https://doi.org/10.14428/esann/2024.ES2024-75
Abstract:
The IMPaCT Cohort (ISCIII, Spain) is expected to collect biomechanical parameters from a wide population (~200,000) over seven consecutive days, using a triaxial accelerometer and a gyroscope positioned on both the wrist and thigh of participants. This will be one of the distinctive features of the Cohort, based on the hypothesis that simultaneous placement of two devices on the wrist and thigh will enable accurate classification of subjects' activity. In this study, we aim to explore this crucial aspect using Deep CNNs and data from publicly available datasets. Our experimental findings demonstrate an 85% accuracy achieved when utilizing data from both the thigh and wrist. The results support the hypothesis that incorporating accelerometry data from both limbs enhances classification, yielding over a 15% increase in accuracy compared to using data from a single limb alone.
The IMPaCT Cohort (ISCIII, Spain) is expected to collect biomechanical parameters from a wide population (~200,000) over seven consecutive days, using a triaxial accelerometer and a gyroscope positioned on both the wrist and thigh of participants. This will be one of the distinctive features of the Cohort, based on the hypothesis that simultaneous placement of two devices on the wrist and thigh will enable accurate classification of subjects' activity. In this study, we aim to explore this crucial aspect using Deep CNNs and data from publicly available datasets. Our experimental findings demonstrate an 85% accuracy achieved when utilizing data from both the thigh and wrist. The results support the hypothesis that incorporating accelerometry data from both limbs enhances classification, yielding over a 15% increase in accuracy compared to using data from a single limb alone.
On Fb-score and Cost-Consistency in Evaluation of Imbalanced Classification
Aleksi Avela
https://doi.org/10.14428/esann/2024.ES2024-186
Aleksi Avela
https://doi.org/10.14428/esann/2024.ES2024-186
Abstract:
Among many other difficulties of imbalanced classification, evaluation of classifiers is rarely trivial. Fb-score is often recommended as one of the go-to evaluation measures in imbalanced classification, but researchers have voiced their concerns on whether Fb-score in fact is an appropriate measure. In this paper, we introduce a framework of cost-consistency, i.e., whether an evaluation measure is consistent with total classification cost at least for some cost and class imbalance ratio, and show that, with a simple cost structure, Fb-score is not cost-consistent.
Among many other difficulties of imbalanced classification, evaluation of classifiers is rarely trivial. Fb-score is often recommended as one of the go-to evaluation measures in imbalanced classification, but researchers have voiced their concerns on whether Fb-score in fact is an appropriate measure. In this paper, we introduce a framework of cost-consistency, i.e., whether an evaluation measure is consistent with total classification cost at least for some cost and class imbalance ratio, and show that, with a simple cost structure, Fb-score is not cost-consistent.
Decision fusion based multimodal hierarchical method for speech emotion recognition from audio and text
Nawal Alqurashi, Yuhua Li, Kirill Sidorov , David marshall
https://doi.org/10.14428/esann/2024.ES2024-219
Nawal Alqurashi, Yuhua Li, Kirill Sidorov , David marshall
https://doi.org/10.14428/esann/2024.ES2024-219
Abstract:
Expressing emotions is essential in human interaction. Often, individuals convey emotions through neutral speech, while the underlying meaning carries emotional weight. Conversely, tone can also convey emotion despite neutral words. Most Speech Emotion Recognition research overlooks this. We address this gap with a multimodal emotion recognition system using hierarchical classifiers and a novel decision fusion method. Our approach analyses emotional cues from speech and text, measuring their impact on predicted classes, considering emotional or neutral contributions for each instance. Results on the IEMOCAP dataset show our method's effectiveness: 69.45% and 65.26% weighted accuracy in speaker-dependent and speaker-independent settings, respectively.
Expressing emotions is essential in human interaction. Often, individuals convey emotions through neutral speech, while the underlying meaning carries emotional weight. Conversely, tone can also convey emotion despite neutral words. Most Speech Emotion Recognition research overlooks this. We address this gap with a multimodal emotion recognition system using hierarchical classifiers and a novel decision fusion method. Our approach analyses emotional cues from speech and text, measuring their impact on predicted classes, considering emotional or neutral contributions for each instance. Results on the IEMOCAP dataset show our method's effectiveness: 69.45% and 65.26% weighted accuracy in speaker-dependent and speaker-independent settings, respectively.
Trust in Artificial Intelligence: Beyond Interpretability
Trust in Artificial Intelligence: Beyond Interpretability
Tassadit Bouadi, Benoit Frénay, Luis Galárraga, Pierre Geurts, Barbara Hammer, Gilles Perrouin
https://doi.org/10.14428/esann/2024.ES2024-6
Tassadit Bouadi, Benoit Frénay, Luis Galárraga, Pierre Geurts, Barbara Hammer, Gilles Perrouin
https://doi.org/10.14428/esann/2024.ES2024-6
Abstract:
As artificial intelligence (AI) systems become increasingly integrated into everyday life, the need for trustworthiness in these systems has emerged as a critical challenge. This tutorial paper addresses the complexity of building trust in AI systems by exploring recent advances in explainable AI (XAI) and related areas that go beyond mere interpretability. After reviewing recent trends in XAI, we discuss how to control AI systems, align them with societal concerns, and address the robustness, reproducibility, and evaluation concerns inherent in these systems. This review highlights the multifaceted nature of the mechanisms for building trust in AI, and we hope it will pave the way for further research in this area.
As artificial intelligence (AI) systems become increasingly integrated into everyday life, the need for trustworthiness in these systems has emerged as a critical challenge. This tutorial paper addresses the complexity of building trust in AI systems by exploring recent advances in explainable AI (XAI) and related areas that go beyond mere interpretability. After reviewing recent trends in XAI, we discuss how to control AI systems, align them with societal concerns, and address the robustness, reproducibility, and evaluation concerns inherent in these systems. This review highlights the multifaceted nature of the mechanisms for building trust in AI, and we hope it will pave the way for further research in this area.
Interpreting Hybrid AI through Autodecoded Latent Space Entities
Roland Veen, Christodoulos Hadjichristodoulou, Michael Biehl
https://doi.org/10.14428/esann/2024.ES2024-170
Roland Veen, Christodoulos Hadjichristodoulou, Michael Biehl
https://doi.org/10.14428/esann/2024.ES2024-170
Abstract:
Explainable AI models and methods have seen a rise in interest in recent years as a reaction to the widespread use of neural networks and similar black-box models in machine learning. In this project, we combine explainable, prototype-based systems and neural networks in an effort to benefit from both approaches. Specifically, we employ Generalized Matrix Relevance Learning Vector Quantization in combination with autoencoder networks. This allows us to perform automated non-linear feature extraction from high-dimensional inputs before feeding them into LVQ for classification. Moreover, the approach enables the mapping of the low-dimensional representatives and relevances back to the original feature space for visual inspection and interpretation.
Explainable AI models and methods have seen a rise in interest in recent years as a reaction to the widespread use of neural networks and similar black-box models in machine learning. In this project, we combine explainable, prototype-based systems and neural networks in an effort to benefit from both approaches. Specifically, we employ Generalized Matrix Relevance Learning Vector Quantization in combination with autoencoder networks. This allows us to perform automated non-linear feature extraction from high-dimensional inputs before feeding them into LVQ for classification. Moreover, the approach enables the mapping of the low-dimensional representatives and relevances back to the original feature space for visual inspection and interpretation.
ProtoNCD: Prototypical Parts for Interpretable Novel Class Discovery
Tomasz Michalski, Dawid Rymarczyk, Daniel Barczyk, Bartosz Zieliński
https://doi.org/10.14428/esann/2024.ES2024-70
Tomasz Michalski, Dawid Rymarczyk, Daniel Barczyk, Bartosz Zieliński
https://doi.org/10.14428/esann/2024.ES2024-70
Abstract:
In this work, we introduce ProtoNCD, a novel approach to novel class discovery (NCD) that leverages prototypical parts for enhanced interpretability. ProtoNCD extends the ProtoPool methodology to the NCD setting, employing techniques such as knowledge distillation and specialized prototypical parts initialization. Through comprehensive experiments on the CUB-200-2011 dataset, we demonstrate the efficacy of ProtoNCD and its pivotal role in explaining how the reasoning of known classes influences predictions for those newly discovered.
In this work, we introduce ProtoNCD, a novel approach to novel class discovery (NCD) that leverages prototypical parts for enhanced interpretability. ProtoNCD extends the ProtoPool methodology to the NCD setting, employing techniques such as knowledge distillation and specialized prototypical parts initialization. Through comprehensive experiments on the CUB-200-2011 dataset, we demonstrate the efficacy of ProtoNCD and its pivotal role in explaining how the reasoning of known classes influences predictions for those newly discovered.
Evaluating the Quality of Saliency Maps for Distilled Convolutional Neural Networks
Jasper Wilfling, Matias Valdenegro-Toro, Marco Zullich
https://doi.org/10.14428/esann/2024.ES2024-131
Jasper Wilfling, Matias Valdenegro-Toro, Marco Zullich
https://doi.org/10.14428/esann/2024.ES2024-131
Abstract:
Knowledge Distillation (KD) is a popular technique to compress Deep Neural Networks. Studies on KD often evaluate it on the basis of accuracy and time-complexity; however, there exist other facets of a model performance, like explainability and fairness. In the present work, we evaluate the quality of saliency maps in terms of faithfulness and coherence in the context of KD and compare the results obtained with the uncompressed model. Our findings indicate how KD is potentially decreasing the accuracy of the saliency maps, thus acting as a warning on the usage of KD when high-quality explanations are required.
Knowledge Distillation (KD) is a popular technique to compress Deep Neural Networks. Studies on KD often evaluate it on the basis of accuracy and time-complexity; however, there exist other facets of a model performance, like explainability and fairness. In the present work, we evaluate the quality of saliency maps in terms of faithfulness and coherence in the context of KD and compare the results obtained with the uncompressed model. Our findings indicate how KD is potentially decreasing the accuracy of the saliency maps, thus acting as a warning on the usage of KD when high-quality explanations are required.
Safety-Oriented Pruning and Interpretation of Reinforcement Learning Policies
Dennis Gross, Helge Spieker
https://doi.org/10.14428/esann/2024.ES2024-71
Dennis Gross, Helge Spieker
https://doi.org/10.14428/esann/2024.ES2024-71
Abstract:
Pruning neural networks (NNs) can streamline them but risks removing vital parameters from safe reinforcement learning (RL) policies. We introduce an interpretable RL method called VERINTER, which combines NN pruning with model checking to ensure interpretable RL safety. VERINTER exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements. This method maintains safety in pruned RL policies and enhances understanding of their safety dynamics, which has proven effective in multiple RL settings.
Pruning neural networks (NNs) can streamline them but risks removing vital parameters from safe reinforcement learning (RL) policies. We introduce an interpretable RL method called VERINTER, which combines NN pruning with model checking to ensure interpretable RL safety. VERINTER exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements. This method maintains safety in pruned RL policies and enhances understanding of their safety dynamics, which has proven effective in multiple RL settings.
Evaluation methodology for disentangled uncertainty quantification on regression models
Kevin Pasini, Clément ARLOTTI, milad leyli abadi, Marc Nabhan, johanna baro
https://doi.org/10.14428/esann/2024.ES2024-87
Kevin Pasini, Clément ARLOTTI, milad leyli abadi, Marc Nabhan, johanna baro
https://doi.org/10.14428/esann/2024.ES2024-87
Abstract:
A practical way to enhance the confidence of the predictions made by Machine Learning (ML) models is to enrich them with trustworthiness add-ons such as Uncertainty Quantification (UQ). Existing UQ paradigms capture two intertwined components (epistemic and aleatoric), but few of them evaluate their disentanglement, even less on real data. We thus propose and implement a methodology to assess the effectiveness of uncertainty disentanglement despite the absence of ground truth in real datasets. To do so, we use a data withdrawal-based strategy to simulate Out-of-Distribution (OOD) data and evaluate four state-of-the-art UQ approaches.
A practical way to enhance the confidence of the predictions made by Machine Learning (ML) models is to enrich them with trustworthiness add-ons such as Uncertainty Quantification (UQ). Existing UQ paradigms capture two intertwined components (epistemic and aleatoric), but few of them evaluate their disentanglement, even less on real data. We thus propose and implement a methodology to assess the effectiveness of uncertainty disentanglement despite the absence of ground truth in real datasets. To do so, we use a data withdrawal-based strategy to simulate Out-of-Distribution (OOD) data and evaluate four state-of-the-art UQ approaches.
Influence of Data Characteristics on Machine Learning Classification Performance and Stability of SHapley Additive exPlanations
Anusha Ihalapathirana, Gunjan Chandra, Piia Lavikainen, Pekka Siirtola, Satu Tamminen, Nirzor Talukder, Janne Martikainen, Juha Röning
https://doi.org/10.14428/esann/2024.ES2024-107
Anusha Ihalapathirana, Gunjan Chandra, Piia Lavikainen, Pekka Siirtola, Satu Tamminen, Nirzor Talukder, Janne Martikainen, Juha Röning
https://doi.org/10.14428/esann/2024.ES2024-107
Abstract:
This study explores the effects of different data sizes and data imbalance on model performance and the stability of SHapley Additive exPlanations (SHAP). The study utilizes a Type 2 diabetes (T2D) dataset to train three machine learning (ML) models: linear discriminant analysis, XGBoost, and a neural network. It shows that adjusting the background dataset size leads to variations in the SHAP values, with decreased variance observed in larger and balanced datasets. Furthermore, the study highlights that the data characteristics leading to high model performance may not always produce reliable and stable SHAP explanations.
This study explores the effects of different data sizes and data imbalance on model performance and the stability of SHapley Additive exPlanations (SHAP). The study utilizes a Type 2 diabetes (T2D) dataset to train three machine learning (ML) models: linear discriminant analysis, XGBoost, and a neural network. It shows that adjusting the background dataset size leads to variations in the SHAP values, with decreased variance observed in larger and balanced datasets. Furthermore, the study highlights that the data characteristics leading to high model performance may not always produce reliable and stable SHAP explanations.
Insight-SNE: Understanding t-SNE Embeddings through Interactive Explanation
Sacha Corbugy, Thibaut Septon, Bruno Dumas, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-190
Sacha Corbugy, Thibaut Septon, Bruno Dumas, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-190
Abstract:
Non-linear dimensionality reduction techniques offer insights into complex datasets, yet interpreting them poses challenges. While some papers provide methods for explaining DR, and others focus on interactively exploring embeddings, there are currently no works that seamlessly combine both aspects. Our contributions, Insight-SNE, propose an interactive tool that allows exploring t-SNE embeddings and their related gradient-based explanations, as well as its evaluation with expert users.
Non-linear dimensionality reduction techniques offer insights into complex datasets, yet interpreting them poses challenges. While some papers provide methods for explaining DR, and others focus on interactively exploring embeddings, there are currently no works that seamlessly combine both aspects. Our contributions, Insight-SNE, propose an interactive tool that allows exploring t-SNE embeddings and their related gradient-based explanations, as well as its evaluation with expert users.
Does a Reduced Fine-Tuning Surface Impact the Stability of the Explanations of LLMs?
Jérémie Bogaert, François-Xavier Standaert
https://doi.org/10.14428/esann/2024.ES2024-194
Jérémie Bogaert, François-Xavier Standaert
https://doi.org/10.14428/esann/2024.ES2024-194
Abstract:
Explainability is an increasingly demanded feature for the deployment of LLMs. In this context, it has been shown that the explanations of models that are equivalent from the accuracy viewpoint can differ due to their training randomness, leading to a need to characterize the explanations' distribution and to understand the origin of this sensitivity. In this paper, we investigate whether the fine-tuning surface, defined as the number of bits that are fine-tuned in a LLM, can serve as a good proxy for the stability of its explanations. We answer negatively and show that two different approaches for reducing the fine-tuning surface, namely quantizing and freezing (a part of) the models, lead to very different outcomes.
Explainability is an increasingly demanded feature for the deployment of LLMs. In this context, it has been shown that the explanations of models that are equivalent from the accuracy viewpoint can differ due to their training randomness, leading to a need to characterize the explanations' distribution and to understand the origin of this sensitivity. In this paper, we investigate whether the fine-tuning surface, defined as the number of bits that are fine-tuned in a LLM, can serve as a good proxy for the stability of its explanations. We answer negatively and show that two different approaches for reducing the fine-tuning surface, namely quantizing and freezing (a part of) the models, lead to very different outcomes.
Nonlinear dimensionality reduction and unsupervised learning
Positive and Scale Invariant Gaussian Process Latent Variable Model for Astronomical Spectra
Nikos Gianniotis, Kai L. Polsterer, Iliana Isabel Cortés Pérez
https://doi.org/10.14428/esann/2024.ES2024-72
Nikos Gianniotis, Kai L. Polsterer, Iliana Isabel Cortés Pérez
https://doi.org/10.14428/esann/2024.ES2024-72
Abstract:
We propose a probabilistic model that reduces the dimensionality of positive-valued data in a scale-invariant way, treating data items that differ only in scaling as identical. Extending the Gaussian Process Latent Variable Model, we ensure positive function values by applying a non-linear transformation to latent function values. To address the intractable marginal log-likelihood, we utilize a variational lower bound and amortized inference to reduce the number of variational parameters. We apply our model to reconstructing partially observed spectra and show how its scale-invariant property leads to better reconstructions.
We propose a probabilistic model that reduces the dimensionality of positive-valued data in a scale-invariant way, treating data items that differ only in scaling as identical. Extending the Gaussian Process Latent Variable Model, we ensure positive function values by applying a non-linear transformation to latent function values. To address the intractable marginal log-likelihood, we utilize a variational lower bound and amortized inference to reduce the number of variational parameters. We apply our model to reconstructing partially observed spectra and show how its scale-invariant property leads to better reconstructions.
Forget early exaggeration in t-SNE: early hierarchization preserves global structure
Lee John, Edouard Couplet, Pierre Lambert, Ludovic Journaux, Dounia Mulders, Cyril de Bodt, Michel Verleysen
https://doi.org/10.14428/esann/2024.ES2024-146
Lee John, Edouard Couplet, Pierre Lambert, Ludovic Journaux, Dounia Mulders, Cyril de Bodt, Michel Verleysen
https://doi.org/10.14428/esann/2024.ES2024-146
Abstract:
As a local method of dimensionality reduction, t-SNE requires careful initialization in order to preserve the data global structure to the best extent. In regular t-SNE, the low-dimensional embedding is initialized either randomly or with PCA coordinates; next, gradient descent refines the embedding coordinates in two phases. In the first one, called \emph{early exaggeration}, attractive forces between points are artificially strengthened to delay any detrimental effect of repulsive forces while points are still poorly organized. In this paper, a novel initialization of t-SNE is proposed. It works by hierarchizing the data points into a space-partitioning binary tree and successive runs of t-SNE with 4, 8, 16, ..., N points. Between two runs, the prototypical point in each tree branch is split into its two children prototypes, with some little random noise, and the embedding is rescaled to account for the increased population. Experimental results show the effectiveness of the method. The proposed method is compatible with any method of neighbor embedding (t-SNE, UMAP, etc.) provided early exaggeration can be disable and initial coordinates can be fed.
As a local method of dimensionality reduction, t-SNE requires careful initialization in order to preserve the data global structure to the best extent. In regular t-SNE, the low-dimensional embedding is initialized either randomly or with PCA coordinates; next, gradient descent refines the embedding coordinates in two phases. In the first one, called \emph{early exaggeration}, attractive forces between points are artificially strengthened to delay any detrimental effect of repulsive forces while points are still poorly organized. In this paper, a novel initialization of t-SNE is proposed. It works by hierarchizing the data points into a space-partitioning binary tree and successive runs of t-SNE with 4, 8, 16, ..., N points. Between two runs, the prototypical point in each tree branch is split into its two children prototypes, with some little random noise, and the embedding is rescaled to account for the increased population. Experimental results show the effectiveness of the method. The proposed method is compatible with any method of neighbor embedding (t-SNE, UMAP, etc.) provided early exaggeration can be disable and initial coordinates can be fed.
Estimated neighbour sets and smoothed sampled global interactions are sufficient for a fast approximate tSNE.
Pierre Lambert, Edouard Couplet, Cyril de Bodt, Lee John
https://doi.org/10.14428/esann/2024.ES2024-203
Pierre Lambert, Edouard Couplet, Cyril de Bodt, Lee John
https://doi.org/10.14428/esann/2024.ES2024-203
Abstract:
To minimise its loss function, the popular method of nonlinear dimensionality reduction t-SNE requires O(N 2 ) computations. As its applications often involve large datasets, fast approximations have been developed, such as Barnes-Hut t-SNE and FIt-SNE. Most fast approximations to t-SNE require the embedding dimensionality to be small, typically 2 or 3, limiting the use of t-SNE to data visualisation. Additionally, the effective computation time of the current accelerated t-SNE algorithms stays too high for a comfortable interactive visual exploration of data. This paper proposes an accelerated approximation of t-SNE with iterations of complexity O(N K), which does not rely on the use of a model to capture information about the low-dimensional space, relieving the computational burden of high dimensionality of the embedding space. For this purpose, the proposed method approximates neighbour sets and keeps track of smoothed estimations of long-range interactions in O(N K) time. The method is qualitatively tested on a handful of datasets and shows comparable results to existing fast neighbour embedding methods in the context of data visualisation. Code is available at https://github.com/PierreLambert3/c_fast_hSNE.git.
To minimise its loss function, the popular method of nonlinear dimensionality reduction t-SNE requires O(N 2 ) computations. As its applications often involve large datasets, fast approximations have been developed, such as Barnes-Hut t-SNE and FIt-SNE. Most fast approximations to t-SNE require the embedding dimensionality to be small, typically 2 or 3, limiting the use of t-SNE to data visualisation. Additionally, the effective computation time of the current accelerated t-SNE algorithms stays too high for a comfortable interactive visual exploration of data. This paper proposes an accelerated approximation of t-SNE with iterations of complexity O(N K), which does not rely on the use of a model to capture information about the low-dimensional space, relieving the computational burden of high dimensionality of the embedding space. For this purpose, the proposed method approximates neighbour sets and keeps track of smoothed estimations of long-range interactions in O(N K) time. The method is qualitatively tested on a handful of datasets and shows comparable results to existing fast neighbour embedding methods in the context of data visualisation. Code is available at https://github.com/PierreLambert3/c_fast_hSNE.git.
Hyperbolic Metabolite-Disease Association Prediction
Domonkos Pogány, Péter Antal
https://doi.org/10.14428/esann/2024.ES2024-29
Domonkos Pogány, Péter Antal
https://doi.org/10.14428/esann/2024.ES2024-29
Abstract:
In biomarker research, there is a growing demand for computational methods to efficiently identify novel metabolite-disease associations (MDAs). Current approaches, however, do not take into account the underlying geometry of the MDA space. Here, we show that classifiers leveraging hyperbolic embeddings achieve comparable results to their Euclidean counterparts with significantly lower dimensionality, aligning better with the association network's scale-free nature. Finally, through a case study, we provide an interpretation of the model embeddings and investigate newly predicted associations. Our results demonstrate the intrinsic non-Euclidean geometry of the MDA space, providing direction for further research. A Pytorch-based implementation is available at https://github.com/PDomonkos/hyperbolic-MDA-prediction.
In biomarker research, there is a growing demand for computational methods to efficiently identify novel metabolite-disease associations (MDAs). Current approaches, however, do not take into account the underlying geometry of the MDA space. Here, we show that classifiers leveraging hyperbolic embeddings achieve comparable results to their Euclidean counterparts with significantly lower dimensionality, aligning better with the association network's scale-free nature. Finally, through a case study, we provide an interpretation of the model embeddings and investigate newly predicted associations. Our results demonstrate the intrinsic non-Euclidean geometry of the MDA space, providing direction for further research. A Pytorch-based implementation is available at https://github.com/PDomonkos/hyperbolic-MDA-prediction.
Interactive Machine Learning-Powered Dashboard for Energy Analytics in Residential Buildings
Diego Garcia-Perez, Ignacio Diaz-Blanco, Jose M. Enguita-Gonzalez, Jorge Menéndez, Abel A. Cuadrado-Vega
https://doi.org/10.14428/esann/2024.ES2024-130
Diego Garcia-Perez, Ignacio Diaz-Blanco, Jose M. Enguita-Gonzalez, Jorge Menéndez, Abel A. Cuadrado-Vega
https://doi.org/10.14428/esann/2024.ES2024-130
Abstract:
Efforts to reduce energy consumption in buildings are crucial for climate change concerns. In this sense, energy monitoring increases energy awareness and mitigates energy wastes. This study integrates machine learning models, advanced visualisations, and interactive tools to create an insightful energy monitoring dashboard. Novel contributions include a 2D map of daily energy demand profiles combining spatial encodings based on t-SNE, fluid aggregation, and filter operations via a data-cube framework, as well as visual encoding powered by morphing projections. This approach facilitates the decisions of end users regarding the optimisation of energy in residential facilities.
Efforts to reduce energy consumption in buildings are crucial for climate change concerns. In this sense, energy monitoring increases energy awareness and mitigates energy wastes. This study integrates machine learning models, advanced visualisations, and interactive tools to create an insightful energy monitoring dashboard. Novel contributions include a 2D map of daily energy demand profiles combining spatial encodings based on t-SNE, fluid aggregation, and filter operations via a data-cube framework, as well as visual encoding powered by morphing projections. This approach facilitates the decisions of end users regarding the optimisation of energy in residential facilities.
Exploring Self-Organizing Maps for Addressing Semantic Impairments
Jorge Graneri, Sebastian Basterrech, Gerardo Rubino, Eduardo Mizraji
https://doi.org/10.14428/esann/2024.ES2024-215
Jorge Graneri, Sebastian Basterrech, Gerardo Rubino, Eduardo Mizraji
https://doi.org/10.14428/esann/2024.ES2024-215
Abstract:
Since the 1990s, Self-Organizing Maps (SOMs) have been instrumental in reducing dimensionality and visualizing high-dimensional data. This study adapts SOMs to explore the neural representation of human concepts, their neural ‘word net’ mapping, and the deterioration of these mappings in certain neurological disorders. Our model draws inspiration from semantic dementia, a severe condition that degrades semantic knowledge in the brain. Although our exploration utilizes a low-dimensional model - a rough simplification with respect of our brains - it successfully replicates observed clinical patterns. These promising results inspire further research to enhance our understanding of language pathophysiology in neurological disorders.
Since the 1990s, Self-Organizing Maps (SOMs) have been instrumental in reducing dimensionality and visualizing high-dimensional data. This study adapts SOMs to explore the neural representation of human concepts, their neural ‘word net’ mapping, and the deterioration of these mappings in certain neurological disorders. Our model draws inspiration from semantic dementia, a severe condition that degrades semantic knowledge in the brain. Although our exploration utilizes a low-dimensional model - a rough simplification with respect of our brains - it successfully replicates observed clinical patterns. These promising results inspire further research to enhance our understanding of language pathophysiology in neurological disorders.
HDBSCAN for 3-rd order tensor
Dina Faneva Andriantsiory, Joseph Ben Geloun, Mustapha Lebbah
https://doi.org/10.14428/esann/2024.ES2024-198
Dina Faneva Andriantsiory, Joseph Ben Geloun, Mustapha Lebbah
https://doi.org/10.14428/esann/2024.ES2024-198
Abstract:
Several methods for tensor clustering require hyperparameters such as the cluster size or the number of clusters per mode. These methods present a challenge because, for real datasets, such inputs cannot be determined without incurring significant costs. Recently, Multi-Slice Clustering (MSC) has addressed this issue by utilizing a threshold parameter to perform data clustering. MSC identifies signal slices that reside in a lower-dimensional subspace within a 3rd-order rank-1 tensor dataset. However, determining the tensor rank remains a complex task. The current work introduces a new approach to tensor clustering that can extract clusters of similar slices and is also capable of finding co-clustering and triclustering in 3rd-order tensors of any rank. Our algorithm is based on the density of the data.
Several methods for tensor clustering require hyperparameters such as the cluster size or the number of clusters per mode. These methods present a challenge because, for real datasets, such inputs cannot be determined without incurring significant costs. Recently, Multi-Slice Clustering (MSC) has addressed this issue by utilizing a threshold parameter to perform data clustering. MSC identifies signal slices that reside in a lower-dimensional subspace within a 3rd-order rank-1 tensor dataset. However, determining the tensor rank remains a complex task. The current work introduces a new approach to tensor clustering that can extract clusters of similar slices and is also capable of finding co-clustering and triclustering in 3rd-order tensors of any rank. Our algorithm is based on the density of the data.
Graph learning
Large-Scale Continuous Structure Learning from Time-Series Data
Filippo Michelis, Riccardo Massidda, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-120
Filippo Michelis, Riccardo Massidda, Davide Bacciu
https://doi.org/10.14428/esann/2024.ES2024-120
Abstract:
Structure learning is the problem of recovering from data a Directed Acyclic Graph (DAG) of the interactions among variables. By enforcing a differentiable acyclicity constraint on the adjacency matrix of the graph, existing methods solve this problem as an optimization problem and have been recently extended to time-series data. Due to the cubic computational complexity of existing acyclicity constraints, their application is limited to a few variables. In this paper, we introduce SVARCOSMO, an optimization-based structure learning method for time-series data that builds upon recent developments on unconstrained but provably acyclic models. We empirically show on both simulated and real data that SVARCOSMO correctly recovers the underlying DAG in significantly less time, enabling optimization-based structure learning on high-dimensional data.
Structure learning is the problem of recovering from data a Directed Acyclic Graph (DAG) of the interactions among variables. By enforcing a differentiable acyclicity constraint on the adjacency matrix of the graph, existing methods solve this problem as an optimization problem and have been recently extended to time-series data. Due to the cubic computational complexity of existing acyclicity constraints, their application is limited to a few variables. In this paper, we introduce SVARCOSMO, an optimization-based structure learning method for time-series data that builds upon recent developments on unconstrained but provably acyclic models. We empirically show on both simulated and real data that SVARCOSMO correctly recovers the underlying DAG in significantly less time, enabling optimization-based structure learning on high-dimensional data.
Noise Robust One-Class Intrusion Detection on Dynamic Graphs
Aleksei Liuliakov, Alexander Schulz, Luca Hermes, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-124
Aleksei Liuliakov, Alexander Schulz, Luca Hermes, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-124
Abstract:
In the domain of network intrusion detection, robustness against contaminated and noisy data inputs remains a critical challenge. This study introduces a probabilistic version of the Temporal Graph Network Support Vector Data Description (TGN-SVDD) model, designed to enhance detection accuracy in the presence of input noise. By predicting parameters of a Gaussian distribution for each network event, our model is able to naturally address noisy adversarials and improve robustness compared to a baseline model. Our experiments on a modified CIC-IDS2017 data set with synthetic noise demonstrate significant improvements in detection performance compared to the baseline TGN-SVDD model, especially as noise levels increase.
In the domain of network intrusion detection, robustness against contaminated and noisy data inputs remains a critical challenge. This study introduces a probabilistic version of the Temporal Graph Network Support Vector Data Description (TGN-SVDD) model, designed to enhance detection accuracy in the presence of input noise. By predicting parameters of a Gaussian distribution for each network event, our model is able to naturally address noisy adversarials and improve robustness compared to a baseline model. Our experiments on a modified CIC-IDS2017 data set with synthetic noise demonstrate significant improvements in detection performance compared to the baseline TGN-SVDD model, especially as noise levels increase.
SAT Instances Generation Using Graph Variational Autoencoders
Daniel Crowley, Marco Dalla, Barry O'Sullivan, Andrea Visentin
https://doi.org/10.14428/esann/2024.ES2024-223
Daniel Crowley, Marco Dalla, Barry O'Sullivan, Andrea Visentin
https://doi.org/10.14428/esann/2024.ES2024-223
Abstract:
This paper presents a SAT instance generator using a Graph Variational Autoencoder (GVAE) architecture that outperforms existing generative deep learning models in speed and requires minimal post-processing. Our computational analyses benchmark this model against current deep learning techniques, introducing advanced metrics for more accurate evaluation. This new model is unique in its ability to maintain partial satisfiability of SAT instances while significantly reducing computational time. Although no method perfectly addresses all challenges in generating SAT instances, our approach marks a significant step forward in the efficiency and effectiveness of SAT instance generation.
This paper presents a SAT instance generator using a Graph Variational Autoencoder (GVAE) architecture that outperforms existing generative deep learning models in speed and requires minimal post-processing. Our computational analyses benchmark this model against current deep learning techniques, introducing advanced metrics for more accurate evaluation. This new model is unique in its ability to maintain partial satisfiability of SAT instances while significantly reducing computational time. Although no method perfectly addresses all challenges in generating SAT instances, our approach marks a significant step forward in the efficiency and effectiveness of SAT instance generation.
Dual Stream Graph Transformer Fusion Networks for Enhanced Brain Decoding
Lucas Goené, Siamak Mehrkanoon
https://doi.org/10.14428/esann/2024.ES2024-23
Lucas Goené, Siamak Mehrkanoon
https://doi.org/10.14428/esann/2024.ES2024-23
Abstract:
This paper presents the novel Dual Stream Graph-Transformer Fusion (DS-GTF) architecture designed specifically for classifying task-based Magnetoencephalography (MEG) data. In the spatial stream, inputs are initially represented as graphs, which are then passed through graph attention networks (GAT) to extract spatial patterns. Two methods, TopK and Thresholded Adjacency are introduced for initializing the adjacency matrix used in the GAT. In the temporal stream, the Transformer Encoder receives concatenated windowed input MEG data and learns new temporal representations. The learned temporal and spatial representations from both streams are fused before reaching the output layer. Experimental results demonstrate an enhancement in classification performance and a reduction in standard deviation across multiple test subjects compared to other examined models.
This paper presents the novel Dual Stream Graph-Transformer Fusion (DS-GTF) architecture designed specifically for classifying task-based Magnetoencephalography (MEG) data. In the spatial stream, inputs are initially represented as graphs, which are then passed through graph attention networks (GAT) to extract spatial patterns. Two methods, TopK and Thresholded Adjacency are introduced for initializing the adjacency matrix used in the GAT. In the temporal stream, the Transformer Encoder receives concatenated windowed input MEG data and learns new temporal representations. The learned temporal and spatial representations from both streams are fused before reaching the output layer. Experimental results demonstrate an enhancement in classification performance and a reduction in standard deviation across multiple test subjects compared to other examined models.
Link prediction heuristics for temporal graph benchmark
Manuel Dileo, Matteo Zignani
https://doi.org/10.14428/esann/2024.ES2024-141
Manuel Dileo, Matteo Zignani
https://doi.org/10.14428/esann/2024.ES2024-141
Abstract:
Link prediction is one of the most well-known and studied problems in graph machine learning, successfully applied in different settings, such as predicting network evolution in online social networks, protein-to-protein interactions, or completing links in knowledge graphs. In recent years, we have witnessed several solutions based on deep learning methods for solving this task in the context of temporal networks. However, despite their effectiveness on static graphs, traditional heuristic-based approaches from network science research have never been considered potential benchmarks' baselines. For this reason, in this work, we tested four of the most well-known and simple heuristics for link prediction on the most adopted temporal graph benchmark (TGB). Our results show that simple link prediction heuristics can reach comparable results with state-of-the-art deep learning techniques and, thanks to their interpretability, give insights into the network being studied. We believe considering heuristic-based baselines will push the temporal graph learning community toward better models for link prediction.
Link prediction is one of the most well-known and studied problems in graph machine learning, successfully applied in different settings, such as predicting network evolution in online social networks, protein-to-protein interactions, or completing links in knowledge graphs. In recent years, we have witnessed several solutions based on deep learning methods for solving this task in the context of temporal networks. However, despite their effectiveness on static graphs, traditional heuristic-based approaches from network science research have never been considered potential benchmarks' baselines. For this reason, in this work, we tested four of the most well-known and simple heuristics for link prediction on the most adopted temporal graph benchmark (TGB). Our results show that simple link prediction heuristics can reach comparable results with state-of-the-art deep learning techniques and, thanks to their interpretability, give insights into the network being studied. We believe considering heuristic-based baselines will push the temporal graph learning community toward better models for link prediction.
Inductive lateral movement detection in enterprise computer networks
Corentin Larroche
https://doi.org/10.14428/esann/2024.ES2024-19
Corentin Larroche
https://doi.org/10.14428/esann/2024.ES2024-19
Abstract:
Lateral movement is a crucial phase of advanced cyberattacks, during which attackers propagate from host to host within the targeted network. State-of-the-art methods for detecting this behavior rely on graph-based learning algorithms, which typically rely on node embeddings to detect anomalous edges between hosts. Once trained, such models cannot easily generalize to new hosts joining the network or to a different network, which is impractical in real-world applications. We investigate the detection performance of an inductive link prediction model, which can generalize to graphs not seen during training, and find that it performs as well as state-of-the-art transductive methods in a zero-shot setting. This opens promising perspectives for practical lateral movement detection.
Lateral movement is a crucial phase of advanced cyberattacks, during which attackers propagate from host to host within the targeted network. State-of-the-art methods for detecting this behavior rely on graph-based learning algorithms, which typically rely on node embeddings to detect anomalous edges between hosts. Once trained, such models cannot easily generalize to new hosts joining the network or to a different network, which is impractical in real-world applications. We investigate the detection performance of an inductive link prediction model, which can generalize to graphs not seen during training, and find that it performs as well as state-of-the-art transductive methods in a zero-shot setting. This opens promising perspectives for practical lateral movement detection.
T-WinG: Windowing for Temporal Knowledge Graph Completion
Ngoc-Trung Nguyen, Thanh Vu, Thanh Le
https://doi.org/10.14428/esann/2024.ES2024-166
Ngoc-Trung Nguyen, Thanh Vu, Thanh Le
https://doi.org/10.14428/esann/2024.ES2024-166
Abstract:
In the domain of Temporal Knowledge Graph Completion, existing models often struggle with efficiently capturing the intricate temporal dynamics and interactions within knowledge graphs. To address these challenges, this paper introduces T-WinG, a novel approach that incorporates the Swin Transformer architecture, renowned for its efficacy in hierarchical representation learning. By integrating SPLIME's preprocessing techniques and refining the Swin Transformer's token mixer, T-WinG substantially improves performance. Specifically, our model demonstrates a performance improvement of up to 20% in accuracy metrics such as Mean Reciprocal Rank (MRR) and Hits@K, across four benchmark datasets compared to the best-performing baseline models. These results not only underscore T-WinG's ability to handle dynamic temporal data but also highlight its potential to address the pressing needs of real-world applications requiring accurate and timely insights from knowledge graphs.
In the domain of Temporal Knowledge Graph Completion, existing models often struggle with efficiently capturing the intricate temporal dynamics and interactions within knowledge graphs. To address these challenges, this paper introduces T-WinG, a novel approach that incorporates the Swin Transformer architecture, renowned for its efficacy in hierarchical representation learning. By integrating SPLIME's preprocessing techniques and refining the Swin Transformer's token mixer, T-WinG substantially improves performance. Specifically, our model demonstrates a performance improvement of up to 20% in accuracy metrics such as Mean Reciprocal Rank (MRR) and Hits@K, across four benchmark datasets compared to the best-performing baseline models. These results not only underscore T-WinG's ability to handle dynamic temporal data but also highlight its potential to address the pressing needs of real-world applications requiring accurate and timely insights from knowledge graphs.
Exploring Temporal Knowledge Graphs with Compositional Interactions and Diachronic Mechanisms
Loc Tran, Bac Le, Thanh Le
https://doi.org/10.14428/esann/2024.ES2024-173
Loc Tran, Bac Le, Thanh Le
https://doi.org/10.14428/esann/2024.ES2024-173
Abstract:
Temporal Knowledge Graphs (TKGs) organize dynamic real-world facts, adding a time dimension to the multi-relational graph structure of Knowledge Graphs (KGs). We leverage the expressive power of graph convolutional networks (GCNs) for modeling TKGs, recognizing similarities with handling graph-structured data and utilizing complex geometry. Our approach emphasizes compositional interactions between relations and entities, integrating a diachronic mechanism to enhance representation with both graph structure and temporal dynamics. Experimental results on benchmark datasets, employing various composition operators, showcase the effectiveness of our model in link prediction tasks.
Temporal Knowledge Graphs (TKGs) organize dynamic real-world facts, adding a time dimension to the multi-relational graph structure of Knowledge Graphs (KGs). We leverage the expressive power of graph convolutional networks (GCNs) for modeling TKGs, recognizing similarities with handling graph-structured data and utilizing complex geometry. Our approach emphasizes compositional interactions between relations and entities, integrating a diachronic mechanism to enhance representation with both graph structure and temporal dynamics. Experimental results on benchmark datasets, employing various composition operators, showcase the effectiveness of our model in link prediction tasks.
Domain Knowledge Integration in Machine Learning Systems
Domain Knowledge Integration in Machine Learning Systems - An Introduction
Marika Kaden, Sascha Saralajew, Thomas Villmann
https://doi.org/10.14428/esann/2024.ES2024-5
Marika Kaden, Sascha Saralajew, Thomas Villmann
https://doi.org/10.14428/esann/2024.ES2024-5
Abstract:
Knowledge integration into machine learning systems is a promising and successful strategy to achieve more plausible and consistent results. The plausibility is accompanied by better model interpretability due to the adjustment of the machine learning system to the domain specific requirements and restrictions. Further, informed machine learning can be seen as a particular task specific regularization of the model leading to better learning convergence and frequently also requiring a lower amount of training data. This short introduction paper addresses some recent aspects, how domain knowledge can be integrated into learning systems on different levels ranging from informed feature extraction to domain adjusted structure and model architecture.
Knowledge integration into machine learning systems is a promising and successful strategy to achieve more plausible and consistent results. The plausibility is accompanied by better model interpretability due to the adjustment of the machine learning system to the domain specific requirements and restrictions. Further, informed machine learning can be seen as a particular task specific regularization of the model leading to better learning convergence and frequently also requiring a lower amount of training data. This short introduction paper addresses some recent aspects, how domain knowledge can be integrated into learning systems on different levels ranging from informed feature extraction to domain adjusted structure and model architecture.
Tumor Grading via Decorrelated Sparse Survival Regression
Benjamin Paassen, Nadine Gaisa, Michael Rose, Mark-Sebastian Bösherz
https://doi.org/10.14428/esann/2024.ES2024-44
Benjamin Paassen, Nadine Gaisa, Michael Rose, Mark-Sebastian Bösherz
https://doi.org/10.14428/esann/2024.ES2024-44
Abstract:
In medical pathology, tumor grading is concerned with estimating the risk posed by a tumor, based on its pathological features. One way to infer risk scores is survival regression, i.e.\ using machine learning to infer a score that predicts the remaining survival time of a patient. Unfortunately, if applied naively, such a score is a mix of the intrinsic risk posed by the tumor and other risk factors, like the progression of the tumor or patient gender and age. We provide the first survival regression model that disentangles tumor grading from undesired correlations, while retaining a high degree of model interpretability, thanks to convex optimization, non-negativity constraints, sparsity, and linearity. We evaluate the proposed approach both on simulated and real-world data from N=114 patients at the University Clinic Aachen.
In medical pathology, tumor grading is concerned with estimating the risk posed by a tumor, based on its pathological features. One way to infer risk scores is survival regression, i.e.\ using machine learning to infer a score that predicts the remaining survival time of a patient. Unfortunately, if applied naively, such a score is a mix of the intrinsic risk posed by the tumor and other risk factors, like the progression of the tumor or patient gender and age. We provide the first survival regression model that disentangles tumor grading from undesired correlations, while retaining a high degree of model interpretability, thanks to convex optimization, non-negativity constraints, sparsity, and linearity. We evaluate the proposed approach both on simulated and real-world data from N=114 patients at the University Clinic Aachen.
Physics-Aware Normalizing Flows: Leveraging Electric Circuit Models in Adversarial Learning
Benjamin Schindler, Thomas Schmid
https://doi.org/10.14428/esann/2024.ES2024-177
Benjamin Schindler, Thomas Schmid
https://doi.org/10.14428/esann/2024.ES2024-177
Abstract:
We introduce Physics-Aware Normalizing Flows, a novel framework combining data-driven generative modeling with a physical layer based on an Electric Circuit Model, ensuring adherence to electricity laws, sample fidelity, and explainability. Four existing Normalizing Flow architectures, including Real-NVP and NSF, were adapted to our adversarial regime and evaluated with promising results for the ad hoc determination of value ranges of physical quantities and the generation of labeled measurements based on an unlabeled dataset. By extensive data generation according to our self-explainable approach, Random Forest regressions of underlying physical quantities could be improved significantly, compared to the original dataset including omitted ground truth labels.
We introduce Physics-Aware Normalizing Flows, a novel framework combining data-driven generative modeling with a physical layer based on an Electric Circuit Model, ensuring adherence to electricity laws, sample fidelity, and explainability. Four existing Normalizing Flow architectures, including Real-NVP and NSF, were adapted to our adversarial regime and evaluated with promising results for the ad hoc determination of value ranges of physical quantities and the generation of labeled measurements based on an unlabeled dataset. By extensive data generation according to our self-explainable approach, Random Forest regressions of underlying physical quantities could be improved significantly, compared to the original dataset including omitted ground truth labels.
Leveraging Physics-Informed Neural Networks as Solar Wind Forecasting Models
Nuno Costa, Filipa S. Barros, João J. G. Lima, Rui F. Pinto, André Restivo
https://doi.org/10.14428/esann/2024.ES2024-110
Nuno Costa, Filipa S. Barros, João J. G. Lima, Rui F. Pinto, André Restivo
https://doi.org/10.14428/esann/2024.ES2024-110
Abstract:
Space weather refers to the dynamic conditions in the solar system, particularly the interactions between the solar wind - a stream of charged particles emitted by the Sun - and the Earth's magnetic field and atmosphere. Accurate space weather forecasting is crucial for mitigating potential impacts on satellite operations, communication systems, power grids, and astronaut safety. However, existing solar wind coronal models like MULTI-VP require substantial computational resources. This paper proposes a Physics-Informed Neural Network (PiNN) as a faster yet accurate alternative that respects physical laws. PiNNs blend physics and data-driven techniques for rapid and reliable forecasts. Our studies show that PiNNs can reduce computation times and deliver forecasts comparable to MULTI-VP, offering an expedited and dependable solar wind forecasting approach.
Space weather refers to the dynamic conditions in the solar system, particularly the interactions between the solar wind - a stream of charged particles emitted by the Sun - and the Earth's magnetic field and atmosphere. Accurate space weather forecasting is crucial for mitigating potential impacts on satellite operations, communication systems, power grids, and astronaut safety. However, existing solar wind coronal models like MULTI-VP require substantial computational resources. This paper proposes a Physics-Informed Neural Network (PiNN) as a faster yet accurate alternative that respects physical laws. PiNNs blend physics and data-driven techniques for rapid and reliable forecasts. Our studies show that PiNNs can reduce computation times and deliver forecasts comparable to MULTI-VP, offering an expedited and dependable solar wind forecasting approach.
Online learning and concept drift
Self-Supervised Learning from Incrementally Drifting Data Streams
Valerie Vaquet, Jonas Vaquet, Fabian Hinder, Kleanthis Malialis, Christos Panayiotou, Marios Polycarpou, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-49
Valerie Vaquet, Jonas Vaquet, Fabian Hinder, Kleanthis Malialis, Christos Panayiotou, Marios Polycarpou, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-49
Abstract:
Supervised online learning relies on the assumption that ground truth information is available for model updates at each time step. As this is not realistic in every setting, alternatives such as active online learning, or online learning with verification latency have been proposed. In this work, we argue that provided we can characterize the expected concept drift as incremental drift, we can rely on a self-labeling strategy to keep updated models without having label information available. We derive a knn-based self-labeling online learner implementing the presented self-supervised scheme and experimentally show that this is an option for incrementally drifting data streams in the absence of label information.
Supervised online learning relies on the assumption that ground truth information is available for model updates at each time step. As this is not realistic in every setting, alternatives such as active online learning, or online learning with verification latency have been proposed. In this work, we argue that provided we can characterize the expected concept drift as incremental drift, we can rely on a self-labeling strategy to keep updated models without having label information available. We derive a knn-based self-labeling online learner implementing the presented self-supervised scheme and experimentally show that this is an option for incrementally drifting data streams in the absence of label information.
On-line Learning Dynamics in Layered Neural Networks with Arbitrary Activation Functions
Frederieke Richert, Otavio Citton, Michael Biehl
https://doi.org/10.14428/esann/2024.ES2024-116
Frederieke Richert, Otavio Citton, Michael Biehl
https://doi.org/10.14428/esann/2024.ES2024-116
Abstract:
We revisit and extend the statistical physics based analysis of layered neural networks trained by online gradient descent. We focus on the infl uence of the hidden unit activation functions on the typical learning behavior in model scenarios. Expanding activation functions in terms of Hermite polynomials enables us to extend the formalism to the analysis of soft committee machines with arbitrary activation in student-teacher scenarios. The approach requires much lower computational eff ort than naive numerical integration, which is practically infeasible. Moreover, it now becomes possible to treat mismatched scenarios in which the student activation function diff ers from the one used in the target rule defi nition. This makes it possible to study realistic models of machine learning.
We revisit and extend the statistical physics based analysis of layered neural networks trained by online gradient descent. We focus on the infl uence of the hidden unit activation functions on the typical learning behavior in model scenarios. Expanding activation functions in terms of Hermite polynomials enables us to extend the formalism to the analysis of soft committee machines with arbitrary activation in student-teacher scenarios. The approach requires much lower computational eff ort than naive numerical integration, which is practically infeasible. Moreover, it now becomes possible to treat mismatched scenarios in which the student activation function diff ers from the one used in the target rule defi nition. This makes it possible to study realistic models of machine learning.
Online Adaptation of Compressed Models by Pre-Training and Task-Relevant Pruning
Thomas Avé, Matthias Hutsebaut-Buysse, Wei Wei, Kevin Mets
https://doi.org/10.14428/esann/2024.ES2024-50
Thomas Avé, Matthias Hutsebaut-Buysse, Wei Wei, Kevin Mets
https://doi.org/10.14428/esann/2024.ES2024-50
Abstract:
Neural networks are increasingly deployed on edge devices, where they must adapt to new data in dynamic environments. Here, model compression techniques like pruning are essential. This involves removing redundant neurons, increasing efficiency at the cost of accuracy, and creating a conflict between efficiency and adaptability. We propose a novel method for training and compressing models that maintains and extends their ability to generalize to new data, improving online adaptation without reducing compression rates. By pre-training the model on additional knowledge and identifying the parts of the deep neural network that actually encode task-relevant knowledge, we can effectively prune the model by 80% and achieve 16% higher accuracies when adapting to new domains.
Neural networks are increasingly deployed on edge devices, where they must adapt to new data in dynamic environments. Here, model compression techniques like pruning are essential. This involves removing redundant neurons, increasing efficiency at the cost of accuracy, and creating a conflict between efficiency and adaptability. We propose a novel method for training and compressing models that maintains and extends their ability to generalize to new data, improving online adaptation without reducing compression rates. By pre-training the model on additional knowledge and identifying the parts of the deep neural network that actually encode task-relevant knowledge, we can effectively prune the model by 80% and achieve 16% higher accuracies when adapting to new domains.
Deep Temporal Consensus Clustering for Patient Stratification in Amyotrophic Lateral Sclerosis
Miguel Pego Roque, Andreia S. Martins, Marta Gromicho, Mamede de Carvalho, Sara C. Madeira, Pedro Tomás, Helena Aidos
https://doi.org/10.14428/esann/2024.ES2024-195
Miguel Pego Roque, Andreia S. Martins, Marta Gromicho, Mamede de Carvalho, Sara C. Madeira, Pedro Tomás, Helena Aidos
https://doi.org/10.14428/esann/2024.ES2024-195
Abstract:
Amyotrophic Lateral Sclerosis (ALS) is a fast-acting neurodegenerative disease, characterized by loss of muscle movement and heterogeneity in disease evolution. This poses a challenge in predicting the best time for therapy administration. Here, we propose Deep Temporal Consensus Clustering (DTCC), a stratification method to uncover patient groups with similar disease progression. Using only the initial 6-month follow-up period, DTCC uncovered five clusters that were evaluated in terms of disease evolution and time-to-event. For three critical events (non-invasive ventilation, gastrostomy and death) the attained groups show distinct 10-year progressions, validating the approach.
Amyotrophic Lateral Sclerosis (ALS) is a fast-acting neurodegenerative disease, characterized by loss of muscle movement and heterogeneity in disease evolution. This poses a challenge in predicting the best time for therapy administration. Here, we propose Deep Temporal Consensus Clustering (DTCC), a stratification method to uncover patient groups with similar disease progression. Using only the initial 6-month follow-up period, DTCC uncovered five clusters that were evaluated in terms of disease evolution and time-to-event. For three critical events (non-invasive ventilation, gastrostomy and death) the attained groups show distinct 10-year progressions, validating the approach.
Trustworthiness Score for Echo State Networks by Analysis of the Reservoir Dynamics
Jose M. Enguita-Gonzalez, Diego Garcia-Perez, Abel Alberto Cuadrado-Vega, Daniel García-Peña, José Ramón Rodríguez-Ossorio, Ignacio Diaz-Blanco
https://doi.org/10.14428/esann/2024.ES2024-38
Jose M. Enguita-Gonzalez, Diego Garcia-Perez, Abel Alberto Cuadrado-Vega, Daniel García-Peña, José Ramón Rodríguez-Ossorio, Ignacio Diaz-Blanco
https://doi.org/10.14428/esann/2024.ES2024-38
Abstract:
Epistemic uncertainty arises from input data areas where models lack exposure during training and may result in significant performance degradation in deployment. Echo State Networks are often used as virtual sensors or digital twins processing temporal input data, so their robustness against this degradation is crucial. This paper addresses this challenge by proposing a score comparing the similarity between the dynamic evolution of the reservoir in training and in inference. This research aims to enhance model confidence and adaptability in evolving circumstances.
Epistemic uncertainty arises from input data areas where models lack exposure during training and may result in significant performance degradation in deployment. Echo State Networks are often used as virtual sensors or digital twins processing temporal input data, so their robustness against this degradation is crucial. This paper addresses this challenge by proposing a score comparing the similarity between the dynamic evolution of the reservoir in training and in inference. This research aims to enhance model confidence and adaptability in evolving circumstances.
Invariant Representation Learning for Generalizable Imitation
Mohamed Jabri, Panagiotis Papadakis, Ehsan Abbasnejad, Gilles Coppin, Javen Shi
https://doi.org/10.14428/esann/2024.ES2024-18
Mohamed Jabri, Panagiotis Papadakis, Ehsan Abbasnejad, Gilles Coppin, Javen Shi
https://doi.org/10.14428/esann/2024.ES2024-18
Abstract:
We address the problem of learning imitation policies that generalize across environments sharing the same underlying causal structure between the system dynamics and the task. We introduce a novel loss for learning invariant state representations that draws inspiration from adversarial robustness. Our approach is algorithm-agnostic and does not require knowledge of domain labels. Yet, evaluation in visual and non-visual environments reveals improved zero-shot generalization in the presence of spurious features compared to previous works.
We address the problem of learning imitation policies that generalize across environments sharing the same underlying causal structure between the system dynamics and the task. We introduce a novel loss for learning invariant state representations that draws inspiration from adversarial robustness. Our approach is algorithm-agnostic and does not require knowledge of domain labels. Yet, evaluation in visual and non-visual environments reveals improved zero-shot generalization in the presence of spurious features compared to previous works.
Unsupervised Drift Detection Using Quadtree Spatial Mapping
Bernardo A. Ramos, Cristiano Leite de Castro, Tiago A. Coelho, Plamen Angelov
https://doi.org/10.14428/esann/2024.ES2024-187
Bernardo A. Ramos, Cristiano Leite de Castro, Tiago A. Coelho, Plamen Angelov
https://doi.org/10.14428/esann/2024.ES2024-187
Abstract:
This paper presents an unsupervised and model-independent concept drift detector based on quadtree spatial analysis (QTS). We used a d-dimensional quadtree to map the feature space and tracked a univariate curve that mimics the spatial behavior of the data stream. This curve serves as a helpful visual tool for analyzing concept drifts. Drifts are identified when there is a significant change in the current spatial mapping. Experimental results show that the proposed outperformed well-known drift detectors in terms of average precision and F1-score
This paper presents an unsupervised and model-independent concept drift detector based on quadtree spatial analysis (QTS). We used a d-dimensional quadtree to map the feature space and tracked a univariate curve that mimics the spatial behavior of the data stream. This curve serves as a helpful visual tool for analyzing concept drifts. Drifts are identified when there is a significant change in the current spatial mapping. Experimental results show that the proposed outperformed well-known drift detectors in terms of average precision and F1-score
Time series, recurrent and reinforcement learning
LSTM encoder-decoder model for contextualized time series forecasting applied to the simulation of a digital patient's physiological variables.
Julien Paris, Christine Sinoquet, Fadoua Taia-Alaoui, Corinne Lejus-Bourdeau
https://doi.org/10.14428/esann/2024.ES2024-105
Julien Paris, Christine Sinoquet, Fadoua Taia-Alaoui, Corinne Lejus-Bourdeau
https://doi.org/10.14428/esann/2024.ES2024-105
Abstract:
This paper explores utilizing an encoder-decoder neural architecture for unsupervised representation learning of mixed asynchronous data, presenting the JMETTS (Joint Modelling of Event Traces and Time Series) model. Our goal is to forecast short-term multivariate time series within event contexts. As a proof of concept, we examine a real-world case in digitally assisted training for anaesthesiology. JMETTS demonstrates high predictive performance, with a maximum prediction error percentage of approximately 5.5%, comparable to that of its only competitor published to date. The source code can be found at https://github.com/jp3142/jmetts_models_and_pipeline.
This paper explores utilizing an encoder-decoder neural architecture for unsupervised representation learning of mixed asynchronous data, presenting the JMETTS (Joint Modelling of Event Traces and Time Series) model. Our goal is to forecast short-term multivariate time series within event contexts. As a proof of concept, we examine a real-world case in digitally assisted training for anaesthesiology. JMETTS demonstrates high predictive performance, with a maximum prediction error percentage of approximately 5.5%, comparable to that of its only competitor published to date. The source code can be found at https://github.com/jp3142/jmetts_models_and_pipeline.
Reservoir Memory Networks
Claudio Gallicchio, Andrea Ceni
https://doi.org/10.14428/esann/2024.ES2024-117
Claudio Gallicchio, Andrea Ceni
https://doi.org/10.14428/esann/2024.ES2024-117
Abstract:
We introduce Reservoir Memory Networks (RMNs), a novel class of Reservoir Computing (RC) models that integrate a linear memory cell with a non-linear reservoir to enhance long-term information retention. We explore various configurations of the memory cell using orthogonal circular shift matrices and Legendre polynomials, alongside non-linear reservoirs configured as in Echo State Networks and Euler State Networks. Experimental results demonstrate the substantial benefits of RMNs in time-series classification tasks, highlighting their potential for advancing RC applications in areas requiring robust temporal processing.
We introduce Reservoir Memory Networks (RMNs), a novel class of Reservoir Computing (RC) models that integrate a linear memory cell with a non-linear reservoir to enhance long-term information retention. We explore various configurations of the memory cell using orthogonal circular shift matrices and Legendre polynomials, alongside non-linear reservoirs configured as in Echo State Networks and Euler State Networks. Experimental results demonstrate the substantial benefits of RMNs in time-series classification tasks, highlighting their potential for advancing RC applications in areas requiring robust temporal processing.
Why long model-based rollouts are no reason for bad Q-value estimates
Philipp Wissmann, Daniel Hein, Steffen Udluft, Volker Tresp
https://doi.org/10.14428/esann/2024.ES2024-80
Philipp Wissmann, Daniel Hein, Steffen Udluft, Volker Tresp
https://doi.org/10.14428/esann/2024.ES2024-80
Abstract:
This paper explores the use of model-based offline reinforcement learning with long model rollouts. While some literature criticizes this approach due to compounding errors, many practitioners have found success in real-world applications. The paper aims to demonstrate that long rollouts do not necessarily result in exponentially growing errors and can actually produce better Q-value estimates compared to model-free methods. These findings can potentially enhance reinforcement learning techniques.
This paper explores the use of model-based offline reinforcement learning with long model rollouts. While some literature criticizes this approach due to compounding errors, many practitioners have found success in real-world applications. The paper aims to demonstrate that long rollouts do not necessarily result in exponentially growing errors and can actually produce better Q-value estimates compared to model-free methods. These findings can potentially enhance reinforcement learning techniques.
Recurrent Neural Network based Counter Automata
Sergio Leal, Luis Lago
https://doi.org/10.14428/esann/2024.ES2024-211
Sergio Leal, Luis Lago
https://doi.org/10.14428/esann/2024.ES2024-211
Abstract:
This paper presents a neural network architecture that aims to merge RNNs and push-down automata in order to address the recognition of formal languages improving interpretability. The model manages to reproduce a behaviour equivalent to that of an automaton, making it more generalizable and interpretable. Validation has been carried out through several experiments, testing not only convergence but also adaptability and training speed, and comparing the results with similar existing models, as well as with an LSTM. The proposed model serves as a starting point with excellent results, and serves as a basis for future extensions to more sophisticated architectures.
This paper presents a neural network architecture that aims to merge RNNs and push-down automata in order to address the recognition of formal languages improving interpretability. The model manages to reproduce a behaviour equivalent to that of an automaton, making it more generalizable and interpretable. Validation has been carried out through several experiments, testing not only convergence but also adaptability and training speed, and comparing the results with similar existing models, as well as with an LSTM. The proposed model serves as a starting point with excellent results, and serves as a basis for future extensions to more sophisticated architectures.
Multidimensional CDTW-based features for Parkinson's Disease classification
Ferhat Attal, Nicolas Khoury, Yacine Amirat
https://doi.org/10.14428/esann/2024.ES2024-106
Ferhat Attal, Nicolas Khoury, Yacine Amirat
https://doi.org/10.14428/esann/2024.ES2024-106
Abstract:
This paper presents an improvement of the Unidimensional Continuous Dynamic Time Warping (UCDTW) method for diagnosing Parkinson's Disease (PD) based on multidimensional time series data. These data include recordings of vertical Ground Reaction Forces (vGRFs) collected from eight force sensors per shoe sole during the walk. Leveraging gait cycle patterns, the proposed approach distinguishes between healthy and PD subjects by assessing gait cycle repetition through Multidimensional CDTW. Serval classification methods, including supervised (K-NN, DT, RF, SVM) and unsupervised (GMM, K-means), are used to classify the healthy and PD subjects, using MCDTW distances extracted from the gait cycles. The obtained results show a significant improvement in terms of classification performances.
This paper presents an improvement of the Unidimensional Continuous Dynamic Time Warping (UCDTW) method for diagnosing Parkinson's Disease (PD) based on multidimensional time series data. These data include recordings of vertical Ground Reaction Forces (vGRFs) collected from eight force sensors per shoe sole during the walk. Leveraging gait cycle patterns, the proposed approach distinguishes between healthy and PD subjects by assessing gait cycle repetition through Multidimensional CDTW. Serval classification methods, including supervised (K-NN, DT, RF, SVM) and unsupervised (GMM, K-means), are used to classify the healthy and PD subjects, using MCDTW distances extracted from the gait cycles. The obtained results show a significant improvement in terms of classification performances.
Vision Language Models as Policy Learners in Reinforcement Learning Environments
Giovanni Bonetta, Davide Zago, Rossella Cancelliere, Mirko Polato, Bernardo Magnini
https://doi.org/10.14428/esann/2024.ES2024-181
Giovanni Bonetta, Davide Zago, Rossella Cancelliere, Mirko Polato, Bernardo Magnini
https://doi.org/10.14428/esann/2024.ES2024-181
Abstract:
In various domains requiring general knowledge and agent reasoning, traditional reinforcement learning (RL) algorithms often start from scratch, lacking prior knowledge of the environment. This approach can lead to significant inefficiencies as agents sometimes undergo extensive exploration before optimizing their actions. Conversely, in this paper we assume that recent Vision Language Models (VLMs), integrating both visual and textual information, possess inherent knowledge and basic reasoning capabilities, offering potential solutions to the sample inefficiency problem in RL. The paper explores the integration of VLMs into RL by employing a robust VLM model, Idefics-9B, as a policy updated via Proximal Policy Optimization (PPO). Experimental results on simulated environments demonstrate that utilizing VLMs in RL significantly accelerates PPO convergence and improves rewards compared to traditional solutions. Additionally, we propose a streamlined modification to the model architecture for memory efficiency and lighter training, and we release a number of upgraded environments featuring both visual observations and textual descriptions, which, we hope, will facilitate research in VLM and RL applications. Code is available at: https://github.com/giobin/VlmPolicyEsann24
In various domains requiring general knowledge and agent reasoning, traditional reinforcement learning (RL) algorithms often start from scratch, lacking prior knowledge of the environment. This approach can lead to significant inefficiencies as agents sometimes undergo extensive exploration before optimizing their actions. Conversely, in this paper we assume that recent Vision Language Models (VLMs), integrating both visual and textual information, possess inherent knowledge and basic reasoning capabilities, offering potential solutions to the sample inefficiency problem in RL. The paper explores the integration of VLMs into RL by employing a robust VLM model, Idefics-9B, as a policy updated via Proximal Policy Optimization (PPO). Experimental results on simulated environments demonstrate that utilizing VLMs in RL significantly accelerates PPO convergence and improves rewards compared to traditional solutions. Additionally, we propose a streamlined modification to the model architecture for memory efficiency and lighter training, and we release a number of upgraded environments featuring both visual observations and textual descriptions, which, we hope, will facilitate research in VLM and RL applications. Code is available at: https://github.com/giobin/VlmPolicyEsann24
Predicting the Closing Cross Auction Results at the NASDAQ Stock Exchange
Sarel Cohen, Manuel Hettich, Philipp Bielefeld, Crispin Schomers, Tobias Friedrich
https://doi.org/10.14428/esann/2024.ES2024-159
Sarel Cohen, Manuel Hettich, Philipp Bielefeld, Crispin Schomers, Tobias Friedrich
https://doi.org/10.14428/esann/2024.ES2024-159
Abstract:
This paper aims to present the results and learnings from our work on the last year's \textit{Optiver - Trading at the Close} Kaggle 2023 challenge. It not only touches the two most widely used approaches in the competition, deep learning models and support vector regression models, but also describes the provided dataset drawn from the NASDAQ stock exchange with many detailed attributes, like the imbalance size and far and near prices, recorded in an interval of one second for the last ten minutes of each trading day. It also describes the constraints of the competition. The presented machine learning model based on the LightGBM engine stood out from the competition by feeding back the revealed target data given for the previous day and was one of the top 5\% of all models in the competition.
This paper aims to present the results and learnings from our work on the last year's \textit{Optiver - Trading at the Close} Kaggle 2023 challenge. It not only touches the two most widely used approaches in the competition, deep learning models and support vector regression models, but also describes the provided dataset drawn from the NASDAQ stock exchange with many detailed attributes, like the imbalance size and far and near prices, recorded in an interval of one second for the last ten minutes of each trading day. It also describes the constraints of the competition. The presented machine learning model based on the LightGBM engine stood out from the competition by feeding back the revealed target data given for the previous day and was one of the top 5\% of all models in the competition.
A Deep Double Q-Learning as a SDLS support in solving LABS problem
Dominik Żurek
https://doi.org/10.14428/esann/2024.ES2024-36
Dominik Żurek
https://doi.org/10.14428/esann/2024.ES2024-36
Abstract:
Low Autocorrelation Binary Sequence (LABS) remains an open complex optimization problem with multiple applications. Existing studies rely primarily on advanced solvers based on local search heuristics, such as the steepest-descent local search algorithm (SDLS), Tabu search, or xLostovka algorithms. These approaches require searching through a large solution space, which is a computationally heavy and time-consuming process, leading to slower convergence. To improve convergence speed and allow for finding better solutions within a limited time, we propose the Deep Double Q-learning reinforcement learning algorithm for the LABS problem to support heuristic methods. The model aims to narrow down the search space without causing a drop in the final efficiency. Our experimental study showcases that the proposed approach is a promising direction for developing a highly efficient method for the LABS problem.
Low Autocorrelation Binary Sequence (LABS) remains an open complex optimization problem with multiple applications. Existing studies rely primarily on advanced solvers based on local search heuristics, such as the steepest-descent local search algorithm (SDLS), Tabu search, or xLostovka algorithms. These approaches require searching through a large solution space, which is a computationally heavy and time-consuming process, leading to slower convergence. To improve convergence speed and allow for finding better solutions within a limited time, we propose the Deep Double Q-learning reinforcement learning algorithm for the LABS problem to support heuristic methods. The model aims to narrow down the search space without causing a drop in the final efficiency. Our experimental study showcases that the proposed approach is a promising direction for developing a highly efficient method for the LABS problem.
Enhanced Deep Reinforcement Learning based Group Recommendation System with Multi-head Attention for Varied Group Sizes
Saba Izadkhah, Banafsheh Rekabdar
https://doi.org/10.14428/esann/2024.ES2024-138
Saba Izadkhah, Banafsheh Rekabdar
https://doi.org/10.14428/esann/2024.ES2024-138
Abstract:
This paper introduces EnGRMA, an Enhanced deep reinforcement learning-based Group Recommendation system with Multi-head Attention for varied group sizes. EnGRMA adapts its recommendation strategy according to group sizes, using individual member preferences in smaller groups through a weighted average method, and leveraging multihead attention to aggregate diverse opinions effectively in larger groups. This method helps model dynamic member-item interactions, enhancing the system’s ability to deliver personalized recommendations. Our evaluation of the MovieLens-Rand dataset shows that EnGRMA not only outperforms GRMA and DRGR in Recall, NDCG, Precision, and F1 scores but also demonstrates superior performance in NDCG against AGREE.
This paper introduces EnGRMA, an Enhanced deep reinforcement learning-based Group Recommendation system with Multi-head Attention for varied group sizes. EnGRMA adapts its recommendation strategy according to group sizes, using individual member preferences in smaller groups through a weighted average method, and leveraging multihead attention to aggregate diverse opinions effectively in larger groups. This method helps model dynamic member-item interactions, enhancing the system’s ability to deliver personalized recommendations. Our evaluation of the MovieLens-Rand dataset shows that EnGRMA not only outperforms GRMA and DRGR in Recall, NDCG, Precision, and F1 scores but also demonstrates superior performance in NDCG against AGREE.
Aeronautic data analysis
Aeronautic data analysis
Jérôme Lacaille, Patrick Fabiani, Patricia Besson
https://doi.org/10.14428/esann/2024.ES2024-7
Jérôme Lacaille, Patrick Fabiani, Patricia Besson
https://doi.org/10.14428/esann/2024.ES2024-7
Abstract:
The latest IPCC report shows that the aviation industry is responsible for around 2% of greenhouse gas emissions; this is lower than emissions from many other sectors, but still equivalent to the total emissions of a European country like Germany. Following the recommendations of the International Civil Aviation Organization (ICAO) and its long-term global aspirational goal (LTAG), the aeronautics industry has come together under the Air Transport Aviation Group (ATAG) to converge towards zero greenhouse gas emissions by 2050, such as CO2 emissions and other radiative effects such as those generated by condensation trails. To achieve this goal, we have a number of levers at our disposal: technical improvements to our engines and aircrafts, the use of new sustainable fuels, and the use of data now accessible thanks to new engineering 4.0 technologies. This document first presents the data we now have at our disposal. The second section briefly recalls the opportunities offered by new renewable fuels. Finally, we present some digital approaches and conclude with details of three central themes illustrated by the contributions to this special session.
The latest IPCC report shows that the aviation industry is responsible for around 2% of greenhouse gas emissions; this is lower than emissions from many other sectors, but still equivalent to the total emissions of a European country like Germany. Following the recommendations of the International Civil Aviation Organization (ICAO) and its long-term global aspirational goal (LTAG), the aeronautics industry has come together under the Air Transport Aviation Group (ATAG) to converge towards zero greenhouse gas emissions by 2050, such as CO2 emissions and other radiative effects such as those generated by condensation trails. To achieve this goal, we have a number of levers at our disposal: technical improvements to our engines and aircrafts, the use of new sustainable fuels, and the use of data now accessible thanks to new engineering 4.0 technologies. This document first presents the data we now have at our disposal. The second section briefly recalls the opportunities offered by new renewable fuels. Finally, we present some digital approaches and conclude with details of three central themes illustrated by the contributions to this special session.
From Data to Simulation: Capturing Aircraft Engine Degradation Dynamics
Abdellah Madane, Florent Forest, Hanane Azzag, Mustapha Lebbah, Jérôme Lacaille
https://doi.org/10.14428/esann/2024.ES2024-51
Abdellah Madane, Florent Forest, Hanane Azzag, Mustapha Lebbah, Jérôme Lacaille
https://doi.org/10.14428/esann/2024.ES2024-51
Abstract:
The analysis and simulation of aircraft engine behavior have garnered significant attention in the aeronautical industry, primarily due to its implications for performance, maintenance, safety, and sustainability. Our work successfully showcases the efficacy of utilizing time series data collected from our aircraft engines to construct a digital twin capable of dynamically emulating their real-time behavior. We then introduce a new methodology to model the physical engine's degradation and meticulously monitor its evolution over time. By continuously analyzing the simulated data against real-world performance measurements, our approach offers valuable insights into the engine's long-term behavior and health trajectory.
The analysis and simulation of aircraft engine behavior have garnered significant attention in the aeronautical industry, primarily due to its implications for performance, maintenance, safety, and sustainability. Our work successfully showcases the efficacy of utilizing time series data collected from our aircraft engines to construct a digital twin capable of dynamically emulating their real-time behavior. We then introduce a new methodology to model the physical engine's degradation and meticulously monitor its evolution over time. By continuously analyzing the simulated data against real-world performance measurements, our approach offers valuable insights into the engine's long-term behavior and health trajectory.
A Kalman Filter and Neural Network Hybrid Approach for Health Monitoring of Aircraft Engines
Solène Thépaut, Sebastien RAZAKARIVONY, Dong Quan Vu, Alfred Bauny
https://doi.org/10.14428/esann/2024.ES2024-69
Solène Thépaut, Sebastien RAZAKARIVONY, Dong Quan Vu, Alfred Bauny
https://doi.org/10.14428/esann/2024.ES2024-69
Abstract:
In aircraft engine monitoring, estimating performance indicators from observed measurement data has been an important and long standing subject, as these indicators provide highly beneficial information to assist maintenance activities. Besides the traditional gas path analysis method, the two main resolution approaches in tackling this problem are Bayesian inferences, which strike a balance between assumed statistical models and observations, and machine-learning methods, where (simulated) data are generated to train regression models. However, these methods have their own limitations: Bayesian inferences are not robust against model-reality gap and non-linearity, while current implementations of machine learning algorithms in this context do not take into account temporal information. In this work, we focus on a use case in estimating engine performance indicators from operational measurement (snapshot) data. We explore several hybrid approaches, aiming to simultaneously leverage the advantages of Bayesian inference and machine learning. We demonstrate that the estimation precision provided by several of our hybrid methods significantly improves upon that of state-of-the-art methods in the tested use case.
In aircraft engine monitoring, estimating performance indicators from observed measurement data has been an important and long standing subject, as these indicators provide highly beneficial information to assist maintenance activities. Besides the traditional gas path analysis method, the two main resolution approaches in tackling this problem are Bayesian inferences, which strike a balance between assumed statistical models and observations, and machine-learning methods, where (simulated) data are generated to train regression models. However, these methods have their own limitations: Bayesian inferences are not robust against model-reality gap and non-linearity, while current implementations of machine learning algorithms in this context do not take into account temporal information. In this work, we focus on a use case in estimating engine performance indicators from operational measurement (snapshot) data. We explore several hybrid approaches, aiming to simultaneously leverage the advantages of Bayesian inference and machine learning. We demonstrate that the estimation precision provided by several of our hybrid methods significantly improves upon that of state-of-the-art methods in the tested use case.
Towards Contrail Mitigation through Robust and Frugal AI-Driven Data Exploitation
Davide Di Giusto, Grégoire Boussu, Simon Alix, Céline Reverdy, Mathieu Riou, Teodora Petrisor
https://doi.org/10.14428/esann/2024.ES2024-64
Davide Di Giusto, Grégoire Boussu, Simon Alix, Céline Reverdy, Mathieu Riou, Teodora Petrisor
https://doi.org/10.14428/esann/2024.ES2024-64
Abstract:
Condensation trails significantly contribute to aviation's impact on climate change. Their effective mitigation involves formulating accurate predictions of occurrence, introducing the relevant constraints in trajectory optimization and employing reliable verification strategies based on observations. Atmospheric data, expert knowledge and contrails observations can be leveraged for these purposes. However, several factors determine a limited prediction accuracy and high uncertainty bounds, including the difficulties in predicting contrails persistence, the complexity of trajectory optimization problems and the lack of labelled data for contrail verification. This paper gives an overview of our robust Artificial Intelligence methods aiming to tackle these challenges throughout the entire contrail mitigation chain.
Condensation trails significantly contribute to aviation's impact on climate change. Their effective mitigation involves formulating accurate predictions of occurrence, introducing the relevant constraints in trajectory optimization and employing reliable verification strategies based on observations. Atmospheric data, expert knowledge and contrails observations can be leveraged for these purposes. However, several factors determine a limited prediction accuracy and high uncertainty bounds, including the difficulties in predicting contrails persistence, the complexity of trajectory optimization problems and the lack of labelled data for contrail verification. This paper gives an overview of our robust Artificial Intelligence methods aiming to tackle these challenges throughout the entire contrail mitigation chain.
Modern Machine Learning Methods for robust and real-time Brain-Computer Interfaces (BCI)
Machine Learning Methods for BCI: challenges, pitfalls and promises
Jaime A Riascos, Marta Molinas, Fabien Lotte
https://doi.org/10.14428/esann/2024.ES2024-4
Jaime A Riascos, Marta Molinas, Fabien Lotte
https://doi.org/10.14428/esann/2024.ES2024-4
Abstract:
The development of Brain-Computer Interfaces (BCIs) has been constrained by a predominant focus on signal classification. This paper rather emphasizes the integration of neurophysiological principles, BCI paradigm selection, and rigorous experimental design. By addressing common pitfalls in Machine Learning implementation, we provide researchers with a tutorial and robust framework for BCI development, promoting reproducibility and rigor. Furthermore, by tackling challenges at the intersection of BCI and Machine Learning, this work contributes to the advancement of practical, real-time BCI applications.
The development of Brain-Computer Interfaces (BCIs) has been constrained by a predominant focus on signal classification. This paper rather emphasizes the integration of neurophysiological principles, BCI paradigm selection, and rigorous experimental design. By addressing common pitfalls in Machine Learning implementation, we provide researchers with a tutorial and robust framework for BCI development, promoting reproducibility and rigor. Furthermore, by tackling challenges at the intersection of BCI and Machine Learning, this work contributes to the advancement of practical, real-time BCI applications.
Exploring High- and Low-Density Electroencephalography for a Dream Decoding Brain-Computer Interface
Mithila Packiyanathan, André Torvestad, Marta Molinas, Luis Alfredo Moctezuma Pascual
https://doi.org/10.14428/esann/2024.ES2024-115
Mithila Packiyanathan, André Torvestad, Marta Molinas, Luis Alfredo Moctezuma Pascual
https://doi.org/10.14428/esann/2024.ES2024-115
Abstract:
A high-performance real-time brain-computer interface system capable of identifying dreams has potential for healthcare applications. To address this, we use electroencephalogram (EEG) data from non-rapid eye movement sleep to classify dream experience and no- experience. Using 58 EEG channels, we achieve an accuracy of 0.94, an AUROC of 0.91, and a kappa score of 0.84, accomplished by first filtering the data through multivariate empirical mode decomposition followed by a combination of principal component analysis and common spatial patterns for feature extraction and K-nearest neighbors classifier. Interestingly, comparable results are obtained using 29 or 10 EEG channels selected by permutation-based channel selection.
A high-performance real-time brain-computer interface system capable of identifying dreams has potential for healthcare applications. To address this, we use electroencephalogram (EEG) data from non-rapid eye movement sleep to classify dream experience and no- experience. Using 58 EEG channels, we achieve an accuracy of 0.94, an AUROC of 0.91, and a kappa score of 0.84, accomplished by first filtering the data through multivariate empirical mode decomposition followed by a combination of principal component analysis and common spatial patterns for feature extraction and K-nearest neighbors classifier. Interestingly, comparable results are obtained using 29 or 10 EEG channels selected by permutation-based channel selection.
Deep Riemannian Neural Architectures for Domain Adaptation in Burst cVEP-based Brain Computer Interface
Sébastien VELUT, Sylvain Chevallier, Marie-Constance Corsi, Frédéric Dehais
https://doi.org/10.14428/esann/2024.ES2024-112
Sébastien VELUT, Sylvain Chevallier, Marie-Constance Corsi, Frédéric Dehais
https://doi.org/10.14428/esann/2024.ES2024-112
Abstract:
Code modulated Visually Evoked Potentials (cVEP) is an emerging paradigm for Brain-Computer Interfaces (BCIs) that offers reduced calibration times. However, cVEP-based BCIs still encounter challenges related to cross-session/subject variabilities. As Riemannian approaches have demonstrated good robustness to these variabilities, we propose the first study of deep Riemannian neural architectures, namely SPDNets, on cVEP-based BCIs. To evaluate their performance with respect to subject variabilities, we conduct classification tasks in a domain adaptation framework using a burst cVEP open dataset. This study demonstrates that SPDNet yields the best accuracy with single-subject calibration and promising results in domain adaptation.
Code modulated Visually Evoked Potentials (cVEP) is an emerging paradigm for Brain-Computer Interfaces (BCIs) that offers reduced calibration times. However, cVEP-based BCIs still encounter challenges related to cross-session/subject variabilities. As Riemannian approaches have demonstrated good robustness to these variabilities, we propose the first study of deep Riemannian neural architectures, namely SPDNets, on cVEP-based BCIs. To evaluate their performance with respect to subject variabilities, we conduct classification tasks in a domain adaptation framework using a burst cVEP open dataset. This study demonstrates that SPDNet yields the best accuracy with single-subject calibration and promising results in domain adaptation.
EEG Source Imaging Enhances Motor Imagery Classification
Andres Soler, Viktor Naas, Amita Giri, Marta Molinas
https://doi.org/10.14428/esann/2024.ES2024-158
Andres Soler, Viktor Naas, Amita Giri, Marta Molinas
https://doi.org/10.14428/esann/2024.ES2024-158
Abstract:
Brain-computer Interfaces (BCIs) have been developed towards enhancing communication and control in individuals with motor disabilities and assist in motor rehabilitation, where motor imagery (MI), the mental visualization of limb movement, has been broadly explored. Traditionally, MI-based BCIs utilize electroencephalographic (EEG) recordings to discriminate between limbs motor imagination. This involves applying feature extraction and classification, primarily analyzing signals recorded at the scalp. Despite the success of the traditional sensor space analysis, recent studies have demonstrated that incorporating EEG source imaging (ESI)has led to an improvement of the classification performance. This work studies pipelines on both sensor and source space for classifying upper limb MI. Here, we introduce the use of source average power for the integration of ESI into MI-based BCIs. Our results suggest a significant accuracy improvement of 10% when applying source space analysis with average power against traditional sensor space analysis. This demonstrates that a shift from sensor space analysis to source space analysis can be beneficial for MI classification.
Brain-computer Interfaces (BCIs) have been developed towards enhancing communication and control in individuals with motor disabilities and assist in motor rehabilitation, where motor imagery (MI), the mental visualization of limb movement, has been broadly explored. Traditionally, MI-based BCIs utilize electroencephalographic (EEG) recordings to discriminate between limbs motor imagination. This involves applying feature extraction and classification, primarily analyzing signals recorded at the scalp. Despite the success of the traditional sensor space analysis, recent studies have demonstrated that incorporating EEG source imaging (ESI)has led to an improvement of the classification performance. This work studies pipelines on both sensor and source space for classifying upper limb MI. Here, we introduce the use of source average power for the integration of ESI into MI-based BCIs. Our results suggest a significant accuracy improvement of 10% when applying source space analysis with average power against traditional sensor space analysis. This demonstrates that a shift from sensor space analysis to source space analysis can be beneficial for MI classification.
Unveiling Dreams: Moving Towards Automatic Dream Decoding via PSD-Based EEG Analysis and Machine Learning
André Torvestad, Mithila Packiyanathan, Luis Alfredo Moctezuma Pascual, Marta Molinas
https://doi.org/10.14428/esann/2024.ES2024-164
André Torvestad, Mithila Packiyanathan, Luis Alfredo Moctezuma Pascual, Marta Molinas
https://doi.org/10.14428/esann/2024.ES2024-164
Abstract:
Equipping brain-computer interfaces with dream decoding capabilities could be vital in healthcare applications. We used high-density electroencephalogram data from non-rapid eye movement sleep to conduct qualitative analysis employing multivariate empirical mode decomposition and power spectral density (PSD) for preprocessing and machine learning algorithms to distinguish between a dream experience and no experience. Qualitative analysis shows differences between the two classes, especially in the theta and beta bands. We achieve a classification performance of 0.915 in accuracy, 0.851 in AUROC, and 0.715 in kappa with PSD features and extreme gradient boosting classifier.
Equipping brain-computer interfaces with dream decoding capabilities could be vital in healthcare applications. We used high-density electroencephalogram data from non-rapid eye movement sleep to conduct qualitative analysis employing multivariate empirical mode decomposition and power spectral density (PSD) for preprocessing and machine learning algorithms to distinguish between a dream experience and no experience. Qualitative analysis shows differences between the two classes, especially in the theta and beta bands. We achieve a classification performance of 0.915 in accuracy, 0.851 in AUROC, and 0.715 in kappa with PSD features and extreme gradient boosting classifier.
Towards calibration-free online EEG motor imagery decoding using Deep Learning
Martin Wimpff, Jan Zerfowski, Bin Yang
https://doi.org/10.14428/esann/2024.ES2024-26
Martin Wimpff, Jan Zerfowski, Bin Yang
https://doi.org/10.14428/esann/2024.ES2024-26
Abstract:
The prevalence of stroke-induced disability drives research in motor imagery Brain-Computer Interfaces (BCIs) for rehabilitation. Closed-loop systems using traditional decoding models prevail but deep learning advances in single-trial offline decoding offer promises. However, transferring methods from offline to online decoding poses challenges. To address this, we propose a new approach to tune existing offline deep learning models towards online decoding, outperforming traditional pipelines without the need for subject-specific calibration data. Our proposed method is a step towards calibration-free BCIs that enable immediate feedback and user learning.
The prevalence of stroke-induced disability drives research in motor imagery Brain-Computer Interfaces (BCIs) for rehabilitation. Closed-loop systems using traditional decoding models prevail but deep learning advances in single-trial offline decoding offer promises. However, transferring methods from offline to online decoding poses challenges. To address this, we propose a new approach to tune existing offline deep learning models towards online decoding, outperforming traditional pipelines without the need for subject-specific calibration data. Our proposed method is a step towards calibration-free BCIs that enable immediate feedback and user learning.
Geometric Deep Learning to Enhance Imbalanced Domain Adaptation in EEG
Shanglin Li, Motoaki Kawanabe, Reinmar Kobler
https://doi.org/10.14428/esann/2024.ES2024-91
Shanglin Li, Motoaki Kawanabe, Reinmar Kobler
https://doi.org/10.14428/esann/2024.ES2024-91
Abstract:
Electroencephalography (EEG) based brain-computer interfaces face significant challenges in generalization across different domains (i.e., sessions and subjects) without costly supervised calibration. Assuming identical label distributions across domains, we recently proposed a geometric deep learning framework to align marginal statistics in latent representation space. Yet, label distribution shifts are frequently encountered in practice. To this end, we proposed a novel approach integrating data augmentation and clustering techniques to align feature distributions more effectively under label shifts.
Electroencephalography (EEG) based brain-computer interfaces face significant challenges in generalization across different domains (i.e., sessions and subjects) without costly supervised calibration. Assuming identical label distributions across domains, we recently proposed a geometric deep learning framework to align marginal statistics in latent representation space. Yet, label distribution shifts are frequently encountered in practice. To this end, we proposed a novel approach integrating data augmentation and clustering techniques to align feature distributions more effectively under label shifts.
Language models
LLaMA Tunes CMA-ES
Oliver Kramer
https://doi.org/10.14428/esann/2024.ES2024-136
Oliver Kramer
https://doi.org/10.14428/esann/2024.ES2024-136
Abstract:
This paper introduces LLaMA-ES, an approach for tuning the hyperparameters of Evolution Strategies (ES), specifically the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), by leveraging a Large Language Model (LLM). The proposed method uses the LLM to iteratively suggest parameter adjustments based on the optimization history, enabling dynamic fine-tuning of the algorithm. We validate our approach through experiments on numerical benchmark optimization problems, employing the LLaMA3 model with 70 billion parameters. The results demonstrate that LLaMA-ES significantly enhances the performance of CMA-ES, achieving competitive results in parameter tuning and demonstrating the potential of LLMs in optimization tasks.
This paper introduces LLaMA-ES, an approach for tuning the hyperparameters of Evolution Strategies (ES), specifically the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), by leveraging a Large Language Model (LLM). The proposed method uses the LLM to iteratively suggest parameter adjustments based on the optimization history, enabling dynamic fine-tuning of the algorithm. We validate our approach through experiments on numerical benchmark optimization problems, employing the LLaMA3 model with 70 billion parameters. The results demonstrate that LLaMA-ES significantly enhances the performance of CMA-ES, achieving competitive results in parameter tuning and demonstrating the potential of LLMs in optimization tasks.
A Two-Stage Approach for Implicit Bias Detection in Generative Language Models
Jeremy Edwards, Renjie Hu, Amaury Lendasse, Alexander Schlager, Peggy Lindner
https://doi.org/10.14428/esann/2024.ES2024-206
Jeremy Edwards, Renjie Hu, Amaury Lendasse, Alexander Schlager, Peggy Lindner
https://doi.org/10.14428/esann/2024.ES2024-206
Abstract:
Machine learning and AI are increasingly popular for their impressive task performance. Yet, Natural Language Processing (NLP) models often inadvertently learn harmful biases related to gender and race, leading to skewed predictions. Literature distinguishes between direct and indirect bias. Current research aims to detect and mitigate these biases in machine learning models. This study introduces a two-stage approach to identify both types of gender bias in generative large language models (LLMs), confirming that they can manifest both direct and indirect biases.
Machine learning and AI are increasingly popular for their impressive task performance. Yet, Natural Language Processing (NLP) models often inadvertently learn harmful biases related to gender and race, leading to skewed predictions. Literature distinguishes between direct and indirect bias. Current research aims to detect and mitigate these biases in machine learning models. This study introduces a two-stage approach to identify both types of gender bias in generative large language models (LLMs), confirming that they can manifest both direct and indirect biases.
Fine-Tuning Llama 2 Large Language Models for Detecting Online Sexual Predatory Chats and Abusive Texts
Thanh Thi Nguyen, Campbell Wilson, Janis Dalins
https://doi.org/10.14428/esann/2024.ES2024-222
Thanh Thi Nguyen, Campbell Wilson, Janis Dalins
https://doi.org/10.14428/esann/2024.ES2024-222
Abstract:
This paper proposes an approach to detection of online harmful content using the open-source pretrained Llama~2 model, recently released by Meta GenAI. We fine-tune the LLM using datasets with different sizes, imbalance degrees, and languages. Based on the power of LLMs, our approach is generic and automated without a manual search for a synergy between feature extraction and classifier design steps like conventional methods. Experimental results show a strong performance of the proposed approach, which is proficient and consistent across three distinct datasets with five sets of experiments. This study's outcomes indicate that the proposed method can be implemented in real-world applications (even with non-English languages) for flagging sexual predators, offensive or toxic content, and hate speech in online discussions and comments to maintain respectful digital communities.
This paper proposes an approach to detection of online harmful content using the open-source pretrained Llama~2 model, recently released by Meta GenAI. We fine-tune the LLM using datasets with different sizes, imbalance degrees, and languages. Based on the power of LLMs, our approach is generic and automated without a manual search for a synergy between feature extraction and classifier design steps like conventional methods. Experimental results show a strong performance of the proposed approach, which is proficient and consistent across three distinct datasets with five sets of experiments. This study's outcomes indicate that the proposed method can be implemented in real-world applications (even with non-English languages) for flagging sexual predators, offensive or toxic content, and hate speech in online discussions and comments to maintain respectful digital communities.
Towards Explainable Evolution Strategies with Large Language Models
Jill Baumann, Oliver Kramer
https://doi.org/10.14428/esann/2024.ES2024-129
Jill Baumann, Oliver Kramer
https://doi.org/10.14428/esann/2024.ES2024-129
Abstract:
This paper introduces an approach that integrates self-adaptive Evolution Strategies (ES) with Large Language Models (LLMs) to enhance the explainability of complex optimization processes. By employing a self-adaptive ES equipped with a restart mechanism, we effectively navigate the challenging landscapes of benchmark functions, capturing detailed logs of the optimization journey. The logs include fitness evolution, step-size adjustments and restart events due to stagnation. An LLM is then utilized to process these logs, generating concise, user-friendly summaries that highlight key aspects such as convergence behavior, optimal fitness achievements, and encounters with local optima. Our case study on the Rastrigin function demonstrates how our approach makes the complexities of ES optimization transparent. Our findings highlight the potential of using LLMs to bridge the gap between advanced optimization algorithms and their interpretability.
This paper introduces an approach that integrates self-adaptive Evolution Strategies (ES) with Large Language Models (LLMs) to enhance the explainability of complex optimization processes. By employing a self-adaptive ES equipped with a restart mechanism, we effectively navigate the challenging landscapes of benchmark functions, capturing detailed logs of the optimization journey. The logs include fitness evolution, step-size adjustments and restart events due to stagnation. An LLM is then utilized to process these logs, generating concise, user-friendly summaries that highlight key aspects such as convergence behavior, optimal fitness achievements, and encounters with local optima. Our case study on the Rastrigin function demonstrates how our approach makes the complexities of ES optimization transparent. Our findings highlight the potential of using LLMs to bridge the gap between advanced optimization algorithms and their interpretability.
Embodying Language Models in Robot Action
Connor Gäde, Ozan Özdemir, Cornelius Weber, Stefan Wermter
https://doi.org/10.14428/esann/2024.ES2024-143
Connor Gäde, Ozan Özdemir, Cornelius Weber, Stefan Wermter
https://doi.org/10.14428/esann/2024.ES2024-143
Abstract:
Large language models (LLMs) have achieved significant recent success in deep learning. The remaining challenges in robotics and human-robot interaction (HRI) still need to be tackled but off-the-shelf pre-trained LLMs with advanced language and reasoning capabilities can provide solutions to problems in the field. In this work, we realise an open-ended HRI scenario involving a humanoid robot communicating with a human while performing robotic object manipulation tasks at a table. To this end, we combine pre-trained general models of speech recognition, vision-language, text-to-speech and open-world object detection with robot-specific models of visuospatial coordinate transfer and inverse kinematics, as well as a task-specific motion model. Our experiments reveal robust performance by the language model in accurately selecting the task mode and by the whole model in correctly executing actions during open-ended dialogue. Our innovative architecture enables a seamless integration of open-ended dialogue, scene description, open-world object detection and action execution. It is promising as a modular solution for diverse robotic platforms and HRI scenarios.
Large language models (LLMs) have achieved significant recent success in deep learning. The remaining challenges in robotics and human-robot interaction (HRI) still need to be tackled but off-the-shelf pre-trained LLMs with advanced language and reasoning capabilities can provide solutions to problems in the field. In this work, we realise an open-ended HRI scenario involving a humanoid robot communicating with a human while performing robotic object manipulation tasks at a table. To this end, we combine pre-trained general models of speech recognition, vision-language, text-to-speech and open-world object detection with robot-specific models of visuospatial coordinate transfer and inverse kinematics, as well as a task-specific motion model. Our experiments reveal robust performance by the language model in accurately selecting the task mode and by the whole model in correctly executing actions during open-ended dialogue. Our innovative architecture enables a seamless integration of open-ended dialogue, scene description, open-world object detection and action execution. It is promising as a modular solution for diverse robotic platforms and HRI scenarios.
Large Language Models as Tuning Agents of Metaheuristics
Alicja Martinek, Szymon Łukasik, Amir H. Gandomi
https://doi.org/10.14428/esann/2024.ES2024-209
Alicja Martinek, Szymon Łukasik, Amir H. Gandomi
https://doi.org/10.14428/esann/2024.ES2024-209
Abstract:
This study examines whether LLMs can be utilized in metaheuristic tuning through selection of appropriate parameters. Instances of two optimization problems, Travelling Salesman and Graph Coloring, were solved with GA, ACO, PSO, and SA. Experiment involved running these heuristic optimizers with parameter values advised by LLMs. A round of feedback was performed through feeding LLMs with prompts that included initial parameters, average performance, and population variance, where applicable. The results show LLMs exhibit the ability to comprehend the non-trivial task of tuning metaheuristics' parameters. Additionally, feedback runs often outperform results achieved by initial setups, yielding a new application of LLMs.
This study examines whether LLMs can be utilized in metaheuristic tuning through selection of appropriate parameters. Instances of two optimization problems, Travelling Salesman and Graph Coloring, were solved with GA, ACO, PSO, and SA. Experiment involved running these heuristic optimizers with parameter values advised by LLMs. A round of feedback was performed through feeding LLMs with prompts that included initial parameters, average performance, and population variance, where applicable. The results show LLMs exhibit the ability to comprehend the non-trivial task of tuning metaheuristics' parameters. Additionally, feedback runs often outperform results achieved by initial setups, yielding a new application of LLMs.
ChatDT: Simplifying Constraint Integration in Decision Trees
Abiola Paterne Chokki, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-8
Abiola Paterne Chokki, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-8
Abstract:
Decision trees help domain experts, such as doctors and bankers, rationalize system decisions. However, existing methods lack user-friendly ways to integrate multiple constraints and identify branches for pruning. This paper introduces ChatDT, a prototype developed with a new domain-specific language and an enhanced version of the CART algorithm to address these challenges. An evaluation involving 22 participants highlights ChatDT's effectiveness, confirming its role in facilitating decision tree creation tailored to domain-specific constraints and identifying branches for pruning.
Decision trees help domain experts, such as doctors and bankers, rationalize system decisions. However, existing methods lack user-friendly ways to integrate multiple constraints and identify branches for pruning. This paper introduces ChatDT, a prototype developed with a new domain-specific language and an enhanced version of the CART algorithm to address these challenges. An evaluation involving 22 participants highlights ChatDT's effectiveness, confirming its role in facilitating decision tree creation tailored to domain-specific constraints and identifying branches for pruning.
Image processing and computer vision
From Three to Two Dimensions: 2D Quaternion Convolutions for 3D Images
Valentin Delchevalerie, Benoit Frénay, Alexandre Mayer
https://doi.org/10.14428/esann/2024.ES2024-73
Valentin Delchevalerie, Benoit Frénay, Alexandre Mayer
https://doi.org/10.14428/esann/2024.ES2024-73
Abstract:
In fields like biomedical imaging, it is common to manage 3D images instead of 2D ones (CT-scans, MRI, 3D-ultrasound, etc.). Although 3D-Convolutional Neural Networks (CNNs) are generally more powerful compared to their 2D counterparts for such applications, it also comes at the cost of an increase in computational resources (both in time and memory). In this work, we present a new way to build 2D representations of 3D images while minimizing the information loss by leveraging quaternions. Those quaternion CNNs are able to offer competitive performance while significantly reducing computational complexity.
In fields like biomedical imaging, it is common to manage 3D images instead of 2D ones (CT-scans, MRI, 3D-ultrasound, etc.). Although 3D-Convolutional Neural Networks (CNNs) are generally more powerful compared to their 2D counterparts for such applications, it also comes at the cost of an increase in computational resources (both in time and memory). In this work, we present a new way to build 2D representations of 3D images while minimizing the information loss by leveraging quaternions. Those quaternion CNNs are able to offer competitive performance while significantly reducing computational complexity.
Visualizing and Improving 3D Mesh Segmentation with DeepView
Andreas Mazur, Isaac Roberts, David Leins, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-135
Andreas Mazur, Isaac Roberts, David Leins, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2024.ES2024-135
Abstract:
While 3D data is rich in information, it often comes with the drawback of being tedious to handle. Recent work in the Geometric Deep Learning community focused on developing high quality 3D datasets for tasks like mesh segmentation. However, the label quality can never be assured to be perfect. To improve label quality in 3D datasets, we propose an interactive algorithm combining DeepView, a method to visualize the classification the function of neural networks, with Intrinsic Mesh CNNs, which generalize the convolution to Riemannian manifolds, to smartly select adequate sets of vertices from triangle mesh data for label correction.
While 3D data is rich in information, it often comes with the drawback of being tedious to handle. Recent work in the Geometric Deep Learning community focused on developing high quality 3D datasets for tasks like mesh segmentation. However, the label quality can never be assured to be perfect. To improve label quality in 3D datasets, we propose an interactive algorithm combining DeepView, a method to visualize the classification the function of neural networks, with Intrinsic Mesh CNNs, which generalize the convolution to Riemannian manifolds, to smartly select adequate sets of vertices from triangle mesh data for label correction.
Clarity: a Deep Ensemble for Visual Counterfactual Explanations
Claire Theobald, Frédéric Pennerath, Brieuc Conan-Guez, Miguel Couceiro, Amedeo Napoli
https://doi.org/10.14428/esann/2024.ES2024-188
Claire Theobald, Frédéric Pennerath, Brieuc Conan-Guez, Miguel Couceiro, Amedeo Napoli
https://doi.org/10.14428/esann/2024.ES2024-188
Abstract:
Counterfactual visual explanations are aimed at identifying changes in an image that will modify the prediction of a classifier. Unlike adversarial images, counterfactuals are required to be realistic. For this reason generative models such as variational autoencoders (VAE) have been used to restrain the search of counterfactuals on the data manifold. However such gradient-based approaches remain limited even when they deal with simple datasets such as MNIST. Conjecturing that these limitations result from a plateau effect which makes the gradient noisy and less informative, we improve the gradient estimation by training an ensemble of classifiers directly in the latent space of VAEs. Several experiments show that the resulting method called Clarity delivers counterfactual images of high-quality, competitive with the state-of-the-art.
Counterfactual visual explanations are aimed at identifying changes in an image that will modify the prediction of a classifier. Unlike adversarial images, counterfactuals are required to be realistic. For this reason generative models such as variational autoencoders (VAE) have been used to restrain the search of counterfactuals on the data manifold. However such gradient-based approaches remain limited even when they deal with simple datasets such as MNIST. Conjecturing that these limitations result from a plateau effect which makes the gradient noisy and less informative, we improve the gradient estimation by training an ensemble of classifiers directly in the latent space of VAEs. Several experiments show that the resulting method called Clarity delivers counterfactual images of high-quality, competitive with the state-of-the-art.
An Efficient Neural Architecture Search Model for Medical Image Classification
Lunchen Xie, Eugenio Lomurno, Matteo Gambella, Danilo Ardagna, Manuel Roveri, Matteo Matteucci, Qingjiang Shi
https://doi.org/10.14428/esann/2024.ES2024-119
Lunchen Xie, Eugenio Lomurno, Matteo Gambella, Danilo Ardagna, Manuel Roveri, Matteo Matteucci, Qingjiang Shi
https://doi.org/10.14428/esann/2024.ES2024-119
Abstract:
Accurate classification of medical images is essential for modern diagnostics. Deep learning advancements led clinicians to increasingly use sophisticated models to make faster and more accurate decisions, sometimes replacing human judgment. However, model development is costly and repetitive. Neural Architecture Search (NAS) provides solutions by automating the design of deep learning architectures. This paper presents ZO-DARTS+, a differentiable NAS algorithm that improves search efficiency through a novel method of generating sparse probabilities by bi-level optimization. Experiments on five public medical datasets show that ZO-DARTS+ matches the accuracy of state-of-the-art solutions while reducing search times by up to three times.
Accurate classification of medical images is essential for modern diagnostics. Deep learning advancements led clinicians to increasingly use sophisticated models to make faster and more accurate decisions, sometimes replacing human judgment. However, model development is costly and repetitive. Neural Architecture Search (NAS) provides solutions by automating the design of deep learning architectures. This paper presents ZO-DARTS+, a differentiable NAS algorithm that improves search efficiency through a novel method of generating sparse probabilities by bi-level optimization. Experiments on five public medical datasets show that ZO-DARTS+ matches the accuracy of state-of-the-art solutions while reducing search times by up to three times.
Leveraging endoscopic data with Contrastive Learning for Crohn’s disease detection
Robin Ghyselinck, Jérôme Fink, Bruno Dumas, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-56
Robin Ghyselinck, Jérôme Fink, Bruno Dumas, Benoit Frénay
https://doi.org/10.14428/esann/2024.ES2024-56
Abstract:
This study contributes to the automatic detection of Crohn's Disease (CD), a gastrointestinal inflammatory condition. In particular, our approach deals with the challenge of data scarcity for CD by pre-training Vision Transformers (ViT) on Hyper-Kvasir and LDPolyp, two large colonscopic datasets that represent over one million images from a similar domain, using a Contrastive Loss (CL) mechanism. This approach significantly outperforms models pre-trained on ImageNet as well as models pre-trained with a Cross-Entropy Loss on the Crohn-IPI dataset.
This study contributes to the automatic detection of Crohn's Disease (CD), a gastrointestinal inflammatory condition. In particular, our approach deals with the challenge of data scarcity for CD by pre-training Vision Transformers (ViT) on Hyper-Kvasir and LDPolyp, two large colonscopic datasets that represent over one million images from a similar domain, using a Contrastive Loss (CL) mechanism. This approach significantly outperforms models pre-trained on ImageNet as well as models pre-trained with a Cross-Entropy Loss on the Crohn-IPI dataset.
Unpaired Image-to-Image Translation to Improve Log End Identification
Dag Björnberg, Morgan Ericsson, Welf Löwe, Jonas Nordqvist
https://doi.org/10.14428/esann/2024.ES2024-63
Dag Björnberg, Morgan Ericsson, Welf Löwe, Jonas Nordqvist
https://doi.org/10.14428/esann/2024.ES2024-63
Abstract:
Visual re-identification tasks are often subject to large domain variations due to camera types, brightness conditions, or environmental differences. For identification models to generalize in such varying domains, a large amount of training data is necessary for capturing these variations. We explore the potential of using unpaired image-to-image translation to enhance the generalization capacity of a log end identification model in the absence or combined with a smaller amount of labeled training data.
Visual re-identification tasks are often subject to large domain variations due to camera types, brightness conditions, or environmental differences. For identification models to generalize in such varying domains, a large amount of training data is necessary for capturing these variations. We explore the potential of using unpaired image-to-image translation to enhance the generalization capacity of a log end identification model in the absence or combined with a smaller amount of labeled training data.
Investigating the Gestalt Principle of Closure in Deep Convolutional Neural Networks
Yuyan Zhang, Derya Soydaner, Fatemeh Behrad, Lisa Koßmann, Johan Wagemans
https://doi.org/10.14428/esann/2024.ES2024-111
Yuyan Zhang, Derya Soydaner, Fatemeh Behrad, Lisa Koßmann, Johan Wagemans
https://doi.org/10.14428/esann/2024.ES2024-111
Abstract:
Deep neural networks perform well in object recognition, but do they perceive objects like humans? This study investigates the Gestalt principle of closure in convolutional neural networks. We propose a protocol to identify closure and conduct experiments using simple visual stimuli with progressively removed edge sections. We evaluate well-known networks on their ability to classify incomplete polygons. Our findings reveal a performance degradation as the edge removal percentage increases, indicating that current models heavily rely on complete edge information for accurate classification. The data used in our study is available on Github.
Deep neural networks perform well in object recognition, but do they perceive objects like humans? This study investigates the Gestalt principle of closure in convolutional neural networks. We propose a protocol to identify closure and conduct experiments using simple visual stimuli with progressively removed edge sections. We evaluate well-known networks on their ability to classify incomplete polygons. Our findings reveal a performance degradation as the edge removal percentage increases, indicating that current models heavily rely on complete edge information for accurate classification. The data used in our study is available on Github.
Influence of image encoders and image features transformations in emergent communication
Bastien Vanderplaetse, Stéphane Dupont, Xavier Siebert
https://doi.org/10.14428/esann/2024.ES2024-127
Bastien Vanderplaetse, Stéphane Dupont, Xavier Siebert
https://doi.org/10.14428/esann/2024.ES2024-127
Abstract:
Emergent communication in multi-agent systems is a research field exploring how autonomous agents can develop unique communication protocols without human programming, showing adaptability in various contexts. This study investigates the influence of image encoders and spatial information within image features on agent performance and the compositionality of emergent languages in multi-agent systems. By exploring various image encoding strategies, including the application of different processing methods to image features, we assess their impact on agents' abilities in a structured communication task. Our findings indicate that while certain encoding processes enhance overall task performance, they do not necessarily improve language compositionality.
Emergent communication in multi-agent systems is a research field exploring how autonomous agents can develop unique communication protocols without human programming, showing adaptability in various contexts. This study investigates the influence of image encoders and spatial information within image features on agent performance and the compositionality of emergent languages in multi-agent systems. By exploring various image encoding strategies, including the application of different processing methods to image features, we assess their impact on agents' abilities in a structured communication task. Our findings indicate that while certain encoding processes enhance overall task performance, they do not necessarily improve language compositionality.
SDE U-Net: Disentangling Aleatoric and Epistemic Uncertainties in Medical Image Segmentation
Chuxin Zhang, Ana Maria Barragan Montero, Lee John
https://doi.org/10.14428/esann/2024.ES2024-133
Chuxin Zhang, Ana Maria Barragan Montero, Lee John
https://doi.org/10.14428/esann/2024.ES2024-133
Abstract:
Quantifying uncertainty is crucial in artifical intelligence (AI) applications, particularly in high-stakes healthcare settings. This paper introduces SDE U-Net, a novel architecture that integrates stochastic differential equations (SDEs) with the U-Net framework, effectively distinguishing between aleatoric and epistemic uncertainties. By incorporating a randomness component, SDE U-Net directly captures and quantifies aleatoric uncertainty, while epistemic uncertainty is assessed through multiple forward passes. Comparative results demonstrate that SDE U-Net not only matches the benchmark performance but also shows superior robustness. This approach enhances the reliability of AI in medical decision-making by providing a clear, comprehensive representation of uncertainty, marking a significant advancement in the field of medical image segmentation.
Quantifying uncertainty is crucial in artifical intelligence (AI) applications, particularly in high-stakes healthcare settings. This paper introduces SDE U-Net, a novel architecture that integrates stochastic differential equations (SDEs) with the U-Net framework, effectively distinguishing between aleatoric and epistemic uncertainties. By incorporating a randomness component, SDE U-Net directly captures and quantifies aleatoric uncertainty, while epistemic uncertainty is assessed through multiple forward passes. Comparative results demonstrate that SDE U-Net not only matches the benchmark performance but also shows superior robustness. This approach enhances the reliability of AI in medical decision-making by providing a clear, comprehensive representation of uncertainty, marking a significant advancement in the field of medical image segmentation.
Generation of Simulated Dataset of Computed Tomography Images of Eggs and Extraction of Measurements Using Deep Learning
Jean Pierre Brik López Vargas, Davi Duarte de Paula, Denis Henrique Pinheiro Salvadeo, Emílio Bergamim Júnior
https://doi.org/10.14428/esann/2024.ES2024-134
Jean Pierre Brik López Vargas, Davi Duarte de Paula, Denis Henrique Pinheiro Salvadeo, Emílio Bergamim Júnior
https://doi.org/10.14428/esann/2024.ES2024-134
Abstract:
This paper extracts morphometric measurements of the different volumes of chicken egg components (shell, yolk, albumen and air chamber) by evaluating the segmentation algorithms, U-Net and Fully Convolutional Network (FCN). It also presents a new data set of 3D CT images of chicken eggs, simulating the different densities of a real one in the Digital Imaging and Communications in Medicine (DICOM) format and its labeled masks. The 3D models trained end-to-end showed high generalization even in the presence of variations in egg size and internal structures, achieving state-of-the-art segmentation performance with 99.4% accuracy.
This paper extracts morphometric measurements of the different volumes of chicken egg components (shell, yolk, albumen and air chamber) by evaluating the segmentation algorithms, U-Net and Fully Convolutional Network (FCN). It also presents a new data set of 3D CT images of chicken eggs, simulating the different densities of a real one in the Digital Imaging and Communications in Medicine (DICOM) format and its labeled masks. The 3D models trained end-to-end showed high generalization even in the presence of variations in egg size and internal structures, achieving state-of-the-art segmentation performance with 99.4% accuracy.
AI-based Collimation Optimization for X-Ray Imaging using Time-of-Flight Cameras
Dominik Mairhöfer, Manuel Laufer, Lennart Berkel, Arpad Bischof, Erhardt Barth, Jörg Barkhausen, Thomas Martinetz
https://doi.org/10.14428/esann/2024.ES2024-147
Dominik Mairhöfer, Manuel Laufer, Lennart Berkel, Arpad Bischof, Erhardt Barth, Jörg Barkhausen, Thomas Martinetz
https://doi.org/10.14428/esann/2024.ES2024-147
Abstract:
Collimation during radiography, which is the process of defining the area to be radiated, is a crucial factor for the protection of the patient and for the diagnostic quality of a radiograph. Moreover, incorrect collimation is one of the main causes for a retake and the associated costs. In this paper we propose a novel collimation optimization approach using Time-of-Flight cameras and deep Neural Networks trained end-to-end to increase the diagnostic quality of a radiograph. For this we acquired a new dataset in a clinical environment consisting of depth images of the lower leg and the abdomen. Using this dataset we are able to segment depth images for the optimal collimation with an average IoU of 83%.
Collimation during radiography, which is the process of defining the area to be radiated, is a crucial factor for the protection of the patient and for the diagnostic quality of a radiograph. Moreover, incorrect collimation is one of the main causes for a retake and the associated costs. In this paper we propose a novel collimation optimization approach using Time-of-Flight cameras and deep Neural Networks trained end-to-end to increase the diagnostic quality of a radiograph. For this we acquired a new dataset in a clinical environment consisting of depth images of the lower leg and the abdomen. Using this dataset we are able to segment depth images for the optimal collimation with an average IoU of 83%.
On the Stability of Neural Segmentation in Radiology
moritz wolter, Lokesh Veeramacheneni, Bettina Baeßler, Ulrike I. Attenberger, Barbara D. Wichtmann
https://doi.org/10.14428/esann/2024.ES2024-172
moritz wolter, Lokesh Veeramacheneni, Bettina Baeßler, Ulrike I. Attenberger, Barbara D. Wichtmann
https://doi.org/10.14428/esann/2024.ES2024-172
Abstract:
Neural networks promise automated prostate segmentation for the development of precise and quantifiable image-based biomarkers in modern personalized oncology. Before clinical translation, however, their stability must be ensured. In this study, we train three-dimensional U-shaped convolutional neural networks to segment prostate magnetic resonance imaging (MRI) scans and evaluate different loss formulations to improve their performance. To evaluate the generalizability and reproducibility of our networks, we compare their performance in a clinically acquired test/re-test MRI data set of 26 prostate cancer patients that was previously not seen by the networks. We find our networks to be generalizable with good reproducibility with a mean Intersection over Union of 0.88. While initial results are promising, anatomical accuracy remains to be evaluated in larger, multi-center data sets. To facilitate clinical applicability, we provide an easy-to-use toolbox online.
Neural networks promise automated prostate segmentation for the development of precise and quantifiable image-based biomarkers in modern personalized oncology. Before clinical translation, however, their stability must be ensured. In this study, we train three-dimensional U-shaped convolutional neural networks to segment prostate magnetic resonance imaging (MRI) scans and evaluate different loss formulations to improve their performance. To evaluate the generalizability and reproducibility of our networks, we compare their performance in a clinically acquired test/re-test MRI data set of 26 prostate cancer patients that was previously not seen by the networks. We find our networks to be generalizable with good reproducibility with a mean Intersection over Union of 0.88. While initial results are promising, anatomical accuracy remains to be evaluated in larger, multi-center data sets. To facilitate clinical applicability, we provide an easy-to-use toolbox online.
Analysis of DNA methylation patterns in cancer samples using SOM
Ignacio Diaz-Blanco, Jose M. Enguita-Gonzalez, Diego Garcia-Perez, Abel A. Cuadrado-Vega, Nuria Valdes-Gallego, Maria Dolores Chiara-Romero
https://doi.org/10.14428/esann/2024.ES2024-42
Ignacio Diaz-Blanco, Jose M. Enguita-Gonzalez, Diego Garcia-Perez, Abel A. Cuadrado-Vega, Nuria Valdes-Gallego, Maria Dolores Chiara-Romero
https://doi.org/10.14428/esann/2024.ES2024-42
Abstract:
By leveraging the SOM algorithm and the extensive epigenomic data from TCGA, this work aims to suggest a valid approach to explore the relationships between epigenetic alterations and PCPG pathogenesis. Additionally, the methodological approach presented here lays the foundation for a potentially valuable analysis tool that can be applied to other cancer types and epigenetic research.
By leveraging the SOM algorithm and the extensive epigenomic data from TCGA, this work aims to suggest a valid approach to explore the relationships between epigenetic alterations and PCPG pathogenesis. Additionally, the methodological approach presented here lays the foundation for a potentially valuable analysis tool that can be applied to other cancer types and epigenetic research.
Graph-cut-assisted CNN training for pulmonary embolism segmentation
Nana Yang, Robin Verschuren, Christophe De Vleeschouwer
https://doi.org/10.14428/esann/2024.ES2024-45
Nana Yang, Robin Verschuren, Christophe De Vleeschouwer
https://doi.org/10.14428/esann/2024.ES2024-45
Abstract:
We present a novel algorithm for pulmonary embolism segmentation, designed to alleviate the need for expert annotation. Our approach integrates deep learning with a conventional image segmentation techniques, operating in two distinct stages. Specifically, graph cut is used for initial segmentation, followed by manual refinement, to define the labels required to train a CNN. This CNN is then employed to generate pseudo-labels on a large dataset, enabling the training of an improved CNN*. Our findings demonstrate enhanced performance of CNN* over CNN. Overall, the CNN* builds on a very limited amount of manual intervention. Moreover, the injection of expert knowledge in the graph-cut avoids the need for expert knowledge in this manual intervention.
We present a novel algorithm for pulmonary embolism segmentation, designed to alleviate the need for expert annotation. Our approach integrates deep learning with a conventional image segmentation techniques, operating in two distinct stages. Specifically, graph cut is used for initial segmentation, followed by manual refinement, to define the labels required to train a CNN. This CNN is then employed to generate pseudo-labels on a large dataset, enabling the training of an improved CNN*. Our findings demonstrate enhanced performance of CNN* over CNN. Overall, the CNN* builds on a very limited amount of manual intervention. Moreover, the injection of expert knowledge in the graph-cut avoids the need for expert knowledge in this manual intervention.
Reconstruction of Mammography Projections using Image-to-Image Translation Techniques
Joana Cristo Santos, Miriam Seoane Santos, Pedro Henriques Abreu
https://doi.org/10.14428/esann/2024.ES2024-62
Joana Cristo Santos, Miriam Seoane Santos, Pedro Henriques Abreu
https://doi.org/10.14428/esann/2024.ES2024-62
Abstract:
Mammography imaging is the gold standard for breast cancer detection and involves capturing two projections: mediolateral oblique and craniocaudal projections. The implementation of an approach that allows the acquisition of only one projection and reconstructs the other could mitigate patient burden, minimize radiation exposure, and reduce costs. Image-to-image translation has showcased the ability to generate realistic synthetic images in different medical imaging modalities which make these techniques a great candidate for the novel application in mammography. This study aims to compare five image-to-image translation approaches to assess the feasibility of reconstructing a mammography projection from its counterpart. The results indicate that ResViT shows the best overall performance in translating between both projections.
Mammography imaging is the gold standard for breast cancer detection and involves capturing two projections: mediolateral oblique and craniocaudal projections. The implementation of an approach that allows the acquisition of only one projection and reconstructs the other could mitigate patient burden, minimize radiation exposure, and reduce costs. Image-to-image translation has showcased the ability to generate realistic synthetic images in different medical imaging modalities which make these techniques a great candidate for the novel application in mammography. This study aims to compare five image-to-image translation approaches to assess the feasibility of reconstructing a mammography projection from its counterpart. The results indicate that ResViT shows the best overall performance in translating between both projections.