Bruges, Belgium April 23 - 25
Content of the proceedings
-
Network Science Meets AI
Streaming Continual Learning: fast adaptation and knowledge consolidation in dynamic environment
Unsupervised learning and dimensionality reduction
Image processing and deep learning
Foundation and Generative Models for Graphs
Machine learning and applied Artificial Intelligence in cognitive sciences and psychology
Reinforcement learning
Classification and statistical learning
Time series
Quantum, Quantum Inspired and Hybrid Machine Learning
Domain adaptation and federated learning
Explainable AI and representation learning
Natural language processing
Dynamical systems and recurrent learning
Network Science Meets AI
Network Science Meets AI: A Converging Frontier
Matteo Zignani, Fragkiskos D. Malliaros, Ingo Scholtes, Roberto Interdonato, Manuel Dileo
https://doi.org/10.14428/esann/2025.ES2025-28
Matteo Zignani, Fragkiskos D. Malliaros, Ingo Scholtes, Roberto Interdonato, Manuel Dileo
https://doi.org/10.14428/esann/2025.ES2025-28
Abstract:
The convergence of network science and artificial intelligence (AI) represents a rich area of research, where both fields can mutually enhance one another. Network science offers a comprehensive framework to analyze and model complex relationships, while machine learning (ML) and AI provide powerful tools for recognizing patterns and making predictions from large datasets. Combining these two disciplines can advance the study of complex systems and lead to new innovations in data-driven research. This tutorial paper reviews fundamental
concepts of network science, describes the current and promising research direction for bridging network science and AI, and summarizes the contributions that have been accepted for publication in the ESANN 2025 special session on the topic.
The convergence of network science and artificial intelligence (AI) represents a rich area of research, where both fields can mutually enhance one another. Network science offers a comprehensive framework to analyze and model complex relationships, while machine learning (ML) and AI provide powerful tools for recognizing patterns and making predictions from large datasets. Combining these two disciplines can advance the study of complex systems and lead to new innovations in data-driven research. This tutorial paper reviews fundamental
concepts of network science, describes the current and promising research direction for bridging network science and AI, and summarizes the contributions that have been accepted for publication in the ESANN 2025 special session on the topic.
Learning of Probability Estimates for System and Network Reliability Analysis by Means of Matrix Learning Vector Quantization
Mandy Lange-Geisler , Klaus Dohmen, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-67
Mandy Lange-Geisler , Klaus Dohmen, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-67
Abstract:
We present a new approach for the assessment of the reliability of coherent systems by using a prototype-based classification method. More specifically, reliability levels for consecutive $k$-out-of-$n$ systems, which serve as a model for a particular type of networks, are classified using Generalized Matrix Learning Vector Quantization, which provides useful information about the impact of the input probabilities on the classified reliability levels. Our approach is not limited to reliability analysis, but is generally applicable for estimating the probability of the union of any finite family of events, based on their individual and pairwise probabilities.
We present a new approach for the assessment of the reliability of coherent systems by using a prototype-based classification method. More specifically, reliability levels for consecutive $k$-out-of-$n$ systems, which serve as a model for a particular type of networks, are classified using Generalized Matrix Learning Vector Quantization, which provides useful information about the impact of the input probabilities on the classified reliability levels. Our approach is not limited to reliability analysis, but is generally applicable for estimating the probability of the union of any finite family of events, based on their individual and pairwise probabilities.
Enhancing neural link predictors for temporal knowledge graphs with temporal regularisers
Manuel Dileo, Pasquale Minervini, Matteo Zignani, Sabrina Gaito
https://doi.org/10.14428/esann/2025.ES2025-87
Manuel Dileo, Pasquale Minervini, Matteo Zignani, Sabrina Gaito
https://doi.org/10.14428/esann/2025.ES2025-87
Abstract:
The problem of link prediction in temporal knowledge graphs (TKGs) consists of finding missing links in the knowledge base under temporal constraints. Recently, Lacroix et al. and Xu et al. proposed a solution to the problem inspired by the canonical decomposition of 4-order tensors, where they regularise the representations of time steps by learning similar transformations for adjacent timestamps. However, the impact of the choice of temporal regularisation terms is still poorly understood. In this work, we systematically analyse several choices of temporal regularisers using linear functions and recurrent architectures. In our experiments, we show that by carefully selecting the temporal regulariser and regularisation weight, a simple method like TNTComplEx can produce comparable results with state-of-the-art methods and enhance its original performance. Specifically, we observe that linear regularisers for temporal smoothing based on specific nuclear norms can significantly improve the predictive accuracy of the base temporal link prediction methods.
The problem of link prediction in temporal knowledge graphs (TKGs) consists of finding missing links in the knowledge base under temporal constraints. Recently, Lacroix et al. and Xu et al. proposed a solution to the problem inspired by the canonical decomposition of 4-order tensors, where they regularise the representations of time steps by learning similar transformations for adjacent timestamps. However, the impact of the choice of temporal regularisation terms is still poorly understood. In this work, we systematically analyse several choices of temporal regularisers using linear functions and recurrent architectures. In our experiments, we show that by carefully selecting the temporal regulariser and regularisation weight, a simple method like TNTComplEx can produce comparable results with state-of-the-art methods and enhance its original performance. Specifically, we observe that linear regularisers for temporal smoothing based on specific nuclear norms can significantly improve the predictive accuracy of the base temporal link prediction methods.
Hyperbolic representation learning in multi-layer tissue networks
Domonkos Pogány, Péter Antal
https://doi.org/10.14428/esann/2025.ES2025-21
Domonkos Pogány, Péter Antal
https://doi.org/10.14428/esann/2025.ES2025-21
Abstract:
Predicting tissue-specific protein functions and protein-protein interactions (PPI) is essential for understanding human biology, diseases, and potential therapeutics.
Recently, as a promising direction, more and more complex unsupervised feature learning approaches have emerged in the field, but none of them consider the scale-free nature and the underlying geometry of multi-layer PPI networks.
Therefore, this study proposes contextualized, tissue-specific representation learning in non-Euclidean geometries and demonstrates that hyperbolic embeddings capture the structure of multi-layer PPI networks with less distortion and achieve better performance in tissue-specific protein function prediction.
Predicting tissue-specific protein functions and protein-protein interactions (PPI) is essential for understanding human biology, diseases, and potential therapeutics.
Recently, as a promising direction, more and more complex unsupervised feature learning approaches have emerged in the field, but none of them consider the scale-free nature and the underlying geometry of multi-layer PPI networks.
Therefore, this study proposes contextualized, tissue-specific representation learning in non-Euclidean geometries and demonstrates that hyperbolic embeddings capture the structure of multi-layer PPI networks with less distortion and achieve better performance in tissue-specific protein function prediction.
Topology-Aware Activation Functions in Neural Networks
Pavel Snopov, Oleg Musin
https://doi.org/10.14428/esann/2025.ES2025-211
Pavel Snopov, Oleg Musin
https://doi.org/10.14428/esann/2025.ES2025-211
Abstract:
This study explores novel activation functions that enhance the ability of neural networks to manipulate data topology during training. Building on the limitations of traditional activation functions like \( \relu \), we propose \( \smoothsplit \) and \( \parametricsplit \), which introduce topology <<cutting>> capabilities. These functions enable networks to transform complex data manifolds effectively, improving performance in scenarios with low-dimensional layers. Through experiments on synthetic and real-world datasets, we demonstrate that \( \parametricsplit \) outperforms traditional activations in low-dimensional settings while maintaining competitive performance in higher-dimensional ones. Our findings highlight the potential of topology-aware activation functions in advancing neural network architectures. The code is available via \url{https://github.com/Snopoff/Topology-Aware-Activations}.
This study explores novel activation functions that enhance the ability of neural networks to manipulate data topology during training. Building on the limitations of traditional activation functions like \( \relu \), we propose \( \smoothsplit \) and \( \parametricsplit \), which introduce topology <<cutting>> capabilities. These functions enable networks to transform complex data manifolds effectively, improving performance in scenarios with low-dimensional layers. Through experiments on synthetic and real-world datasets, we demonstrate that \( \parametricsplit \) outperforms traditional activations in low-dimensional settings while maintaining competitive performance in higher-dimensional ones. Our findings highlight the potential of topology-aware activation functions in advancing neural network architectures. The code is available via \url{https://github.com/Snopoff/Topology-Aware-Activations}.
Streaming Continual Learning: fast adaptation and knowledge consolidation in dynamic environment
Don't drift away: Advances and Applications of Streaming and Continual Learning
Andrea Cossu, Davide Bacciu, Alessio Bernardo, Emanuele Della Valle, Alexander Gepperth, Federico Giannini, Barbara Hammer, Giacomo Ziffer
https://doi.org/10.14428/esann/2025.ES2025-23
Andrea Cossu, Davide Bacciu, Alessio Bernardo, Emanuele Della Valle, Alexander Gepperth, Federico Giannini, Barbara Hammer, Giacomo Ziffer
https://doi.org/10.14428/esann/2025.ES2025-23
Abstract:
Non-stationary environments subject to concept drift require the design of adaptive models that can continuously learn and update. Two primary research communities have emerged to address this challenge: Continual Learning (CL) and Streaming Machine Learning (SML). CL manages virtual drifts by learning new concepts without forgetting past knowledge, while SML focuses on real drifts, rapidly adapting to evolving data distributions. However, a unified approach is needed to balance adaptation and knowledge retention. Streaming Continual Learning (SCL) bridges the gap between CL and SML, ensuring models retain useful past information while efficiently adapting to new data. We explore key challenges in SCL, including handling temporal dependencies in data streams and adapting latent representations for personalization and knowledge editing. Additionally, we identify promising SCL benchmarks which can foster and promote a unified research effort between CL and SML.
Non-stationary environments subject to concept drift require the design of adaptive models that can continuously learn and update. Two primary research communities have emerged to address this challenge: Continual Learning (CL) and Streaming Machine Learning (SML). CL manages virtual drifts by learning new concepts without forgetting past knowledge, while SML focuses on real drifts, rapidly adapting to evolving data distributions. However, a unified approach is needed to balance adaptation and knowledge retention. Streaming Continual Learning (SCL) bridges the gap between CL and SML, ensuring models retain useful past information while efficiently adapting to new data. We explore key challenges in SCL, including handling temporal dependencies in data streams and adapting latent representations for personalization and knowledge editing. Additionally, we identify promising SCL benchmarks which can foster and promote a unified research effort between CL and SML.
Compression-based $k$NN for Class Incremental Continual Learning
Valerie Vaquet, Jonas Vaquet, Fabian Hinder, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-75
Valerie Vaquet, Jonas Vaquet, Fabian Hinder, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-75
Abstract:
Catastrophic forgetting is a key challenge in continual learning. In the adjoining field of stream machine learning, few methods target the related problem of re-occurring drift by avoiding forgetting old data. In this work, we investigate whether we can transfer such strategies from the stream machine learning to the continual learning setup. Based on our consideration, we propose a simple yet efficient compression-based kNN scheme and evaluate it experimentally.
Catastrophic forgetting is a key challenge in continual learning. In the adjoining field of stream machine learning, few methods target the related problem of re-occurring drift by avoiding forgetting old data. In this work, we investigate whether we can transfer such strategies from the stream machine learning to the continual learning setup. Based on our consideration, we propose a simple yet efficient compression-based kNN scheme and evaluate it experimentally.
Stability of State and Costate Dynamics in Continuous Time Recurrent Neural Networks
Alessandro Betti, Marco Gori, Stefano Melacci
https://doi.org/10.14428/esann/2025.ES2025-164
Alessandro Betti, Marco Gori, Stefano Melacci
https://doi.org/10.14428/esann/2025.ES2025-164
Abstract:
The notion of stability plays a crucial role in ensuring the safe development of a model in a lifelong learning context. This paper investigates the fundamental aspects of stability in a class of continuous-time recurrent neural networks which include both state and costate variables. The latter are directly inherited from optimal control theory, and they act as adjoint variables closely related to gradient terms. Stability is investigated both in terms of state and of costate dynamics, showing the key conditions that must be satisfied to produce bounded dynamics in the forward and learning stages.
The notion of stability plays a crucial role in ensuring the safe development of a model in a lifelong learning context. This paper investigates the fundamental aspects of stability in a class of continuous-time recurrent neural networks which include both state and costate variables. The latter are directly inherited from optimal control theory, and they act as adjoint variables closely related to gradient terms. Stability is investigated both in terms of state and of costate dynamics, showing the key conditions that must be satisfied to produce bounded dynamics in the forward and learning stages.
Towards Streaming Land Use Classification of Images with Temporal Distribution Shifts
Lorenzo Iovine, Giacomo Ziffer, Andrea Proia, Emanuele Della Valle
https://doi.org/10.14428/esann/2025.ES2025-166
Lorenzo Iovine, Giacomo Ziffer, Andrea Proia, Emanuele Della Valle
https://doi.org/10.14428/esann/2025.ES2025-166
Abstract:
In this study, we introduce a new pipeline that integrates Streaming Machine Learning (SML) models and the Momentum Contrastive Learning (MoCo) technique for the streaming classification of satellite images subject to temporal variations in distribution. We present preliminary results of an experimental campaign conducted on the Functional Map of the World-Time dataset, one of the first benchmarks specifically designed to address temporal distribution shifts in satellite imagery. The results demonstrate that the proposed pipeline enhances robustness and generalization over time, surpassing traditional strategies.
In this study, we introduce a new pipeline that integrates Streaming Machine Learning (SML) models and the Momentum Contrastive Learning (MoCo) technique for the streaming classification of satellite images subject to temporal variations in distribution. We present preliminary results of an experimental campaign conducted on the Functional Map of the World-Time dataset, one of the first benchmarks specifically designed to address temporal distribution shifts in satellite imagery. The results demonstrate that the proposed pipeline enhances robustness and generalization over time, surpassing traditional strategies.
Linear Domain Adaptation for Robustness to Electrode Shifts
Rui Liu, Benjamin Paassen
https://doi.org/10.14428/esann/2025.ES2025-99
Rui Liu, Benjamin Paassen
https://doi.org/10.14428/esann/2025.ES2025-99
Abstract:
Machine learning approaches have shown impressive achievements in bionic prostheses control. However, translating the machine learning models from labs to patient’s everyday lives remains a challenge due to various disturbances, such as electrodes shifts. To mitigate the influence of electrode shifts, we investigate two linear domain adaptation methods and a robust training approach. In experiments, we compare all methods on both simulated electrode shifts on the Ninapro DB2 data set as well as real electrode shifts on Ninapro DB8. We find that linear domain adaptation could estimate the shift and reduce the impact of electrodes shift best, but robust training approaches similar performance without the need for new data.
Machine learning approaches have shown impressive achievements in bionic prostheses control. However, translating the machine learning models from labs to patient’s everyday lives remains a challenge due to various disturbances, such as electrodes shifts. To mitigate the influence of electrode shifts, we investigate two linear domain adaptation methods and a robust training approach. In experiments, we compare all methods on both simulated electrode shifts on the Ninapro DB2 data set as well as real electrode shifts on Ninapro DB8. We find that linear domain adaptation could estimate the shift and reduce the impact of electrodes shift best, but robust training approaches similar performance without the need for new data.
Continual Unlearning through Memory Suppression
Alexander Krawczyk, Alex Gepperth
https://doi.org/10.14428/esann/2025.ES2025-102
Alexander Krawczyk, Alex Gepperth
https://doi.org/10.14428/esann/2025.ES2025-102
Abstract:
This study uncovers surprisingly effective synergies between the field of continual learning (CL) and machine unlearning (MUL). We extend the common class-incremental setting from CL to incorporate suppression requests in what we term class-incremental unlearning (CIUL). We present a light-weight approach to CIUL using replay/rehearsal-based CL approaches together with a selective replay strategy termed \enquote{Replay-To-Suppress} (RTS), where we actually \textit{make use} of the catastrophic forgetting effect to achieve unlearning. In particular, we adapt a CL strategy termed adiabatic replay (AR) to achieve suppression at near-constant time complexity. We demonstrate excellent overall performance for all CL strategies extended by RTS on MNIST, F-MNIST and a latent encoded version of the challenging CIFAR and SVHN benchmarks.
This study uncovers surprisingly effective synergies between the field of continual learning (CL) and machine unlearning (MUL). We extend the common class-incremental setting from CL to incorporate suppression requests in what we term class-incremental unlearning (CIUL). We present a light-weight approach to CIUL using replay/rehearsal-based CL approaches together with a selective replay strategy termed \enquote{Replay-To-Suppress} (RTS), where we actually \textit{make use} of the catastrophic forgetting effect to achieve unlearning. In particular, we adapt a CL strategy termed adiabatic replay (AR) to achieve suppression at near-constant time complexity. We demonstrate excellent overall performance for all CL strategies extended by RTS on MNIST, F-MNIST and a latent encoded version of the challenging CIFAR and SVHN benchmarks.
Reducing the stability gap for continual learning at the edge with class balancing
Wei Wei, Matthias Hutsebaut-Buysse, Tom De Schepper, Kevin Mets
https://doi.org/10.14428/esann/2025.ES2025-139
Wei Wei, Matthias Hutsebaut-Buysse, Tom De Schepper, Kevin Mets
https://doi.org/10.14428/esann/2025.ES2025-139
Abstract:
Continual learning (CL) at the edge requires the model to learn from sequentially arriving small batches of data. A naive online learning strategy fails due to the catastrophic forgetting phenomenon. Previous literature introduced the `latent replay' for CL at the edge, where the input is transformed into latent representations using a pre-trained feature extractor. These latent representations are used, in combination with the real inputs, to train the adaptive classification layers. This approach is prone to the stability gap problem, where the accuracies of learned classes drop when learning a new class, and they only recover during subsequent training iterations. We hypothesize that this is caused by the class imbalance between new class data from the new task, and the old class data in the replay memory. We validate this by applying two class balancing strategies in a latent replay-based CL method. Our empirical results demonstrate that class balancing strategies provide a notable accuracy improvement, and a reduction of the stability gap when using a latent replay-based CL method with a small replay memory size.
Continual learning (CL) at the edge requires the model to learn from sequentially arriving small batches of data. A naive online learning strategy fails due to the catastrophic forgetting phenomenon. Previous literature introduced the `latent replay' for CL at the edge, where the input is transformed into latent representations using a pre-trained feature extractor. These latent representations are used, in combination with the real inputs, to train the adaptive classification layers. This approach is prone to the stability gap problem, where the accuracies of learned classes drop when learning a new class, and they only recover during subsequent training iterations. We hypothesize that this is caused by the class imbalance between new class data from the new task, and the old class data in the replay memory. We validate this by applying two class balancing strategies in a latent replay-based CL method. Our empirical results demonstrate that class balancing strategies provide a notable accuracy improvement, and a reduction of the stability gap when using a latent replay-based CL method with a small replay memory size.
Replay-free Online Continual Learning with Self-Supervised MultiPatches
Giacomo Cignoni, Andrea Cossu, Alexandra Gómez Villa, Joost van de Weijer, Antonio Carta
https://doi.org/10.14428/esann/2025.ES2025-180
Giacomo Cignoni, Andrea Cossu, Alexandra Gómez Villa, Joost van de Weijer, Antonio Carta
https://doi.org/10.14428/esann/2025.ES2025-180
Abstract:
Online Continual Learning (OCL) methods train a model on a non-stationary data stream where only a few examples are available at a time, often leveraging replay strategies. However, usage of replay is sometimes forbidden, especially in applications with strict privacy regulations. Therefore, we propose Continual MultiPatches (CMP), an effective plug-in for existing OCL self-supervised learning strategies that avoids the use of replay samples. CMP generates multiple patches from a single example and projects them into a shared feature space, where patches coming from the same example are pushed together without collapsing into a single point.
CMP surpasses replay and other SSL-based strategies on OCL streams, challenging the role of replay as a go-to solution for self-supervised OCL.
Online Continual Learning (OCL) methods train a model on a non-stationary data stream where only a few examples are available at a time, often leveraging replay strategies. However, usage of replay is sometimes forbidden, especially in applications with strict privacy regulations. Therefore, we propose Continual MultiPatches (CMP), an effective plug-in for existing OCL self-supervised learning strategies that avoids the use of replay samples. CMP generates multiple patches from a single example and projects them into a shared feature space, where patches coming from the same example are pushed together without collapsing into a single point.
CMP surpasses replay and other SSL-based strategies on OCL streams, challenging the role of replay as a go-to solution for self-supervised OCL.
Reward Incremental Learning
Yannick Denker, Alex Gepperth
https://doi.org/10.14428/esann/2025.ES2025-33
Yannick Denker, Alex Gepperth
https://doi.org/10.14428/esann/2025.ES2025-33
Abstract:
We address a new scenario in continual reinforcement learning, which we term reward-incremental learning (RIL).
Generally, RIL is not restricted to RL, but addresses situations where identical samples (observations for RL) must be mapped to different classes (Q-values) in different tasks.
This is in contrast to class-incremental CL where new tasks add classes, but without the contradictions inherent in RIL.
To tackle this issue, we propose the use of an innovative replay-based approach called adiabatic replay (AR) which is inherently suited for RL since it
removes the need for large replay buffers.
Based on a simple benchmark scenario for continual RL, we empirically demonstrate that RIL scenarios can be handled by our approach, in contrast to conventional DQN methods.
We address a new scenario in continual reinforcement learning, which we term reward-incremental learning (RIL).
Generally, RIL is not restricted to RL, but addresses situations where identical samples (observations for RL) must be mapped to different classes (Q-values) in different tasks.
This is in contrast to class-incremental CL where new tasks add classes, but without the contradictions inherent in RIL.
To tackle this issue, we propose the use of an innovative replay-based approach called adiabatic replay (AR) which is inherently suited for RL since it
removes the need for large replay buffers.
Based on a simple benchmark scenario for continual RL, we empirically demonstrate that RIL scenarios can be handled by our approach, in contrast to conventional DQN methods.
Continual Contrastive Learning on Tabular Data with Out of Distribution
achmad ginanjar, Xue Li, Priyanka Singh, Wen Hua
https://doi.org/10.14428/esann/2025.ES2025-141
achmad ginanjar, Xue Li, Priyanka Singh, Wen Hua
https://doi.org/10.14428/esann/2025.ES2025-141
Abstract:
Out-of-Distribution (OOD) prediction remains a significant challenge in machine learning, particularly for tabular data where traditional methods often fail to generalize beyond their training distribution. This paper introduces Tabular Continual Contrastive Learning (TCCL), a novel framework designed to address OOD challenges in tabular data processing. TCCL integrates contrastive learning principles with continual learning mechanisms, featuring a three-component architecture: an Encoder for data transformation, a Decoder for representation learning, and a Learner Head. We evaluate TCCL against 14 baseline models, including state-of-the-art deep learning approaches and gradient boosted decision trees (GBDT), across eight diverse tabular datasets. Our experimental results demonstrate that TCCL consistently outperforms existing methods in both classification and regression tasks on OOD data, with particular strength in handling distribution shifts. These findings suggest that TCCL represents a significant advancement in handling OOD scenarios for tabular data.
Out-of-Distribution (OOD) prediction remains a significant challenge in machine learning, particularly for tabular data where traditional methods often fail to generalize beyond their training distribution. This paper introduces Tabular Continual Contrastive Learning (TCCL), a novel framework designed to address OOD challenges in tabular data processing. TCCL integrates contrastive learning principles with continual learning mechanisms, featuring a three-component architecture: an Encoder for data transformation, a Decoder for representation learning, and a Learner Head. We evaluate TCCL against 14 baseline models, including state-of-the-art deep learning approaches and gradient boosted decision trees (GBDT), across eight diverse tabular datasets. Our experimental results demonstrate that TCCL consistently outperforms existing methods in both classification and regression tasks on OOD data, with particular strength in handling distribution shifts. These findings suggest that TCCL represents a significant advancement in handling OOD scenarios for tabular data.
Unsupervised learning and dimensionality reduction
Generative Kernel Spectral Clustering
Sonny Achten, David Winant, Johan Suykens
https://doi.org/10.14428/esann/2025.ES2025-37
Sonny Achten, David Winant, Johan Suykens
https://doi.org/10.14428/esann/2025.ES2025-37
Abstract:
Modern clustering approaches often trade interpretability for performance, particularly in deep learning-based methods. We present Generative Kernel Spectral Clustering (GenKSC), a novel model combining kernel spectral clustering with generative modeling to produce both well-defined clusters and interpretable representations. By augmenting weighted variance maximization with reconstruction and clustering losses, our model creates an explorable latent space where cluster characteristics can be visualized through traversals along cluster directions. Results on MNIST and FashionMNIST datasets demonstrate the model's ability to learn meaningful cluster representations.
Modern clustering approaches often trade interpretability for performance, particularly in deep learning-based methods. We present Generative Kernel Spectral Clustering (GenKSC), a novel model combining kernel spectral clustering with generative modeling to produce both well-defined clusters and interpretable representations. By augmenting weighted variance maximization with reconstruction and clustering losses, our model creates an explorable latent space where cluster characteristics can be visualized through traversals along cluster directions. Results on MNIST and FashionMNIST datasets demonstrate the model's ability to learn meaningful cluster representations.
Can MDS rival with t-SNE by using the symmetric Kullback-Leibler divergence\\ across neighborhoods as a pseudo-distance?
Lee John, Pierre Lambert, Edouard Couplet, Pierre Merveille, Ludovic Journaux, Dounia Mulders, Cyril de Bodt, Michel Verleysen
https://doi.org/10.14428/esann/2025.ES2025-174
Lee John, Pierre Lambert, Edouard Couplet, Pierre Merveille, Ludovic Journaux, Dounia Mulders, Cyril de Bodt, Michel Verleysen
https://doi.org/10.14428/esann/2025.ES2025-174
Abstract:
Local methods of dimensionality reduction like neighborhood embedding (NE) and t-SNE in particular outperform older global approaches such as stress-based multi-dimensional scaling (MDS).
Stochastic neighborhoods are less sensitive than distances to statistical variations between spaces with strongly different dimensionalities, making a match across them very difficult.
Here, we take insipration from those stochastic neighborhoods in order to devise a pseudo-distance that is less prone to concentration than the Euclidean distance.
For two points in the high-dimensional data space, it is defined as the symmetrized Kullback-Leibler divergence across the (stochastic) neighborhoods of the two points (SKLAN in short).
Plugging the SKLAN in a method of stress-based MDS, we compare the performance between t-SNE, MDS with all Euclidean distances, and MDS with SKLAN & Euclidean distances on several data sets.
The results show that SKLAN allows MDS to perform competitively with t-SNE.
Local methods of dimensionality reduction like neighborhood embedding (NE) and t-SNE in particular outperform older global approaches such as stress-based multi-dimensional scaling (MDS).
Stochastic neighborhoods are less sensitive than distances to statistical variations between spaces with strongly different dimensionalities, making a match across them very difficult.
Here, we take insipration from those stochastic neighborhoods in order to devise a pseudo-distance that is less prone to concentration than the Euclidean distance.
For two points in the high-dimensional data space, it is defined as the symmetrized Kullback-Leibler divergence across the (stochastic) neighborhoods of the two points (SKLAN in short).
Plugging the SKLAN in a method of stress-based MDS, we compare the performance between t-SNE, MDS with all Euclidean distances, and MDS with SKLAN & Euclidean distances on several data sets.
The results show that SKLAN allows MDS to perform competitively with t-SNE.
Adaptive Locally Aligned Ant Technique for Manifold Detection and Denoising
Felipe Contreras, Kerstin Bunte, Reynier Peletier
https://doi.org/10.14428/esann/2025.ES2025-185
Felipe Contreras, Kerstin Bunte, Reynier Peletier
https://doi.org/10.14428/esann/2025.ES2025-185
Abstract:
The detection and extraction of noisy manifolds from data have various applications. In Astronomy, the detection of faint streams and filaments is particularly difficult due to background contamination, which immerses and hides them in noise. The biologically inspired Locally Aligned Ant Technique (LAAT) has been demonstrated as an efficient and flexible algorithm to detect and denoise versatile structures within noisy backgrounds. Our contribution extends LAAT two-fold: (1) introduction of a dynamic local radius, and (2) locally variable pheromone deposition. The former avoids highlighting spurious patterns in noisy regions and allows smaller jumps in areas with strong alignment. The latter increases pheromone deposition in fainter zones. We demonstrate this in 2 datasets.
The detection and extraction of noisy manifolds from data have various applications. In Astronomy, the detection of faint streams and filaments is particularly difficult due to background contamination, which immerses and hides them in noise. The biologically inspired Locally Aligned Ant Technique (LAAT) has been demonstrated as an efficient and flexible algorithm to detect and denoise versatile structures within noisy backgrounds. Our contribution extends LAAT two-fold: (1) introduction of a dynamic local radius, and (2) locally variable pheromone deposition. The former avoids highlighting spurious patterns in noisy regions and allows smaller jumps in areas with strong alignment. The latter increases pheromone deposition in fainter zones. We demonstrate this in 2 datasets.
Explaining Outliers using Isolation Forest and Shapley Interactions
Roel Visser, Fabian Fumagalli, Maximilian Muschalik, Eyke Hüllermeier, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-163
Roel Visser, Fabian Fumagalli, Maximilian Muschalik, Eyke Hüllermeier, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-163
Abstract:
In unsupervised machine learning, Isolation Forest (IsoForest) is a widely used algorithm for the efficient detection of outliers. Identifying the features responsible for observed anomalies is crucial for practitioners, yet the ensemble nature of IsoForest complicates interpretation and comparison. As a remedy, SHAP is a prevalent method to interpret outlier scoring models by assigning contributions to individual features based on the SV. However, complex anomalies typically involve interaction of features, and it is paramount for practitioners to distinguish such complex anomalies from simple cases. In this work, we propose SIs to enrich explanations of outliers with feature interactions. SIs, as an extension of the SV, decompose the outlier score into contributions of individual features and interactions of features up to a specified explanation order. We modify IsoForest to compute SI using TreeSHAP-IQ, an extension of TreeSHAP for tree-based models, using the shapiq package. Using a qualitative and quantitative analysis on synthetic and real-world datasets, we demonstrate the benefit of SI and feature interactions for outlier explanations over feature contributions alone.
In unsupervised machine learning, Isolation Forest (IsoForest) is a widely used algorithm for the efficient detection of outliers. Identifying the features responsible for observed anomalies is crucial for practitioners, yet the ensemble nature of IsoForest complicates interpretation and comparison. As a remedy, SHAP is a prevalent method to interpret outlier scoring models by assigning contributions to individual features based on the SV. However, complex anomalies typically involve interaction of features, and it is paramount for practitioners to distinguish such complex anomalies from simple cases. In this work, we propose SIs to enrich explanations of outliers with feature interactions. SIs, as an extension of the SV, decompose the outlier score into contributions of individual features and interactions of features up to a specified explanation order. We modify IsoForest to compute SI using TreeSHAP-IQ, an extension of TreeSHAP for tree-based models, using the shapiq package. Using a qualitative and quantitative analysis on synthetic and real-world datasets, we demonstrate the benefit of SI and feature interactions for outlier explanations over feature contributions alone.
Do not get lost in projection: finding the right distance for meaningful UMAP embeddings
Eva Blanco-Mallo, Verónica Bolón-Canedo, Beatriz Remeseiro
https://doi.org/10.14428/esann/2025.ES2025-177
Eva Blanco-Mallo, Verónica Bolón-Canedo, Beatriz Remeseiro
https://doi.org/10.14428/esann/2025.ES2025-177
Abstract:
Dimensionality reduction techniques are essential for visualizing and analyzing high-dimensional data. This study explores the impact of distance measures on the performance of Uniformly Uniform Manifold Approximation and Projection (UMAP), a widely used dimensionality reduction method. We evaluate their influence on cluster separation, structure preservation, and their effectiveness when used as a preprocessing step for classification tasks on real and synthetic datasets. The results highlight the importance of tailoring distance measures to specific data contexts and provide guidance for optimizing UMAP applications.
Dimensionality reduction techniques are essential for visualizing and analyzing high-dimensional data. This study explores the impact of distance measures on the performance of Uniformly Uniform Manifold Approximation and Projection (UMAP), a widely used dimensionality reduction method. We evaluate their influence on cluster separation, structure preservation, and their effectiveness when used as a preprocessing step for classification tasks on real and synthetic datasets. The results highlight the importance of tailoring distance measures to specific data contexts and provide guidance for optimizing UMAP applications.
Image processing and deep learning
Machine Learning on Smartphone-Captured Diffraction Data
Udo Seiffert, Ashish Shivajirao Jadhav, Andreas Backhaus
https://doi.org/10.14428/esann/2025.ES2025-34
Udo Seiffert, Ashish Shivajirao Jadhav, Andreas Backhaus
https://doi.org/10.14428/esann/2025.ES2025-34
Abstract:
This study presents a novel approach for classifying oily or cream-like substances using diffraction data captured on a smartphone camera, applied specifically to assessing engine oil quality. Utilising the COMPOLYTICS® TapCorder approach, optical diffraction patterns were analysed with a tailored feature extraction method. The performance of three machine learning paradigms - Multilayer Perceptrons (MLP), Learning Vector Quantization (LVQ), and Radial Basis Function Networks (RBFN) - was analysed in classifying new and used oil samples. MLP achieved the highest accuracy, while LVQ required the least computation time, highlighting trade-offs relevant for consumer-focused applications. This work clearly demonstrates the feasibility of accessible, low-cost chemical substance analysis via smartphone-based systems.
This study presents a novel approach for classifying oily or cream-like substances using diffraction data captured on a smartphone camera, applied specifically to assessing engine oil quality. Utilising the COMPOLYTICS® TapCorder approach, optical diffraction patterns were analysed with a tailored feature extraction method. The performance of three machine learning paradigms - Multilayer Perceptrons (MLP), Learning Vector Quantization (LVQ), and Radial Basis Function Networks (RBFN) - was analysed in classifying new and used oil samples. MLP achieved the highest accuracy, while LVQ required the least computation time, highlighting trade-offs relevant for consumer-focused applications. This work clearly demonstrates the feasibility of accessible, low-cost chemical substance analysis via smartphone-based systems.
Exoplanet detection in angular and spectral differential imaging with an accelerated proximal gradient algorithm
Nicolas Mil-Homens Cavaco, Laurent Jacques, Pierre-Antoine Absil
https://doi.org/10.14428/esann/2025.ES2025-103
Nicolas Mil-Homens Cavaco, Laurent Jacques, Pierre-Antoine Absil
https://doi.org/10.14428/esann/2025.ES2025-103
Abstract:
Differential imaging is a technique to post-process images captured by ground-based telescopes during an observation campaign, in order to make exoplanets in a distant planetary system directly visible and to remove the so-called quasi-static speckles that dramatically affect detection capabilities. In order to introduce geometric diversity between the exoplanets and the quasi-static speckles, the light is split into spectral channels during the data acquisition process, producing a 4-D data cube with images recorded at many wavelengths and at many times. In this work, we propose to follow an inverse problem approach to model the astronomical data as the contribution of a low-rank component containing the background of quasi-static speckles and a sparse component containing the exoplanets. We then formulate the resulting model as a convex non-smooth optimization model so that an accelerated proximal gradient descent can be used to solve the detection problem.
Differential imaging is a technique to post-process images captured by ground-based telescopes during an observation campaign, in order to make exoplanets in a distant planetary system directly visible and to remove the so-called quasi-static speckles that dramatically affect detection capabilities. In order to introduce geometric diversity between the exoplanets and the quasi-static speckles, the light is split into spectral channels during the data acquisition process, producing a 4-D data cube with images recorded at many wavelengths and at many times. In this work, we propose to follow an inverse problem approach to model the astronomical data as the contribution of a low-rank component containing the background of quasi-static speckles and a sparse component containing the exoplanets. We then formulate the resulting model as a convex non-smooth optimization model so that an accelerated proximal gradient descent can be used to solve the detection problem.
Deciphering Barlow Twins: Reduncy Reduction is Insufficient and Normalization is Key
Hans-Oliver Hansen, Marius Jahrens, Thomas Martinetz
https://doi.org/10.14428/esann/2025.ES2025-119
Hans-Oliver Hansen, Marius Jahrens, Thomas Martinetz
https://doi.org/10.14428/esann/2025.ES2025-119
Abstract:
Barlow Twins is a feature-contrastive self-supervised learning framework built on the principle of redundancy reduction. The idea is to train a network by maximizing the correlation between corresponding features and minimizing the correlation between non-corresponding features in distorted views of the same image, through this facilitating effective pretraining of a backbone network for a subsequent classification head. This is achieved by diagonalizing the cross-correlation matrix of the network’s representations and scaling it towards the identity matrix. We show that the cross-correlation matrix of distorted images is inherently symmetric, independent of the backbone network's weights, which leads to two key insights: (i) the cross-correlation matrix can always be diagonalized using a linear transformation (layer), and (ii) the core idea of maximizing correlations between corresponding features while minimizing them for non-corresponding features alone is insufficient for effective backbone network pretraining. Nevertheless, Barlow Twins provide highly effective pretraining. We show that this is due to the normalization of the cross-correlation matrix in the Barlow Twins cost function. This normalization leads to minima of the cost function which are equivalent to the minima of sample contrastive approaches to enforce invariance.
Barlow Twins is a feature-contrastive self-supervised learning framework built on the principle of redundancy reduction. The idea is to train a network by maximizing the correlation between corresponding features and minimizing the correlation between non-corresponding features in distorted views of the same image, through this facilitating effective pretraining of a backbone network for a subsequent classification head. This is achieved by diagonalizing the cross-correlation matrix of the network’s representations and scaling it towards the identity matrix. We show that the cross-correlation matrix of distorted images is inherently symmetric, independent of the backbone network's weights, which leads to two key insights: (i) the cross-correlation matrix can always be diagonalized using a linear transformation (layer), and (ii) the core idea of maximizing correlations between corresponding features while minimizing them for non-corresponding features alone is insufficient for effective backbone network pretraining. Nevertheless, Barlow Twins provide highly effective pretraining. We show that this is due to the normalization of the cross-correlation matrix in the Barlow Twins cost function. This normalization leads to minima of the cost function which are equivalent to the minima of sample contrastive approaches to enforce invariance.
Benchmarking Data Augmentation for Contrastive Learning in Static Sign Language Recognition
Ariel Basso Madjoukeng, Jérôme Fink, Pierre Poitier, Bélise Kenmogne Edith, Benoit Frénay
https://doi.org/10.14428/esann/2025.ES2025-142
Ariel Basso Madjoukeng, Jérôme Fink, Pierre Poitier, Bélise Kenmogne Edith, Benoit Frénay
https://doi.org/10.14428/esann/2025.ES2025-142
Abstract:
Sign language (SL) is a communication method used by deaf people. Static sign language recognition (SLR) is a challenging task aimed at identifying signs in images, for which acquisition of annotated data is time-consuming. To leverage unannotated data, practitioners have turned to unsupervised methods. Contrastive representation learning proved to be effective in capturing important features from unannotated data. It is known that the performance of the contrastive model depends on the data augmentation technique used during training. For various applications, a set of effective data augmentation has been identified, but it is not yet the case for SL. This paper identifies the most effective augmentation for static SLR. The results show a difference in accuracy of up to 30% between appearance-based augmentations combined with translations and augmentations based on rotations, erasing, or vertical flips.
Sign language (SL) is a communication method used by deaf people. Static sign language recognition (SLR) is a challenging task aimed at identifying signs in images, for which acquisition of annotated data is time-consuming. To leverage unannotated data, practitioners have turned to unsupervised methods. Contrastive representation learning proved to be effective in capturing important features from unannotated data. It is known that the performance of the contrastive model depends on the data augmentation technique used during training. For various applications, a set of effective data augmentation has been identified, but it is not yet the case for SL. This paper identifies the most effective augmentation for static SLR. The results show a difference in accuracy of up to 30% between appearance-based augmentations combined with translations and augmentations based on rotations, erasing, or vertical flips.
Enhancing Image Classification in Quantum Computing: A Study on Preprocessing Techniques and Qubit Limitations
Henrique Alves Barbosa, Gustavo Augusto Pires, Juliana Assis Alves, Luiz Torres, Janier Arias García, Frederico Gualberto Ferreira Coelho
https://doi.org/10.14428/esann/2025.ES2025-195
Henrique Alves Barbosa, Gustavo Augusto Pires, Juliana Assis Alves, Luiz Torres, Janier Arias García, Frederico Gualberto Ferreira Coelho
https://doi.org/10.14428/esann/2025.ES2025-195
Abstract:
Quantum algorithms present unique advantages over classical methods but remain constrained by the limited number of qubits in current quantum computers. This limitation hinders their effectiveness in machine learning tasks, such as image classification. Despite its relevance, the impact of these constraints on quantum machine learning remains underexplored. This study addresses this gap by analyzing preprocessing techniques for preparing images on quantum processors. We evaluated 10 dimensionality reduction methods across four standard datasets using three distinct quantum neural network architectures. The results provide valuable insights into optimizing classification efficiency under qubit constraints, paving the way for broader applications of quantum machine learning.
Quantum algorithms present unique advantages over classical methods but remain constrained by the limited number of qubits in current quantum computers. This limitation hinders their effectiveness in machine learning tasks, such as image classification. Despite its relevance, the impact of these constraints on quantum machine learning remains underexplored. This study addresses this gap by analyzing preprocessing techniques for preparing images on quantum processors. We evaluated 10 dimensionality reduction methods across four standard datasets using three distinct quantum neural network architectures. The results provide valuable insights into optimizing classification efficiency under qubit constraints, paving the way for broader applications of quantum machine learning.
JEPA for RL: Investigating Joint-Embedding Predictive Architectures for Reinforcement Learning
Tristan Kenneweg, Philip Kenneweg, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-19
Tristan Kenneweg, Philip Kenneweg, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-19
Abstract:
Joint-Embedding Predictive Architectures (JEPA) have recently become popular as promising architectures for self-supervised learning. Vision transformers have been trained using JEPA to produce embeddings from images and videos, which have been shown to be highly suitable for downstream tasks like classification and segmentation. In this paper, we show how to adapt the JEPA architecture to reinforcement learning from images. We discuss model collapse, show how to prevent it, and provide exemplary data on the classical Cart Pole task.
Joint-Embedding Predictive Architectures (JEPA) have recently become popular as promising architectures for self-supervised learning. Vision transformers have been trained using JEPA to produce embeddings from images and videos, which have been shown to be highly suitable for downstream tasks like classification and segmentation. In this paper, we show how to adapt the JEPA architecture to reinforcement learning from images. We discuss model collapse, show how to prevent it, and provide exemplary data on the classical Cart Pole task.
A variational framework for local learning with probabilistic latent representations
Cabrel Teguemne Fokam, Khaleelulla Khan Nazeer, Christian Mayr, Anand Subramoney, David Kappel
https://doi.org/10.14428/esann/2025.ES2025-123
Cabrel Teguemne Fokam, Khaleelulla Khan Nazeer, Christian Mayr, Anand Subramoney, David Kappel
https://doi.org/10.14428/esann/2025.ES2025-123
Abstract:
We introduce a novel method for distributed learning by dividing deep neural networks into blocks and incorporating feedback networks to propagate target information backwards, enabling auxiliary local losses. Forward and backward propagation operate in parallel with independent weights, addressing locking and weight transport problems. Our approach is rooted in a statistical view of training, treating block output activations as parameters of probability distributions to measure alignment between forward and backward passes. Error backpropagation is then performed locally within blocks, hence \emph{block-local learning}. Preliminary results across tasks and architectures showcase state-of-the-art performance, establishing a principled framework for asynchronous distributed learning.
We introduce a novel method for distributed learning by dividing deep neural networks into blocks and incorporating feedback networks to propagate target information backwards, enabling auxiliary local losses. Forward and backward propagation operate in parallel with independent weights, addressing locking and weight transport problems. Our approach is rooted in a statistical view of training, treating block output activations as parameters of probability distributions to measure alignment between forward and backward passes. Error backpropagation is then performed locally within blocks, hence \emph{block-local learning}. Preliminary results across tasks and architectures showcase state-of-the-art performance, establishing a principled framework for asynchronous distributed learning.
Mask-Aware Cropping: Mitigating Mask Imbalance in Segmentation Tasks
Robin Ghyselinck, Valentin Delchevalerie, Benoit Frénay, Bruno Dumas
https://doi.org/10.14428/esann/2025.ES2025-3
Robin Ghyselinck, Valentin Delchevalerie, Benoit Frénay, Bruno Dumas
https://doi.org/10.14428/esann/2025.ES2025-3
Abstract:
Data imbalance can take various forms, such as uneven class distributions in the dataset. Solutions like data augmentation, sampling techniques and weighted loss functions are commonly used to address this issue. However, in segmentation tasks, an additional type of imbalance may occur at the pixel level, with most of them belonging to the background class. This work introduces Mask-Aware Cropping (MAC), a technique to reduce pixel-level imbalance by cropping image regions containing key information about the minority class.
Data imbalance can take various forms, such as uneven class distributions in the dataset. Solutions like data augmentation, sampling techniques and weighted loss functions are commonly used to address this issue. However, in segmentation tasks, an additional type of imbalance may occur at the pixel level, with most of them belonging to the background class. This work introduces Mask-Aware Cropping (MAC), a technique to reduce pixel-level imbalance by cropping image regions containing key information about the minority class.
Improving Robustness of Defect Detection models using Adversarial-based Data Augmentation
Daniel García, Aleix García, Diego Garcia-Perez, Ignacio Diaz-Blanco
https://doi.org/10.14428/esann/2025.ES2025-59
Daniel García, Aleix García, Diego Garcia-Perez, Ignacio Diaz-Blanco
https://doi.org/10.14428/esann/2025.ES2025-59
Abstract:
We propose an adversarial-based data augmentation method to improve the robustness of object detection models, specifically for industrial defect detection. Unlike prior approaches focused on classification or synthetic datasets, our method generates adversarial examples that target both classification and localization outputs. We further introduce controlled white noise to these examples, enhancing robustness against environmental variations. Empirical evaluation on a real-world dataset of defective laser welding images shows that our approach outperforms standard data augmentation and existing adversarial training methods, improving both model accuracy and resilience to diverse perturbations encountered in real-world settings.
We propose an adversarial-based data augmentation method to improve the robustness of object detection models, specifically for industrial defect detection. Unlike prior approaches focused on classification or synthetic datasets, our method generates adversarial examples that target both classification and localization outputs. We further introduce controlled white noise to these examples, enhancing robustness against environmental variations. Empirical evaluation on a real-world dataset of defective laser welding images shows that our approach outperforms standard data augmentation and existing adversarial training methods, improving both model accuracy and resilience to diverse perturbations encountered in real-world settings.
Multi-View Graph Neural Network for Image Segmentation : Intermediate vs Late Fusion
Elie Karam, Nisrine JRAD, Patty Coupeau, Doron Tobiano, Jean-Baptiste Fasquel, Fahed Abdallah
https://doi.org/10.14428/esann/2025.ES2025-62
Elie Karam, Nisrine JRAD, Patty Coupeau, Doron Tobiano, Jean-Baptiste Fasquel, Fahed Abdallah
https://doi.org/10.14428/esann/2025.ES2025-62
Abstract:
Representing an image as a graph captures its spatial and contextual relationships effectively. Using Graph Neural Networks (GNNs) on graph-based images has considerably enhanced image segmentation.
This paper investigates Multi-View GNNs for image segmentation, comparing Intermediate and Late Fusion methods. Experiments show that Intermediate Fusion achieves high accuracy on synthetic data by integrating relational features upfront. On a real dataset, Late Fusion methods, particularly RVCons, outperform Intermediate Fusion by dynamically aggregating multi-view predictions. Indeed, Late Fusion effectively mitigates issues arising from view-specific noise and variance. The results underscore the complementary strengths of both fusion strategies.
Representing an image as a graph captures its spatial and contextual relationships effectively. Using Graph Neural Networks (GNNs) on graph-based images has considerably enhanced image segmentation.
This paper investigates Multi-View GNNs for image segmentation, comparing Intermediate and Late Fusion methods. Experiments show that Intermediate Fusion achieves high accuracy on synthetic data by integrating relational features upfront. On a real dataset, Late Fusion methods, particularly RVCons, outperform Intermediate Fusion by dynamically aggregating multi-view predictions. Indeed, Late Fusion effectively mitigates issues arising from view-specific noise and variance. The results underscore the complementary strengths of both fusion strategies.
Shallow convolution and attention-based models for micro-expression recognition
Tanmay Verlekar, Prateek Upadhya
https://doi.org/10.14428/esann/2025.ES2025-68
Tanmay Verlekar, Prateek Upadhya
https://doi.org/10.14428/esann/2025.ES2025-68
Abstract:
The use of deep learning models for micro-expression recognition is challenging because of an absence of large datasets. This paper proposes the construction of shallow models to address this problem. It explores block-wise 3D convolutions, 2D convolutions over frames and 2D convolutional long short-term memory (ConvLSTM) over the video and combines them with multi-headed self-attention to obtain three different models. The evaluation indicates that the proposed 2D ConvLSTM and attention-based model performs the best, beating the state-of-the-art by obtaining an accuracy of 74%. It also has a parameter size that is 10 times smaller than the state-of-the-art.
The use of deep learning models for micro-expression recognition is challenging because of an absence of large datasets. This paper proposes the construction of shallow models to address this problem. It explores block-wise 3D convolutions, 2D convolutions over frames and 2D convolutional long short-term memory (ConvLSTM) over the video and combines them with multi-headed self-attention to obtain three different models. The evaluation indicates that the proposed 2D ConvLSTM and attention-based model performs the best, beating the state-of-the-art by obtaining an accuracy of 74%. It also has a parameter size that is 10 times smaller than the state-of-the-art.
Solar Panel Segmentation on Aerial Images using Color and Elevation Information
Gerrit Luimstra, Kerstin Bunte
https://doi.org/10.14428/esann/2025.ES2025-81
Gerrit Luimstra, Kerstin Bunte
https://doi.org/10.14428/esann/2025.ES2025-81
Abstract:
The automatic detection of solar panels from aerial imagery is highly desirable for energy planning and urban development in the Netherlands, where such data has not been extensively explored.
To address this gap, we publicise a new annotated dataset, tailored for the Dutch landscape, and compare several state-of-the-art semantic segmentation models. While traditional approaches primarily utilize RGB data, we incorporate elevation and angle information in the model to analyse its potential benefit. We achieved satisfactory performance for automated solar panel detection and segmentation, with surface estimates only diverging by 1m within a 900m^2 area. The additional elevation information does not improve the performance significantly, but is more robust in certain cases.
The automatic detection of solar panels from aerial imagery is highly desirable for energy planning and urban development in the Netherlands, where such data has not been extensively explored.
To address this gap, we publicise a new annotated dataset, tailored for the Dutch landscape, and compare several state-of-the-art semantic segmentation models. While traditional approaches primarily utilize RGB data, we incorporate elevation and angle information in the model to analyse its potential benefit. We achieved satisfactory performance for automated solar panel detection and segmentation, with surface estimates only diverging by 1m within a 900m^2 area. The additional elevation information does not improve the performance significantly, but is more robust in certain cases.
Enhancing Computer Vision with Knowledge: a Rummikub Case Study
Simon Vandevelde, Laurent Mertens, Sverre Lauwers, Joost Vennekens
https://doi.org/10.14428/esann/2025.ES2025-98
Simon Vandevelde, Laurent Mertens, Sverre Lauwers, Joost Vennekens
https://doi.org/10.14428/esann/2025.ES2025-98
Abstract:
Artificial Neural Networks excel at identifying individual components in an image. However, out-of-the-box, they do not manage to correctly integrate and interpret these components as a whole. One way
to alleviate this weakness is to expand the network with explicit knowledge and a separate reasoning component. In this paper, we evaluate an approach to this end, applied to the solving of the popular board game Rummikub. We demonstrate that, for this particular example, the added background knowledge is equally valuable as two-thirds of the data set, and allows to bring down the training time to half the original time.
Artificial Neural Networks excel at identifying individual components in an image. However, out-of-the-box, they do not manage to correctly integrate and interpret these components as a whole. One way
to alleviate this weakness is to expand the network with explicit knowledge and a separate reasoning component. In this paper, we evaluate an approach to this end, applied to the solving of the popular board game Rummikub. We demonstrate that, for this particular example, the added background knowledge is equally valuable as two-thirds of the data set, and allows to bring down the training time to half the original time.
Leveraging Segmentation Maps to improve Skin Lesion Classification
Simone Bonechi, Paolo Andreini, Fiamma Romagnoli
https://doi.org/10.14428/esann/2025.ES2025-106
Simone Bonechi, Paolo Andreini, Fiamma Romagnoli
https://doi.org/10.14428/esann/2025.ES2025-106
Abstract:
We propose a novel approach for skin lesion classification that leverages a transformer architecture to integrate diverse clinical information (dermoscopic images, segmentation maps, and patient clinical information) for more accurate diagnosis. By incorporating semantic segmentation maps as input, we directly provide the model with border details critical for distinguishing between benign and malignant lesions. This integration improves classification performance compared to models that use only dermoscopic images or clinical data.
To the best of our knowledge, this is the first application of semantic segmentation maps to enhance skin lesion classification. Our experiments on the ISIC dataset yield promising results, highlighting the potential of combining advanced transformer models with multimodal data for improved dermatological diagnostics.
We propose a novel approach for skin lesion classification that leverages a transformer architecture to integrate diverse clinical information (dermoscopic images, segmentation maps, and patient clinical information) for more accurate diagnosis. By incorporating semantic segmentation maps as input, we directly provide the model with border details critical for distinguishing between benign and malignant lesions. This integration improves classification performance compared to models that use only dermoscopic images or clinical data.
To the best of our knowledge, this is the first application of semantic segmentation maps to enhance skin lesion classification. Our experiments on the ISIC dataset yield promising results, highlighting the potential of combining advanced transformer models with multimodal data for improved dermatological diagnostics.
Semantic Segmentation for Waterbody Extraction Using Superpixels and Convolutional Neural Networks Classifier
Salim Iratni, Ferhat Attal, Yacine Amirat, Abdelgahni chibani, Moussa Diaf
https://doi.org/10.14428/esann/2025.ES2025-132
Salim Iratni, Ferhat Attal, Yacine Amirat, Abdelgahni chibani, Moussa Diaf
https://doi.org/10.14428/esann/2025.ES2025-132
Abstract:
Waterbody extraction from satellite images is an important task for many applications, such as hydrological modeling, ecosystem monitoring and water reserve level tracking. To tackle this problem, several deep learning based approaches have been proposed in the literature. However, these approaches have difficulty delineating water bodies due to their variations in color, size and shape. To overcome this limitation, a novel deep learning-based approach that leverages the power of Convolutional Neural Networks (CNNs) and Superpixel technique using Simple Linear Iterative Clustering (SLIC) algorithm is proposed. The proposed method involves an initial over-segmentation of the input satellite image into homogeneous zones using the SLIC algorithm. These zones are then further processed to extract Regions Of Interest (ROI) that are classified as either water or non-water using a pre-trained CNN model. Finally, each pixel within a homogeneous zone is assigned the predicted class of its associated ROI. The obtained results using Gaofen Image Dataset show the effectiveness of the proposed approach, while highlighting its superiority over state-of-the-art (SOTA) approaches.
Waterbody extraction from satellite images is an important task for many applications, such as hydrological modeling, ecosystem monitoring and water reserve level tracking. To tackle this problem, several deep learning based approaches have been proposed in the literature. However, these approaches have difficulty delineating water bodies due to their variations in color, size and shape. To overcome this limitation, a novel deep learning-based approach that leverages the power of Convolutional Neural Networks (CNNs) and Superpixel technique using Simple Linear Iterative Clustering (SLIC) algorithm is proposed. The proposed method involves an initial over-segmentation of the input satellite image into homogeneous zones using the SLIC algorithm. These zones are then further processed to extract Regions Of Interest (ROI) that are classified as either water or non-water using a pre-trained CNN model. Finally, each pixel within a homogeneous zone is assigned the predicted class of its associated ROI. The obtained results using Gaofen Image Dataset show the effectiveness of the proposed approach, while highlighting its superiority over state-of-the-art (SOTA) approaches.
A feedback-loop approach for galaxy physical properties estimation
Davide Zago, Giovanni Bonetta, Rossella Cancelliere, Mario Gai
https://doi.org/10.14428/esann/2025.ES2025-155
Davide Zago, Giovanni Bonetta, Rossella Cancelliere, Mario Gai
https://doi.org/10.14428/esann/2025.ES2025-155
Abstract:
Ongoing and forthcoming surveys promise great advances in our understanding of the Universe content and history, thanks to unprecedented improvements in the size and precision of observation datasets. On a cosmological scale, galaxies characteristics may be summarised by three main features, namely their redshift, stellar content mass and star formation rate, evolving throughout their lifetime. They are usually estimated from a set of photometric measurements, mapping their spectral emission. In this context, we propose a machine learning approach where we first evaluate redshift from the photometric data, and then merge it with them through a feedback loop, for subsequent estimation of the three desired parameters. In spite of its simplicity, our approach matches the performance of, and in some cases outperforms, significantly more complex previous tools exploiting also images. It achieves correct estimates on the near totality of instances for redshift and stellar mass, decreasing to about 70% on the more difficult case of SFR estimation.
Ongoing and forthcoming surveys promise great advances in our understanding of the Universe content and history, thanks to unprecedented improvements in the size and precision of observation datasets. On a cosmological scale, galaxies characteristics may be summarised by three main features, namely their redshift, stellar content mass and star formation rate, evolving throughout their lifetime. They are usually estimated from a set of photometric measurements, mapping their spectral emission. In this context, we propose a machine learning approach where we first evaluate redshift from the photometric data, and then merge it with them through a feedback loop, for subsequent estimation of the three desired parameters. In spite of its simplicity, our approach matches the performance of, and in some cases outperforms, significantly more complex previous tools exploiting also images. It achieves correct estimates on the near totality of instances for redshift and stellar mass, decreasing to about 70% on the more difficult case of SFR estimation.
Generalized Stochastic Pooling
Francesco Landolfi, Davide Bacciu
https://doi.org/10.14428/esann/2025.ES2025-156
Francesco Landolfi, Davide Bacciu
https://doi.org/10.14428/esann/2025.ES2025-156
Abstract:
Pooling layers play a critical role in Convolutional Neural Networks by reducing spatial dimensions and enhancing translation invariance. While conventional methods like max pooling and average pooling are effective, they can respectively amplify noise or dilute important features. Stochastic pooling introduces probabilistic sampling to improve generalization but is susceptible to biases from outliers, often mimicking max pooling in such cases. To address these limitations, we propose a generalization of stochastic pooling that introduces a tunable parameter to control the balance between uniform sampling, stochastic pooling, and max pooling. Experiments on multiple datasets demonstrate that uniform sampling outperforms the biased one, achieving a favorable trade-off between regularization and performance.
Pooling layers play a critical role in Convolutional Neural Networks by reducing spatial dimensions and enhancing translation invariance. While conventional methods like max pooling and average pooling are effective, they can respectively amplify noise or dilute important features. Stochastic pooling introduces probabilistic sampling to improve generalization but is susceptible to biases from outliers, often mimicking max pooling in such cases. To address these limitations, we propose a generalization of stochastic pooling that introduces a tunable parameter to control the balance between uniform sampling, stochastic pooling, and max pooling. Experiments on multiple datasets demonstrate that uniform sampling outperforms the biased one, achieving a favorable trade-off between regularization and performance.
O-Net: a Brain Tumor segmentation architecture based on U-Net using alternated Pooling
Omar EL BARRAJ, Aya HAGE CHEHADE, Jean-Marie MARION, Mohamad Oueidat, Pierre CHAUVET, Nassib ABDALLAH
https://doi.org/10.14428/esann/2025.ES2025-168
Omar EL BARRAJ, Aya HAGE CHEHADE, Jean-Marie MARION, Mohamad Oueidat, Pierre CHAUVET, Nassib ABDALLAH
https://doi.org/10.14428/esann/2025.ES2025-168
Abstract:
Deep Learning (DL) offers promising tools for improving diagnostic processes in healthcare. Automated brain tumor segmentation using multi-parametric multimodal Magnetic Resonance Imaging (mpMRI) plays a vital role in the clinical management of brain tumor patients, enabling precise delineation of tumor regions. In this paper, we present O-Net, a deep learning model inspired by the U-Net architecture. O-Net employs an ensemble of two mirrored U-Nets with alternating pooling strategies -Max and Average Pooling- to enhance feature extraction. Our approach demonstrates the potential to improve segmentation accuracy using the BraTS 2021 training dataset and highlights the advantages of combining complementary pooling strategies for this task.
Deep Learning (DL) offers promising tools for improving diagnostic processes in healthcare. Automated brain tumor segmentation using multi-parametric multimodal Magnetic Resonance Imaging (mpMRI) plays a vital role in the clinical management of brain tumor patients, enabling precise delineation of tumor regions. In this paper, we present O-Net, a deep learning model inspired by the U-Net architecture. O-Net employs an ensemble of two mirrored U-Nets with alternating pooling strategies -Max and Average Pooling- to enhance feature extraction. Our approach demonstrates the potential to improve segmentation accuracy using the BraTS 2021 training dataset and highlights the advantages of combining complementary pooling strategies for this task.
Hierarchical Residuals Exploit Brain-Inspired Compositionality
Francisco M. López, Jochen Triesch
https://doi.org/10.14428/esann/2025.ES2025-196
Francisco M. López, Jochen Triesch
https://doi.org/10.14428/esann/2025.ES2025-196
Abstract:
We present Hierarchical Residual Networks (HiResNets), deep convolutional neural networks with long-range residual connections between layers at different hierarchical levels. HiResNets draw inspiration on the organization of the mammalian brain by replicating the direct connections from subcortical areas to the entire cortical hierarchy. We show that the inclusion of hierarchical residuals in several architectures, including ResNets, results in a boost in accuracy and faster learning. A detailed analysis of our models reveals that they perform hierarchical compositionality by learning feature maps relative to the compressed representations provided by the skip connections.
We present Hierarchical Residual Networks (HiResNets), deep convolutional neural networks with long-range residual connections between layers at different hierarchical levels. HiResNets draw inspiration on the organization of the mammalian brain by replicating the direct connections from subcortical areas to the entire cortical hierarchy. We show that the inclusion of hierarchical residuals in several architectures, including ResNets, results in a boost in accuracy and faster learning. A detailed analysis of our models reveals that they perform hierarchical compositionality by learning feature maps relative to the compressed representations provided by the skip connections.
Comparison of convolutional neural networks approaches applied to the diagnosis of Alzheimer’s disease
Leandro Coelho, Luiza Scapinello Aquino da Silva, Leonardo Alexandre de Geus, Viviana Cocco Mariani
https://doi.org/10.14428/esann/2025.ES2025-203
Leandro Coelho, Luiza Scapinello Aquino da Silva, Leonardo Alexandre de Geus, Viviana Cocco Mariani
https://doi.org/10.14428/esann/2025.ES2025-203
Abstract:
Alzheimer's disease (AD), a neurodegenerative disorder, progressively impairs memory and cognitive functions. Magnetic resonance imaging (MRI) is used as AD diagnosis and progress monitoring method. Convolutional Neural Network (CNN) is a data-driven deep learning model containing layers transforming data input using convolution filters. The goal of this paper is to present an analysis of the CNN architectures for classifying AD diagnoses using functional brain MRI scans acquired by the experimental dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Results show CNNs variants such as InceptionV3 and InceptionResNetV2 as powerful computational tools for developing predictive neuroimaging biomarkers in AD diagnosis applications, with accuracy above 70%.
Alzheimer's disease (AD), a neurodegenerative disorder, progressively impairs memory and cognitive functions. Magnetic resonance imaging (MRI) is used as AD diagnosis and progress monitoring method. Convolutional Neural Network (CNN) is a data-driven deep learning model containing layers transforming data input using convolution filters. The goal of this paper is to present an analysis of the CNN architectures for classifying AD diagnoses using functional brain MRI scans acquired by the experimental dataset from the Alzheimer's Disease Neuroimaging Initiative (ADNI). Results show CNNs variants such as InceptionV3 and InceptionResNetV2 as powerful computational tools for developing predictive neuroimaging biomarkers in AD diagnosis applications, with accuracy above 70%.
INAM: Image-Scale Neural Additive Models
Jana Hüls, Jan-Ole Perschewski, Sebastian Stober
https://doi.org/10.14428/esann/2025.ES2025-54
Jana Hüls, Jan-Ole Perschewski, Sebastian Stober
https://doi.org/10.14428/esann/2025.ES2025-54
Abstract:
Neural Additive Models are inherently interpretable models
that can be applied to tabular data. However, when
applying these models to images the value of a given
pixel is not a meaningful feature for understanding the
model. For that reason, we propose INAM - Image scale Neural Additive
Model - a combination of trainable feature extractors and NAMs.
We show INAMs can be successfully applied to image data sets with
low variability while allowing global explanations of
the models and data point-specific explanations.
Neural Additive Models are inherently interpretable models
that can be applied to tabular data. However, when
applying these models to images the value of a given
pixel is not a meaningful feature for understanding the
model. For that reason, we propose INAM - Image scale Neural Additive
Model - a combination of trainable feature extractors and NAMs.
We show INAMs can be successfully applied to image data sets with
low variability while allowing global explanations of
the models and data point-specific explanations.
Foundation and Generative Models for Graphs
Foundation and Generative Models for Graphs
Davide Bacciu, Federico Errica, Stefano Moro, Luca Pasa, Davide Rigoni, Daniele Zambon
https://doi.org/10.14428/esann/2025.ES2025-24
Davide Bacciu, Federico Errica, Stefano Moro, Luca Pasa, Davide Rigoni, Daniele Zambon
https://doi.org/10.14428/esann/2025.ES2025-24
Abstract:
The rapidly evolving field of machine learning for graph-structured data gathered significant attention due to its ability to preserve critical information inherent in complex data structures. As a result, significant efforts have been dedicated to designing advanced architectures and foundational models optimized for graph-based operations. Research in this area explores methodologies for graph representation learning and graph generation, incorporating probabilistic models such as variational autoencoders and normalizing flows. Despite increasing interest from researchers as well as their efforts in solving graph-related problems, several issues and areas remain to be addressed to improve model generalization and reliability. This tutorial reviews foundational concepts and challenges in graph representation, structure learning, and graph generation, while also summarizing the contributions accepted for publication in the special session on this topic at the 33th European Symposium on Artificial Neural Networks, Computational Intelligence, and Machine Learning (ESANN).
The rapidly evolving field of machine learning for graph-structured data gathered significant attention due to its ability to preserve critical information inherent in complex data structures. As a result, significant efforts have been dedicated to designing advanced architectures and foundational models optimized for graph-based operations. Research in this area explores methodologies for graph representation learning and graph generation, incorporating probabilistic models such as variational autoencoders and normalizing flows. Despite increasing interest from researchers as well as their efforts in solving graph-related problems, several issues and areas remain to be addressed to improve model generalization and reliability. This tutorial reviews foundational concepts and challenges in graph representation, structure learning, and graph generation, while also summarizing the contributions accepted for publication in the special session on this topic at the 33th European Symposium on Artificial Neural Networks, Computational Intelligence, and Machine Learning (ESANN).
D4: Distance Diffusion for a Truly Equivariant Molecular Design
Samuel Cognolato, Davide Rigoni, Marco Ballarini, Luciano Serafini, Stefano Moro, Alessandro Sperduti
https://doi.org/10.14428/esann/2025.ES2025-80
Samuel Cognolato, Davide Rigoni, Marco Ballarini, Luciano Serafini, Stefano Moro, Alessandro Sperduti
https://doi.org/10.14428/esann/2025.ES2025-80
Abstract:
Recent years have witnessed an increase in interest in leveraging generative models for de novo molecular design in drug discovery. Many State-of-the-Art (SotA) models incorporate the 3D structural information of the molecule, particularly atomic spatial coordinates. However, such approaches face challenges integrating SE(3) equivariance when trained on coordinates. This work explores the use of the distance matrix for molecular structures, natively SE(3) invariant, avoiding whatever the issue. Experimental evaluation shows that our proposed approach significantly improves upon MiDi, a SotA 3D molecule generator.
Recent years have witnessed an increase in interest in leveraging generative models for de novo molecular design in drug discovery. Many State-of-the-Art (SotA) models incorporate the 3D structural information of the molecule, particularly atomic spatial coordinates. However, such approaches face challenges integrating SE(3) equivariance when trained on coordinates. This work explores the use of the distance matrix for molecular structures, natively SE(3) invariant, avoiding whatever the issue. Experimental evaluation shows that our proposed approach significantly improves upon MiDi, a SotA 3D molecule generator.
Encoding Graph Topology with Randomized Ising Models
Domenico Tortorella, Antonio Brau, Alessio Micheli
https://doi.org/10.14428/esann/2025.ES2025-109
Domenico Tortorella, Antonio Brau, Alessio Micheli
https://doi.org/10.14428/esann/2025.ES2025-109
Abstract:
The increasing popularity of deep learning on graphs has motivated the need for the co-design of hardware and graph representation models. We propose Randomized Ising Model (RIM), a reservoir computing model for encoding topological information of graph nodes, that is amenable to physical implementation via neuromorphic hardware. Our experiments demonstrate that RIM's node embeddings are able to provide sufficient topological information to be suitable to address node classification tasks, exhibiting an accuracy in line with Graph Echo State Networks.
The increasing popularity of deep learning on graphs has motivated the need for the co-design of hardware and graph representation models. We propose Randomized Ising Model (RIM), a reservoir computing model for encoding topological information of graph nodes, that is amenable to physical implementation via neuromorphic hardware. Our experiments demonstrate that RIM's node embeddings are able to provide sufficient topological information to be suitable to address node classification tasks, exhibiting an accuracy in line with Graph Echo State Networks.
Robustness in Protein-Protein Interaction Networks: A Link Prediction Approach
Alessandro Dipalma, Domenico Tortorella, Alessio Micheli
https://doi.org/10.14428/esann/2025.ES2025-153
Alessandro Dipalma, Domenico Tortorella, Alessio Micheli
https://doi.org/10.14428/esann/2025.ES2025-153
Abstract:
Protein-protein interaction networks (PPINs) are indispensable in exploring complex biological systems, facilitating advancements in fields like drug discovery, protein function annotation, and disease mechanism elucidation. So far, predicting the dynamical properties of biochemical pathways has relied on costly numerical simulations. In this paper, we propose exploiting the topological information in PPINs to restate the problem of predicting pathway robustness as a link prediction task.
Our experiments show that the PPIN topology can supply information on inter-pathway relationships, significantly improving predictions of the graph-agnostic baseline relying only on protein sequence embeddings.
Protein-protein interaction networks (PPINs) are indispensable in exploring complex biological systems, facilitating advancements in fields like drug discovery, protein function annotation, and disease mechanism elucidation. So far, predicting the dynamical properties of biochemical pathways has relied on costly numerical simulations. In this paper, we propose exploiting the topological information in PPINs to restate the problem of predicting pathway robustness as a link prediction task.
Our experiments show that the PPIN topology can supply information on inter-pathway relationships, significantly improving predictions of the graph-agnostic baseline relying only on protein sequence embeddings.
3-WL GNNs for Metric Learning on Graphs
Aldo Moscatelli, Maxime Berar, Pierre Héroux, Florian Yger, Sebastien Adam
https://doi.org/10.14428/esann/2025.ES2025-49
Aldo Moscatelli, Maxime Berar, Pierre Héroux, Florian Yger, Sebastien Adam
https://doi.org/10.14428/esann/2025.ES2025-49
Abstract:
Since the advent of Graph Neural Networks (GNNs), many works have computed distances between graphs by embedding them in vector spaces using Message Passing GNNs (MPNNs). However, MPNNs are known for their lack of expressiveness as they are bounded by the first-order Weisfeiler-Lehman test. In this paper, we use higher-order GNNs to tackle the metric learning problem and show on benchmark datasets how they can improve performance by using a node-level strategy and the Wasserstein distance.
Since the advent of Graph Neural Networks (GNNs), many works have computed distances between graphs by embedding them in vector spaces using Message Passing GNNs (MPNNs). However, MPNNs are known for their lack of expressiveness as they are bounded by the first-order Weisfeiler-Lehman test. In this paper, we use higher-order GNNs to tackle the metric learning problem and show on benchmark datasets how they can improve performance by using a node-level strategy and the Wasserstein distance.
Towards Efficient Molecular Property Optimization with Graph Energy Based Models
Luca Miglior, Lorenzo Simone, Marco Podda, Davide Bacciu
https://doi.org/10.14428/esann/2025.ES2025-120
Luca Miglior, Lorenzo Simone, Marco Podda, Davide Bacciu
https://doi.org/10.14428/esann/2025.ES2025-120
Abstract:
Optimizing chemical properties is a challenging task due to the vastness and complexity of chemical space. Here, we present a generative energy-based architecture for implicit chemical property optimization, designed to efficiently generate molecules that satisfy target properties without explicit conditional generation. We use Graph Energy Based Models and a training approach that does not require property labels. We validated our approach on well-established chemical benchmarks, showing superior results to state-of-the-art methods and demonstrating robustness and efficiency towards de novo drug design.
Optimizing chemical properties is a challenging task due to the vastness and complexity of chemical space. Here, we present a generative energy-based architecture for implicit chemical property optimization, designed to efficiently generate molecules that satisfy target properties without explicit conditional generation. We use Graph Energy Based Models and a training approach that does not require property labels. We validated our approach on well-established chemical benchmarks, showing superior results to state-of-the-art methods and demonstrating robustness and efficiency towards de novo drug design.
Machine learning and applied Artificial Intelligence in cognitive sciences and psychology
Machine Learning and applied Artificial Intelligence in cognitive sciences and pyschology: a tutorial
Caroline König, Alfredo Vellido
https://doi.org/10.14428/esann/2025.ES2025-27
Caroline König, Alfredo Vellido
https://doi.org/10.14428/esann/2025.ES2025-27
Abstract:
Artificial Intelligence (AI) both in general and in its current predominant version, mostly based on connectionist tenets, lives in the paradox of aiming to reproduce and simulate the workings of an immensely complex system, the biological brain, which are still to a large extent unknown. This gives us latitude for some interesting domain interplay: concepts from the cognitive sciences can be used to improve AI models, while AI can be used in data science mode to analyze cognitive processes in neuroscience, as well as brain pathologies from a medical standpoint.
Artificial Intelligence (AI) both in general and in its current predominant version, mostly based on connectionist tenets, lives in the paradox of aiming to reproduce and simulate the workings of an immensely complex system, the biological brain, which are still to a large extent unknown. This gives us latitude for some interesting domain interplay: concepts from the cognitive sciences can be used to improve AI models, while AI can be used in data science mode to analyze cognitive processes in neuroscience, as well as brain pathologies from a medical standpoint.
Introducing Intrinsic Motivation in Elastic Decision Transformers
Leonardo Guiducci, Giovanna Maria Dimitri, Giulia Palma , Antonio Rizzo
https://doi.org/10.14428/esann/2025.ES2025-115
Leonardo Guiducci, Giovanna Maria Dimitri, Giulia Palma , Antonio Rizzo
https://doi.org/10.14428/esann/2025.ES2025-115
Abstract:
Effective decision-making is a key challenge in artificial intelligence, with Reinforcement Learning (RL) emerging as the main approach. However, RL often depends on complex reward functions, which are difficult to design. Intrinsic motivation, inspired by psychological concepts like curiosity, offers an alternative by generating agent-driven rewards to foster exploration. This paper introduces intrinsic motivation into the Elastic Decision Transformer (EDT) framework for Offline RL. By using an auxiliary intrinsic loss, we enhance representation learning without altering fixed reward signals. Experiments in locomotion tasks demonstrate improved performance, under scoring the potential of intrinsic motivation to advance RL in offline settings.
Effective decision-making is a key challenge in artificial intelligence, with Reinforcement Learning (RL) emerging as the main approach. However, RL often depends on complex reward functions, which are difficult to design. Intrinsic motivation, inspired by psychological concepts like curiosity, offers an alternative by generating agent-driven rewards to foster exploration. This paper introduces intrinsic motivation into the Elastic Decision Transformer (EDT) framework for Offline RL. By using an auxiliary intrinsic loss, we enhance representation learning without altering fixed reward signals. Experiments in locomotion tasks demonstrate improved performance, under scoring the potential of intrinsic motivation to advance RL in offline settings.
Direct versus intermediate multi-task transfer learning for dementia detection from unstructured conversations
Dan Kumpik , Yoav Ben-Shlomo, Elizabeth Coulthard, Alexander Hepburn, Raul Santos-Rodriguez
https://doi.org/10.14428/esann/2025.ES2025-210
Dan Kumpik , Yoav Ben-Shlomo, Elizabeth Coulthard, Alexander Hepburn, Raul Santos-Rodriguez
https://doi.org/10.14428/esann/2025.ES2025-210
Abstract:
Leveraging unstructured conversations for detecting early dementia may be possible through information transfer from more systematically constrained representations. To explore whether cross-domain (from semi-structured to unstructured) transfer learning improves dementia classification from conversational speech, we fine-tuned a BERT-family model using semi-structured narratives for which contextual information including speaker identity was available. We further fine-tuned on naturalistic conversations recorded in the home, but found that direct transfer from BERT to conversations was more effective for improving generalization. These findings show scope to directly leverage unstructured language samples for in-the-wild dementia detection.
Leveraging unstructured conversations for detecting early dementia may be possible through information transfer from more systematically constrained representations. To explore whether cross-domain (from semi-structured to unstructured) transfer learning improves dementia classification from conversational speech, we fine-tuned a BERT-family model using semi-structured narratives for which contextual information including speaker identity was available. We further fine-tuned on naturalistic conversations recorded in the home, but found that direct transfer from BERT to conversations was more effective for improving generalization. These findings show scope to directly leverage unstructured language samples for in-the-wild dementia detection.
The Regulatory Character of Boredom in AI - Towards a Self-Regulating System based on Spiking Neural Networks
Patrick Schoefer, James Danckert, Peter Stadler, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-5
Patrick Schoefer, James Danckert, Peter Stadler, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-5
Abstract:
Boredom is increasingly recognized as a functional emotion playing an important role in regulating human behavior. Despite continuous advances in the field of artificial intelligence, research on whether these models can enter emotional states such as boredom remains limited. However, emotions can be pivotal towards more human-like intelligence in AI. This paper transfers the regulatory function of boredom into a control loop modeled with spiking neural networks. Simulations demonstrate the successful replication of the regulatory mechanism of boredom based on simulated input. This work provides a foundation for future research and development towards a self-regulating system based on spiking neural networks capable of entering a state of boredom.
Boredom is increasingly recognized as a functional emotion playing an important role in regulating human behavior. Despite continuous advances in the field of artificial intelligence, research on whether these models can enter emotional states such as boredom remains limited. However, emotions can be pivotal towards more human-like intelligence in AI. This paper transfers the regulatory function of boredom into a control loop modeled with spiking neural networks. Simulations demonstrate the successful replication of the regulatory mechanism of boredom based on simulated input. This work provides a foundation for future research and development towards a self-regulating system based on spiking neural networks capable of entering a state of boredom.
Sleep Staging with Gradient Boosting and DWT-PSD Features from EEG/EOG Signals
Luis Alfredo Moctezuma Pascual, Yoko Suzuki, Junya Furuki, Marta Molinas, Takashi Abe
https://doi.org/10.14428/esann/2025.ES2025-16
Luis Alfredo Moctezuma Pascual, Yoko Suzuki, Junya Furuki, Marta Molinas, Takashi Abe
https://doi.org/10.14428/esann/2025.ES2025-16
Abstract:
Advances in machine learning (ML) and deep learning (DL) have led to automated sleep staging approaches that achieve high accuracy but often require extensive computational resources and/or high-density electroencephalograms (EEG).
This paper presents a method for sleep staging using features extracted via the Discrete Wavelet Transform (DWT) and Power Spectral Density (PSD), followed by the Gradient Boosting (GB) classifier.
The study employs a private dataset and the sleep-EDF dataset, comprising EEG and electrooculograms (EOG) channels. The analysis includes configurations with varying numbers of subjects (75, 20, and 12), and the results demonstrate that the proposed method achieves competitive performance with existing approaches that use complex DL architectures, even with fewer subjects.
Feature importance analysis highlights the importance of detail coefficients from DWT and PSD-based features from EEG signals.
The findings suggest that simplified methods using low-density EEG and EOG with well-selected features and GB classification can offer a viable alternative to DL approaches for sleep staging.
Advances in machine learning (ML) and deep learning (DL) have led to automated sleep staging approaches that achieve high accuracy but often require extensive computational resources and/or high-density electroencephalograms (EEG).
This paper presents a method for sleep staging using features extracted via the Discrete Wavelet Transform (DWT) and Power Spectral Density (PSD), followed by the Gradient Boosting (GB) classifier.
The study employs a private dataset and the sleep-EDF dataset, comprising EEG and electrooculograms (EOG) channels. The analysis includes configurations with varying numbers of subjects (75, 20, and 12), and the results demonstrate that the proposed method achieves competitive performance with existing approaches that use complex DL architectures, even with fewer subjects.
Feature importance analysis highlights the importance of detail coefficients from DWT and PSD-based features from EEG signals.
The findings suggest that simplified methods using low-density EEG and EOG with well-selected features and GB classification can offer a viable alternative to DL approaches for sleep staging.
Multimodal Explainable Automated Diagnosis of Autistic Spectrum Disorder
Meryem Ben Yahia, Moncef Garouani, Julien Aligon
https://doi.org/10.14428/esann/2025.ES2025-72
Meryem Ben Yahia, Moncef Garouani, Julien Aligon
https://doi.org/10.14428/esann/2025.ES2025-72
Abstract:
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by symptoms affecting social interaction, communication, and behavior, with diagnosis complicated by significant individual variability and the absence of definitive biomarkers.
Current artificial intelligence methods have improved diagnostic accuracy, but their reliance on subjective assessments or single-modal data, coupled with their "black-box" nature, limits consistency and clinical applicability. Addressing current limitations, this paper introduces a multimodal ASD detection framework using deep neural networks (DNN) with explainable AI (xAI) to enhance model transparency.
Our model achieves a mean 5-fold cross-validation accuracy of 98.64% , surpassing existing methods and demonstrating potential for clinical dependability of ASD diagnoses. The source code is available at: https://github.com/mebenyahia/Multimodal-Explainable-Automated-Diagnosis-of-Autistic-Spectrum-Disorder
Autism Spectrum Disorder (ASD) is a neurodevelopmental disorder characterized by symptoms affecting social interaction, communication, and behavior, with diagnosis complicated by significant individual variability and the absence of definitive biomarkers.
Current artificial intelligence methods have improved diagnostic accuracy, but their reliance on subjective assessments or single-modal data, coupled with their "black-box" nature, limits consistency and clinical applicability. Addressing current limitations, this paper introduces a multimodal ASD detection framework using deep neural networks (DNN) with explainable AI (xAI) to enhance model transparency.
Our model achieves a mean 5-fold cross-validation accuracy of 98.64% , surpassing existing methods and demonstrating potential for clinical dependability of ASD diagnoses. The source code is available at: https://github.com/mebenyahia/Multimodal-Explainable-Automated-Diagnosis-of-Autistic-Spectrum-Disorder
Altered emotion recognition from psychiatric patient profiles using Machine Learning
Pedro Jesús Copado, Martha Ivon Cardenas, Alfredo Vellido, Caroline König
https://doi.org/10.14428/esann/2025.ES2025-90
Pedro Jesús Copado, Martha Ivon Cardenas, Alfredo Vellido, Caroline König
https://doi.org/10.14428/esann/2025.ES2025-90
Abstract:
Mental illnesses influence the emotion recognition capabilities of those who suffer them. This article presents a study that involves the prediction, using multi-class classification models, of several human standard emotions from facial expressions. It is based on a publicly available dataset for emotion recognition that includes socio-demographic information and psychiatric profiles of individuals with mental illnesses. The study aims to explore how effectively these models can identify and classify emotions based on facial cues, considering the diverse psychiatric backgrounds of the subjects. It also aims to investigate to what extent the severity of the psychiatric condition affects the level of certainty of the predictions.
Mental illnesses influence the emotion recognition capabilities of those who suffer them. This article presents a study that involves the prediction, using multi-class classification models, of several human standard emotions from facial expressions. It is based on a publicly available dataset for emotion recognition that includes socio-demographic information and psychiatric profiles of individuals with mental illnesses. The study aims to explore how effectively these models can identify and classify emotions based on facial cues, considering the diverse psychiatric backgrounds of the subjects. It also aims to investigate to what extent the severity of the psychiatric condition affects the level of certainty of the predictions.
Screening Dyslexia for English: Impact of Heterogeneity in Demographic Variables
Enrique Romero, Luz Rello
https://doi.org/10.14428/esann/2025.ES2025-108
Enrique Romero, Luz Rello
https://doi.org/10.14428/esann/2025.ES2025-108
Abstract:
Dyslexia is a complex learning disorder, and its diagnosis can be challenging. Therefore, gathering knowledge about the impact of key attributes is crucial. This work focuses on demographic variables utilized in a pioneering computer-based linguistic game designed for the screening of dyslexia using Machine Learning. The analysis highlights the heterogeneity present in these variables and provides valuable insights for the development of future Machine Learning approaches. It emphasizes key contributions, such as strategies to mitigate biases and effectively address heterogeneity, suggesting the formation of subgroups based on interaction data collected.
Dyslexia is a complex learning disorder, and its diagnosis can be challenging. Therefore, gathering knowledge about the impact of key attributes is crucial. This work focuses on demographic variables utilized in a pioneering computer-based linguistic game designed for the screening of dyslexia using Machine Learning. The analysis highlights the heterogeneity present in these variables and provides valuable insights for the development of future Machine Learning approaches. It emphasizes key contributions, such as strategies to mitigate biases and effectively address heterogeneity, suggesting the formation of subgroups based on interaction data collected.
Explainable deep learning reveals a behavioral strategy underlying human decisions in a spatial navigation task
Youri MARQUISE
https://doi.org/10.14428/esann/2025.ES2025-181
Youri MARQUISE
https://doi.org/10.14428/esann/2025.ES2025-181
Abstract:
This paper uses a set of explainable AI (xAI) methods in order to study human behavior in a spatial navigation task. First, locomotion and gaze dynamics of human subjects was reproduced in a virtual environment and visual snapshots extracted from this simulation were used as dataset. Second, the dataset was used to train a deep convolutional network to reproduce human decisions. Third, network strategies used for image classification were analyzed using a combination of three xAI methods. Using this analysis we discovered a specific oculomotor marker that indicated the behavioral strategy used by human participants in this task. We conclude that xAI is a promising approach to study human behavior in complex real-world tasks.
This paper uses a set of explainable AI (xAI) methods in order to study human behavior in a spatial navigation task. First, locomotion and gaze dynamics of human subjects was reproduced in a virtual environment and visual snapshots extracted from this simulation were used as dataset. Second, the dataset was used to train a deep convolutional network to reproduce human decisions. Third, network strategies used for image classification were analyzed using a combination of three xAI methods. Using this analysis we discovered a specific oculomotor marker that indicated the behavioral strategy used by human participants in this task. We conclude that xAI is a promising approach to study human behavior in complex real-world tasks.
Reinforcement learning
TEA: Trajectory Encoding Augmentation for Robust and Transferable Policies in Offline Reinforcement Learning
Batıkan Bora Ormancı, Phillip Swazinna, Steffen Udluft, Thomas Runkler
https://doi.org/10.14428/esann/2025.ES2025-114
Batıkan Bora Ormancı, Phillip Swazinna, Steffen Udluft, Thomas Runkler
https://doi.org/10.14428/esann/2025.ES2025-114
Abstract:
In this paper, we investigate offline reinforcement learning (RL) with the goal of training a single robust policy that generalizes effectively across environments with unseen dynamics. We propose a novel approach, Trajectory Encoding Augmentation (TEA), which extends the state space by integrating latent representations of environmental dynamics obtained from sequence encoders, such as autoencoders. Our findings show that incorporating these encodings with TEA improves the transferability of a single policy to novel environments with new dynamics, surpassing methods that rely solely on unmodified states. These results indicate that TEA captures critical, environment-specific characteristics, enabling RL agents to generalize effectively across dynamic conditions.
In this paper, we investigate offline reinforcement learning (RL) with the goal of training a single robust policy that generalizes effectively across environments with unseen dynamics. We propose a novel approach, Trajectory Encoding Augmentation (TEA), which extends the state space by integrating latent representations of environmental dynamics obtained from sequence encoders, such as autoencoders. Our findings show that incorporating these encodings with TEA improves the transferability of a single policy to novel environments with new dynamics, surpassing methods that rely solely on unmodified states. These results indicate that TEA captures critical, environment-specific characteristics, enabling RL agents to generalize effectively across dynamic conditions.
Robust Evolutionary Multi-Objective Neural Architecture Search for Reinforcement Learning (EMNAS-RL)
Nihal Acharya Adde, Alexandra Gianzina, Hanno Gottschalk, Andreas Ebert
https://doi.org/10.14428/esann/2025.ES2025-122
Nihal Acharya Adde, Alexandra Gianzina, Hanno Gottschalk, Andreas Ebert
https://doi.org/10.14428/esann/2025.ES2025-122
Abstract:
This paper introduces Evolutionary Multi-Objective Neural Architecture Search (EMNAS) for the first time to optimize neural network architectures in large-scale reinforcement learning for autonomous driving. EMNAS uses genetic algorithms to automate network design, tailored to enhance rewards and reduce model size without compromising performance. Additionally, parallelization techniques are employed to accelerate the search, and teacher-student methodologies are implemented to ensure scalable optimization. Experimental results demonstrate that tailored EMNAS outperforms manually designed models, achieving higher rewards with fewer parameters.
This paper introduces Evolutionary Multi-Objective Neural Architecture Search (EMNAS) for the first time to optimize neural network architectures in large-scale reinforcement learning for autonomous driving. EMNAS uses genetic algorithms to automate network design, tailored to enhance rewards and reduce model size without compromising performance. Additionally, parallelization techniques are employed to accelerate the search, and teacher-student methodologies are implemented to ensure scalable optimization. Experimental results demonstrate that tailored EMNAS outperforms manually designed models, achieving higher rewards with fewer parameters.
Is Q-learning an Ill-posed Problem?
Philipp Wissmann, Daniel Hein, Steffen Udluft, Thomas Runkler
https://doi.org/10.14428/esann/2025.ES2025-129
Philipp Wissmann, Daniel Hein, Steffen Udluft, Thomas Runkler
https://doi.org/10.14428/esann/2025.ES2025-129
Abstract:
This paper investigates the instability of Q-learning in continuous environments, a challenge frequently encountered by practitioners. Traditionally, this instability is attributed to bootstrapping and regression model errors. Using a representative reinforcement learning benchmark, we systematically examine the effects of bootstrapping and model inaccuracies by incrementally eliminating these potential error sources. Our findings reveal that even in relatively simple benchmarks, the fundamental task of Q-learning -- iteratively learning a Q-function from policy-specific target values -- can be inherently ill-posed and prone to failure. These insights cast doubt on the reliability of Q-learning as a universal solution for reinforcement learning problems.
This paper investigates the instability of Q-learning in continuous environments, a challenge frequently encountered by practitioners. Traditionally, this instability is attributed to bootstrapping and regression model errors. Using a representative reinforcement learning benchmark, we systematically examine the effects of bootstrapping and model inaccuracies by incrementally eliminating these potential error sources. Our findings reveal that even in relatively simple benchmarks, the fundamental task of Q-learning -- iteratively learning a Q-function from policy-specific target values -- can be inherently ill-posed and prone to failure. These insights cast doubt on the reliability of Q-learning as a universal solution for reinforcement learning problems.
Reinforcement learning-based control system for biogas plants in laboratory scale
Alberto Meola, Sören Weinrich, Oliver Kiefner, Félix Delory
https://doi.org/10.14428/esann/2025.ES2025-187
Alberto Meola, Sören Weinrich, Oliver Kiefner, Félix Delory
https://doi.org/10.14428/esann/2025.ES2025-187
Abstract:
Reinforcement learning techniques can be used to learn effective policies for complex tasks, but they are rarely applied for control of biogas plants. While control of the anaerobic process is necessary for optimal plant operation, process complexity and instability prevent the usage of advanced control mechanisms in industrial settings. In this study, a proximal-policy optimization algorithm has been applied on the feeding schedule of a lab-scale biogas reactor for biomethane conversion to electrical energy depending on dynamic energy prices. The algorithm effectively optimizes feeding and selling strategies, outperforming traditional methods.
Reinforcement learning techniques can be used to learn effective policies for complex tasks, but they are rarely applied for control of biogas plants. While control of the anaerobic process is necessary for optimal plant operation, process complexity and instability prevent the usage of advanced control mechanisms in industrial settings. In this study, a proximal-policy optimization algorithm has been applied on the feeding schedule of a lab-scale biogas reactor for biomethane conversion to electrical energy depending on dynamic energy prices. The algorithm effectively optimizes feeding and selling strategies, outperforming traditional methods.
Data-Density guided Reinforcement Learning
Leon Lantz, Maximilian Schieder, Michel Tokic
https://doi.org/10.14428/esann/2025.ES2025-194
Leon Lantz, Maximilian Schieder, Michel Tokic
https://doi.org/10.14428/esann/2025.ES2025-194
Abstract:
This paper investigates reinforcement learning by avoiding low-density state regions using modified reward functions. The approach leverages data-density models within the state space, enabling a custom reward function that penalizes transitions into sparse regions. Applied in the Pendulum environment, this method encourages exploration in well-sampled areas while avoiding less-explored states. Empirical results show that this method effectively balances reward optimization with state confidence, enabling robust policy learning in challenging environments.
This paper investigates reinforcement learning by avoiding low-density state regions using modified reward functions. The approach leverages data-density models within the state space, enabling a custom reward function that penalizes transitions into sparse regions. Applied in the Pendulum environment, this method encourages exploration in well-sampled areas while avoiding less-explored states. Empirical results show that this method effectively balances reward optimization with state confidence, enabling robust policy learning in challenging environments.
Classification and statistical learning
Ranking the scores of algorithms with confidence
Adrien Foucart, Arthur Elskens, Christine Decaestecker
https://doi.org/10.14428/esann/2025.ES2025-39
Adrien Foucart, Arthur Elskens, Christine Decaestecker
https://doi.org/10.14428/esann/2025.ES2025-39
Abstract:
Evaluating algorithms (particularly in the context of a competition) typically ends with a ranking from best to worst. While this ranking is sometimes accompanied by statistical significance tests on the assessment metrics, sometimes associated with confidence intervals, the ranks are usually presented as singular values. We argue that these ranks should themselves be accompanied by confidence intervals. We investigate different methods for computing such intervals, and measure their behaviour in simulated scenarios. Our results show that we can obtain robust confidence intervals for ranks using the Iman-Davenport test and the pairwise Wilcoxon signed-rank test with Holm's correction.
Evaluating algorithms (particularly in the context of a competition) typically ends with a ranking from best to worst. While this ranking is sometimes accompanied by statistical significance tests on the assessment metrics, sometimes associated with confidence intervals, the ranks are usually presented as singular values. We argue that these ranks should themselves be accompanied by confidence intervals. We investigate different methods for computing such intervals, and measure their behaviour in simulated scenarios. Our results show that we can obtain robust confidence intervals for ranks using the Iman-Davenport test and the pairwise Wilcoxon signed-rank test with Holm's correction.
Reconciling Grokking with Statistical Learning Theory
Luca Oneto, Sandro Ridella, Andrea Coraddu, Davide Anguita
https://doi.org/10.14428/esann/2025.ES2025-10
Luca Oneto, Sandro Ridella, Andrea Coraddu, Davide Anguita
https://doi.org/10.14428/esann/2025.ES2025-10
Abstract:
In recent years, Artificial Intelligence, particularly Machine Learning (ML), has demonstrated remarkable success in addressing complex problems.
However, this progress has been accompanied by the emergence of unexpected, poorly understood, and elusive phenomena that characterize the behavior of machine intelligence and learning processes.
Researchers are often challenged to interpret these phenomena within the existing theoretical frameworks of ML, fostering a search for more complex or technical explanations.
One such phenomenon, known as "grokking", occurs when an ML model, after a long period of stagnant or even regressive learning, suddenly exhibits rapid and substantial improvement.
In this paper, we argue that grokking can be explained with the theoretical foundations of ML by leveraging Statistical Learning Theory, i.e., Algorithmic Stability theory.
We provide insights into how this theory can reconcile grokking with established principles of learning and generalization.
In recent years, Artificial Intelligence, particularly Machine Learning (ML), has demonstrated remarkable success in addressing complex problems.
However, this progress has been accompanied by the emergence of unexpected, poorly understood, and elusive phenomena that characterize the behavior of machine intelligence and learning processes.
Researchers are often challenged to interpret these phenomena within the existing theoretical frameworks of ML, fostering a search for more complex or technical explanations.
One such phenomenon, known as "grokking", occurs when an ML model, after a long period of stagnant or even regressive learning, suddenly exhibits rapid and substantial improvement.
In this paper, we argue that grokking can be explained with the theoretical foundations of ML by leveraging Statistical Learning Theory, i.e., Algorithmic Stability theory.
We provide insights into how this theory can reconcile grokking with established principles of learning and generalization.
Coherence-based Sample Selection for Class-incremental Learning
Andrea Daou, Jean-Baptiste Pothin, Paul Honeine, Abdelaziz Bensrhair
https://doi.org/10.14428/esann/2025.ES2025-13
Andrea Daou, Jean-Baptiste Pothin, Paul Honeine, Abdelaziz Bensrhair
https://doi.org/10.14428/esann/2025.ES2025-13
Abstract:
Class-Incremental Learning (Class-IL) is challenging as the model must adapt to new classes while retaining knowledge of old ones. To avoid catastrophic forgetting in knowledge distillation with a fixed-budget memory, exemplars from previously learned classes need to be stored. We propose a novel sample selection method based on the coherence measure to boost Class-IL performance. This is the first time the coherence is investigated in a deep model, specifically for Class-IL. We define the coherence between two samples as a normalized inner product between their deep feature extractor features. Theoretical results and extensive experiments demonstrate the relevance of our approach.
Class-Incremental Learning (Class-IL) is challenging as the model must adapt to new classes while retaining knowledge of old ones. To avoid catastrophic forgetting in knowledge distillation with a fixed-budget memory, exemplars from previously learned classes need to be stored. We propose a novel sample selection method based on the coherence measure to boost Class-IL performance. This is the first time the coherence is investigated in a deep model, specifically for Class-IL. We define the coherence between two samples as a normalized inner product between their deep feature extractor features. Theoretical results and extensive experiments demonstrate the relevance of our approach.
Hierarchical decomposition through "Mental Images" evaluation
Gianluca Coda, Massimo De Gregorio, Antonio Sorgente, Paolo Vanacore
https://doi.org/10.14428/esann/2025.ES2025-77
Gianluca Coda, Massimo De Gregorio, Antonio Sorgente, Paolo Vanacore
https://doi.org/10.14428/esann/2025.ES2025-77
Abstract:
Hierarchical Decomposition Methods (HDMs) are techniques that handle multi-class classification problems by breaking them down into smaller, more manageable binary classification tasks, typically achieving better accuracy than flat classification approaches. In this work, a new HDM based on the exploitation of DRASiW "Mental Images" to construct the optimal tree model is presented. Through experiments performed on 26 standard datasets, we show how this approach improves the system classification performance with respect to the classical flat classification.
Hierarchical Decomposition Methods (HDMs) are techniques that handle multi-class classification problems by breaking them down into smaller, more manageable binary classification tasks, typically achieving better accuracy than flat classification approaches. In this work, a new HDM based on the exploitation of DRASiW "Mental Images" to construct the optimal tree model is presented. Through experiments performed on 26 standard datasets, we show how this approach improves the system classification performance with respect to the classical flat classification.
Investigating the Impact of Imbalanced Medical Data on the Performance of Self-Supervised Learning Approaches
Manuel Laufer, Felicitas Brokmann, Dominik Mairhöfer, Erhardt Barth, Thomas Martinetz
https://doi.org/10.14428/esann/2025.ES2025-127
Manuel Laufer, Felicitas Brokmann, Dominik Mairhöfer, Erhardt Barth, Thomas Martinetz
https://doi.org/10.14428/esann/2025.ES2025-127
Abstract:
In clinical practice, a substantial amount of data is generated on a daily basis for diagnostic purposes. Since expensive expert knowledge is required for data annotation in order to use this data for supervised learning, large amounts of data often remain unused. Self-supervised learning methods are well suited for using unlabeled data by pre-training networks to solve pretext tasks. As medical data follow an underlying uneven distribution of occurring diseases, they are inherently imbalanced. This could introduce an unwanted bias during pre-training, ultimately leading to negative consequences that may inhibit the benefits of fine-tuning. In this work we investigate the impact of the imbalance of 2D and 3D medical datasets used for pre-training, as well as the importance of the type and size of the dataset used for pre-training and the pretext task. Our findings indicate that the size of the dataset used for pre-training has greater impact on the final tasks than its balance.
In clinical practice, a substantial amount of data is generated on a daily basis for diagnostic purposes. Since expensive expert knowledge is required for data annotation in order to use this data for supervised learning, large amounts of data often remain unused. Self-supervised learning methods are well suited for using unlabeled data by pre-training networks to solve pretext tasks. As medical data follow an underlying uneven distribution of occurring diseases, they are inherently imbalanced. This could introduce an unwanted bias during pre-training, ultimately leading to negative consequences that may inhibit the benefits of fine-tuning. In this work we investigate the impact of the imbalance of 2D and 3D medical datasets used for pre-training, as well as the importance of the type and size of the dataset used for pre-training and the pretext task. Our findings indicate that the size of the dataset used for pre-training has greater impact on the final tasks than its balance.
Towards Learning Vector Quantization in the Setting of Homomorphic Encryption
Thomas Davies, Ronny Schubert, Mandy Lange-Geisler , Klaus Dohmen, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-47
Thomas Davies, Ronny Schubert, Mandy Lange-Geisler , Klaus Dohmen, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-47
Abstract:
With federated learning scenarios gaining popularity to out-
source computational heavy tasks or to increase generalizability of machine
learning models, there is also a rise of research in terms of the security and
privacy of the respective data used for these tasks. While differential pri-
vacy is studied well for Learning Vector Quantization, we want to present
steps towards Homomorphic Encryption. In this regard, we will show the-
oretically how LVQ-1 can be adapted to be compatible with the TFHE
encryption scheme and present experimental results.
With federated learning scenarios gaining popularity to out-
source computational heavy tasks or to increase generalizability of machine
learning models, there is also a rise of research in terms of the security and
privacy of the respective data used for these tasks. While differential pri-
vacy is studied well for Learning Vector Quantization, we want to present
steps towards Homomorphic Encryption. In this regard, we will show the-
oretically how LVQ-1 can be adapted to be compatible with the TFHE
encryption scheme and present experimental results.
Integrating Class Relation Knowledge in Probabilistic Learning Vector Quantization
Marika Kaden, Ronny Schubert, Tina Geweniger, Wieland Hermann, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-64
Marika Kaden, Ronny Schubert, Tina Geweniger, Wieland Hermann, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-64
Abstract:
An interpretable approach to classification learning using cross-entropy loss is the Probabilistic Learning Vector Quantizer (PLVQ) as a robust prototype-based classifier. We propose a variant of the PLVQ, that allows the integration of domain knowledge. This strategy is becoming increasingly popular as a means of developing intelligent models that can enhance performance and gain acceptance from domain experts. In this paper, we put forth the idea of incorporating externally known class relations as supplementary information. We present theoretical aspects of the model and demonstrate its capabilities through numerical experiments.
An interpretable approach to classification learning using cross-entropy loss is the Probabilistic Learning Vector Quantizer (PLVQ) as a robust prototype-based classifier. We propose a variant of the PLVQ, that allows the integration of domain knowledge. This strategy is becoming increasingly popular as a means of developing intelligent models that can enhance performance and gain acceptance from domain experts. In this paper, we put forth the idea of incorporating externally known class relations as supplementary information. We present theoretical aspects of the model and demonstrate its capabilities through numerical experiments.
Mitigating the Bias in Data for Fairness Using an Advanced Generalized Learning Vector Quantization Approach -- FA(IR)$^2$MA-GLVQ
Marika Kaden, Alexander Engelsberger, Ronny Schubert, Sofie Lövdal, Elina van den Brandhof, Michael Biehl, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-65
Marika Kaden, Alexander Engelsberger, Ronny Schubert, Sofie Lövdal, Elina van den Brandhof, Michael Biehl, Thomas Villmann
https://doi.org/10.14428/esann/2025.ES2025-65
Abstract:
We propose a bias detection and mitigating scheme for data in the context of classification tasks based on learning vector quatizers (LVQ) as classifier. For this purpose generalized LVQ endowed with an advanced matrix adaptation scheme is used for bias detection. The bias removal from data is realized applying a nullspace data projection using the adjusted matrix. The usefulness of the approach is demonstrated and illustrated in terms of two real world datasets.
We propose a bias detection and mitigating scheme for data in the context of classification tasks based on learning vector quatizers (LVQ) as classifier. For this purpose generalized LVQ endowed with an advanced matrix adaptation scheme is used for bias detection. The bias removal from data is realized applying a nullspace data projection using the adjusted matrix. The usefulness of the approach is demonstrated and illustrated in terms of two real world datasets.
Multiclass Adaptive Subspace Learning
Peter Preinesberger, Maximilian Münch, Frank-Michael Schleif
https://doi.org/10.14428/esann/2025.ES2025-4
Peter Preinesberger, Maximilian Münch, Frank-Michael Schleif
https://doi.org/10.14428/esann/2025.ES2025-4
Abstract:
In modern data analysis, there is an increasing trend towards the integration of information across diverse input formats and perspectives. If the available
data is not given in large quantities deep learning is in general impractical. The
recently introduced Adaptive Subspace Kernel Fusion (ASKF) technique provides
an efficient solution for binary classification, facilitating the effective integration
of diverse views throughout the learning process. In this paper, we extend ASKF
by employing a vector-labeled multi-class model, eliminating the need for multiple individual models typically required in conventional one-vs-rest or one-vs-one
approaches. We also evaluated the effect of using GPU-based numerical solvers,
optimizing our problem formulation and the generated code for better efficiency.
The approach is evaluated on various kernel functions, highlighting our methods
ability of robustly dealing with multi-view data.
In modern data analysis, there is an increasing trend towards the integration of information across diverse input formats and perspectives. If the available
data is not given in large quantities deep learning is in general impractical. The
recently introduced Adaptive Subspace Kernel Fusion (ASKF) technique provides
an efficient solution for binary classification, facilitating the effective integration
of diverse views throughout the learning process. In this paper, we extend ASKF
by employing a vector-labeled multi-class model, eliminating the need for multiple individual models typically required in conventional one-vs-rest or one-vs-one
approaches. We also evaluated the effect of using GPU-based numerical solvers,
optimizing our problem formulation and the generated code for better efficiency.
The approach is evaluated on various kernel functions, highlighting our methods
ability of robustly dealing with multi-view data.
The Role of the Learning Rate in Layered Neural Networks with ReLU Activation Function
Otavio Citton, Frederieke Richert, Michael Biehl
https://doi.org/10.14428/esann/2025.ES2025-94
Otavio Citton, Frederieke Richert, Michael Biehl
https://doi.org/10.14428/esann/2025.ES2025-94
Abstract:
Using the statistical physics framework, we study the online learning dynamics in a particular case of shallow feed-forward neural networks with ReLU activation. By expanding the activation in terms of Hermite polynomials we derive analytical results for the evolution of order parameters for any learning rate. Moreover, we compare our results with online gradient descent simulations and show how our method describes the typical learning curves. We also present results on how the learning rate affects the overall behavior of the network and its equilibria, showing the different learning regimes and critical values of the learning rate.
Using the statistical physics framework, we study the online learning dynamics in a particular case of shallow feed-forward neural networks with ReLU activation. By expanding the activation in terms of Hermite polynomials we derive analytical results for the evolution of order parameters for any learning rate. Moreover, we compare our results with online gradient descent simulations and show how our method describes the typical learning curves. We also present results on how the learning rate affects the overall behavior of the network and its equilibria, showing the different learning regimes and critical values of the learning rate.
Growth strategies for arbitrary DAG neural architectures
Stella Douka, Verbockhaven Verbockhaven, Théo Rudkiewicz, Stéphane Rivaud, François P. Landes, Sylvain Chevallier, Guillaume Charpiat
https://doi.org/10.14428/esann/2025.ES2025-112
Stella Douka, Verbockhaven Verbockhaven, Théo Rudkiewicz, Stéphane Rivaud, François P. Landes, Sylvain Chevallier, Guillaume Charpiat
https://doi.org/10.14428/esann/2025.ES2025-112
Abstract:
Deep learning has shown impressive results obtained at the cost of training huge neural networks. However, the larger the architecture, the higher the computational, financial, and environmental costs during training and inference. We aim at reducing both training and inference durations. We focus on Neural Architecture Growth, which can increase the size of a small model when needed, directly during training using information from the backpropagation. We expand existing work and freely grow neural networks in the form of any Directed Acyclic Graph by reducing expressivity bottlenecks in the architecture. We explore strategies to reduce excessive computations and steer network growth toward more parameter-efficient architectures.
Deep learning has shown impressive results obtained at the cost of training huge neural networks. However, the larger the architecture, the higher the computational, financial, and environmental costs during training and inference. We aim at reducing both training and inference durations. We focus on Neural Architecture Growth, which can increase the size of a small model when needed, directly during training using information from the backpropagation. We expand existing work and freely grow neural networks in the form of any Directed Acyclic Graph by reducing expressivity bottlenecks in the architecture. We explore strategies to reduce excessive computations and steer network growth toward more parameter-efficient architectures.
Making Convolutional Neural Networks Energy-Efficient: An Introduction
Noémie Draguet, Benoit Frénay
https://doi.org/10.14428/esann/2025.ES2025-140
Noémie Draguet, Benoit Frénay
https://doi.org/10.14428/esann/2025.ES2025-140
Abstract:
As convolutional neural networks (CNNs) have become mainstream for object recognition and image classification, the environmental impact caused by their high energy consumption (EC) is non negligible. This paper examines techniques that have the ability to reduce the EC of CNNs. It also highlights the inconsistency of metrics that are used for estimating or measuring EC, which reduces the comparability of these techniques. This review aims to shed light on the current situation and to provide a basis for future research in green machine learning.
As convolutional neural networks (CNNs) have become mainstream for object recognition and image classification, the environmental impact caused by their high energy consumption (EC) is non negligible. This paper examines techniques that have the ability to reduce the EC of CNNs. It also highlights the inconsistency of metrics that are used for estimating or measuring EC, which reduces the comparability of these techniques. This review aims to shed light on the current situation and to provide a basis for future research in green machine learning.
Membership Inference Attack in Random Forests
Fatemeh Akbarian, Amir Aminifar
https://doi.org/10.14428/esann/2025.ES2025-184
Fatemeh Akbarian, Amir Aminifar
https://doi.org/10.14428/esann/2025.ES2025-184
Abstract:
Machine Learning (ML) offers many opportunities, but its reliance on personal data raises privacy concerns. One such example is the Membership Inference Attack (MIA), which aims to determine whether a specific data point was part of a model’s training dataset. In this paper, we investigate this attack on Random Forests (RFs) and propose a method to quantify their vulnerability to MIA. We also demonstrate that in collaborative setups like federated learning, a client with access to the model and partial training dataset can establish MIA against other clients’ training data. The effectiveness of our method is validated through experiments.
Machine Learning (ML) offers many opportunities, but its reliance on personal data raises privacy concerns. One such example is the Membership Inference Attack (MIA), which aims to determine whether a specific data point was part of a model’s training dataset. In this paper, we investigate this attack on Random Forests (RFs) and propose a method to quantify their vulnerability to MIA. We also demonstrate that in collaborative setups like federated learning, a client with access to the model and partial training dataset can establish MIA against other clients’ training data. The effectiveness of our method is validated through experiments.
Proactive Privacy Risk Assessment for Android Applications: A Machine Learning Based-Approach
Narjes Doggaz, Aissa Trad, Hella Kaffel Ben Ayed
https://doi.org/10.14428/esann/2025.ES2025-170
Narjes Doggaz, Aissa Trad, Hella Kaffel Ben Ayed
https://doi.org/10.14428/esann/2025.ES2025-170
Abstract:
Mobile devices have become ubiquitous, collecting vast amounts of personal data through granted permissions. Privacy concerns arise when personal information is leaked to third parties without the user’s awareness or consent. To address this issue, we propose a proactive approach based on a Machine Learning model to predict privacy risk scores for Android applications. These scores are based on the requested permissions and allow the users to be aware of the potential leakage of sensitive information before installing an application. Experimental evaluations demonstrate the competitive performance of our model against existing state-of-the-art methods.
Mobile devices have become ubiquitous, collecting vast amounts of personal data through granted permissions. Privacy concerns arise when personal information is leaked to third parties without the user’s awareness or consent. To address this issue, we propose a proactive approach based on a Machine Learning model to predict privacy risk scores for Android applications. These scores are based on the requested permissions and allow the users to be aware of the potential leakage of sensitive information before installing an application. Experimental evaluations demonstrate the competitive performance of our model against existing state-of-the-art methods.
A new approach to multilayer SVMs
Lluis Belanche
https://doi.org/10.14428/esann/2025.ES2025-183
Lluis Belanche
https://doi.org/10.14428/esann/2025.ES2025-183
Abstract:
Despite the traditional high performance of Support Vector Machines (SVMs) in classification and regression tasks, modern data loads have introduced new efficiency challenges, rendering SVMs incapable of handling non-linear problems when the dataset size is large. On the other hand, neural architectures have shown excellent results when dealing with complex patterns in data. By leveraging kernel approximation techniques and linear optimizations, this work introduces a multilayer SVM architecture, presenting competitive performance against classical SVMs.
Despite the traditional high performance of Support Vector Machines (SVMs) in classification and regression tasks, modern data loads have introduced new efficiency challenges, rendering SVMs incapable of handling non-linear problems when the dataset size is large. On the other hand, neural architectures have shown excellent results when dealing with complex patterns in data. By leveraging kernel approximation techniques and linear optimizations, this work introduces a multilayer SVM architecture, presenting competitive performance against classical SVMs.
Evolutionary Fault Localization Based on the Diversity of Suspiciousness Values
Willian Ferreira, Plinio S. Leitao-Junior, Deuslirio Silva-Junior, Rachel Harrison
https://doi.org/10.14428/esann/2025.ES2025-124
Willian Ferreira, Plinio S. Leitao-Junior, Deuslirio Silva-Junior, Rachel Harrison
https://doi.org/10.14428/esann/2025.ES2025-124
Abstract:
Context. Fault localization (FL) is a software lifecycle activity and its automation is a challenge for researchers and practitioners. Method. The study focuses on evolutionary fault localization and introduces a novel Genetic Programming (GP) approach that evolves FL heuristics based on the diversity of the suspiciousness score of program statements – a score to grade how faulty a statement is. Experimental analysis. The approach was evaluated against baselines, which include the canonical GP, in benchmarks with real programs and real faults. Conclusion. The results showed the competitiveness of the approach through evaluation metrics commonly used in the research field.
Context. Fault localization (FL) is a software lifecycle activity and its automation is a challenge for researchers and practitioners. Method. The study focuses on evolutionary fault localization and introduces a novel Genetic Programming (GP) approach that evolves FL heuristics based on the diversity of the suspiciousness score of program statements – a score to grade how faulty a statement is. Experimental analysis. The approach was evaluated against baselines, which include the canonical GP, in benchmarks with real programs and real faults. Conclusion. The results showed the competitiveness of the approach through evaluation metrics commonly used in the research field.
Time series
A Model of Memristive Nanowire Neuron for Recurrent Neural Networks
Veronica Pistolesi, Andrea Ceni, Gianluca Milano, Carlo Ricciardi, Claudio Gallicchio
https://doi.org/10.14428/esann/2025.ES2025-104
Veronica Pistolesi, Andrea Ceni, Gianluca Milano, Carlo Ricciardi, Claudio Gallicchio
https://doi.org/10.14428/esann/2025.ES2025-104
Abstract:
We propose a novel neural processing unit for artificial neural networks, inspired by the memristive properties of nanowires. Our analysis, framed within the Reservoir Computing paradigm, demonstrates the stability, short-term memory, and fading memory capabilities of the unit. Further experiments on assemblies of nanowire-inspired neurons show promising results in time-series classification tasks.
Our introduced approach bridges analog neuromorphic hardware and AI applications, enabling efficient time series processing.
We propose a novel neural processing unit for artificial neural networks, inspired by the memristive properties of nanowires. Our analysis, framed within the Reservoir Computing paradigm, demonstrates the stability, short-term memory, and fading memory capabilities of the unit. Further experiments on assemblies of nanowire-inspired neurons show promising results in time-series classification tasks.
Our introduced approach bridges analog neuromorphic hardware and AI applications, enabling efficient time series processing.
Performance monitoring and wear comprehension through Neural Network
Thomas Binet, Hanane Azzag, Mustapha Lebbah, Jérôme Lacaille
https://doi.org/10.14428/esann/2025.ES2025-110
Thomas Binet, Hanane Azzag, Mustapha Lebbah, Jérôme Lacaille
https://doi.org/10.14428/esann/2025.ES2025-110
Abstract:
In this paper, we present a novel approach to modeling the wear
of complex dynamic systems, exemplified by aircraft engines, through the
construction of a structured latent space. Unlike traditional methods, our
model does not rely on explicit wear data but instead leverages supervised
training to minimize the error on observable system parameters. Beyond
wear forecasting, this work offers a foundation for unsupervised diagnosis,
risk prevention, and the quantification of repair impacts.
In this paper, we present a novel approach to modeling the wear
of complex dynamic systems, exemplified by aircraft engines, through the
construction of a structured latent space. Unlike traditional methods, our
model does not rely on explicit wear data but instead leverages supervised
training to minimize the error on observable system parameters. Beyond
wear forecasting, this work offers a foundation for unsupervised diagnosis,
risk prevention, and the quantification of repair impacts.
On Domain Generalization for Human Activity Recognition with Mix-Based Methods
Otávio Napoli, Edson Borin
https://doi.org/10.14428/esann/2025.ES2025-135
Otávio Napoli, Edson Borin
https://doi.org/10.14428/esann/2025.ES2025-135
Abstract:
Domain generalization (DG) is a challenging problem that involves adapting a model trained on source domains to an unseen target domain. In human activity recognition (HAR), domain shifts often arise from differences in sensor placement, device specifications, or environmental factors, making generalization difficult. In this work, we investigate the effectiveness of mix-based methods like MixStyle and Exact Feature Distribution Mixing (EFDM) when integrated into state-of-the-art models like ResNet and TS2Vec for DG in HAR tasks, leveraging the DAGHAR benchmark. Our results demonstrate that MixStyle significantly outperforms both EFDM and Empirical Risk Minimization approaches, highlighting its effectiveness in addressing domain shifts.
Domain generalization (DG) is a challenging problem that involves adapting a model trained on source domains to an unseen target domain. In human activity recognition (HAR), domain shifts often arise from differences in sensor placement, device specifications, or environmental factors, making generalization difficult. In this work, we investigate the effectiveness of mix-based methods like MixStyle and Exact Feature Distribution Mixing (EFDM) when integrated into state-of-the-art models like ResNet and TS2Vec for DG in HAR tasks, leveraging the DAGHAR benchmark. Our results demonstrate that MixStyle significantly outperforms both EFDM and Empirical Risk Minimization approaches, highlighting its effectiveness in addressing domain shifts.
Investigating four deep learning approaches as candidates for unified models in time series forecasting and event prediction: application in anesthesia training
Quentin Victor, Ianis Clavier, Hugo Boisaubert, Fabien Picarougne, Corinne Lejus-Bourdeau, Christine Sinoquet
https://doi.org/10.14428/esann/2025.ES2025-158
Quentin Victor, Ianis Clavier, Hugo Boisaubert, Fabien Picarougne, Corinne Lejus-Bourdeau, Christine Sinoquet
https://doi.org/10.14428/esann/2025.ES2025-158
Abstract:
This paper explores deep learning architectures for the purposes of unsupervised representation learning of hybrid asynchronous data, and joint prediction tasks. We aim to forecast short-term multivariate time series contextualized by events and to predict events contextualized by time series. Our proof-of-concept examines a real-world case of digitally assisted training in anesthesia. We evaluate four different architectures, using two strategies to integrate both time series and event sequences in the models. We assess the prediction quality of the models, and demonstrate that only one of the four architectures achieves performance outcomes compatible with our application objective.
This paper explores deep learning architectures for the purposes of unsupervised representation learning of hybrid asynchronous data, and joint prediction tasks. We aim to forecast short-term multivariate time series contextualized by events and to predict events contextualized by time series. Our proof-of-concept examines a real-world case of digitally assisted training in anesthesia. We evaluate four different architectures, using two strategies to integrate both time series and event sequences in the models. We assess the prediction quality of the models, and demonstrate that only one of the four architectures achieves performance outcomes compatible with our application objective.
Generate Polyphonic Music with Multivariate Masked Autoregressive Flow
Massimiliano Sirgiovanni, Daniele Castellana
https://doi.org/10.14428/esann/2025.ES2025-86
Massimiliano Sirgiovanni, Daniele Castellana
https://doi.org/10.14428/esann/2025.ES2025-86
Abstract:
This paper explores the Masked Autoregressive Flow (MAF) model in the context of music generation. The choice of MAF was driven by its promising ability to handle temporal data with strong dependencies between variables. Unfortunately, MAF is suitable only for univariate time series and therefore cannot be directly applied to generate polyphonic melodies. We propose three different approaches to extend the MAF architecture to handle multivariate time series and we test them on the Lakh Pianoroll Dataset. The conducted experiments demonstrate good accuracy and the ability to generate pleasant and original melodies, highlighting the innovative potential of this interdisciplinary convergence.
This paper explores the Masked Autoregressive Flow (MAF) model in the context of music generation. The choice of MAF was driven by its promising ability to handle temporal data with strong dependencies between variables. Unfortunately, MAF is suitable only for univariate time series and therefore cannot be directly applied to generate polyphonic melodies. We propose three different approaches to extend the MAF architecture to handle multivariate time series and we test them on the Lakh Pianoroll Dataset. The conducted experiments demonstrate good accuracy and the ability to generate pleasant and original melodies, highlighting the innovative potential of this interdisciplinary convergence.
Quantum, Quantum Inspired and Hybrid Machine Learning
Enhancing Machine Learning with Quantum Methods
M. Lautaro Hickmann, Markus Lange, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-29
M. Lautaro Hickmann, Markus Lange, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-29
Abstract:
Quantum physics offers a new paradigm that promises to make certain computations faster and more efficient. The recent progress of quantum computers allows for more complex applications which lead to a rising interest in transferring machine learning methods to quantum hardware for practical applications. However, the development of quantum computers is still in its beginnings and currently these approaches require synergy with classical computers. We present some methods where this quantum-classical interplay is used to enhance machine learning approaches.
Quantum physics offers a new paradigm that promises to make certain computations faster and more efficient. The recent progress of quantum computers allows for more complex applications which lead to a rising interest in transferring machine learning methods to quantum hardware for practical applications. However, the development of quantum computers is still in its beginnings and currently these approaches require synergy with classical computers. We present some methods where this quantum-classical interplay is used to enhance machine learning approaches.
Quantum Annealing based Feature Selection
Daniel Pranjic, Bharadwaj Chowdary Mummaneni, Christian Tutschku
https://doi.org/10.14428/esann/2025.ES2025-162
Daniel Pranjic, Bharadwaj Chowdary Mummaneni, Christian Tutschku
https://doi.org/10.14428/esann/2025.ES2025-162
Abstract:
Feature selection is crucial for enhancing the accuracy and efficiency of machine learning models. Calculating the optimal feature set for maximum mutual information (MI) and conditional mutual information (CMI) remains computationally intractable for large datasets on classical computers, even with approximation methods. This study employs a Mutual Information Quadratic Unconstrained Binary Optimization (MIQUBO) formulation, enabling its solution on a quantum annealer. To showcase its real-world applicability, we apply MIQUBO to forecasting the price of used excavators. Our results demonstrate that using the MIQUBO approach there is an improvement in the prediction of machine learning models for datasets, with a smaller MI concentration.
Feature selection is crucial for enhancing the accuracy and efficiency of machine learning models. Calculating the optimal feature set for maximum mutual information (MI) and conditional mutual information (CMI) remains computationally intractable for large datasets on classical computers, even with approximation methods. This study employs a Mutual Information Quadratic Unconstrained Binary Optimization (MIQUBO) formulation, enabling its solution on a quantum annealer. To showcase its real-world applicability, we apply MIQUBO to forecasting the price of used excavators. Our results demonstrate that using the MIQUBO approach there is an improvement in the prediction of machine learning models for datasets, with a smaller MI concentration.
Encoding hyperspectral data with low-bond dimension quantum tensor networks
Fabian Fischbach, Hans-Martin Rieser, Oliver Sefrin
https://doi.org/10.14428/esann/2025.ES2025-91
Fabian Fischbach, Hans-Martin Rieser, Oliver Sefrin
https://doi.org/10.14428/esann/2025.ES2025-91
Abstract:
Encoding data on a quantum computer poses a major challenge on data intensive quantum applications like machine learning. In particular, data with complex internal structure like emission spectra need to be adapted to reduce the encoding effort of quantum circuits. We empirically investigate the influence of compression on the encoding of hyperspectral data into quantum states, to make its encoding more efficient. To this end, we assess the effect of approximating states by low-bond dimension matrix product states fed into a variational quantum classifier on the public Pavia University benchmark dataset.
Encoding data on a quantum computer poses a major challenge on data intensive quantum applications like machine learning. In particular, data with complex internal structure like emission spectra need to be adapted to reduce the encoding effort of quantum circuits. We empirically investigate the influence of compression on the encoding of hyperspectral data into quantum states, to make its encoding more efficient. To this end, we assess the effect of approximating states by low-bond dimension matrix product states fed into a variational quantum classifier on the public Pavia University benchmark dataset.
CompactifAI: Extreme Compression of Large Language Models using Quantum-Inspired Tensor Networks
Saeed Jahromi, Uygar Kurt, Sukhi Singh, David Montero, Borja Aizpurua Altuna, Román Orús Lacort
https://doi.org/10.14428/esann/2025.ES2025-8
Saeed Jahromi, Uygar Kurt, Sukhi Singh, David Montero, Borja Aizpurua Altuna, Román Orús Lacort
https://doi.org/10.14428/esann/2025.ES2025-8
Abstract:
Large Language Models (LLMs) like ChatGPT and LlaMA offer immense opportunities but face challenges due to their vast size, leading to high training/inference costs and energy demands. Traditional compression methods focus on reducing neurons or precision. We introduce \textit{CompactifAI}, a novel LLM compression using quantum-inspired Tensor Networks to compress the model's correlation space. Testing on LlaMA-2 7B showed a 93% memory and 70% parameter reduction, with minimal accuracy loss (2-3%) and significant training (50%) and inference (25%) speedups.
Large Language Models (LLMs) like ChatGPT and LlaMA offer immense opportunities but face challenges due to their vast size, leading to high training/inference costs and energy demands. Traditional compression methods focus on reducing neurons or precision. We introduce \textit{CompactifAI}, a novel LLM compression using quantum-inspired Tensor Networks to compress the model's correlation space. Testing on LlaMA-2 7B showed a 93% memory and 70% parameter reduction, with minimal accuracy loss (2-3%) and significant training (50%) and inference (25%) speedups.
Quantum Tensor Network Learning with DMRG
Gustav Jäger, Martin B. Plenio, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-157
Gustav Jäger, Martin B. Plenio, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-157
Abstract:
Tensor Networks are a relatively new machine learning ap-
proach. The architectures proposed initially are inspired by approaches
from quantum many-body physics simulations. One common layout is the
matrix product state (MPS) also known as a tensor train optimized with
gradient descent techniques. We introduce a global normalization condi-
tion, so that the MPS represents a quantum state. We investigate two
optimization methods that find the locally optimal tensors and compare
them regarding their effectiveness. One is based on gradient descent and
the other on an adaptation of DMRG.
Tensor Networks are a relatively new machine learning ap-
proach. The architectures proposed initially are inspired by approaches
from quantum many-body physics simulations. One common layout is the
matrix product state (MPS) also known as a tensor train optimized with
gradient descent techniques. We introduce a global normalization condi-
tion, so that the MPS represents a quantum state. We investigate two
optimization methods that find the locally optimal tensors and compare
them regarding their effectiveness. One is based on gradient descent and
the other on an adaptation of DMRG.
Expressivity vs. Generalization in Quantum Kernel Methods
Markus Gross, Markus Lange, Bogusz Bujnowski, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-152
Markus Gross, Markus Lange, Bogusz Bujnowski, Hans-Martin Rieser
https://doi.org/10.14428/esann/2025.ES2025-152
Abstract:
We analytically and numerically investigate the expressivity and generalization ability of quantum kernel models. We consider prototypical parallel encoding strategies and show that they give rise to simple universal forms of quantum kernels. By using qubit-dependent data re-scaling schemes, we can exponentially vary the spectral content of the kernel and thereby control its simplicity bias. We obtain analytical results on the kernel eigenspectrum and connect it to theories of kernel generalization, which allow us to study the influence of expressivity on generalization error.
We analytically and numerically investigate the expressivity and generalization ability of quantum kernel models. We consider prototypical parallel encoding strategies and show that they give rise to simple universal forms of quantum kernels. By using qubit-dependent data re-scaling schemes, we can exponentially vary the spectral content of the kernel and thereby control its simplicity bias. We obtain analytical results on the kernel eigenspectrum and connect it to theories of kernel generalization, which allow us to study the influence of expressivity on generalization error.
Domain adaptation and federated learning
RAM: Retrieval Augmented Modelling for Tabular In-Context Few-Shot Domain Adaptation
Oleh Kostromin, Felix Kossak, Michael Zwick
https://doi.org/10.14428/esann/2025.ES2025-15
Oleh Kostromin, Felix Kossak, Michael Zwick
https://doi.org/10.14428/esann/2025.ES2025-15
Abstract:
Transformer architectures have shown great success in natural language processing, sparking interest in their applications on tabular data. However, the potential of using transformer-like architectures for in-context domain adaptation in tabular settings remains underexplored. We introduce Retrieval-Augmented Modelling (RAM), a compact attention-based architecture specifically designed for this task. RAM utilises a Domain-Aligned Memory training strategy, which ensures that it always processes the data from the same domain at each training step, allowing the model to focus on domain-specific patterns. Evaluated on synthetic data simulating domain shifts, RAM outperforms traditional machine learning models, effectively adapting to unseen domains.
Transformer architectures have shown great success in natural language processing, sparking interest in their applications on tabular data. However, the potential of using transformer-like architectures for in-context domain adaptation in tabular settings remains underexplored. We introduce Retrieval-Augmented Modelling (RAM), a compact attention-based architecture specifically designed for this task. RAM utilises a Domain-Aligned Memory training strategy, which ensures that it always processes the data from the same domain at each training step, allowing the model to focus on domain-specific patterns. Evaluated on synthetic data simulating domain shifts, RAM outperforms traditional machine learning models, effectively adapting to unseen domains.
Adversarial Attacks for Drift Detection
Fabian Hinder, Valerie Vaquet, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-82
Fabian Hinder, Valerie Vaquet, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-82
Abstract:
Concept drift refers to the change of data distributions over time. While drift poses a challenge for learning models, requiring their continual adaption, it is also relevant in system monitoring to detect malfunctions, system failures, and unexpected behavior. In the latter case, the robust and reliable detection of drifts is imperative. This work studies the shortcomings of commonly used drift detection schemes. We show that they are prone to adversarial attacks, i.e., streams with undetected drift. In particular, we give necessary and sufficient conditions for their existence, provide methods for their construction, and demonstrate this behavior in experiments.
Concept drift refers to the change of data distributions over time. While drift poses a challenge for learning models, requiring their continual adaption, it is also relevant in system monitoring to detect malfunctions, system failures, and unexpected behavior. In the latter case, the robust and reliable detection of drifts is imperative. This work studies the shortcomings of commonly used drift detection schemes. We show that they are prone to adversarial attacks, i.e., streams with undetected drift. In particular, we give necessary and sufficient conditions for their existence, provide methods for their construction, and demonstrate this behavior in experiments.
Conceptualizing Concept Drift
Isaac Roberts, Fabian Hinder, Valerie Vaquet, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-117
Isaac Roberts, Fabian Hinder, Valerie Vaquet, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-117
Abstract:
Concept drift refers to the phenomenon that the underlying data distribution changes over time. While detection methods or model adjustment methods exist, a proper explanation of drift in high-dimensional settings is still widely unsolved. This problem is crucial since it enables
an understanding of the most prominent drift characteristics. In this work, we propose to explain concept drift of high-dimensional data objects by means of concept activation vectors which give rise to local, phase, and a novel, global explanation called the $Concept^2$ Drift Distribution.
Concept drift refers to the phenomenon that the underlying data distribution changes over time. While detection methods or model adjustment methods exist, a proper explanation of drift in high-dimensional settings is still widely unsolved. This problem is crucial since it enables
an understanding of the most prominent drift characteristics. In this work, we propose to explain concept drift of high-dimensional data objects by means of concept activation vectors which give rise to local, phase, and a novel, global explanation called the $Concept^2$ Drift Distribution.
Resource-Aware Cooperation in Federated Learning
Manuel Röder, Fabian Geiger, Frank-Michael Schleif
https://doi.org/10.14428/esann/2025.ES2025-149
Manuel Röder, Fabian Geiger, Frank-Michael Schleif
https://doi.org/10.14428/esann/2025.ES2025-149
Abstract:
We present a novel Federated Learning framework, FedT4T, that systematically evaluates utility-driven client strategies under resource
constraints. Recognizing the significant challenges in practical distributed learning environments, such as limited resources and non-cooperative behaviors,
we model client interactions using the Iterated Prisoner’s Dilemma. Our framework enables clients to adapt their decision rules based on prior interactions
and available resources, optimizing both individual utility and collective contribution to solve a global learning task. We apply FedT4T to a Federated Learning
benchmark classification task and explore the dynamics of cooperation between clients driven by common strategies from cooperation theory under the impact
of varying resource availability. The code is publicly available at https://github.com/cairo-thws/FedT4T.
We present a novel Federated Learning framework, FedT4T, that systematically evaluates utility-driven client strategies under resource
constraints. Recognizing the significant challenges in practical distributed learning environments, such as limited resources and non-cooperative behaviors,
we model client interactions using the Iterated Prisoner’s Dilemma. Our framework enables clients to adapt their decision rules based on prior interactions
and available resources, optimizing both individual utility and collective contribution to solve a global learning task. We apply FedT4T to a Federated Learning
benchmark classification task and explore the dynamics of cooperation between clients driven by common strategies from cooperation theory under the impact
of varying resource availability. The code is publicly available at https://github.com/cairo-thws/FedT4T.
SecureBFL: a Blockchain-enhanced federated learning architecture with MPC
Tanguy Vansnick, Leandro Collier, Saïd Mahmoudi
https://doi.org/10.14428/esann/2025.ES2025-160
Tanguy Vansnick, Leandro Collier, Saïd Mahmoudi
https://doi.org/10.14428/esann/2025.ES2025-160
Abstract:
The increasing demand for data in machine learning raises significant privacy concerns. Federated Learning (FL) enables multiple entities to train models collaboratively without sharing raw data. However, centralized FL (CFL) relies on a central server, making it vulnerable to poisoning attacks and single points of failure (SPOF). Decentralized FL (DFL) addresses these issues by removing the central server. This paper proposes a novel DFL architecture integrating blockchain for resisting attacks and Multi-Party Computation (MPC) for secure model parameter transfer. This architecture enhances security and confidentiality in collaborative learning without compromising result quality.
The increasing demand for data in machine learning raises significant privacy concerns. Federated Learning (FL) enables multiple entities to train models collaboratively without sharing raw data. However, centralized FL (CFL) relies on a central server, making it vulnerable to poisoning attacks and single points of failure (SPOF). Decentralized FL (DFL) addresses these issues by removing the central server. This paper proposes a novel DFL architecture integrating blockchain for resisting attacks and Multi-Party Computation (MPC) for secure model parameter transfer. This architecture enhances security and confidentiality in collaborative learning without compromising result quality.
Explainable AI and representation learning
Encoding Higher-Order Logic in Spatio-Temporal Hypergraphs for Neuro-Symbolic Learning
Bikram Pratim BHUYAN, Amylia AIT SAADI, Amar RAMDANE-CHERIF
https://doi.org/10.14428/esann/2025.ES2025-78
Bikram Pratim BHUYAN, Amylia AIT SAADI, Amar RAMDANE-CHERIF
https://doi.org/10.14428/esann/2025.ES2025-78
Abstract:
This work integrates Monadic Second-Order (MSO) logic into Spatio-Temporal Heterogeneous Hypergraphs (STHH) to advance Neuro-Symbolic AI. By bridging higher-ordered symbolic logic with neural computations, STHH offers a novel framework for knowledge representation and learning. Evaluations on a custom agricultural dataset show that the proposed STHH outperforms state-of-the-art hypergraph models across F1-score, accuracy, and AUC metrics. Despite challenges such as limited standardized datasets, this study underscores the potential of integrating higher-ordered symbolic logic into neural systems to achieve robust and interpretable AI.
This work integrates Monadic Second-Order (MSO) logic into Spatio-Temporal Heterogeneous Hypergraphs (STHH) to advance Neuro-Symbolic AI. By bridging higher-ordered symbolic logic with neural computations, STHH offers a novel framework for knowledge representation and learning. Evaluations on a custom agricultural dataset show that the proposed STHH outperforms state-of-the-art hypergraph models across F1-score, accuracy, and AUC metrics. Despite challenges such as limited standardized datasets, this study underscores the potential of integrating higher-ordered symbolic logic into neural systems to achieve robust and interpretable AI.
Implicit Neural Decision Trees
Francesco Spinnato, Antonio Mastropietro, Riccardo Guidotti
https://doi.org/10.14428/esann/2025.ES2025-85
Francesco Spinnato, Antonio Mastropietro, Riccardo Guidotti
https://doi.org/10.14428/esann/2025.ES2025-85
Abstract:
Representation learning is a central topic in machine learning, with significant efforts dedicated to encoding structured data such as sequences, trees, and graphs for various downstream tasks. A branch of these studies focuses on functional data analysis, which views data not as discrete arrays but as continuous functions. When these functions are parameterized using neural networks, they are called Implicit Neural Representations (INR).
INRs have been successfully applied to represent diverse data types but, to the best of our knowledge, have not been used for encoding decision models. This work addresses the novel challenge of using INRs to represent decision trees.
We introduce a tailored coordinate system and train INRs to reconstruct decision trees with a loss function to minimize node reconstruction errors. We benchmark implicit neural decision trees on several datasets, showing that they can effectively represent individual trees, and show potential extensions to tree forests through meta-learning.
Representation learning is a central topic in machine learning, with significant efforts dedicated to encoding structured data such as sequences, trees, and graphs for various downstream tasks. A branch of these studies focuses on functional data analysis, which views data not as discrete arrays but as continuous functions. When these functions are parameterized using neural networks, they are called Implicit Neural Representations (INR).
INRs have been successfully applied to represent diverse data types but, to the best of our knowledge, have not been used for encoding decision models. This work addresses the novel challenge of using INRs to represent decision trees.
We introduce a tailored coordinate system and train INRs to reconstruct decision trees with a loss function to minimize node reconstruction errors. We benchmark implicit neural decision trees on several datasets, showing that they can effectively represent individual trees, and show potential extensions to tree forests through meta-learning.
Project-Specific Code Summarization with Meta-Learning and Explainability Techniques
Quang-Huy Nguyen, Hoai-Phong Le, Bac Le
https://doi.org/10.14428/esann/2025.ES2025-128
Quang-Huy Nguyen, Hoai-Phong Le, Bac Le
https://doi.org/10.14428/esann/2025.ES2025-128
Abstract:
Code summarization generates natural language descriptions for code snippets, enhancing readability and maintainability. While current methods perform well with large-scale datasets, they struggle in low-resource scenarios typical of smaller and newer projects. Additionally, developers need summaries that capture project-specific characteristics rather than generic descriptions. To address these challenges, we propose a meta-learning-based training framework that adapts the model to individual projects as distinct tasks, even with minimal data. We introduce a strategy for selecting support projects to boost the framework's effectiveness. Experiments on eight real-world projects show that our method outperforms the baseline approach. Furthermore, we use explainability techniques to clarify the prediction process and identify potential issues.
Code summarization generates natural language descriptions for code snippets, enhancing readability and maintainability. While current methods perform well with large-scale datasets, they struggle in low-resource scenarios typical of smaller and newer projects. Additionally, developers need summaries that capture project-specific characteristics rather than generic descriptions. To address these challenges, we propose a meta-learning-based training framework that adapts the model to individual projects as distinct tasks, even with minimal data. We introduce a strategy for selecting support projects to boost the framework's effectiveness. Experiments on eight real-world projects show that our method outperforms the baseline approach. Furthermore, we use explainability techniques to clarify the prediction process and identify potential issues.
Trajectory-Embedded Matryoshka Representation Learning for Enhanced Similarity Analysis
Federico Pennino, Andrea Gurioli, Maurizio Gabbrielli
https://doi.org/10.14428/esann/2025.ES2025-121
Federico Pennino, Andrea Gurioli, Maurizio Gabbrielli
https://doi.org/10.14428/esann/2025.ES2025-121
Abstract:
This paper introduces Trajectory-Embedded Matryoshka Representation Learning (TE-MRL). This novel framework synergies the capabilities of trajectory representation learning with the adaptability and efficiency of Matryoshka Representation Learning (MRL). TE-MRL is engineered to generate adaptive, multi-granular embeddings that efficiently capture the spatial-temporal dynamics inherent in trajectory data. We evaluate TE-MRL on the Porto dataset, focusing on trajectory similarity and k-nearest trajectory similarity tasks. Our findings demonstrate that TE-MRL preserves critical features such as travel semantics and temporal regularities while it can significantly reduce computational time and memory footprint. The proposed approach matches existing methods' accuracy and efficiency but demonstrates robust adaptability under varying computational constraints.
Furthermore, we proposed a two-stage retrieval pipeline to enhance computational time while maintaining the same precision. We reduced the computation time by 8x while maintaining state-of-the-art precision. The effectiveness of TE-MRL in handling the complexity of the Porto dataset underlines its potential for broader applications in urban computing and mobility analytics.
This paper introduces Trajectory-Embedded Matryoshka Representation Learning (TE-MRL). This novel framework synergies the capabilities of trajectory representation learning with the adaptability and efficiency of Matryoshka Representation Learning (MRL). TE-MRL is engineered to generate adaptive, multi-granular embeddings that efficiently capture the spatial-temporal dynamics inherent in trajectory data. We evaluate TE-MRL on the Porto dataset, focusing on trajectory similarity and k-nearest trajectory similarity tasks. Our findings demonstrate that TE-MRL preserves critical features such as travel semantics and temporal regularities while it can significantly reduce computational time and memory footprint. The proposed approach matches existing methods' accuracy and efficiency but demonstrates robust adaptability under varying computational constraints.
Furthermore, we proposed a two-stage retrieval pipeline to enhance computational time while maintaining the same precision. We reduced the computation time by 8x while maintaining state-of-the-art precision. The effectiveness of TE-MRL in handling the complexity of the Porto dataset underlines its potential for broader applications in urban computing and mobility analytics.
Encoding Matters: Impact of Categorical Variable Encoding on Performance and Bias
Daniel Kopp, Benjamin Maudet, Lisheng Sun-Hosoya, Kristin Bennett
https://doi.org/10.14428/esann/2025.ES2025-191
Daniel Kopp, Benjamin Maudet, Lisheng Sun-Hosoya, Kristin Bennett
https://doi.org/10.14428/esann/2025.ES2025-191
Abstract:
Encoding categorical variables impacts model performance and can introduce bias in supervised learning, particularly affecting fairness when some groups are under-represented. We analyze the effects of different encoding methods on synthetic and real datasets to mitigate unintended model reliance on specific variables. We propose CaVaR (Categorical Variable Reliance) to quantify model reliance on variables and an Availability Index to measure CaVaR's sensitivity to partial encoding changes. A high Availability Disparity, measured by the standard deviation of the Availability Index across encodings, highlights potential bias from mixed encodings. The results suggest encoding all categorical variables uniformly, regardless of their ordinal or nominal nature, may reduce bias, with the choice guided by computational and performance considerations.
Encoding categorical variables impacts model performance and can introduce bias in supervised learning, particularly affecting fairness when some groups are under-represented. We analyze the effects of different encoding methods on synthetic and real datasets to mitigate unintended model reliance on specific variables. We propose CaVaR (Categorical Variable Reliance) to quantify model reliance on variables and an Availability Index to measure CaVaR's sensitivity to partial encoding changes. A high Availability Disparity, measured by the standard deviation of the Availability Index across encodings, highlights potential bias from mixed encodings. The results suggest encoding all categorical variables uniformly, regardless of their ordinal or nominal nature, may reduce bias, with the choice guided by computational and performance considerations.
Explainable ensemble learning for structural damage prediction under seismic events
Michele Baldassini, Pierfrancesco Foglia, Beatrice Lazzerini, Francesco Pistolesi, Cosimo Antonio Prete
https://doi.org/10.14428/esann/2025.ES2025-198
Michele Baldassini, Pierfrancesco Foglia, Beatrice Lazzerini, Francesco Pistolesi, Cosimo Antonio Prete
https://doi.org/10.14428/esann/2025.ES2025-198
Abstract:
This paper presents an explainable ensemble learning framework using Bootstrap Aggregating to predict structural damage in masonry buildings during seismic events. It estimates the peak ground acceleration (PGA) leading to the damage control limit state (significant damage) based on structural parameters. The model achieves high accuracy (R²=0.9536, MAE=0.0057) and interpretability through SHAP, aligning with structural engineering principles. Compared to finite element analyses, it offers faster computations (milliseconds) and scalability, enabling rapid intervention planning after earthquakes. Developed under the "MEDEA" project (EU Grant n. 10101236), it supports disaster response and enhances seismic resilience.
This paper presents an explainable ensemble learning framework using Bootstrap Aggregating to predict structural damage in masonry buildings during seismic events. It estimates the peak ground acceleration (PGA) leading to the damage control limit state (significant damage) based on structural parameters. The model achieves high accuracy (R²=0.9536, MAE=0.0057) and interpretability through SHAP, aligning with structural engineering principles. Compared to finite element analyses, it offers faster computations (milliseconds) and scalability, enabling rapid intervention planning after earthquakes. Developed under the "MEDEA" project (EU Grant n. 10101236), it supports disaster response and enhances seismic resilience.
Generating Synthetic Spectral Data using Conditional DDPM
Fabian Kubiczek, Stefan Patzke, Jörg Thiem
https://doi.org/10.14428/esann/2025.ES2025-189
Fabian Kubiczek, Stefan Patzke, Jörg Thiem
https://doi.org/10.14428/esann/2025.ES2025-189
Abstract:
This study investigates the efficiency and effectiveness of Denoising Diffusion Probabilistic Models (DDPM) for generating synthetic spectral data. A modified DDPM was implemented and evaluated in comparison to a previously established model. Both models were trained with and without Classifier-Free Guidance (CFG). In addition, training duration and sample generation are compared. The results demonstrate that the synthetic spectral data exhibits a high degree of alignment with the training data, with only minor deviations. Furthermore, the influence of CFG on the generation process is evident. The findings indicate that the modified DDPM performs better on the given data.
This study investigates the efficiency and effectiveness of Denoising Diffusion Probabilistic Models (DDPM) for generating synthetic spectral data. A modified DDPM was implemented and evaluated in comparison to a previously established model. Both models were trained with and without Classifier-Free Guidance (CFG). In addition, training duration and sample generation are compared. The results demonstrate that the synthetic spectral data exhibits a high degree of alignment with the training data, with only minor deviations. Furthermore, the influence of CFG on the generation process is evident. The findings indicate that the modified DDPM performs better on the given data.
Interpretable machine learning for the diagnosis of hyperkinetic movement disorders
Elina van den Brandhof, Jan W.J. Elting, Inge Tuitert, A.M. Madelein van der Stouwe, Jelle R. Dalenberg, Marina A.J. Tijssen, Michael Biehl
https://doi.org/10.14428/esann/2025.ES2025-73
Elina van den Brandhof, Jan W.J. Elting, Inge Tuitert, A.M. Madelein van der Stouwe, Jelle R. Dalenberg, Marina A.J. Tijssen, Michael Biehl
https://doi.org/10.14428/esann/2025.ES2025-73
Abstract:
We present a machine learning approach to the challenging differentiation of hyperkinetic movement disorders, based on accelerometric sensor data. We address the diagnosis of essential tremor and cortical myoclonus as a specific example. Generalized Matrix Relevance Learning Vector Quantization (GMLVQ) systems are applied directly to power spectra obtained from eight sensors recording upper body movements. We find excellent validation performance of the classifiers. Moreover, GMLVQ provides insight into the characteristic patterns of the phenotypes and the importance of particular frequency ranges in the spectra. We demonstrate that the explanatory power of the classifier is further enhanced when integrating information from several tasks per subject.
We present a machine learning approach to the challenging differentiation of hyperkinetic movement disorders, based on accelerometric sensor data. We address the diagnosis of essential tremor and cortical myoclonus as a specific example. Generalized Matrix Relevance Learning Vector Quantization (GMLVQ) systems are applied directly to power spectra obtained from eight sensors recording upper body movements. We find excellent validation performance of the classifiers. Moreover, GMLVQ provides insight into the characteristic patterns of the phenotypes and the importance of particular frequency ranges in the spectra. We demonstrate that the explanatory power of the classifier is further enhanced when integrating information from several tasks per subject.
Natural language processing
Isotropy Matters: Soft-ZCA Whitening of Embeddings for Semantic Code Search
Andor Diera, Lukas Galke, Ansgar Scherp
https://doi.org/10.14428/esann/2025.ES2025-58
Andor Diera, Lukas Galke, Ansgar Scherp
https://doi.org/10.14428/esann/2025.ES2025-58
Abstract:
Low isotropy in an embedding space impairs performance on tasks involving semantic inference. Our study investigates the impact of isotropy on semantic code search performance and explores post-processing techniques to mitigate this issue. We analyze various code language models, examine isotropy in their embedding spaces, and its influence on search effectiveness. We propose a modified ZCA whitening technique to control isotropy levels in embeddings. Our results demonstrate that Soft-ZCA whitening improves the performance of pre-trained code language models and can complement contrastive fine-tuning.
Low isotropy in an embedding space impairs performance on tasks involving semantic inference. Our study investigates the impact of isotropy on semantic code search performance and explores post-processing techniques to mitigate this issue. We analyze various code language models, examine isotropy in their embedding spaces, and its influence on search effectiveness. We propose a modified ZCA whitening technique to control isotropy levels in embeddings. Our results demonstrate that Soft-ZCA whitening improves the performance of pre-trained code language models and can complement contrastive fine-tuning.
Open-Vocabulary Robotic Object Manipulation using Foundation Models
Stig Griebenow, Ozan Özdemir, Cornelius Weber, Stefan Wermter
https://doi.org/10.14428/esann/2025.ES2025-35
Stig Griebenow, Ozan Özdemir, Cornelius Weber, Stefan Wermter
https://doi.org/10.14428/esann/2025.ES2025-35
Abstract:
Classical vision-language-action models are limited by unidirectional communication, hindering natural human-robot interaction. The recent CrossT5 embeds an efficient vision action pathway into an LLM, but lacks visual generalization, restricting actions to objects seen during training. We introduce OWL×T5, which integrates the OWLv2 object detection model into CrossT5 to enable robot actions on unseen objects. OWL×T5 is trained on a simulated dataset using the NICO humanoid robot and evaluated on the new CLAEO dataset featuring interactions with unseen objects. Results show that OWL×T5 achieves zero-shot object recognition for robotic manipulation, while efficiently integrating vision-language-action capabilities.
Classical vision-language-action models are limited by unidirectional communication, hindering natural human-robot interaction. The recent CrossT5 embeds an efficient vision action pathway into an LLM, but lacks visual generalization, restricting actions to objects seen during training. We introduce OWL×T5, which integrates the OWLv2 object detection model into CrossT5 to enable robot actions on unseen objects. OWL×T5 is trained on a simulated dataset using the NICO humanoid robot and evaluated on the new CLAEO dataset featuring interactions with unseen objects. Results show that OWL×T5 achieves zero-shot object recognition for robotic manipulation, while efficiently integrating vision-language-action capabilities.
Improving Privacy Benefits of Redaction
Vaibhav Gusain, Douglas Leith
https://doi.org/10.14428/esann/2025.ES2025-36
Vaibhav Gusain, Douglas Leith
https://doi.org/10.14428/esann/2025.ES2025-36
Abstract:
We propose a novel redaction methodology that can be used to sanitize natural text data. Our new technique provides better privacy benefits than other state of the art techniques while maintaining lower redaction levels.
We propose a novel redaction methodology that can be used to sanitize natural text data. Our new technique provides better privacy benefits than other state of the art techniques while maintaining lower redaction levels.
Evaluating Text Representations Techniques for Hypernymy Detection: The Case of Arabic Language
Randah Alharbi, Husni Al-Muhtaseb
https://doi.org/10.14428/esann/2025.ES2025-89
Randah Alharbi, Husni Al-Muhtaseb
https://doi.org/10.14428/esann/2025.ES2025-89
Abstract:
Text representation is a key performance component for any hypernymy-related task. In this study, we investigate representation techniques to understand which features best represent the hypernymy relation, focusing on three factors of representation: word embedding, embedding combination techniques, and using features. The results indicate that different embeddings have different effects on performance; concatenation, 'addition and subtraction' have led to better performance, and using unsupervised measures has a negative effect on performance.
Text representation is a key performance component for any hypernymy-related task. In this study, we investigate representation techniques to understand which features best represent the hypernymy relation, focusing on three factors of representation: word embedding, embedding combination techniques, and using features. The results indicate that different embeddings have different effects on performance; concatenation, 'addition and subtraction' have led to better performance, and using unsupervised measures has a negative effect on performance.
Early Prediction of Dynamic Sparsity in Large Language Models
Reza Sedghi, Amit Kumar Pal, Anand Subramoney, David Kappel
https://doi.org/10.14428/esann/2025.ES2025-97
Reza Sedghi, Amit Kumar Pal, Anand Subramoney, David Kappel
https://doi.org/10.14428/esann/2025.ES2025-97
Abstract:
Large language models are powerful but computationally very expensive. We investigate dynamic sparsity in attention mechanisms, using the OPT model as a case study. We explore the dynamic nature of redundancy in attention heads and analyze which components of the model provide sufficient information to predict sparsity effectively. Our findings highlight the norm of attention outputs as a reliable criterion for ranking head importance. We systematically evaluate embeddings across layers and time steps, showing that dynamic sparsity predictions can be achieved early in the model pipeline with minimal loss in accuracy. By elucidating the mechanisms underlying dynamic sparsity, this work lays a foundation for more efficient and scalable transformer models.
Large language models are powerful but computationally very expensive. We investigate dynamic sparsity in attention mechanisms, using the OPT model as a case study. We explore the dynamic nature of redundancy in attention heads and analyze which components of the model provide sufficient information to predict sparsity effectively. Our findings highlight the norm of attention outputs as a reliable criterion for ranking head importance. We systematically evaluate embeddings across layers and time steps, showing that dynamic sparsity predictions can be achieved early in the model pipeline with minimal loss in accuracy. By elucidating the mechanisms underlying dynamic sparsity, this work lays a foundation for more efficient and scalable transformer models.
Unlocking Structured Thinking in Language Models with Cognitive Prompting
Oliver Kramer, Jill Baumann
https://doi.org/10.14428/esann/2025.ES2025-116
Oliver Kramer, Jill Baumann
https://doi.org/10.14428/esann/2025.ES2025-116
Abstract:
We propose cognitive prompting as a novel approach to guide problem-solving in large language models (LLMs) through structured, human-like cognitive operations, such as goal clarification, decomposition, filtering, abstraction, and pattern recognition. By employing systematic, step-by-step reasoning, cognitive prompting enables LLMs to tackle complex, multi-step tasks more efficiently. We introduce three variants: a deterministic sequence of cognitive operations, a self-adaptive variant in which the LLM dynamically selects the sequence of cognitive operations, and a hybrid variant that uses generated correct solutions as few-shot chain-of-thought prompts. Experiments with LLaMA, Gemma~2, and Qwen models in each two sizes on the arithmetic reasoning benchmark GSM8K demonstrate that cognitive prompting significantly improves performance compared to standard question answering.
We propose cognitive prompting as a novel approach to guide problem-solving in large language models (LLMs) through structured, human-like cognitive operations, such as goal clarification, decomposition, filtering, abstraction, and pattern recognition. By employing systematic, step-by-step reasoning, cognitive prompting enables LLMs to tackle complex, multi-step tasks more efficiently. We introduce three variants: a deterministic sequence of cognitive operations, a self-adaptive variant in which the LLM dynamically selects the sequence of cognitive operations, and a hybrid variant that uses generated correct solutions as few-shot chain-of-thought prompts. Experiments with LLaMA, Gemma~2, and Qwen models in each two sizes on the arithmetic reasoning benchmark GSM8K demonstrate that cognitive prompting significantly improves performance compared to standard question answering.
Comparing Modern LLM Quantization Methods Across Natural Languages
Maksym Iakovenko, Stéphane Dupont
https://doi.org/10.14428/esann/2025.ES2025-190
Maksym Iakovenko, Stéphane Dupont
https://doi.org/10.14428/esann/2025.ES2025-190
Abstract:
Weight quantization has become a key tool for democratizing access to large language models (LLMs). Despite the technique's growing popularity and potential to aid speakers of diverse languages worldwide, new LLM quantization methods are predominantly validated in monolingual English contexts. This study explores ways to consistently evaluate the multilingual performance of a variety of LLaMA-based models under different quantization configurations. We identify links between the multilingual performance of widely adopted LLM quantization methods and multiple factors such as language's prevalence in the training set and similarity to model's dominant language.
Weight quantization has become a key tool for democratizing access to large language models (LLMs). Despite the technique's growing popularity and potential to aid speakers of diverse languages worldwide, new LLM quantization methods are predominantly validated in monolingual English contexts. This study explores ways to consistently evaluate the multilingual performance of a variety of LLaMA-based models under different quantization configurations. We identify links between the multilingual performance of widely adopted LLM quantization methods and multiple factors such as language's prevalence in the training set and similarity to model's dominant language.
Evaluating Concept Discovery Methods for Sensitive Attributes in Language Models
Sarah Schröder, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-150
Sarah Schröder, Alexander Schulz, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-150
Abstract:
This paper examines how to improve interpretability of language models in the context of fairness. While traditional concept learning focuses on identifying the most important concepts for a task, this study explores how to locate the representation of sensitive attributes in pre-trained language models. We address challenges such as the potential low importance and sparsity of sensitive attributes in training data, and the limited amount of labeled data for this purpose. Our experiments evaluate potential methods to obtain such identity concepts, considering factors like label sparsity, generalizability, and the influence of different language models on the representation of sensitive attributes.
This paper examines how to improve interpretability of language models in the context of fairness. While traditional concept learning focuses on identifying the most important concepts for a task, this study explores how to locate the representation of sensitive attributes in pre-trained language models. We address challenges such as the potential low importance and sparsity of sensitive attributes in training data, and the limited amount of labeled data for this purpose. Our experiments evaluate potential methods to obtain such identity concepts, considering factors like label sparsity, generalizability, and the influence of different language models on the representation of sensitive attributes.
Dynamical systems and recurrent learning
Towards Adaptive and Stable Compositional Assemblies of Recurrent Neural Network Modules
Valerio De Caro, Andrea Ceni, Davide Bacciu, Claudio Gallicchio
https://doi.org/10.14428/esann/2025.ES2025-48
Valerio De Caro, Andrea Ceni, Davide Bacciu, Claudio Gallicchio
https://doi.org/10.14428/esann/2025.ES2025-48
Abstract:
Recurrent neural networks (RNNs) are computational models regarded as dynamical systems. Modularity is a key ingredient of complex systems. Thus, the composition of RNN modules provides a simple paradigm for building complex computational models, with the potential to approach the human brain capability. We devise strategies for training RNNs assembled into a larger RNN of RNNs, provided with theoretical guarantees of stability that hold during training for the composed global network. Experiments on pixel-by-pixel image classification benchmarks prove the effectiveness of this approach.
Recurrent neural networks (RNNs) are computational models regarded as dynamical systems. Modularity is a key ingredient of complex systems. Thus, the composition of RNN modules provides a simple paradigm for building complex computational models, with the potential to approach the human brain capability. We devise strategies for training RNNs assembled into a larger RNN of RNNs, provided with theoretical guarantees of stability that hold during training for the composed global network. Experiments on pixel-by-pixel image classification benchmarks prove the effectiveness of this approach.
Solving Turbulent Rayleigh-Bénard Convection using Fourier Neural Operators
Michiel Straat, Thorben Markmann, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-131
Michiel Straat, Thorben Markmann, Barbara Hammer
https://doi.org/10.14428/esann/2025.ES2025-131
Abstract:
We train Fourier Neural Operator (FNO) surrogate models for Rayleigh-Bénard Convection (RBC), a model for convection processes that occur in nature and industrial settings. We compare the prediction accuracy and model properties of FNO surrogates to two popular surrogates used in fluid dynamics: Dynamic Mode Decomposition (DMD) and the Linearly-Recurrent Autoencoder Network (LRAN). We regard Direct Numerical Simulations (DNS) of the RBC equations as the ground truth on which the models are trained and evaluated for different settings. The FNO performs favorably when compared to the DMD and LRAN and its predictions are fast and highly accurate for this task. Additionally, we show its zero-shot super-resolution ability for the convection dynamics. The FNO model has a high potential to be used in downstream tasks such as flow control in RBC.
We train Fourier Neural Operator (FNO) surrogate models for Rayleigh-Bénard Convection (RBC), a model for convection processes that occur in nature and industrial settings. We compare the prediction accuracy and model properties of FNO surrogates to two popular surrogates used in fluid dynamics: Dynamic Mode Decomposition (DMD) and the Linearly-Recurrent Autoencoder Network (LRAN). We regard Direct Numerical Simulations (DNS) of the RBC equations as the ground truth on which the models are trained and evaluated for different settings. The FNO performs favorably when compared to the DMD and LRAN and its predictions are fast and highly accurate for this task. Additionally, we show its zero-shot super-resolution ability for the convection dynamics. The FNO model has a high potential to be used in downstream tasks such as flow control in RBC.
Predictive Coding Dynamics Enhance Model-Brain Similarity
Manshan Guo, Michael Samjatin, Bhavin Choksi, sari sadiya, Radoslaw Cichy, Gemma Roig
https://doi.org/10.14428/esann/2025.ES2025-143
Manshan Guo, Michael Samjatin, Bhavin Choksi, sari sadiya, Radoslaw Cichy, Gemma Roig
https://doi.org/10.14428/esann/2025.ES2025-143
Abstract:
Predictive coding--a popular theory in neuroscience--has garnered significant attention in the machine learning community aiming to incorporate brain-inspired components in neural networks. While various proposals have demonstrated the ability of predictive dynamics to render robustness and entail human-like perception of illusions, it remains unclear if they improve the alignment between brain and artificial representations. Here, we systematically investigate the conditions under which brain-inspired modifications in predictive processing improve alignment between model and neural representations in the brain. Our results reveal that the feedback component significantly increases similarity between model representations and those found in higher-level visual brain areas, especially when processing complex visual scenes.
Predictive coding--a popular theory in neuroscience--has garnered significant attention in the machine learning community aiming to incorporate brain-inspired components in neural networks. While various proposals have demonstrated the ability of predictive dynamics to render robustness and entail human-like perception of illusions, it remains unclear if they improve the alignment between brain and artificial representations. Here, we systematically investigate the conditions under which brain-inspired modifications in predictive processing improve alignment between model and neural representations in the brain. Our results reveal that the feedback component significantly increases similarity between model representations and those found in higher-level visual brain areas, especially when processing complex visual scenes.
Efficient Training of Neural SDEs Using Stochastic Optimal Control
Rembert Daems, Manfred Opper, Guillaume Crevecoeur, Tolga Birdal
https://doi.org/10.14428/esann/2025.ES2025-182
Rembert Daems, Manfred Opper, Guillaume Crevecoeur, Tolga Birdal
https://doi.org/10.14428/esann/2025.ES2025-182
Abstract:
We present a hierarchical, control theory inspired method for variational inference (VI) for neural stochastic differential equations (SDEs). While VI for neural SDEs is a promising avenue for uncertainty-aware reasoning in time-series, it is computationally challenging due to the iterative nature of maximizing the ELBO. In this work, we propose to decompose the control term into linear and residual non-linear components and derive an optimal control term for linear SDEs, using stochastic optimal control. Modeling the non-linear component by a neural network, we show how to efficiently train neural SDEs without sacrificing their expressive power. Since the linear part of the control term is optimal and does not need to be learned, the training is initialized at a lower cost and we observe faster convergence.
We present a hierarchical, control theory inspired method for variational inference (VI) for neural stochastic differential equations (SDEs). While VI for neural SDEs is a promising avenue for uncertainty-aware reasoning in time-series, it is computationally challenging due to the iterative nature of maximizing the ELBO. In this work, we propose to decompose the control term into linear and residual non-linear components and derive an optimal control term for linear SDEs, using stochastic optimal control. Modeling the non-linear component by a neural network, we show how to efficiently train neural SDEs without sacrificing their expressive power. Since the linear part of the control term is optimal and does not need to be learned, the training is initialized at a lower cost and we observe faster convergence.
The Reinforced Liquid State Machine: A New Training Architecture for Spiking Neural Networks
Dominik Krenzer, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-1
Dominik Krenzer, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-1
Abstract:
This work presents a novel Spiking Neural Network training architecture based on a deepened Liquid State Machine integrating Winner-Takes-All computation and Reward-Modulated Synaptic Plasticity. The networks performance is evaluated on the Heidelberg Dataset for spoken digit recognition. A two-layer liquid configuration improves classification accuracy by 5% over a single-layer baseline, while incorporating feedback between liquid layers. This architecture demonstrates that deep liquid models, combined with feedback and reward-driven learning, can effectively capture complex spatio-temporal patterns, offering significant advantages in terms of accuracy over traditional Liquid State Machines.
This work presents a novel Spiking Neural Network training architecture based on a deepened Liquid State Machine integrating Winner-Takes-All computation and Reward-Modulated Synaptic Plasticity. The networks performance is evaluated on the Heidelberg Dataset for spoken digit recognition. A two-layer liquid configuration improves classification accuracy by 5% over a single-layer baseline, while incorporating feedback between liquid layers. This architecture demonstrates that deep liquid models, combined with feedback and reward-driven learning, can effectively capture complex spatio-temporal patterns, offering significant advantages in terms of accuracy over traditional Liquid State Machines.
Motif-augmented classical music synthesis via recurrent neural networks
Alexandru-Ion Marinescu
https://doi.org/10.14428/esann/2025.ES2025-71
Alexandru-Ion Marinescu
https://doi.org/10.14428/esann/2025.ES2025-71
Abstract:
We propose a motif-augmented approach to classical music synthesis using LSTM-based recurrent neural networks trained on J. S. Bach violin compositions. By combining motif augmentation with temperature-based sampling, we improve the entropy alignment between generated sequences and ground-truth data. Our experiments show that motif augmentation significantly reduces entropy deviation and enhances sequence coherence, as confirmed by statistical analysis. This method advances generative music modeling, offering potential applications in music composition and sequence prediction tasks.
We propose a motif-augmented approach to classical music synthesis using LSTM-based recurrent neural networks trained on J. S. Bach violin compositions. By combining motif augmentation with temperature-based sampling, we improve the entropy alignment between generated sequences and ground-truth data. Our experiments show that motif augmentation significantly reduces entropy deviation and enhances sequence coherence, as confirmed by statistical analysis. This method advances generative music modeling, offering potential applications in music composition and sequence prediction tasks.
Analysing the impact of brain-inspired predictive coding dynamics through gradient based explainability methods
Bhavin Choksi, Gionata Paolo Zalaffi, Giovanna Maria Dimitri, Gemma Roig
https://doi.org/10.14428/esann/2025.ES2025-88
Bhavin Choksi, Gionata Paolo Zalaffi, Giovanna Maria Dimitri, Gemma Roig
https://doi.org/10.14428/esann/2025.ES2025-88
Abstract:
Multiple theories exist for the role of feedback connections in the brain and in the artificial neural networks, but remain untested using modern tools. In this work, we undertake this task by exploring the utility of explainability methods like GradCAMs in investigating bio-inspired recurrent networks–provided with the predify package–that perform hierarchical updates inspired by the predictive coding theory in neuroscience. We report an extensive search with different levels of feedforward and feedback information. Our preliminary results show that the dynamics are able to recover the GradCAMs on noisy images, providing promising avenues for future work aiming to understand the role of recurrence.
Multiple theories exist for the role of feedback connections in the brain and in the artificial neural networks, but remain untested using modern tools. In this work, we undertake this task by exploring the utility of explainability methods like GradCAMs in investigating bio-inspired recurrent networks–provided with the predify package–that perform hierarchical updates inspired by the predictive coding theory in neuroscience. We report an extensive search with different levels of feedforward and feedback information. Our preliminary results show that the dynamics are able to recover the GradCAMs on noisy images, providing promising avenues for future work aiming to understand the role of recurrence.
A Pipeline based on Differential Evolution for Tuning Parameters of Synaptic Dynamics Models
Ferney Beltran-Velandia, Nico Scherf, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-111
Ferney Beltran-Velandia, Nico Scherf, Martin Bogdan
https://doi.org/10.14428/esann/2025.ES2025-111
Abstract:
Integrating the modulatory properties of Synaptic Dynamics (SD) into Spiking Neural Networks (SNNs) can enhance their computational capabilities. For improving this integration process, this paper presents a pipeline based on Differential Evolution to tune parameters of SD models. Using reference signals from in vitro experiments, parameters of two models are tuned as study cases: the Tsodyks-Markram and the Modified Stochastic Synaptic Model. The pipeline has an average success rate of 75% and 80% respectively. The outcome is a distribution of parameters for each model, which can be considered as prior knowledge to facilitate the integration of SD models into SNNs.
Integrating the modulatory properties of Synaptic Dynamics (SD) into Spiking Neural Networks (SNNs) can enhance their computational capabilities. For improving this integration process, this paper presents a pipeline based on Differential Evolution to tune parameters of SD models. Using reference signals from in vitro experiments, parameters of two models are tuned as study cases: the Tsodyks-Markram and the Modified Stochastic Synaptic Model. The pipeline has an average success rate of 75% and 80% respectively. The outcome is a distribution of parameters for each model, which can be considered as prior knowledge to facilitate the integration of SD models into SNNs.
Influence of function nodes on automated generation of routing policies with genetic programming
Marko Đurasević, Francisco Javiel Gil Gala
https://doi.org/10.14428/esann/2025.ES2025-138
Marko Đurasević, Francisco Javiel Gil Gala
https://doi.org/10.14428/esann/2025.ES2025-138
Abstract:
Routing policies (RPs) are simple heuristics used to solve the electric vehicle routing problem, suitable for large or dynamic problems.
Designing efficient RPs is difficult, because of which researchers started applying genetic programming (GP) for their automated development.
For GP to be able to generate efficient RPs, it must be supplied with appropriate building blocks, i.e., functions and problem features, to construct the solution.
This study investigates the selection of appropriate function nodes to construct RPs.
The experiments demonstrate that the best results are obtained when using the most simple arithmetic operators enhanced with some additional operators.
Routing policies (RPs) are simple heuristics used to solve the electric vehicle routing problem, suitable for large or dynamic problems.
Designing efficient RPs is difficult, because of which researchers started applying genetic programming (GP) for their automated development.
For GP to be able to generate efficient RPs, it must be supplied with appropriate building blocks, i.e., functions and problem features, to construct the solution.
This study investigates the selection of appropriate function nodes to construct RPs.
The experiments demonstrate that the best results are obtained when using the most simple arithmetic operators enhanced with some additional operators.
Towards metacognitive agents: integrating confidence in sequential decision-making
Baptiste Pesquet, Frédéric Alexandre
https://doi.org/10.14428/esann/2025.ES2025-176
Baptiste Pesquet, Frédéric Alexandre
https://doi.org/10.14428/esann/2025.ES2025-176
Abstract:
In natural cognition, confidence is used to evaluate the quality of decisions and adapt one's behavior to the task at hand. For now, artificial agents lack this kind of metacognitive ability and interact with their environment in a purely reactive way. Inspired by recent findings about the cognitive modeling of confidence, we propose a novel architecture for sequential decision-making. It combines an evidence accumulation model with a metacognitive module that computes and exploits confidence to tune the decision process. The model has been assessed on a perceptual decision-making task, showing promises for more flexible artificial agents and a possible path towards artificial metacognition.
In natural cognition, confidence is used to evaluate the quality of decisions and adapt one's behavior to the task at hand. For now, artificial agents lack this kind of metacognitive ability and interact with their environment in a purely reactive way. Inspired by recent findings about the cognitive modeling of confidence, we propose a novel architecture for sequential decision-making. It combines an evidence accumulation model with a metacognitive module that computes and exploits confidence to tune the decision process. The model has been assessed on a perceptual decision-making task, showing promises for more flexible artificial agents and a possible path towards artificial metacognition.
Exploring Model Architectures for Real-Time Lung Sound Event Detection
Michiel Jacobs, Lode Vuegen, Tom Verresen, Marie Schouterden, David Ruttens, Peter Karsmakers
https://doi.org/10.14428/esann/2025.ES2025-201
Michiel Jacobs, Lode Vuegen, Tom Verresen, Marie Schouterden, David Ruttens, Peter Karsmakers
https://doi.org/10.14428/esann/2025.ES2025-201
Abstract:
Computerized detection of relevant lung sound events has the potential to assist physicians during auscultation and to monitor the severity of pulmonary diseases in ambulatory settings. In some cases, real-time detection of adventitious lung sounds is required to provide instant feedback to physicians, e.g. during autogenic drainage therapy.
State-of-the-art solutions for this task leverage deep learning models, which vary significantly in complexity. For real-time applications on resource-constrained devices, such as stethoscope-integrated hardware, both detection accuracy and model complexity are important to consider. While most existing research focusses primarily on accuracy, this work evaluates both accuracy and computational complexity.
The contributions of this work are threefold. First, the effect of using a full breathing cycle as input is studied to assess its impact on event detection performance. This approach introduces a computational cost due to the required segmentation process. Second, a transformer-based architecture is compared with two relatively simple convolutional models, each utilizing different input horizons. Evaluations are conducted on both public and in-house lung sound datasets. Third, recognizing that the event detection task aligns better with a multi-label setting than the commonly used multi-class setup, this study compares both approaches.
We conclude that a multi-label output outperforms a multi-class approach, that inputs segmented per breathing cycle are preferred, and that the high complexity models have similar performance to the models with low complexity on unseen data.
Computerized detection of relevant lung sound events has the potential to assist physicians during auscultation and to monitor the severity of pulmonary diseases in ambulatory settings. In some cases, real-time detection of adventitious lung sounds is required to provide instant feedback to physicians, e.g. during autogenic drainage therapy.
State-of-the-art solutions for this task leverage deep learning models, which vary significantly in complexity. For real-time applications on resource-constrained devices, such as stethoscope-integrated hardware, both detection accuracy and model complexity are important to consider. While most existing research focusses primarily on accuracy, this work evaluates both accuracy and computational complexity.
The contributions of this work are threefold. First, the effect of using a full breathing cycle as input is studied to assess its impact on event detection performance. This approach introduces a computational cost due to the required segmentation process. Second, a transformer-based architecture is compared with two relatively simple convolutional models, each utilizing different input horizons. Evaluations are conducted on both public and in-house lung sound datasets. Third, recognizing that the event detection task aligns better with a multi-label setting than the commonly used multi-class setup, this study compares both approaches.
We conclude that a multi-label output outperforms a multi-class approach, that inputs segmented per breathing cycle are preferred, and that the high complexity models have similar performance to the models with low complexity on unseen data.
Inferring Underwater Topography with Finite Volume Neural Networks
Cosku Horuz, Matthias Karlbauer, Timothy Praditia, Sergey Oladyshkin, Wolfgang Nowak, Sebastian Otte
https://doi.org/10.14428/esann/2025.ES2025-12
Cosku Horuz, Matthias Karlbauer, Timothy Praditia, Sergey Oladyshkin, Wolfgang Nowak, Sebastian Otte
https://doi.org/10.14428/esann/2025.ES2025-12
Abstract:
Partial differential equations (PDEs) find applications across various scientific and engineering fields. There is a growing trend for integrating physics-aware machine learning models to solve PDEs. Among them, the Finite Volume Neural Network (FINN) has proven to be efficient in uncovering latent structures in data. This study explores the capabilities of FINN in the investigation of shallow-water equations, which simulate wave dynamics in coastal regions. Specifically, we investigate the efficacy of FINN in reconstructing underwater topography. We find that FINN excels at inferring topography solely from wave dynamics, stressing the importance of application-specific inductive bias in neural network architectures.
Partial differential equations (PDEs) find applications across various scientific and engineering fields. There is a growing trend for integrating physics-aware machine learning models to solve PDEs. Among them, the Finite Volume Neural Network (FINN) has proven to be efficient in uncovering latent structures in data. This study explores the capabilities of FINN in the investigation of shallow-water equations, which simulate wave dynamics in coastal regions. Specifically, we investigate the efficacy of FINN in reconstructing underwater topography. We find that FINN excels at inferring topography solely from wave dynamics, stressing the importance of application-specific inductive bias in neural network architectures.
Artificial Surrogate Model for Computational Fluid Dynamics
Abdallah Alfaham, Siegfried Mercelis
https://doi.org/10.14428/esann/2025.ES2025-70
Abdallah Alfaham, Siegfried Mercelis
https://doi.org/10.14428/esann/2025.ES2025-70
Abstract:
Simulating fluid dynamics is challenging due to the computational complexity of processing high-dimensional data, which often requires significant time. Fluid behavior is typically governed by partial differential equations (PDEs), and the complexity escalates when obstacles disrupt the flow, reinforcing vorticity formation. Vorticity describes the local rotational motion of a fluid. In this paper, we present a data-driven approach to automate PDE simulations and develop surrogate models to generate fluid dynamics based on the Kármán vortex street. Our approach aims to generate accurate fluid simulations with faster computation through architectural adjustments.
Simulating fluid dynamics is challenging due to the computational complexity of processing high-dimensional data, which often requires significant time. Fluid behavior is typically governed by partial differential equations (PDEs), and the complexity escalates when obstacles disrupt the flow, reinforcing vorticity formation. Vorticity describes the local rotational motion of a fluid. In this paper, we present a data-driven approach to automate PDE simulations and develop surrogate models to generate fluid dynamics based on the Kármán vortex street. Our approach aims to generate accurate fluid simulations with faster computation through architectural adjustments.