Website overhaul

This commit is contained in:
2025-03-27 22:57:31 +01:00
parent 2b75326eac
commit 755fd297bb
70 changed files with 1389 additions and 709 deletions

View File

@ -0,0 +1,24 @@
---
layout: single
title: "Learned Trajectory Annotation"
categories: research
tags: geoinformatics machine-learning unsupervised-learning human-robot-interaction autoencoder
excerpt: "Unsupervised autoencoder learns spatial context from trajectory data for annotation."
header:
teaser: /assets/figures/0_trajectory_reconstruction_teaser.png
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
<center>
<img src="/assets/figures/0_trajectory_isovist.jpg" alt="Visualization of spatial perception field (e.g., isovist) from a point on a trajectory" style="width:48%; display: inline-block; margin: 1%;">
<img src="/assets/figures/0_trajectory_reconstruction.jpg" alt="Clustered or reconstructed trajectories based on learned spatial representations" style="width:48%; display: inline-block; margin: 1%;">
<figcaption>Learning spatial context representations (left) enables clustering and annotation of trajectories (right).</figcaption>
</center><br>
This research addresses the challenge of enabling more intuitive human-robot interaction in shared spaces, particularly focusing on grounding verbal communication in spatial understanding. The work introduces a novel unsupervised learning methodology based on neural autoencoders.
The core contribution is a system that learns continuous, low-dimensional representations of spatial context directly from trajectory data, without requiring explicit environmental maps or predefined regions. By processing sequences of spatial perceptions (analogous to visibility fields or isovists) along a path, the autoencoder captures salient environmental features relevant to movement.
These learned latent representations facilitate the effective clustering of trajectories based on shared spatial experiences. The outcome is a set of semantically meaningful encodings and prototypical representations of movement patterns within an environment. This approach lays essential groundwork for developing robotic systems capable of understanding, interpreting, and potentially describing movement through space in human-comprehensible terms, representing a promising direction for future human-robot collaboration. {% cite feld2018trajectory %}

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Trajectory annotation by spatial perception"
categories: research
excerpt: "We propose an approach to annotate trajectories using sequences of spatial perception."
header:
teaser: assets/figures/0_trajectory_reconstruction_teaser.png
---
<figure class="half">
<img src="/assets/figures/0_trajectory_isovist.jpg" alt="" style="width:48%">
<img src="/assets/figures/0_trajectory_reconstruction.jpg" alt="" style="width:48%">
</figure>
This work establishes a foundation for enhancing interaction between robots and humans in shared spaces by developing reliable systems for verbal communication. It introduces an unsupervised learning method using neural autoencoding to learn continuous spatial representations from trajectory data, enabling clustering of movements based on spatial context. The approach yields semantically meaningful encodings of spatio-temporal data for creating prototypical representations, setting a promising direction for future applications in robotic-human interaction. {% cite feld2018trajectory %}

View File

@ -0,0 +1,25 @@
---
layout: single
title: "Neural Self-Replication"
categories: research
tags: neural-networks artificial-life complex-systems self-organization
excerpt: "Neural networks replicating weights, inspired by biology and artificial life."
header:
teaser: /assets/figures/1_self_replication_pca_space.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![alt text](\assets\figures\1_self_replication_robustness.jpg)
{:style="display:block; width:45%" .align-right}
Drawing inspiration from the fundamental process of self-replication in biological systems, this research explores the potential for implementing analogous mechanisms within neural networks. The objective is to develop computational models capable of autonomously reproducing their own structure (specifically, their connection weights), potentially leading to the emergence of complex, adaptive behaviors.
The study investigates various neural network architectures and learning paradigms suitable for achieving self-replication. A key finding highlights the efficacy of leveraging backpropagation-like mechanisms, not for a typical supervised task, but for navigating the weight space in a manner conducive to replication. This approach facilitates the development of non-trivial self-replicating networks.
Furthermore, the research extends this concept by proposing an "artificial chemistry" environment. This framework involves populations of interacting neural networks, where self-replication dynamics can lead to emergent properties and complex ecosystem behaviors. This work offers a novel computational perspective on self-replication, providing tools and insights for exploring artificial life and the principles of self-organization in computational systems. For a detailed discussion, please refer to the publication by {% cite gabor2019self %}.
<div style="clear: both;"></div>
<center>
<img src="/assets/figures/1_self_replication_pca_space.jpg" alt="PCA visualization showing clusters or trajectories of self-replicating networks in a latent space" style="display:block; width:100%">
<figcaption>Visualization of self-replicator populations evolving in a PCA-reduced weight space.</figcaption>
</center>

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Self-Replication in Neural Networks"
categories: research
excerpt: "Introduction of NNs that are able to replicate their own weights."
header:
teaser: assets/figures/1_self_replication_pca_space.jpg
---
![Self-Replication Robustness](\assets\figures\1_self_replication_robustness.jpg){:style="display:block; width:40%" .align-right}
This text discusses the fundamental role of self-replication in biological structures and its application to neural networks for developing complex behaviors in computing. It explores different network types for self-replication, highlighting the effectiveness of backpropagation in navigating network weights and fostering the emergence of non-trivial self-replicators. The study further delves into creating an artificial chemistry environment comprising several neural networks, offering a novel approach to understanding and implementing self-replication in computational models. For in-depth insights, refer to the work by {% cite gabor2019self %}.
![Self-replicators in PCA Space (Soup)](\assets\figures\1_self_replication_pca_space.jpg){:style="display:block; width:80%" .align-center}

View File

@ -0,0 +1,18 @@
---
layout: single
title: "Deep Audio Baselines"
categories: research
tags: deep-learning audio-classification paralinguistics speech-analysis
excerpt: "Deep learning audio baseline for Interspeech 2019 ComParE challenge."
header:
teaser: /assets/figures/3_deep_neural_baselines_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![alt text](\assets\figures\3_deep_neural_baselines.jpg)
{:style="display:block; width:30%" .align-right}
This research, presented as part of the Interspeech 2019 Computational Paralinguistics Challenge (ComParE), specifically addresses the Sleepiness Sub-Challenge. We introduced a robust, end-to-end deep learning methodology designed to serve as a strong baseline for audio classification tasks within the paralinguistics domain.
The core innovation lies in utilizing a deep neural network architecture (e.g., CNNs, potentially combined with recurrent layers) that directly processes raw or minimally processed audio data (such as spectrograms). This end-to-end approach bypasses the need for extensive, task-specific manual feature engineering, which is often a complex and time-consuming aspect of traditional audio analysis pipelines.
Our proposed baseline model achieved performance comparable to established state-of-the-art methods on the sleepiness detection task. Furthermore, the architecture was designed with adaptability in mind, demonstrating its potential applicability to a broader range of audio classification challenges beyond sleepiness detection. This work underscores the power of deep learning to automatically extract relevant features from audio signals for complex paralinguistic tasks. For further details, please consult the publication by {% cite elsner2019deep %}.

View File

@ -1,11 +0,0 @@
---
layout: single
title: "Deep-Neural Baseline"
categories: research
excerpt: "Introduction a deep baseline for audio classification."
header:
teaser: assets/figures/3_deep_neural_baselines_teaser.jpg
---
![Self-Replication Robustness](\assets\figures\3_deep_neural_baselines.jpg){:style="display:block; width:30%" .align-right}
The study presents an innovative end-to-end deep learning method to identify sleepiness in spoken language, as part of the Interspeech 2019 ComParE challenge. This method utilizes a deep neural network architecture to analyze audio data directly, eliminating the need for specific feature engineering. This approach not only achieves performance comparable to state-of-the-art models but is also adaptable to various audio classification tasks. For more details, refer to the work by {% cite elsner2019deep %}.

View File

@ -1,11 +1,23 @@
---
layout: single
title: "Learning Soccer-Team Vecors"
categories: research
excerpt: "Team market value estimation, similarity search and rankings."
title: "Soccer Team Vectors"
categories: research
tags: machine-learning representation-learning sports-analytics similarity-search
excerpt: "STEVE learns soccer team embeddings from match data for analysis."
header:
teaser: assets/figures/2_steve_algo.jpg
teaser: /assets/figures/2_steve_algo.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![STEVE Algorithm](\assets\figures\2_steve_algo.jpg){:style="display:block; width:60%" .align-center}
This research introduces **STEVE (Soccer Team Vectors)**, a novel methodology for learning meaningful, real-valued vector representations (embeddings) for professional soccer teams. The primary goal is to capture intrinsic team characteristics and relationships within a continuous vector space, such that teams with similar playing styles, strengths, or performance levels are positioned closely together.
This study introduces STEVE (Soccer Team Vectors), a novel method for generating real-valued vectors representing soccer teams, organized so that similar teams are proximate in vector space. Utilizing publicly available match data, these vectors facilitate various machine learning applications, notably excelling in team market value estimation and enabling effective similarity search and team ranking. STEVE demonstrates superior performance over competing models in these domains. For further details, please consult the work by {% cite muller2020soccer %}.
Leveraging widely available public data from soccer matches (e.g., results, possibly performance statistics), STEVE employs machine learning techniques to generate these low-dimensional team vectors.
The utility of these learned representations is demonstrated through several downstream applications:
![alt text](\assets\figures\2_steve_algo.jpg){:style="display:block; width:60%" .align-right}
* **Team Market Value Estimation:** The vectors serve as effective features for predicting team market values, outperforming baseline models.
* **Similarity Search:** The vector space allows for efficient identification of teams similar to a given query team based on proximity.
* **Team Ranking:** The embeddings provide a basis for generating data-driven team rankings.
Across these application domains, STEVE demonstrated superior performance compared to competing approaches evaluated in the study. This work provides a valuable tool for quantitative analysis in sports analytics, enabling various machine learning tasks related to team comparison and prediction. For a comprehensive description of the methodology and results, please refer to the publication by {% cite muller2020soccer %}.

View File

@ -0,0 +1,31 @@
---
layout: single
title: "3D Primitive Segmentation"
categories: research
tags: computer-vision 3d-processing point-clouds segmentation deep-learning genetic-algorithms
excerpt: "Hybrid method segments/fits primitives in large 3D point clouds."
header:
teaser: /assets/figures/4_point_cloud_segmentation_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
<center>
<img src="/assets/figures/4_point_cloud_pipeline.jpg" alt="Diagram illustrating the hybrid point cloud segmentation pipeline" style="display:block; width:100%">
<figcaption>Overview of the hybrid segmentation and primitive fitting pipeline.</figcaption>
</center><br>
This research addresses challenges in accurately segmenting large-scale 3D point clouds into meaningful geometric primitives, specifically spheres, cylinders, and cuboids. Existing methods often struggle with scalability or robustness when faced with diverse shapes and noisy real-world data.
We propose a novel **hybrid approach** that synergistically combines multiple techniques to overcome these limitations:
1. **Deep Learning Integration:** Utilized to potentially enhance initial feature extraction or provide guidance for subsequent steps (the exact role should be clarified based on the paper, e.g., pre-segmentation, feature learning).
2. **RANSAC-based Primitive Fitting:** Employs the robust RANSAC algorithm for accurately fitting simpler geometric shapes like spheres and cylinders to subsets of the point cloud.
3. **DBSCAN Clustering:** Applied for grouping remaining points or refining segmentation boundaries, effectively handling noise and varying point densities.
4. **Specialized Genetic Algorithm:** A custom Genetic Algorithm is introduced specifically for the robust detection and fitting of cuboid primitives, which are often challenging for standard fitting methods.
This integrated pipeline demonstrates enhanced stability and robustness compared to methods relying on a single technique. It particularly excels in reconstructing the target primitives from large and complex point sets. The effectiveness of the approach is validated through quantitative performance metrics and qualitative visualizations, with a discussion acknowledging the method's scope and potential limitations. For a detailed technical description and evaluation, please refer to the publication by {% cite friedrich2020hybrid %}.
<center>
<img src="/assets/figures/4_point_cloud_segmentation.jpg" alt="Example result showing a point cloud segmented into different colored geometric primitives" style="display:block; width:890%">
<figcaption>Example segmentation result demonstrating primitive identification.</figcaption>
</center>

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Point Cloud Segmentation"
categories: research
excerpt: "Segmetation of point clouds into primitive building blocks."
header:
teaser: assets/figures/4_point_cloud_segmentation_teaser.jpg
---
![Point Cloud Segmentation](\assets\figures\4_point_cloud_pipeline.jpg){:style="display:block; width:100%" .align-center}
This paper introduces a hybrid approach for segmenting and fitting solid primitives to 3D point clouds, overcoming limitations in handling large datasets and diverse primitive shapes. By integrating deep learning with RANSAC for primitive fitting, employing DBSCAN for clustering, and utilizing a specialized Genetic Algorithm for cuboid extraction, this method achieves enhanced stability and robustness. It excels in reconstructing spheres, cylinders, and cuboids from large point sets, with performance metrics and visualizations provided to demonstrate its effectiveness, alongside a discussion on its limitations. For more detailed insights, refer to {% cite friedrich2020hybrid %}.
![Point Cloud Segmentation](\assets\figures\4_point_cloud_segmentation.jpg){:style="display:block; width:80%" .align-center}

View File

@ -1,13 +0,0 @@
---
layout: single
title: "Policy Entropy for OOD Classification"
categories: research
excerpt: "PEOC for reliably detecting unencountered states in deep RL"
header:
teaser: assets/figures/6_ood_pipeline.jpg
---
![PEOC Performance](\assets\figures\6_ood_performance.jpg){:style="display:block; width:45%" .align-right}In this work, the development of PEOC, a policy entropy-based classifier for detecting unencountered states in deep reinforcement learning, is proposed. Utilizing the agent's policy entropy as a score, PEOC effectively identifies out-of-distribution scenarios, crucial for ensuring safety in real-world applications. Evaluated against advanced one-class classifiers within procedurally generated environments, PEOC demonstrates competitive performance.
Additionally, a structured benchmarking process for out-of-distribution classification in reinforcement learning is presented, offering a comprehensive approach to evaluating such systems' reliability and effectiveness. {% cite sedlmeier2020policy %}
![PEOC Pipeline](\assets\figures\6_ood_pipeline.jpg){:style="display:block; width:90%" .align-center}

View File

@ -0,0 +1,29 @@
---
layout: single
title: "PEOC OOD Detection"
categories: research
tags: deep-reinforcement-learning out-of-distribution-detection ai-safety anomaly-detection
excerpt: "PEOC uses policy entropy for OOD detection in deep RL."
header:
teaser: /assets/figures/6_ood_pipeline.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Graph comparing PEOC performance against other OOD detection methods](\assets\figures\6_ood_performance.jpg)
{:style="display:block; width:45%" .align-right}
Ensuring the safety and reliability of deep reinforcement learning (RL) agents deployed in real-world environments necessitates the ability to detect when the agent encounters states significantly different from those seen during training (i.e., out-of-distribution or OOD states). This research introduces **PEOC (Policy Entropy-based OOD Classifier)**, a novel and computationally efficient method designed for this purpose.
The core idea behind PEOC is to leverage the entropy of the agent's learned policy as an intrinsic indicator of state familiarity. High policy entropy often correlates with uncertainty, suggesting the agent is in a less familiar or potentially OOD state. PEOC utilizes this readily available metric as a scoring function to distinguish between in-distribution and out-of-distribution inputs.
PEOC's effectiveness was rigorously evaluated within procedurally generated environments, which allow for controlled introduction of novel states. Its performance was benchmarked against several state-of-the-art one-class classification methods adapted for the RL context. The results demonstrate that PEOC achieves competitive performance in identifying OOD states while being simple to implement and integrate into existing deep RL frameworks.
Furthermore, this work contributes a structured benchmarking process specifically designed for evaluating OOD classification methods within the context of reinforcement learning, providing a valuable framework for assessing the reliability of such safety-critical components. For a detailed methodology and evaluation, please refer to the publication by {% cite sedlmeier2020policy %}.
<div style="clear: both;"></div>
<figure style="display:block; width:90%; margin: 1em auto; text-align: center;">
<img src="/assets/figures/6_ood_pipeline.jpg" alt="Diagram showing the PEOC pipeline integrated with a deep RL agent" style="display:block; width:90%">
<figcaption>Conceptual pipeline of the PEOC method for OOD detection in deep RL.</figcaption>
</figure>

View File

@ -0,0 +1,31 @@
---
layout: single
title: "AV Meantime Coverage"
categories: research
tags: autonomous-vehicles shared-mobility transportation-systems urban-computing geoinformatics
excerpt: "Analyzing service coverage of parked AVs during downtime ('meantime')."
header:
teaser: /assets/figures/5_meantime_coverage.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
<center>
<img src="/assets/figures/5_meantime_coverage.jpg" alt="Map visualization showing estimated service coverage areas from parked autonomous vehicles" style="display:block; width:80%">
<figcaption>Visualization of estimated service coverage achievable by utilizing parked autonomous vehicles.</figcaption>
</center><br>
This research investigates a potential transitional model towards future transportation systems, focusing on **privately owned shared autonomous vehicles (SAVs)**. The central idea, termed "What to do in the Meantime," explores the feasibility of leveraging these vehicles for ride-sharing services during the significant portions of the day when they are typically parked and idle (e.g., while the owner is at work).
To assess the potential impact and viability of such a model, we developed and applied **two distinct reachability analysis methods**. These methods estimate the geographic area that could be effectively served by SAVs originating from their parking locations within given time constraints.
The analysis was conducted using a real-world dataset representing mobility patterns and parking durations in the greater **Munich metropolitan area**. Key findings reveal the significant influence of spatio-temporal factors on potential service coverage:
* **Time Dependency:** Service potential fluctuates considerably throughout the day, heavily impacted by rush hours which affect travel times and vehicle availability.
* **Location Dependency:** Marked differences in coverage potential were observed between dense urban centers and more dispersed suburban areas.
This study provides quantitative insights into the opportunities and limitations of utilizing the "meantime" of privately owned autonomous vehicles, contributing to the understanding of how future shared mobility systems might evolve. {% cite illium2020meantime %}
<center>
<img src="/assets/figures/5_meantime_availability.jpg" alt="Graph or map showing the temporal or spatial availability of parked vehicles" style="display:block; width:80%">
<figcaption>Analysis of spatio-temporal availability patterns of potentially shareable parked vehicles.</figcaption>
</center>

View File

@ -1,15 +0,0 @@
---
layout: single
title: "What to do in the Meantime"
categories: research
excerpt: "Service Coverage Analysis for Parked Autonomous Vehicles"
header:
teaser: assets/figures/5_meantime_coverage.jpg
---
![Estimated Service Coverage](\assets\figures\5_meantime_coverage.jpg){:style="display:block; width:80%" .align-center}
This analysis explores the concept of privately owned shared autonomous vehicles as a transitional phase towards a new transportation paradigm. It proposes two reachability analysis methods to assess the impact of utilizing privately owned cars during their typical long parking intervals, such as during an owner's work hours. By applying these methods to a dataset from the Munich area, the study reveals how time and location-dependent factors, like rush hours and urban vs. suburban differences, affect service coverage.
{% cite illium2020meantime %}
![Parked Vehicle Availability](\assets\figures\5_meantime_availability.jpg){:style="display:block; width:80%" .align-center}

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Surgical Mask Detection"
categories: research audio deep-learning
excerpt: "Convolutional Neural Networks and Data Augmentations on Spectrograms"
header:
teaser: assets/figures/7_mask_models.jpg
---
![PEOC Pipeline](\assets\figures\7_mask_mels.jpg){:style="display:block; width:80%" .align-center}
This study assesses the effectiveness of data augmentation in enhancing neural network models for audio data classification, focusing on mel-spectrogram representations. Specifically, it examines the role of data augmentation in improving the performance of convolutional neural networks for detecting the presence of surgical masks from human voice samples, testing across four different network architectures. The findings indicate a significant enhancement in model performance, surpassing many of the existing benchmarks established by the ComParE challenge. For further details, refer to {% cite illium2020surgical %}.
![Models](\assets\figures\7_mask_models.jpg){:style="display:block; width:80%" .align-center}

View File

@ -0,0 +1,26 @@
---
layout: single
title: "Surgical-Mask Detection"
categories: research
tags: audio-classification deep-learning data-augmentation computer-vision paralinguistics
excerpt: "CNN mask detection in speech using augmented spectrograms."
header:
teaser: /assets/figures/7_mask_models.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
This study investigates the efficacy of various **data augmentation techniques** applied directly to **mel-spectrogram representations** of audio data for improving classification performance. The specific task addressed is the detection of surgical mask usage based on human speech signals, a relevant problem in paralinguistics and audio analysis.
We systematically evaluated the impact of data augmentation when training **Convolutional Neural Networks (CNNs)** for this binary classification task. The input to the networks consisted of mel-spectrograms derived from voice samples. The effectiveness of augmentation strategies (such as frequency masking, time masking, or combined approaches like SpecAugment) was assessed across **four different CNN architectures**.
<center>
<img src="/assets/figures/7_mask_mels.jpg" alt="Examples of mel-spectrograms of speech with and without a surgical mask" style="display:block; width:80%">
<figcaption>Mel-spectrogram representations of speech signals used as input for CNNs.</figcaption>
</center><br>
The core finding of this research is that applying appropriate data augmentation directly to the spectrogram inputs significantly enhances the performance and generalization capabilities of the CNN models for surgical mask detection. The augmented models demonstrated improved accuracy, robustness, and notably **surpassed many established benchmark results** from the relevant ComParE (Computational Paralinguistics Challenge) tasks. This highlights the importance of data augmentation as a crucial component in building effective deep learning models for audio classification, particularly when dealing with limited or variable datasets. For a detailed description of the methods and results, please refer to {% cite illium2020surgical %}.
<center>
<img src="/assets/figures/7_mask_models.jpg" alt="Diagrams illustrating the different CNN architectures tested" style="display:block; width:100%">
<figcaption>Overview of the different Convolutional Neural Network architectures evaluated.</figcaption>
</center>

View File

@ -1,12 +0,0 @@
---
layout: single
title: "Anomalous Sound Detection"
categories: research audio deep-learning anomalie-detection
excerpt: "Analysis of Feature Representations for Anomalous Sound Detection"
header:
teaser: assets/figures/8_anomalous_sound_teaser.jpg
---
![Pipeline](\assets\figures\8_anomalous_sound_features.jpg){:style="display:block; width:40%" .align-right}
This study explores the use of pretrained neural networks as feature extractors for detecting anomalous sounds, utilizing these networks to derive semantically rich features for a Gaussian Mixture Model that estimates normality. It examines extractors trained on diverse data domains—images, environmental sounds, and music—applied to industrial noises from machinery. Surprisingly, features based on music data often surpass others, including an autoencoder baseline, suggesting that domain similarity between extractor training and application might not always correlate with performance improvement.
{% cite muller2020analysis %}

View File

@ -0,0 +1,30 @@
---
layout: single
title: "Anomalous Sound Features"
categories: research
tags: anomaly-detection audio-classification deep-learning transfer-learning feature-extraction
excerpt: "Pretrained networks extract features for anomalous industrial sound detection."
header:
teaser: /assets/figures/8_anomalous_sound_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Diagram showing features extracted by different pretrained networks visualized in a latent space](\assets\figures\8_anomalous_sound_features.jpg)
{:style="display:block; width:40%" .align-right}
Detecting anomalous sounds, particularly in industrial settings, is crucial for predictive maintenance and safety. This often involves unsupervised or semi-supervised approaches where models learn a representation of 'normal' sounds. This research explores the effectiveness of leveraging **transfer learning** for this task by using **pretrained deep neural networks** as fixed feature extractors.
The core methodology involves:
1. Taking pretrained networks trained on large datasets from various domains.
2. Using these networks to extract high-level, potentially semantically rich feature vectors from audio signals (specifically, industrial machine noises relevant to challenges like DCASE - Detection and Classification of Acoustic Scenes and Events).
3. Modeling the distribution of features extracted from 'normal' sounds using a **Gaussian Mixture Model (GMM)**.
4. Identifying anomalous sounds as those whose extracted features have low likelihood under the learned normality model.
A key aspect of this study was comparing feature extractors pretrained on distinctly different domains:
* **Images** (e.g., models trained on ImageNet)
* **Environmental Sounds** (e.g., models trained on AudioSet or ESC-50)
* **Music** (e.g., models trained on music tagging datasets)
These were evaluated alongside a baseline autoencoder trained directly on the target machine sound data.
Surprisingly, the results indicated that features derived from networks pretrained on **music data** often yielded the best performance in detecting anomalous industrial sounds, frequently surpassing features from environmental sound models and the autoencoder baseline. This counter-intuitive finding suggests that direct domain similarity between the pretraining dataset and the target application data is not necessarily the primary factor determining the utility of transferred features for anomaly detection. {% cite muller2020analysis %}

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Anomalous Image Transfer"
categories: research audio deep-learning anomalie-detection
excerpt: "Acoustic Anomaly Detection for Machine Sounds based on Image Transfer Learning"
header:
teaser: assets/figures/9_image_transfer_sound_teaser.jpg
---
![Workflow](\assets\figures\9_image_transfer_sound_workflow.jpg){:style="display:block; width:45%" .align-right}
This paper explores acoustic malfunction detection in industrial machinery using transfer learning, specifically leveraging neural networks pretrained on image classification to extract features.
These features, when used with anomaly detection models, outperform traditional convolutional autoencoders in noisy conditions across different machine types. The study highlights the superiority of features from ResNet architectures over AlexNet and Squeezenet, with Gaussian Mixture Models and One-Class Support Vector Machines showing the best performance in detecting anomalies.
{% cite muller2020acoustic %}
![Mels](\assets\figures\9_image_transfer_sound_mels.jpg){:style="display:block; width:85%" .align-center}

View File

@ -0,0 +1,38 @@
---
layout: single
title: "Sound Anomaly Transfer"
categories: research
tags: anomaly-detection audio-classification deep-learning transfer-learning feature-extraction computer-vision
excerpt: "Image nets detect acoustic anomalies in machinery via spectrograms."
header:
teaser: /assets/figures/9_image_transfer_sound_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Workflow diagram showing mel-spectrogram input, feature extraction via image network, and anomaly detection model](\assets\figures\9_image_transfer_sound_workflow.jpg)
{:style="display:block; width:45%" .align-right}
This study investigates an effective approach for **acoustic anomaly detection** in industrial machinery, focusing on identifying malfunctions through sound analysis. The core methodology leverages **transfer learning** by repurposing deep neural networks originally trained for large-scale **image classification** (e.g., on ImageNet) as powerful feature extractors for audio data represented as **mel-spectrograms**.
The process involves:
1. Converting audio signals from machinery into mel-spectrogram images.
2. Feeding these spectrograms into various pretrained image classification networks (specifically comparing **ResNet architectures** against **AlexNet** and **SqueezeNet**) to extract deep feature representations.
3. Training standard anomaly detection models particularly **Gaussian Mixture Models (GMMs)** and **One-Class Support Vector Machines (OC-SVMs)** on the features extracted from normal operation sounds.
4. Classifying new sounds as anomalous if their extracted features deviate significantly from the learned normality model.
Key findings from the experiments, conducted across different machine types and noise conditions, include:
* The proposed transfer learning approach significantly **outperforms baseline methods like traditional convolutional autoencoders**, especially in the presence of background noise.
* Features extracted using **ResNet architectures consistently yielded superior anomaly detection performance** compared to those from AlexNet and SqueezeNet.
* **GMMs and OC-SVMs proved highly effective** as anomaly detection classifiers when applied to these transferred features.
<div style="clear: both;"></div>
<center>
<img src="/assets/figures/9_image_transfer_sound_mels.jpg" alt="Examples of mel-spectrograms from normal and anomalous machine sounds" style="display:block; width:85%">
<figcaption>Mel-spectrogram examples of normal vs. anomalous machine sounds.</figcaption>
</center>
This work demonstrates the surprising effectiveness of transferring knowledge from the visual domain to the acoustic domain for anomaly detection, offering a robust and readily implementable method for monitoring industrial equipment. {% cite muller2020acoustic %}

View File

@ -1,15 +0,0 @@
---
layout: single
title: "Acoustic Leak Detection"
categories: research audio deep-learning anomalie-detection
excerpt: "Anomalie based Leak Detection in Water Networks"
header:
teaser: assets/figures/10_water_networks_teaser.jpg
---
![Approach](\assets\figures\10_water_networks_approach.jpg){:style="display:block; width:40%" .align-right}
This study introduces a method for acoustic leak detection in water networks, focusing on energy efficiency and easy deployment. Utilizing recordings from microphones on a municipal water network, various anomaly detection models, both shallow and deep, were trained. The approach mimics human leak detection methods, allowing intermittent monitoring instead of constant surveillance. While detecting nearby leaks proved easy for most models, neural network-based methods excelled at identifying leaks from a distance, showcasing their effectiveness in practical applications.
{% cite muller2021acoustic %}
![Leak-Mels](\assets\figures\10_water_networks_mel.jpg){:style="display:block; width:85%" .align-center}

View File

@ -0,0 +1,30 @@
---
layout: single
title: "Acoustic Leak Detection"
categories: research
tags: anomaly-detection audio-processing deep-learning signal-processing real-world-application
excerpt: "Anomaly detection models for acoustic leak detection in water networks."
header:
teaser: /assets/figures/10_water_networks_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Diagram illustrating the anomaly detection approach for leak detection](\assets\figures\10_water_networks_approach.jpg)
{:style="display:block; width:40%" .align-right}
Detecting leaks in vast municipal water distribution networks is critical for resource conservation and infrastructure maintenance. This study introduces and evaluates an **anomaly detection approach for acoustic leak identification**, specifically designed with **energy efficiency** and **ease of deployment** as key considerations.
The methodology leverages acoustic recordings captured by microphones deployed directly on a section of a real-world **municipal water network**. Instead of requiring continuous monitoring, the proposed system mimics human inspection routines by performing **intermittent checks**, significantly reducing power consumption and data load.
Various **anomaly detection models**, ranging from traditional "shallow" methods (e.g., GMMs, OC-SVMs) to more complex **deep learning architectures** (e.g., autoencoders, potentially CNNs on spectrograms), were trained using data representing normal network operation. These models were then evaluated on their ability to distinguish anomalous sounds indicative of leaks.
Key findings include:
* Detecting leaks occurring acoustically **nearby** the sensor proved relatively straightforward for most evaluated models.
* **Neural network-based methods demonstrated superior performance** in identifying leaks originating **further away** from the sensor, showcasing their ability to capture more subtle acoustic signatures amidst background noise.
<center>
<img src="/assets/figures/10_water_networks_mel.jpg" alt="Mel-spectrogram examples showing acoustic signatures of normal operation versus leak sounds" style="display:block; width:90%">
<figcaption>Mel-spectrogram visualizations comparing normal sounds and leak-related acoustic patterns.</figcaption>
</center><br>
This research validates the feasibility of using anomaly detection for practical, energy-efficient acoustic leak monitoring in water networks, highlighting the advantages of deep learning techniques for detecting more challenging, distant leaks. {% cite muller2021acoustic %}

View File

@ -1,15 +0,0 @@
---
layout: single
title: "Primate Vocalization Classification"
categories: research audio deep-learning anomalie-detection
excerpt: "A Deep and Recurrent Architecture"
header:
teaser: assets/figures/11_recurrent_primate_workflow.jpg
---
![Leak-Mels](\assets\figures\11_recurrent_primate_workflow.jpg){:style="display:block; width:40%" .align-right}
This study introduces a deep, recurrent architecture for classifying primate vocalizations, leveraging bidirectional Long Short-Term Memory networks and advanced techniques like normalized softmax and focal loss. Bayesian optimization was used to fine-tune hyperparameters, and the model was evaluated on a dataset of primate calls from an African sanctuary, showcasing the effectiveness of acoustic monitoring in wildlife conservation efforts.
{% cite muller2021deep %}
![Approach](\assets\figures\11_recurrent_primate_results.jpg){:style="display:block; width:85%" .align-center}

View File

@ -0,0 +1,30 @@
---
layout: single
title: "Primate Vocalization Classification"
categories: research
tags: deep-learning audio-classification bioacoustics conservation-technology recurrent-neural-networks
excerpt: "Deep BiLSTM classifies primate vocalizations for acoustic wildlife monitoring."
header:
teaser: /assets/figures/11_recurrent_primate_workflow.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Workflow diagram showing audio input, feature extraction, BiLSTM processing, and classification output](\assets\figures\11_recurrent_primate_workflow.jpg)
{:style="display:block; width:40%" .align-right}
Acoustic monitoring offers a powerful, non-invasive tool for wildlife conservation, enabling the study and tracking of animal populations through their vocalizations. This research focuses on improving the automated classification of **primate vocalizations**, a challenging task due to call variability and environmental noise.
We propose a novel **deep, recurrent neural network architecture** specifically designed for this purpose. The core of the model utilizes **bidirectional Long Short-Term Memory (BiLSTM) networks**, which are adept at capturing temporal dependencies within the audio signals (represented, for example, as spectrograms or MFCCs).
To further enhance classification performance, particularly in potentially imbalanced datasets common in bioacoustics, the architecture incorporates advanced techniques:
* **Normalized Softmax:** Improves calibration and potentially robustness.
* **Focal Loss:** Addresses class imbalance by focusing training on hard-to-classify examples.
Hyperparameter tuning, a critical step for optimizing deep learning models, was systematically performed using **Bayesian optimization**.
<center>
<img src="/assets/figures/11_recurrent_primate_results.jpg" alt="Graph or table showing classification accuracy or confusion matrix for primate calls" style="display:block; width:90%">
<figcaption>Performance results demonstrating classification accuracy.</figcaption>
</center><br>
The model's effectiveness was evaluated on a challenging real-world dataset comprising diverse primate calls recorded at an **African wildlife sanctuary**. The results demonstrate the capability of the proposed deep recurrent architecture for accurate primate vocalization classification, underscoring the potential of advanced deep learning techniques combined with automated acoustic monitoring for practical wildlife conservation efforts. {% cite muller2021deep %}

View File

@ -1,15 +0,0 @@
---
layout: single
title: "Mel-Vision Transformer"
categories: research audio deep-learning anomalie-detection
excerpt: "Attention based audio classification on Mel-Spektrograms"
header:
teaser: assets/figures/12_vision_transformer_teaser.jpg
---
![Approach](\assets\figures\12_vision_transformer_models.jpg){:style="display:block; width:80%" .align-center}
This work utilizes the vision transformer model on mel-spectrogram audio data, enhanced by mel-based data augmentation and sample weighting, to achieve notable performance in the ComParE21 challenge, surpassing many single model baselines. The introduction of overlapping vertical patching and the analysis of parameter configurations further refine the approach, demonstrating the model's adaptability and effectiveness in audio processing tasks.
{% cite illium2021visual %}

View File

@ -0,0 +1,29 @@
---
layout: single
title: "Audio Vision Transformer"
categories: research
tags: deep-learning audio-classification computer-vision attention-mechanisms transformers
excerpt: "Vision Transformer on spectrograms for audio classification, with data augmentation."
header:
teaser: /assets/figures/12_vision_transformer_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
This research explores the application of the **Vision Transformer (ViT)** architecture, originally designed for image processing, to the domain of audio classification by operating on **mel-spectrogram representations**. The ViT's attention mechanisms offer a potentially powerful alternative to convolutional approaches for capturing relevant patterns in spectrogram data.
<center>
<img src="/assets/figures/12_vision_transformer_models.jpg" alt="Diagram illustrating the Vision Transformer architecture adapted for mel-spectrogram input" style="display:block; width:80%">
<figcaption>Adapting the Vision Transformer architecture for processing mel-spectrograms.</figcaption>
</center><br>
Key aspects of the methodology include:
* **ViT Adaptation:** Applying the ViT model directly to mel-spectrograms treated as images.
* **Data Augmentation:** Employing **mel-based data augmentation** techniques (e.g., SpecAugment variants) to improve model robustness and generalization.
* **Sample Weighting:** Utilizing sample weighting strategies to address potential class imbalances or focus on specific aspects of the dataset.
* **Patching Strategy:** Introducing and evaluating an **overlapping vertical patching** method, potentially better suited for capturing temporal structures in spectrograms compared to standard non-overlapping patches.
The effectiveness of this "Mel-Vision Transformer" approach was demonstrated within the context of the **ComParE 2021 (Computational Paralinguistics Challenge)**. The proposed model achieved notable performance, **surpassing many established single-model baseline results** on the challenge tasks.
Furthermore, the study includes an analysis of different parameter configurations and architectural choices, providing insights into optimizing ViT models for audio processing tasks. This work showcases the adaptability and potential of transformer architectures, particularly ViT, for effectively tackling audio classification challenges. {% cite illium2021visual %}

View File

@ -1,13 +0,0 @@
---
layout: single
title: "Self-Replication Goals"
categories: research audio deep-learning anomalie-detection
excerpt: "Combining replication and auxiliary task for neural networks."
header:
teaser: assets/figures/13_sr_teaser.jpg
---
![Self-Replicator Analysis](\assets\figures\13_sr_analysis.jpg){:style="display:block; width:80%" .align-center}
This research delves into the innovative concept of self-replicating neural networks capable of performing secondary tasks alongside their primary replication function. By employing separate input/output vectors for dual-task training, the study demonstrates that additional tasks can complement and even stabilize self-replication. The dynamics within an artificial chemistry environment are explored, examining how varying action parameters affect the collective learning capability and how a specially developed 'guiding particle' can influence peers towards achieving goal-oriented behaviors, illustrating a method for steering network populations towards desired outcomes.
{% cite gabor2021goals %}

View File

@ -0,0 +1,26 @@
---
layout: single
title: "Tasked Self-Replication"
categories: research
tags: artificial-life complex-systems neural-networks self-organization multi-task-learning
excerpt: "Self-replicating networks perform tasks, exploring stabilization in artificial chemistry."
header:
teaser: /assets/figures/13_sr_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
Building upon the concept of self-replicating neural networks, this research explores the integration of **auxiliary functional goals** alongside the primary objective of self-replication. The aim is to create networks that can not only reproduce their own weights but also perform useful computations or interact meaningfully with an environment simultaneously.
<center>
<img src="/assets/figures/13_sr_analysis.jpg" alt="Analysis graphs or visualizations related to dual-task self-replicating networks" style="display:block; width:80%">
<figcaption>Analysis of networks balancing self-replication and auxiliary tasks.</figcaption>
</center><br>
The study introduces a methodology for **dual-task training**, utilizing distinct input/output vectors to manage both the replication process and the execution of a secondary task. A key finding is that the presence of an auxiliary task does not necessarily hinder self-replication; instead, it can sometimes **complement and even stabilize** the replication dynamics.
Further investigations were conducted within the framework of an **"artificial chemistry" environment**, where populations of these dual-task networks interact:
* The impact of varying **action parameters** (related to the secondary task) on the collective learning or emergent behavior of the network population was examined.
* A concept of a specially designed **"guiding particle"** network was introduced. This network influences its peers, demonstrating a mechanism for potentially steering the population's evolution towards desired goal-oriented behaviors.
This work provides insights into how functional complexity can be integrated with self-replication in computational systems, offering potential pathways for developing more sophisticated artificial life models and exploring guided evolution within network populations. {% cite gabor2021goals %}

View File

@ -1,14 +0,0 @@
---
layout: single
title: "Anomaly Detection in RL"
categories: research audio deep-learning anomalie-detection
excerpt: "Towards Anomaly Detection in Reinforcement Learning"
header:
teaser: assets/figures/14_ad_rl_teaser.jpg
---
This work investigates anomaly detection (AD) within reinforcement learning (RL), highlighting its importance in safety-critical applications due to the complexity of sequential decision-making in RL. The study criticizes the simplicity of current AD research scenarios in RL, connecting AD to lifelong RL and generalization, discussing their interrelations and potential mutual benefits. It identifies non-stationarity as a crucial area for future AD research in RL, proposing a formal approach through the block contextual Markov decision process and outlining practical requirements for future studies.
{% cite muller2022towards %}
![Formal Definition](\assets\figures\14_ad_rl.jpg){:style="display:block; width:50%" .align-center}

View File

@ -0,0 +1,26 @@
---
layout: single
title: "RL Anomaly Detection"
categories: research
tags: reinforcement-learning anomaly-detection ai-safety lifelong-learning generalization
excerpt: "Perspective on anomaly detection challenges and future in reinforcement learning."
header:
teaser: /assets/figures/14_ad_rl_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
Anomaly Detection (AD) is crucial for the safe deployment of Reinforcement Learning (RL) agents, especially in safety-critical applications where encountering unexpected or out-of-distribution situations can lead to catastrophic failures. This work provides a perspective on the state and future directions of AD research specifically tailored for the complexities inherent in RL.
The paper argues that current AD research within RL often relies on overly simplified scenarios that do not fully capture the challenges of sequential decision-making under uncertainty. It establishes important conceptual connections between AD and other critical areas of RL research:
* **Lifelong Reinforcement Learning:** AD is framed as a necessary component for agents that must continually adapt to changing environments and tasks. Detecting anomalies signals the need for adaptation or learning updates.
* **Generalization:** The ability to detect anomalies is closely related to an agent's generalization capabilities; anomalies often represent situations outside the agent's learned experience manifold.
The study highlights **non-stationarity** (i.e., changes in the environment dynamics or reward structure over time) as a particularly critical and under-explored challenge for AD in RL. To address this formally, the paper proposes utilizing the framework of **block contextual Markov decision processes (BCMDPs)** as a suitable model for defining and analyzing non-stationary anomalies.
<center>
<img src="/assets/figures/14_ad_rl.jpg" alt="Mathematical formalism or diagram related to the block contextual MDP framework" style="display:block; width:50%">
<figcaption>Formalizing non-stationary anomalies using the BCMDP framework.</figcaption>
</center>
Finally, it outlines practical requirements and desiderata for future research in this area, advocating for more rigorous evaluation protocols and benchmark environments to advance the development of robust and reliable AD methods for RL agents. {% cite muller2022towards %}

View File

@ -1,15 +0,0 @@
---
layout: single
title: "Self-Replication in NNs"
categories: research audio deep-learning anomalie-detection
excerpt: "Elaboration and journal article of the initial paper"
header:
teaser: assets/figures/15_sr_journal_teaser.jpg
---
![Children Evolution](\assets\figures\15_sr_journal_children.jpg){:style="display:block; width:65%" .align-center}
This study extends previous work on self-replicating neural networks, focusing on backpropagation as a mechanism for facilitating non-trivial self-replication. It delves into the robustness of these self-replicators against noise and introduces artificial chemistry environments to observe emergent behaviors. Additionally, it provides a detailed analysis of fixpoint weight configurations and their attractor basins, enhancing the understanding of self-replication dynamics within neural networks.
{% cite gabor2022self %}
![Noise Levels](\assets\figures\15_noise_levels.jpg){:style="display:block; width:65%" .align-center}

View File

@ -0,0 +1,30 @@
---
layout: single
title: "Extended Self-Replication"
categories: research
tags: artificial-life complex-systems neural-networks self-organization dynamical-systems
excerpt: "Journal extension: self-replication, noise robustness, emergence, dynamical system analysis."
header:
teaser: /assets/figures/15_sr_journal_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
<center>
<img src="/assets/figures/15_sr_journal_children.jpg" alt="Visualization showing the evolution or diversity of 'child' networks generated through self-replication" style="display:block; width:65%">
<figcaption>Analyzing the lineage and diversity in populations of self-replicating networks.</figcaption>
</center><br>
This journal article provides an extended and more in-depth exploration of self-replicating neural networks, building upon earlier foundational work ([Gabor et al., 2019](link-to-previous-paper-if-available)). The research further investigates the use of **backpropagation-like mechanisms** not for typical supervised learning, but as an effective means to enable **non-trivial self-replication** where networks learn to reproduce their own connection weights.
Key extensions and analyses presented in this work include:
* **Robustness Analysis:** A systematic evaluation of the self-replicating networks' resilience and stability when subjected to various levels of **noise** during the replication process.
* **Artificial Chemistry Environments:** Further development and analysis of simulated environments where populations of self-replicating networks interact, leading to observable **emergent collective behaviors** and ecosystem dynamics.
* **Dynamical Systems Perspective:** A detailed theoretical analysis of the self-replication process viewed as a dynamical system. This includes identifying **fixpoint weight configurations** (networks that perfectly replicate themselves) and characterizing their **attractor basins** (the regions in weight space from which networks converge towards a specific fixpoint).
<center>
<img src="/assets/figures/15_noise_levels.jpg" alt="Graph showing the impact of different noise levels on self-replication fidelity or population dynamics" style="display:block; width:65%">
<figcaption>Investigating the influence of noise on the self-replication process.</figcaption>
</center><br>
By delving deeper into the mechanisms, robustness, emergent properties, and underlying dynamics, this study significantly enhances the understanding of how self-replication can be achieved and analyzed within neural network models, contributing valuable insights to the fields of artificial life and complex systems. {% cite gabor2022self %}

View File

@ -0,0 +1,34 @@
---
layout: single
title: "Organism Network Emergence"
categories: research
tags: artificial-life complex-systems neural-networks self-organization emergent-computation
excerpt: "Self-replicating networks collaborate forming higher-level Organism Networks with emergent functionalities."
header:
teaser: /assets/figures/16_on_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
This research investigates the transition from simple self-replication to higher levels of organization by exploring how populations of basic, self-replicating neural network units can form **"Organism Networks" (ONs)** through **collaboration and emergent differentiation**. Moving beyond the replication of individual networks, the focus shifts to the collective dynamics and functional capabilities that arise when these units interact within a shared environment (akin to an "artificial chemistry").
<center>
<img src="/assets/figures/16_on_architecture.jpg" alt="Diagram showing individual self-replicating units interacting to form a larger Organism Network structure" style="display:block; width:65%">
<figcaption>Conceptual architecture of an Organism Network emerging from interacting self-replicators.</figcaption>
</center><br>
The core hypothesis is that through local interactions and potentially shared environmental feedback, initially homogeneous populations of self-replicators can spontaneously develop specialized roles or structures, leading to a collective entity with capabilities exceeding those of individual units.
![Visualization potentially related to network robustness, differentiation, or communication channels.](\assets\figures\16_on_dropout.jpg)
{:style="display:block; width:45%" .align-right}
Key aspects explored in this work include:
* **Mechanisms for Collaboration:** Investigating how communication or resource sharing between self-replicating units can be established and influence collective behavior.
* **Emergent Differentiation:** Analyzing scenarios where units within the population begin to specialize, adopting different internal states (weight configurations) or functions, analogous to cellular differentiation in biological organisms.
* **Formation of Structure:** Studying how interactions lead to stable spatial or functional structures within the population, forming the basis of the Organism Network.
* **Functional Advantages:** Assessing whether these emergent ONs exhibit novel collective functionalities or improved problem-solving capabilities compared to non-interacting populations. (The role of dropout, as suggested by the image, might relate to promoting robustness or specialization within this context).
This study bridges the gap between single-unit self-replication and the emergence of complex, multi-unit systems in artificial life research, offering insights into how collaborative dynamics can lead to higher-order computational structures. For more detailed insights, refer to {% cite illium2022constructing %}.
<!-- Add clearing div after text if float is used -->
<div style="clear: both;"></div>

View File

@ -1,17 +0,0 @@
---
layout: single
title: "Organism Networks"
categories: research audio deep-learning anomalie-detection
excerpt: "Constructing ON from Collaborative Self-Replicators"
header:
teaser: assets/figures/16_on_teaser.jpg
---
![Organism Network Architecture](\assets\figures\16_on_architecture.jpg){:style="display:block; width:65%" .align-center}
This work delves into the concept of self-replicating neural networks, focusing on how backpropagation facilitates the emergence of complex, self-replicating behaviors.
![Dropout](\assets\figures\16_on_dropout.jpg){:style="display:block; width:45%" .align-right}
By evaluating different network types, the study highlights the natural emergence of robust self-replicators and explores their behavior in artificial chemistry environments.
A significant extension over a previous version, this research offers a deep analysis of fixpoint weight configurations and their attractor basins, advancing the understanding of neural network self-replication.
For more detailed insights, refer to {% cite illium2022constructing %}.

View File

@ -0,0 +1,40 @@
---
layout: single
title: "Voronoi Data Augmentation"
categories: research
tags: data-augmentation computer-vision deep-learning convolutional-neural-networks
excerpt: "VoronoiPatches improves CNN robustness via non-linear recombination augmentation."
header:
teaser: /assets/figures/17_vp_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
Data augmentation is essential for improving the performance and generalization of Convolutional Neural Networks (CNNs), especially when training data is limited. This research introduces **VoronoiPatches (VP)**, a novel data augmentation algorithm based on the principle of **non-linear recombination** of image information.
<center>
<img src="/assets/figures/17_vp_lion.jpg" alt="Example of an image augmented with VoronoiPatches, showing polygon patches blended onto a lion image" style="display:block; width:85%">
<figcaption>Visual example of the VoronoiPatches augmentation applied to an image.</figcaption>
</center><br>
Unlike traditional methods that often apply uniform transformations or cutout regions, VP operates by:
1. Generating a random layout of points within an image.
2. Creating a Voronoi diagram based on these points, partitioning the image into unique, convex polygon-shaped patches.
3. Redistributing information between these patches or blending information across patch boundaries (specific mechanism detailed in the paper).
This approach potentially allows for smoother transitions between augmented regions and the original image compared to sharp cutout methods. The core idea is to encourage the CNN to learn more robust features by exposing it to varied, non-linearly recombined versions of the input data.
---
<div style="text-align: center; margin: 1em 0; font-weight: bold; color: #D4AF37;">
:trophy: Best Poster Award - ICAART 2023 :trophy:<br>
<small>(<a href="https://icaart.scitevents.org/PreviousAwards.aspx?y=2024#2023" target="_blank" rel="noopener noreferrer">Official Link</a>)</small>
</div>
---
Evaluations demonstrate that VoronoiPatches can effectively **reduce model variance and combat overfitting**. Comparative studies indicate that VP **outperforms several existing state-of-the-art data augmentation techniques** in improving the robustness and generalization performance of CNN models on unseen data across various benchmarks. {% cite illium2023voronoipatches %}
<center>
<img src="/assets/figures/17_vp_results.jpg" alt="Graphs showing performance comparison (e.g., accuracy, loss) of VoronoiPatches against other augmentation methods" style="display:block; width:90%">
<figcaption>Comparative results illustrating the performance benefits of VoronoiPatches.</figcaption>
</center><br>

View File

@ -1,16 +0,0 @@
---
layout: single
title: "Voronoi Patches"
categories: research audio deep-learning anomalie-detection
excerpt: "Evaluating A New Data Augmentation Method"
header:
teaser: assets/figures/17_vp_teaser.jpg
---
![Organism Network Architecture](\assets\figures\17_vp_lion.jpg){:style="display:block; width:85%" .align-center}
This study introduces VoronoiPatches (VP), a novel data augmentation algorithm that enhances Convolutional Neural Networks' performance by using non-linear recombination of image information. VP distinguishes itself by utilizing small, convex polygon-shaped patches in random layouts to redistribute information within an image, potentially smoothing transitions between patches and the original image. This method has shown to outperform existing data augmentation techniques in reducing model variance and overfitting, thus improving the robustness of CNN models on unseen data. {% cite illium2022voronoipatches %}
:trophy: Our work was awarded the [Best Poster Award](https://icaart.scitevents.org/PreviousAwards.aspx?y=2024#2023) at ICAART 2023 :trophy:
![Dropout](\assets\figures\17_vp_results.jpg){:style="display:block; width:90%" .align-center}

View File

@ -0,0 +1,30 @@
---
layout: single
title: "Emergent Social Dynamics"
categories: research
tags: artificial-life complex-systems neural-networks self-organization emergent-behavior predictive-coding
excerpt: "Artificial chemistry networks develop predictive models via surprise minimization."
header:
teaser: /assets/figures/18_surprised_soup_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
This research extends the study of **artificial chemistry** systems populated by neural network "particles," focusing on the emergence of complex behaviors driven by **social interaction** rather than explicit programming. Building on systems where particles may exhibit self-replication, we introduce interactions based on principles of **predictive processing and surprise minimization** (akin to the Free Energy Principle).
![Schematic diagram illustrating interacting neural network particles in the 'social soup'](\assets\figures\18_surprised_soup_schematic.jpg)
{:style="display:block; width:40%" .align-right}
Specifically, particles are equipped with mechanisms enabling them to **recognize and build predictive models of their peers' behavior**. The learning process is driven by the minimization of prediction error, or "surprise," incentivizing particles to accurately anticipate the actions or state changes of others within the "soup."
Key observations from this setup include:
* The emergence of **stable behavioral patterns and population dynamics** purely from these local, predictive interactions. Notably, these emergent patterns often resemble the stability observed in systems where self-replication was an explicitly trained objective.
* The introduction of a unique **"catalyst" particle** designed to exert evolutionary pressure on the system, demonstrating how external influences or specialized agents can shape the collective dynamics.
<center>
<img src="/assets/figures/18_surprised_soup_trajec.jpg" alt="Trajectories or state space visualization of the particle population dynamics over time" style="display:block; width:90%">
<figcaption>Visualization of particle trajectories or population dynamics within the 'social soup'.</figcaption>
</center>
This study highlights how complex, seemingly goal-directed social behaviors and stable ecosystem structures can emerge from simple, local rules based on mutual prediction and surprise minimization among interacting agents, offering insights into the self-organization of complex adaptive systems. {% cite zorn23surprise %}

View File

@ -1,15 +0,0 @@
---
layout: single
title: "Social NN-Soup"
categories: research audio deep-learning anomalie-detection
excerpt: "Social interaction based on surprise minimization"
header:
teaser: assets/figures/18_surprised_soup_teaser.jpg
---
![Social Soup Schematics](\assets\figures\18_surprised_soup_schematic.jpg){:style="display:block; width:40%" .align-right}
This research explores artificial chemistry systems with neural network particles that exhibit self-replication. Introducing interactions that enable these particles to recognize and predict each other's behavior, the study observes emergent behaviors akin to stability patterns previously seen in explicit self-replication training. A unique catalyst particle introduces evolutionary pressure, demonstrating how 'social' interactions among particles can lead to complex, emergent outcomes.
{% cite zorn23surprise %}
![Soup Trajectories](\assets\figures\18_surprised_soup_trajec.jpg){:style="display:block; width:90%" .align-center}

View File

@ -1,16 +0,0 @@
---
layout: single
title: "Binary Presorting"
categories: research audio deep-learning anomalie-detection
excerpt: "Improving primate sounds classification by sublabeling"
header:
teaser: assets/figures/19_binary_primates_teaser.jpg
---
![Multiclass Training Pipeline](\assets\figures\19_binary_primates_pipeline.jpg){:style="display:block; width:40%" .align-right}
This study advances machine learning applications in wildlife observation by introducing a sophisticated approach to audio classification. By meticulously relabeling subsegments of MEL spectrograms, it significantly refines the process of multi-class classification, crucial for identifying various primate species from audio recordings. Employing convolutional neural networks alongside innovative data augmentation techniques, the methodology showcases remarkable enhancements in classification performance. When applied to the demanding ComparE 2021 dataset, this approach not only achieved substantially higher accuracy and UAR scores over existing baselines but also marked a significant stride in the field of bioacoustics research, demonstrating the potential of machine learning to overcome challenges presented by datasets with weak labeling, varying lengths, and poor signal-to-noise ratios.
{% cite koelle23primate %}
![Thresholding](\assets\figures\19_binary_primates_thresholding.jpg){:style="display:block; width:70%" .align-center}
![Thresholding](\assets\figures\19_binary_primates_results.jpg){:style="display:block; width:70%" .align-center}

View File

@ -0,0 +1,36 @@
---
layout: single
title: "Primate Subsegment Sorting"
categories: research
tags: bioacoustics audio-classification deep-learning data-labeling signal-processing
excerpt: "Binary subsegment presorting improves noisy primate sound classification."
header:
teaser: /assets/figures/19_binary_primates_teaser.jpg
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Diagram illustrating the multi-class training pipeline incorporating subsegment relabeling](\assets\figures\19_binary_primates_pipeline.jpg)
{:style="display:block; width:40%" .align-right}
Automated acoustic classification plays a vital role in wildlife monitoring and bioacoustics research. This study introduces a sophisticated pre-processing and training strategy to significantly enhance the accuracy of multi-class audio classification, specifically targeting the identification of different primate species from field recordings.
A key challenge in bioacoustics is dealing with datasets containing weak labels (where calls of interest occupy only a portion of a labeled segment), varying segment lengths, and poor signal-to-noise ratios (SNR). Our approach addresses this by:
1. **Subsegment Analysis:** Processing audio recordings represented as **MEL spectrograms**.
2. **Refined Labeling:** Meticulously **relabeling subsegments** within the spectrograms. This "binary presorting" step effectively identifies and isolates the actual vocalizations of interest within longer, weakly labeled recordings.
3. **CNN Training:** Training **Convolutional Neural Networks (CNNs)** on these refined, higher-quality subsegment inputs.
4. **Data Augmentation:** Employing innovative **data augmentation techniques** suitable for spectrogram data to further improve model robustness.
<center>
<img src="/assets/figures/19_binary_primates_thresholding.jpg" alt="Visualization related to the thresholding or selection process for subsegment labeling" style="display:block; width:70%">
<figcaption>Thresholding or selection criteria for subsegment refinement.</figcaption>
</center><br>
The effectiveness of this methodology was evaluated on the challenging **ComParE 2021 Primate dataset**. The results demonstrate remarkable improvements in classification performance, achieving substantially higher accuracy and Unweighted Average Recall (UAR) scores compared to existing baseline methods.
<center>
<img src="/assets/figures/19_binary_primates_results.jpg" alt="Graphs or tables showing improved classification results (accuracy, UAR) compared to baselines" style="display:block; width:70%">
<figcaption>Comparative performance results on the ComParE 2021 dataset.</figcaption>
</center><br>
This work represents a significant advancement in handling difficult, real-world bioacoustic data, showcasing how careful data refinement prior to deep learning model training can dramatically enhance classification outcomes. {% cite koelle23primate %}

View File

@ -0,0 +1,36 @@
---
layout: single
title: "Aquarium MARL Environment"
categories: research
tags: multi-agent-reinforcement-learning MARL simulation emergence complex-systems
excerpt: "Aquarium: Open-source MARL environment for predator-prey studies."
header:
teaser: /assets/figures/20_aquarium.png
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Diagram illustrating the multi-agent reinforcement learning cycle within the Aquarium environment](\assets\figures\20_aquarium.png){:style="display:block; width:40%" .align-right}
The study of complex interactions using Multi-Agent Reinforcement Learning (MARL), particularly **predator-prey dynamics**, often requires specialized simulation environments. To streamline research and avoid redundant development efforts, we introduce **Aquarium**: a versatile, open-source MARL environment specifically designed for investigating predator-prey scenarios and related **emergent behaviors**.
Key Features of Aquarium:
* **Framework Integration:** Built upon and seamlessly integrates with the popular **PettingZoo API**, allowing researchers to readily apply existing MARL algorithm implementations (e.g., from Stable-Baselines3, RLlib).
* **Physics-Based Movement:** Simulates agent movement on a two-dimensional, continuous plane with edge-wrapping boundaries, incorporating basic physics for more realistic interactions.
* **High Customizability:** Offers extensive configuration options for:
* **Agent-Environment Interactions:** Observation spaces, action spaces, and reward functions can be tailored to specific research questions.
* **Environmental Parameters:** Key dynamics like agent speeds, prey reproduction rates, predator starvation mechanisms, sensor ranges, and more are fully adjustable.
* **Visualization & Recording:** Includes a resource-efficient visualizer and supports video recording of simulation runs, facilitating qualitative analysis and understanding of agent behaviors.
<div style="display: flex; align-items: center; justify-content: center;">
<center>
<img src="/assets/figures/20_observation_vector.png" alt="Diagram detailing the construction of the observation vector for an agent" style="display:inline-table; width:85%">
<figcaption>Construction details of the agent observation vector.</figcaption>
</center>
<center>
<img src="/assets/figures/20_capture_statistics.png" alt="Graphs showing average captures or rewards per prey agent under different training regimes" style="display:inline-table; width:100%">
<figcaption>Performance metrics (e.g., average captures/rewards) comparing training strategies.</figcaption>
</center>
</div>
To demonstrate its capabilities, we conducted preliminary studies using **Proximal Policy Optimization (PPO)** to train multiple prey agents learning to evade a predator within Aquarium. Consistent with findings in existing MARL literature, our results showed that training agents with **individual policies led to suboptimal performance**, whereas utilizing **parameter sharing** among prey agents significantly improved coordination, sample efficiency, and overall evasion success. {% cite kolle2024aquarium %}

View File

@ -1,18 +0,0 @@
---
layout: single
title: "Aquarium"
categories: research MARL reinforcement-learning multi-agent
excerpt: "Exploring Predator-Prey Dynamics in multi-agent reinforcement-learning"
header:
teaser: assets/figures/20_aquarium.png
---
![Multi-Agent Reinforcement Learning Cycle](\assets\figures\20_aquarium.png){:style="display:block; width:40%" .align-right}
Recent advances in multi-agent reinforcement learning have enabled the modeling of complex interactions between agents in simulated environments. In particular, predator-prey dynamics have garnered significant interest, and various simulations have been adapted to meet unique requirements. To avoid further time-intensive development efforts, we introduce *Aquarium*, a versatile multi-agent reinforcement learning environment designed for studying predator-prey interactions and emergent behavior. *Aquarium* is open-source and seamlessly integrates with the PettingZoo framework, allowing for a quick start using established algorithm implementations. It features physics-based agent movement on a two-dimensional, edge-wrapping plane. Both the agent-environment interactions (observations, actions, rewards) and environmental parameters (agent speed, prey reproduction, predator starvation, and more) are fully customizable. In addition to providing a resource-efficient visualization, *Aquarium* supports video recording, facilitating a visual understanding of agent behavior.
To showcase the environment's capabilities, we conducted preliminary studies using proximal policy optimization (PPO) to train multiple prey agents to evade a predator. Consistent with existing literature, we found that individual learning leads to worse performance, while parameter sharing significantly improves coordination and sample efficiency.
{% cite kolle2024aquarium %}
![Construction of the Observation Vector](\assets\figures\20_capture_statistics.png){:style="display:block; width:70%" .align-center}
![Average captures and rewards per prey agent](\assets\figures\20_observation_vector.png){:style="display:block; width:70%" .align-center}

View File

@ -1,18 +0,0 @@
---
layout: single
title: "MAS Emergence"
categories: research multi-agent reinforcement-learning safety emergence
excerpt: "A safety perspective on emergence in multi-agent reinforcement-learning"
header:
teaser: assets/figures/21_coins_teaser.png
---
![Evaluation Environments](\assets\figures\21_envs.png){:style="display:block; width:40%" .align-right}
Emergent effects can occur in multi-agent systems (MAS), where decision-making is decentralized and based on local information. These effects may range from minor deviations in behavior to catastrophic system failures. To formally define these phenomena, we identify misalignments between the global inherent specification (the true specification) and its local approximation (e.g., the configuration of distinct reward components or observations). Leveraging established safety concepts, we develop a framework for understanding these emergent effects. To demonstrate the resulting implications, we examine two highly configurable gridworld scenarios, where inadequate specifications lead to unintended behavior deviations when derived independently. Acknowledging that a global solution may not always be practical, we propose adjusting the underlying parameterizations to mitigate these issues, thereby improving system alignment and reducing the risk of emergent failures.
{% cite altmann2024emergence %}
![Instances of emergent behavior](\assets\figures\21_coins.png){:style="display:block; width:70%" .align-center}
![Blocking behavior](\assets\figures\21_blocking.png){:style="display:block; width:70%" .align-center}

View File

@ -0,0 +1,31 @@
---
layout: single
title: "MAS Emergence Safety"
categories: research
tags: multi-agent-systems MARL AI-safety emergence system-specification
excerpt: "Formalized MAS emergence misalignment; proposed safety mitigation strategies."
header:
teaser: /assets/figures/21_coins_teaser.png
scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en"
---
![Diagrams of the gridworld environments used for evaluation](\assets\figures\21_envs.png)
{:style="display:block; width:40%" .align-right}
Multi-Agent Systems (MAS), particularly those employing decentralized decision-making based on local information (common in MARL), can exhibit **emergent effects**. These phenomena, arising from complex interactions, range from minor behavioral quirks to potentially catastrophic system failures, posing significant **safety challenges**.
This research provides a framework for understanding and mitigating undesirable emergence from a **safety perspective**. We propose a formal definition: emergent effects arise from **misalignments between the *global inherent specification*** (the intended overall system goal or behavior) **and its *local approximation*** used by individual agents (e.g., distinct reward components, limited observations).
<center>
<img src="/assets/figures/21_coins.png" alt="Visualization showing agents exhibiting emergent coin-collecting behavior" style="display:block; width:70%">
<figcaption>Example of emergent behavior (e.g., coin hoarding) due to specification misalignment.</figcaption>
</center><br>
Leveraging established concepts from system safety engineering, we analyze how such misalignments can lead to deviations from intended global behavior. To illustrate the practical implications, we examine two highly configurable gridworld scenarios. These demonstrate how inadequate or independently derived local specifications (rewards/observations) can predictably result in unintended emergent behaviors, such as resource hoarding or inefficient coordination.
<center>
<img src="/assets/figures/21_blocking.png" alt="Visualization showing agents exhibiting emergent blocking behavior" style="display:block; width:60%">
<figcaption>Example of emergent behavior (e.g., mutual blocking) due to specification misalignment.</figcaption>
</center><br>
Recognizing that achieving a perfectly aligned global specification might be impractical in complex systems, we propose strategies focused on **adjusting the underlying local parameterizations** (e.g., reward shaping, observation design) to mitigate harmful emergence. By carefully tuning these local components, system alignment can be improved, reducing the risk of emergent failures and enhancing overall safety. {% cite altmann2024emergence %}