website/content/research/2021-03-02-sound-anomaly-transfer.mdx

---
title: "Sound Anomaly Transfer"
tags: [anomaly-detection, audio-classification, deep-learning, transfer-learning, feature-extraction, computer-vision, industrial-monitoring, machine-learning]
excerpt: "Image nets detect acoustic anomalies in machinery via spectrograms."
teaser: "/figures/9_image_transfer_sound_teaser.jpg"
venue: "ICAART 2021"
---

<FloatingImage src="/figures/9_image_transfer_sound_workflow.jpg" alt="Workflow of sound anomaly detection using image transfer learning" width={800} height={400} float="right" caption="Overall workflow for acoustic anomaly detection using transfer learning from image classification models." />

This study investigates an effective approach for **acoustic anomaly detection** in industrial machinery, focusing on identifying malfunctions through sound analysis.

The core methodology leverages **transfer learning** by repurposing deep neural networks originally trained for large-scale **image classification** (e.g., on ImageNet) as powerful feature extractors for audio data represented as **mel-spectrograms**.

The process involves:
1.  Converting audio signals from machinery into mel-spectrogram images.
2.  Feeding these spectrograms into various pretrained image classification networks (specifically comparing **ResNet architectures** against **AlexNet** and **SqueezeNet**) to extract deep feature representations.
3.  Training standard anomaly detection models – particularly **Gaussian Mixture Models (GMMs)** and **One-Class Support Vector Machines (OC-SVMs)** – on the features extracted from normal operation sounds.
4.  Classifying new sounds as anomalous if their extracted features deviate significantly from the learned normality model.

Key findings from the experiments, conducted across different machine types and noise conditions, include:

*   The proposed transfer learning approach significantly **outperforms baseline methods like traditional convolutional autoencoders**, especially in the presence of background noise.
*   Features extracted using **ResNet architectures consistently yielded superior anomaly detection performance** compared to those from AlexNet and SqueezeNet.
*   **GMMs and OC-SVMs proved highly effective** as anomaly detection classifiers when applied to these transferred features.

<CenteredImage src="/figures/9_image_transfer_sound_mels.jpg" alt="Examples of mel-spectrograms from normal and anomalous machine sounds" width={1024} height={400} caption="Mel-spectrogram examples illustrating the difference between normal and anomalous machine sounds." maxWidth="75%" />

This work demonstrates the surprising effectiveness of transferring knowledge from the visual domain to the acoustic domain for anomaly detection, offering a robust and readily implementable method for monitoring industrial equipment. <Cite bibtexKey="muller2020acoustic" />