Files
website/content/research/2021-03-04-primate-vocalization-classification.mdx
Steffen Illium 0444067c2d
Some checks failed
Next.js App CI / docker (push) Failing after 3m19s
refined design
2025-09-14 22:49:23 +02:00

35 lines
2.6 KiB
Plaintext

---
title: "Primate Vocalization Classification"
tags: [deep-learning, audio-classification, bioacoustics, conservation-technology, recurrent-neural-networks, machine-learning, wildlife-monitoring, pytorch, animal-conservation, bayesian-optimization]
excerpt: "Deep BiLSTM classifies primate vocalizations for acoustic wildlife monitoring."
teaser: /figures/11_recurrent_primate_workflow.jpg
venue: "Interspeech 2021"
---
# Primate Vocalization Classification
Acoustic monitoring offers a powerful, non-invasive tool for wildlife conservation, enabling the study and tracking of animal populations through their vocalizations.
This research focuses on improving the automated classification of **primate vocalizations**, a challenging task due to call variability and environmental noise.
<FloatingImage src="/figures/11_recurrent_primate_workflow.jpg" alt="Workflow diagram for recurrent neural network primate vocalization classification" width={400} height={225} caption="Overall workflow of the deep recurrent neural network for primate vocalization classification." float="right" />
We propose a novel **deep, recurrent neural network architecture** specifically designed for this purpose. The core of the model utilizes **bidirectional Long Short-Term Memory (BiLSTM) networks**, which are adept at capturing temporal dependencies within the audio signals (represented, for example, as spectrograms or MFCCs).
To further enhance classification performance, particularly in potentially imbalanced datasets common in bioacoustics, the architecture incorporates advanced techniques:
- **Normalized Softmax:** Improves calibration and potentially robustness.
- **Focal Loss:** Addresses class imbalance by focusing training on hard-to-classify examples.
Hyperparameter tuning, a critical step for optimizing deep learning models, was systematically performed using **Bayesian optimization**.
<CenteredImage
src="/figures/11_recurrent_primate_results.jpg"
alt="Graph showing classification accuracy for primate calls"
width={800}
height={450}
caption="Performance results demonstrating classification accuracy of the deep recurrent model on primate calls."
maxWidth="100%"
/>
The model's effectiveness was evaluated on a challenging real-world dataset comprising diverse primate calls recorded at an **African wildlife sanctuary**. The results demonstrate the capability of the proposed deep recurrent architecture for accurate primate vocalization classification, underscoring the potential of advanced deep learning techniques combined with automated acoustic monitoring for practical wildlife conservation efforts. <Cite bibtexKey="muller2021deep" />