website/content/research/2023-06-25-primate-subsegment-sorting.mdx

---
title: "Primate Subsegment Sorting"
tags: [bioacoustics, audio-classification, deep-learning, data-labeling, signal-processing]
excerpt: "Binary subsegment presorting improves noisy primate sound classification."
teaser: /figures/19_binary_primates_teaser.jpg

---

<div className="not-prose flex flex-col md:flex-row gap-8 items-start">
  <div className="flex-1 md:max-w-2xl">
    Automated acoustic classification plays a vital role in wildlife monitoring and bioacoustics research. This study introduces a sophisticated pre-processing and training strategy to significantly enhance the accuracy of multi-class audio classification, specifically targeting the identification of different primate species from field recordings.
  </div>
  <div className="md:w-96 flex-shrink-0">
    <div className="mt-4 text-right">
      <Image
        src="/figures/19_binary_primates_pipeline.jpg"
        alt="Visualization related to the thresholding or selection process for subsegment labeling"
        width={300}
        height={600}
        className="w-full h-auto rounded-md shadow-md"
      />
      <figcaption className="text-sm text-muted-foreground mt-2 block text-center md:text-right">
        Thresholding or selection criteria for subsegment refinement.
      </figcaption>
    </div>
  </div>
</div>

A key challenge in bioacoustics is dealing with datasets containing weak labels (where calls of interest occupy only a portion of a labeled segment), varying segment lengths, and poor signal-to-noise ratios (SNR). Our approach addresses this by:
1.  **Subsegment Analysis:** Processing audio recordings represented as **MEL spectrograms**.
2.  **Refined Labeling:** Meticulously **relabeling subsegments** within the spectrograms. This "binary presorting" step effectively identifies and isolates the actual vocalizations of interest within longer, weakly labeled recordings.
3.  **CNN Training:** Training **Convolutional Neural Networks (CNNs)** on these refined, higher-quality subsegment inputs.
4.  **Data Augmentation:** Employing innovative **data augmentation techniques** suitable for spectrogram data to further improve model robustness.

<div className="my-6 text-center">
  <Image src="/figures/19_binary_primates_thresholding.jpg" alt="Visualization related to the thresholding or selection process for subsegment labeling" width={800} height={600} className="w-3/4 mx-auto rounded-lg" />
  <figcaption className="text-sm text-muted-foreground mt-2">Thresholding or selection criteria for subsegment refinement.</figcaption>
</div>

The effectiveness of this methodology was evaluated on the challenging **ComParE 2021 Primate dataset**. The results demonstrate remarkable improvements in classification performance, achieving substantially higher accuracy and Unweighted Average Recall (UAR) scores compared to existing baseline methods.

<div className="my-6 text-center">
  <Image src="/figures/19_binary_primates_results.jpg" alt="Graphs or tables showing improved classification results (accuracy, UAR) compared to baselines" width={800} height={600} className="w-3/4 mx-auto rounded-lg" />
  <figcaption className="text-sm text-muted-foreground mt-2">Comparative performance results on the ComParE 2021 dataset.</figcaption>
</div>

This work represents a significant advancement in handling difficult, real-world bioacoustic data, showcasing how careful data refinement prior to deep learning model training can dramatically enhance classification outcomes. <Cite bibtexKey="koelle23primate" />