--- layout: single title: "Surgical-Mask Detection" categories: research tags: audio-classification deep-learning data-augmentation computer-vision paralinguistics excerpt: "CNN mask detection in speech using augmented spectrograms." header: teaser: /assets/figures/7_mask_models.jpg scholar_link: "https://scholar.google.de/citations?user=NODAd94AAAAJ&hl=en" --- This study investigates the efficacy of various **data augmentation techniques** applied directly to **mel-spectrogram representations** of audio data for improving classification performance. The specific task addressed is the detection of surgical mask usage based on human speech signals, a relevant problem in paralinguistics and audio analysis. We systematically evaluated the impact of data augmentation when training **Convolutional Neural Networks (CNNs)** for this binary classification task. The input to the networks consisted of mel-spectrograms derived from voice samples. The effectiveness of augmentation strategies (such as frequency masking, time masking, or combined approaches like SpecAugment) was assessed across **four different CNN architectures**.