base init

2023-12-03 18:05:58 +01:00
parent bc0c83c0c4
commit 04aff34e9d
709 changed files with 1137 additions and 18147 deletions
--- a/_posts/research/2021-03-05-Vision_Transformer.md
+++ b/_posts/research/2021-03-05-Vision_Transformer.md
@@ -0,0 +1,15 @@
+---
+layout: single
+title:  "Mel-Vision Transformer"
+categories: research audio deep-learning anomalie-detection 
+excerpt: "Attention based audio classification on Mel-Spektrograms"
+header:
+  teaser: assets/figures/12_vision_transformer_teaser.jpg
+---
+
+![Leak-Mels](\assets\figures\12_vision_transformer_data.jpg){:style="display:block; margin-left:auto; margin-right:auto"}
+
+We apply the vision transformer, a deep machine learning model build around the attention mechanism, on mel-spectrogram representations of raw audio recordings. When adding mel-based data augmentation techniques and sample-weighting, we achieve comparable performance on both (PRS and CCS challenge) tasks of ComParE21, outperforming most single model baselines. We further introduce overlapping vertical patching and evaluate the influence of parameter configurations.
+{% cite illium2021visual %}
+
+![Approach](\assets\figures\12_vision_transformer_models.jpg){:style="display:block; margin-left:auto; margin-right:auto"}