Zobrazit minimální záznam

dc.contributor.advisorWidmer, Gerhard
dc.contributor.authorPaischer, Fabian
dc.date.accessioned2022-03-08T14:23:23Z
dc.date.available2022-03-08T14:23:23Z
dc.date.issued2018
dc.date.submitted2018-06-10
dc.identifier.urihttps://dspace.jcu.cz/handle/20.500.14390/38568
dc.description.abstractIn recent years deep learning has become one of the most popular machine learning techniques for a vast variety of complex problems. An example for such a task is to mirror the human auditory system to classify audio recordings according to the location they were recorded in. This work focuses mainly on the Acoustic Scene Classification task proposed by the IEEE DCASE Challenge. The dataset for Acoustic Scene Classification consists of recordings from distinct recording locations. The aim of the challenge is to classify an unseen test set of recordings. In the challenge of 2016 the training and test set did not differ significantly. In the challenge of 2017, however, the test set originated from a different distribution, implying a strong need for generalization. In the course of this work, the initial implementation consisting of a Deep Convolutional Neural Network for the DCASE 2016 challenge submission (done in Lasagne) was re-implemented in Keras. An extension of the ADAM optimizer (AMSGrad) was investigated for improvement in generalization. Other submissions to the DCASE 2017 challenge suggest that different types of spectrograms might be key for better generalization. Therefore experiments utilizing different kinds of spectrograms were conducted. Furthermore, different interpolation algorithms were used for data augmentation, with some of them yielding significant improvements in classification accuracy and generalization. For different spectrogram dimensions, slight adjustments in the network architecture also resulted in a performance gain. To better understand what different models "see" and what they focus on, their filters, and activations were visualized and compared for differences. Finally the adjustments which led to better generalization on the dataset of the DCASE 2016 challenge were tested on the dataset of the DCASE 2017 challenge, leading to an improvement over all submissions to the DCASE 2017 challenge from the Institute of Computational Perception.cze
dc.format60
dc.format60
dc.language.isoeng
dc.publisherJihočeská univerzitacze
dc.rightsBez omezení
dc.titleImproving Generalization of Deep Convolutional Neural Networks for Acoustic Scene Classificationcze
dc.title.alternativeImproving Generalization of Deep Convolutional Neural Networks for Acoustic Scene Classificationeng
dc.typebakalářská prácecze
dc.identifier.stag54681
dc.description.abstract-translatedIn recent years deep learning has become one of the most popular machine learning techniques for a vast variety of complex problems. An example for such a task is to mirror the human auditory system to classify audio recordings according to the location they were recorded in. This work focuses mainly on the Acoustic Scene Classification task proposed by the IEEE DCASE Challenge. The dataset for Acoustic Scene Classification consists of recordings from distinct recording locations. The aim of the challenge is to classify an unseen test set of recordings. In the challenge of 2016 the training and test set did not differ significantly. In the challenge of 2017, however, the test set originated from a different distribution, implying a strong need for generalization. In the course of this work, the initial implementation consisting of a Deep Convolutional Neural Network for the DCASE 2016 challenge submission (done in Lasagne) was re-implemented in Keras. An extension of the ADAM optimizer (AMSGrad) was investigated for improvement in generalization. Other submissions to the DCASE 2017 challenge suggest that different types of spectrograms might be key for better generalization. Therefore experiments utilizing different kinds of spectrograms were conducted. Furthermore, different interpolation algorithms were used for data augmentation, with some of them yielding significant improvements in classification accuracy and generalization. For different spectrogram dimensions, slight adjustments in the network architecture also resulted in a performance gain. To better understand what different models "see" and what they focus on, their filters, and activations were visualized and compared for differences. Finally the adjustments which led to better generalization on the dataset of the DCASE 2016 challenge were tested on the dataset of the DCASE 2017 challenge, leading to an improvement over all submissions to the DCASE 2017 challenge from the Institute of Computational Perception.eng
dc.date.accepted2018-06-13
dc.description.departmentPřírodovědecká fakultacze
dc.thesis.degree-disciplineBioinformaticscze
dc.thesis.degree-grantorJihočeská univerzita. Přírodovědecká fakultacze
dc.thesis.degree-nameBc.
dc.thesis.degree-programApplied Informaticscze
dc.description.gradeDokončená práce s úspěšnou obhajoboucze
dc.contributor.refereeHofmarcher, Markus


Soubory tohoto záznamu

Thumbnail
Thumbnail
Thumbnail
Thumbnail

Tento záznam se objevuje v

Zobrazit minimální záznam