ANALYZING AUDIO AESTHETICS: AN EMPIRICAL EXAMINATION OF MUSIC GENRE CLASSIFICATION USING MULTILAYER PERCEPTRON AND FEATURE EXTRACTION

Authors

  • Dr. Ahmad Yusril Abdul Gani Faculty of Computer Science and Information Technology, Gunadarma University, Jakarta, Indonesia

Keywords:

Music genre classification, Multilayer Perceptron (MLP),, Chroma Feature, Mel Frequency Cepstral Coefficients (MFCC).

Abstract

The exponential growth of music databases has led to the challenge of manual music categorization, making it difficult to search for specific music genres in vast collections. Digital music development, particularly in genre classification, has facilitated the study and retrieval of songs. Consequently, there is a need for a convenient and efficient genre classification method that optimizes the learning process and ensures accurate results. This study explores the comparison between two music genre classification approaches: one using the Multilayer Perceptron (MLP) model with Chroma Feature extraction, and the other with Mel Frequency Cepstral Coefficients (MFCC) extraction. The dataset utilized in this research comprises audio data from songs, drawn from the GTZAN music dataset available through http://opihi.cs.uvic.ca/.

Machine Learning, a branch of computer science, provides the theoretical foundation for automating the classification process. In particular, Deep Learning (DL), a subset of Machine Learning, is employed to process inexact data such as language, sound, or images. By applying Artificial Intelligence (AI), computers can learn from patterns and store acquired knowledge. The study focuses on MLP, a machine learning implementation method, to build the classification models.

The primary objective is to determine which extraction features yield better accuracy in classifying song genres. Both Chroma Feature and MFCC extraction features are evaluated, and the classification results obtained from the MLP models are compared. The dataset consists of 1000 sample songs, encompassing ten distinct music genres, each with 100 songs in WAV format. The dataset is further divided into three sections of equal duration (10 seconds), resulting in a 3000-sample dataset comprising 300 songs for each genre.

To evaluate the models, 80% of the dataset (2400 songs) is used for training, and the remaining 20% (600 songs) is reserved for testing. The study demonstrates the potential of AI and deep learning techniques to effectively classify music genres, enabling more efficient music retrieval and analysis

Published

2023-11-29

Issue

Section

Articles