Kombinasi Oversampling dan Undersampling dalam Menangani Class Unbalanced dan Overlapping pada Klasifikasi Data Bank Marketing

Erlangga, Anak Agung Gde Wahyu Sukma (2024) Kombinasi Oversampling dan Undersampling dalam Menangani Class Unbalanced dan Overlapping pada Klasifikasi Data Bank Marketing. Masters thesis, Universitas Pendidikan Ganesha.

[img] Text (COVER)
2229101028-COVER.pdf

Download (1MB)
[img] Text (ABSTRAK)
2229101028-ABSTRAK.pdf

Download (265kB)
[img] Text (BAB 1 PENDAHULUAN)
2229101028-BAB 1 PENDAHULUAN.pdf

Download (326kB)
[img] Text (BAB 2 KAJIAN TEORI)
2229101028-BAB 2 KAJIAN TEORI.pdf
Restricted to Repository staff only

Download (643kB) | Request a copy
[img] Text (BAB 3 METODELOGI PENELITIAN)
2229101028-BAB 3 METODELOGI PENELITIAN.pdf
Restricted to Repository staff only

Download (661kB) | Request a copy
[img] Text (BAB 4 HASIL DAN PEMBAHASAN)
2229101028-BAB 4 HASIL DAN PEMBAHASAN.pdf
Restricted to Repository staff only

Download (2MB) | Request a copy
[img] Text (BAB 5 PENUTUP)
2229101028-BAB 5 PENUTUP.pdf
Restricted to Repository staff only

Download (232kB) | Request a copy
[img] Text (DAFTAR PUSTAKA)
2229101028-DAFTAR PUSTAKA.pdf

Download (255kB)
[img] Text (LAMPIRAN)
2229101028-LAMPIRAN.pdf

Download (206kB)

Abstract

Class imbalance can occur in various types of datasets. One of them is bank marketing datasets taken from UCI Machine Learning web. The problem can cause the machine learning model to predict the majority class better than the minority class. Therefore, this research will handle the problem by using an oversampling method, namely SMOTE. However, the application of SMOTE can also cause other problems, namely class overlapping which can also interfere with model performance. This research aims to handle both problems by combining the SMOTE method with undersampling methods, where the undersampling methods to be used consist of ENN, NCL, and TomekLink. The machine learning algorithm used is Logistic Regression and the performance evaluation is done with confusion matrix, so that the sensitivity, specificity, and g-means of the model can be obtained. The results of this study show that the SMOTE-ENN combination produces the most optimal results with sensitivity, specificity, and g-means of 94.05%, 83.22% and 88.47% respectively on bank marketing datasets.

Item Type: Thesis (Masters)
Uncontrolled Keywords: bank marketing, class imbalance, class overlapping, oversampling, undersampling,
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Divisions: Pascasarjana > Program Studi Ilmu Komputer (S2)
Depositing User: Anak Agung Gde Wahyu Sukma Erlangga
Date Deposited: 19 Feb 2024 07:35
Last Modified: 19 Feb 2024 07:35
URI: http://repo.undiksha.ac.id/id/eprint/18852

Actions (login required)

View Item View Item