Erlangga, Anak Agung Gde Wahyu Sukma (2024) Kombinasi Oversampling dan Undersampling dalam Menangani Class Unbalanced dan Overlapping pada Klasifikasi Data Bank Marketing. Masters thesis, Universitas Pendidikan Ganesha.
Text (COVER)
2229101028-COVER.pdf Download (1MB) |
|
Text (ABSTRAK)
2229101028-ABSTRAK.pdf Download (265kB) |
|
Text (BAB 1 PENDAHULUAN)
2229101028-BAB 1 PENDAHULUAN.pdf Download (326kB) |
|
Text (BAB 2 KAJIAN TEORI)
2229101028-BAB 2 KAJIAN TEORI.pdf Restricted to Repository staff only Download (643kB) | Request a copy |
|
Text (BAB 3 METODELOGI PENELITIAN)
2229101028-BAB 3 METODELOGI PENELITIAN.pdf Restricted to Repository staff only Download (661kB) | Request a copy |
|
Text (BAB 4 HASIL DAN PEMBAHASAN)
2229101028-BAB 4 HASIL DAN PEMBAHASAN.pdf Restricted to Repository staff only Download (2MB) | Request a copy |
|
Text (BAB 5 PENUTUP)
2229101028-BAB 5 PENUTUP.pdf Restricted to Repository staff only Download (232kB) | Request a copy |
|
Text (DAFTAR PUSTAKA)
2229101028-DAFTAR PUSTAKA.pdf Download (255kB) |
|
Text (LAMPIRAN)
2229101028-LAMPIRAN.pdf Download (206kB) |
Abstract
Class imbalance can occur in various types of datasets. One of them is bank marketing datasets taken from UCI Machine Learning web. The problem can cause the machine learning model to predict the majority class better than the minority class. Therefore, this research will handle the problem by using an oversampling method, namely SMOTE. However, the application of SMOTE can also cause other problems, namely class overlapping which can also interfere with model performance. This research aims to handle both problems by combining the SMOTE method with undersampling methods, where the undersampling methods to be used consist of ENN, NCL, and TomekLink. The machine learning algorithm used is Logistic Regression and the performance evaluation is done with confusion matrix, so that the sensitivity, specificity, and g-means of the model can be obtained. The results of this study show that the SMOTE-ENN combination produces the most optimal results with sensitivity, specificity, and g-means of 94.05%, 83.22% and 88.47% respectively on bank marketing datasets.
Item Type: | Thesis (Masters) |
---|---|
Uncontrolled Keywords: | bank marketing, class imbalance, class overlapping, oversampling, undersampling, |
Subjects: | Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
Divisions: | Pascasarjana > Program Studi Ilmu Komputer (S2) |
Depositing User: | Anak Agung Gde Wahyu Sukma Erlangga |
Date Deposited: | 19 Feb 2024 07:35 |
Last Modified: | 19 Feb 2024 07:35 |
URI: | http://repo.undiksha.ac.id/id/eprint/18852 |
Actions (login required)
View Item |