Main Article Content

Abstract

Data Mining is a process by which data can be analyzed, so as to generate useful knowledge. In Data Mining, Classifiers are the widely accepted effective technique for prediction. A well balanced dataset is a vital source for the classifiers to yield the best prediction. Recent studies have shown that, imbalanced datasets exists on many real applications. Re-sampling are the techniques to handle such an issue. In this paper an Enhanced hybrid model was proposed to balance the dataset, which is the integration of both undersampling and oversampling techniques. In order to balance the dataset, the model initially uses undersampling technique to remove certain instance from majority class which has less classification information and then oversampling technique applied on minority class by using nearest neighbors. To prove the efficiency of the proposed model various experiments were conducted. To perform the same datasets with different imbalance ratio were taken from UCI and KEEL repository. The results of the experiments show that classifiers were able to outperform on the dataset which was balanced by the proposed model.

Article Details

How to Cite
S.Babu. (2019). Enhanced hybrid model for balancing dataset to improve the performance of the classifier . International Journal of Intellectual Advancements and Research in Engineering Computations, 7(1), 885–891. Retrieved from https://ijiarec.com/ijiarec/article/view/1020