CLASSIFICATION OF MISSING VALUES HANDLING METHOD DURING DATA MINING: REVIEW

Entin Hartini

Sari


CLASSIFICATION OF MISSING VALUES HANDLING METHOD DURING DATA MINING: REVIEW. Missing data often occurs in researchs or surveys. Many real datasets or data mining have missing data, thus affecting the quality of the data. There are various causes resulting in incomplete data, such as: manual data entry procedure, incorrect measurement, equipment error, and many others. Any errors causing data missing make it difficult in a data analysis. This is due to the algorithms of data analysis that only work if the data is complete. Missing data analysis may help resolving missing data. Missing data can be replaced with a value based on the possibility of other information available, so that the data set can be analyzed. Many specialists have been working on this issue to present more modern techniques. Many strategies are available for handling the missing data, however investigator has difficulty in finding the right technique in the absence of information about strategy and implementation. The purpose of this research paper is to classify methods of miss- ing data handling based on statistical method and machine learning. Results from this study are clas- sification methods of missing data handling by ignoring technique, model base technique and impu- tation technique , which are complemented with the advantages and disadvantages of each method.

Keywords: missing value, statistic, machine learning, classification, method 


Teks Lengkap:

PDF

Refbacks

  • Saat ini tidak ada refbacks.



Copyright © 2013
SIGMA EPSILON