A Comparative Study of Sampling Methods and Algorithms for Imbalanced Time Series Classification

Publisher:
Springer-Verlag
Publication Type:
Conference Proceeding
Citation:
Lecture Notes in Computer Science, 2012, 7691 pp. 637 - 648
Issue Date:
2012-01
Full metadata record
Files in This Item:
Filename Description Size
Thumbnail2011007609OK.pdf614.81 kB
Adobe PDF
Mining time series data and imbalanced data are two of ten challenging problems in data mining research. Imbalanced time series classification (ITSC) involves these two challenging problems, which take place in many real world applications. In the existing research, the structure-preserving over-sampling (SOP) method has been proposed for solving the ITSC problems. It is claimed by its authors to achieve better performance than other over-sampling and state-of-the-art methods in time series classification (TSC). However, it is unclear whether an under-sampling method with various learning algorithms is more effective than over-sampling methods, e.g., SPO for ITSC, because research has shown that under-sampling methods are more effective and efficient than over-sampling methods. We propose a comparative study between an under-sampling method with various learning algorithms and oversampling methods, e.g. SPO. Statistical tests, the Friedman test and post-hoc test are applied to determine whether there is a statistically significant difference between methods. The experimental results demonstrate that the under-sampling technique with KNN is the most effective method and can achieve results that are superior to the existing complicated SPO method for ITSC.
Please use this identifier to cite or link to this item: