Online learning from trapezoidal data streams

Publication Type:
Journal Article
IEEE Transactions on Knowledge and Data Engineering, 2016, 28 (10), pp. 2709 - 2723
Issue Date:
Filename Description Size
Online learning from trapezoidal data streams.pdfPublished Version2.46 MB
Adobe PDF
Full metadata record
© 1989-2012 IEEE. In this paper, we study a new problem of continuous learning from doubly-streaming data where both data volume and feature space increase over time. We refer to the doubly-streaming data as trapezoidal data streams and the corresponding learning problem as online learning from trapezoidal data streams. The problem is challenging because both data volume and data dimension increase over time, and existing online learning [1] , [2] , online feature selection [3] , and streaming feature selection algorithms [4] , [5] are inapplicable. We propose a new Online Learning with Streaming Features algorithm (OLSF for short) and its two variants, which combine online learning [1] , [2] and streaming feature selection [4] , [5] to enable learning from trapezoidal data streams with infinite training instances and features. When a new training instance carrying new features arrives, a classifier updates the existing features by following the passive-aggressive update rule [2] and updates the new features by following the structural risk minimization principle. Feature sparsity is then introduced by using the projected truncation technique. We derive performance bounds of the OL SF algorithm and its variants. We also conduct experiments on real-world data sets to show the performance of the proposed algorithms.
Please use this identifier to cite or link to this item: