A Novel Sampling Strategy for Active Learning over Evolving Stream Data
Xuxu Zhang, Zhi Cao, Li Peng, Siqi Ren
Available Online July 2016.
- https://doi.org/10.2991/iccia-17.2017.57How to use a DOI?
- Active learning, Data streams, Evidence, random strategy.
- In classification tasks, data labeling is an expensive and time-consuming process, hence, active learning which query labels for a small representative portion of data, is becoming increasingly important. However, few works consider the challenges from data steam setting because most of the active learning method is designed for non-streaming setting. Be based upon the status quo, after synthesizing the evidence-based uncertainty sampling strategy and split sampling strategy above, we propose a new sampling strategy for active learning over evolving stream data, which can take full advantages of the strengths of each. First, the original data stream is randomly divided into two sub-streams. Instances from one sub-stream are labeled according to the high evidence-focused uncertainty strategy, while instances from the other sub-stream are marked by the random strategy for detecting true concept drifts. Second, we introduce a sliding window in the high evidence-focused uncertainty strategy, finding out whether an instance is the conflict-uncertainty instance or not. Clearly, our strategy solves the issue of the effective use of evidence in data streams setting, and can choose more representative instances over evolving data streams for training a model. Finally, in experiments over four benchmark datasets, compared with state-of-art active learning strategies, the result illustrates good predictive performance of our proposed approach.
- Open Access
- This is an open access article distributed under the CC BY-NC license.
Cite this article
TY - CONF AU - Xuxu Zhang AU - Zhi Cao AU - Li Peng AU - Siqi Ren PY - 2016/07 DA - 2016/07 TI - A Novel Sampling Strategy for Active Learning over Evolving Stream Data BT - 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017) PB - Atlantis Press SN - 2352-538X UR - https://doi.org/10.2991/iccia-17.2017.57 DO - https://doi.org/10.2991/iccia-17.2017.57 ID - Zhang2016/07 ER -