Proceedings of the 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)

A Novel Sampling Strategy for Active Learning over Evolving Stream Data

Authors
Xuxu Zhang, Zhi Cao, Li Peng, Siqi Ren
Corresponding Author
Xuxu Zhang
Available Online July 2016.
DOI
10.2991/iccia-17.2017.57How to use a DOI?
Keywords
Active learning, Data streams, Evidence, random strategy.
Abstract

In classification tasks, data labeling is an expensive and time-consuming process, hence, active learning which query labels for a small representative portion of data, is becoming increasingly important. However, few works consider the challenges from data steam setting because most of the active learning method is designed for non-streaming setting. Be based upon the status quo, after synthesizing the evidence-based uncertainty sampling strategy and split sampling strategy above, we propose a new sampling strategy for active learning over evolving stream data, which can take full advantages of the strengths of each. First, the original data stream is randomly divided into two sub-streams. Instances from one sub-stream are labeled according to the high evidence-focused uncertainty strategy, while instances from the other sub-stream are marked by the random strategy for detecting true concept drifts. Second, we introduce a sliding window in the high evidence-focused uncertainty strategy, finding out whether an instance is the conflict-uncertainty instance or not. Clearly, our strategy solves the issue of the effective use of evidence in data streams setting, and can choose more representative instances over evolving data streams for training a model. Finally, in experiments over four benchmark datasets, compared with state-of-art active learning strategies, the result illustrates good predictive performance of our proposed approach.

Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Download article (PDF)

Volume Title
Proceedings of the 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)
Series
Advances in Computer Science Research
Publication Date
July 2016
ISBN
10.2991/iccia-17.2017.57
ISSN
2352-538X
DOI
10.2991/iccia-17.2017.57How to use a DOI?
Copyright
© 2017, the Authors. Published by Atlantis Press.
Open Access
This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).

Cite this article

TY  - CONF
AU  - Xuxu Zhang
AU  - Zhi Cao
AU  - Li Peng
AU  - Siqi Ren
PY  - 2016/07
DA  - 2016/07
TI  - A Novel Sampling Strategy for Active Learning over Evolving Stream Data
BT  - Proceedings of the 2nd International Conference on Computer Engineering, Information Science & Application Technology (ICCIA 2017)
PB  - Atlantis Press
SP  - 336
EP  - 342
SN  - 2352-538X
UR  - https://doi.org/10.2991/iccia-17.2017.57
DO  - 10.2991/iccia-17.2017.57
ID  - Zhang2016/07
ER  -