Data Cleaning Algorithms for Power Information Communication Assets Data Based on Self-coding
- DOI
- 10.2991/mbdasm-19.2019.39How to use a DOI?
- Keywords
- big data;data clean; self-coding network; residual analysis
- Abstract
With the continuous application of new information and communication technology in the asset management of power information and communication equipment, a large amount of data will be generated in all aspects of asset management. In data acquisition, due to sensor faults and other reasons, data anomalies will inevitably occur in the data set, which causes great trouble to the subsequent data analysis. In this paper, a data cleaning algorithm based on stack self-encoder (DCbS) is proposed, which can improve the ability of distinguishing and recovering outliers of data by saving the short-term correlation between data in sliding windows and residual analysis between noisy data and lossless data. Finally, the advantages of the algorithm are highlighted from two aspects: data recovery and outlier identification.
- Copyright
- © 2019, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Qiang Wang AU - Jianliang Zhang AU - Jian Wu AU - Feng Gao AU - Xuewu Ren PY - 2019/10 DA - 2019/10 TI - Data Cleaning Algorithms for Power Information Communication Assets Data Based on Self-coding BT - Proceedings of the 2019 International Conference on Mathematics, Big Data Analysis and Simulation and Modelling (MBDASM 2019) PB - Atlantis Press SP - 170 EP - 173 SN - 2352-538X UR - https://doi.org/10.2991/mbdasm-19.2019.39 DO - 10.2991/mbdasm-19.2019.39 ID - Wang2019/10 ER -