Load Forecasting of Power SCADA Based on Spark MLlib
- DOI
- 10.2991/msota-16.2016.108How to use a DOI?
- Keywords
- component; spark; decision tree; random forest; k-menas
- Abstract
In order to improve the accuracy and speed of power forecasting in power SCADA system, a distributed real-time steaming forecasting model is designed based on K-means algorithm and Random Forest algorithm in the Spark machine learning library (MLlib). The model uses the sliding window mechanism to segment the incoming data stream. K-means Clustering is used to correct the abnormally data, and then the Random Forest Regression forecasting is performed. Model algorithms is implemented based on the Spark RDD, the performance of the algorithm is verified by sending the data through the daemon process which is a simulation of the message queue. The results show that the forecasting accuracy of the algorithm is superior to the traditional serial Random Forest forecasting and satisfies the real-time requirement.
- Copyright
- © 2017, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Tao Lin AU - Chong Jiang PY - 2016/12 DA - 2016/12 TI - Load Forecasting of Power SCADA Based on Spark MLlib BT - Proceedings of 2016 International Conference on Modeling, Simulation and Optimization Technologies and Applications (MSOTA2016) PB - Atlantis Press SP - 480 EP - 484 SN - 2352-538X UR - https://doi.org/10.2991/msota-16.2016.108 DO - 10.2991/msota-16.2016.108 ID - Lin2016/12 ER -