Distributed Data Streams Processing Based on Flume/Kafka/Spark
Authors
Jun Wang, Wenhao Wang, Renfei Chen
Corresponding Author
Jun Wang
Available Online October 2015.
- DOI
- 10.2991/icmii-15.2015.167How to use a DOI?
- Keywords
- Distributed System, Stream Processing, Kafka, Flume, Spark
- Abstract
Designed and implemented a distributed data streams processing system based on Flume, Kafka and Spark, fetch and analyze datastreams and mining business intelligence information efficiently, real-timely and reliably, With high scalability and high reliability of Flume, the data of multiple sources can be collected accurately and extended easily. Kafka's characteristics of high throughput, scalability, distribution meet the distribution requirements of massive data. Spark Streaming provides a set of efficient, fault-tolerant and real-time large-scale stream processing frame.Thereby services and strategyof enterprise can be improved.
- Copyright
- © 2015, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Jun Wang AU - Wenhao Wang AU - Renfei Chen PY - 2015/10 DA - 2015/10 TI - Distributed Data Streams Processing Based on Flume/Kafka/Spark BT - Proceedings of the 3rd International Conference on Mechatronics and Industrial Informatics PB - Atlantis Press SP - 948 EP - 952 SN - 2352-538X UR - https://doi.org/10.2991/icmii-15.2015.167 DO - 10.2991/icmii-15.2015.167 ID - Wang2015/10 ER -