Implement an Effective Sensor Data Pre-processing System to Securely Store Real-time Sensor Stream Data

[ 31 May 2021 | vol. 14 | no. 1 | pp. 33-42 ]

About Authors:

Serhan Dagtas
-University of Arkansas, USA

Abstract:

In recent years, the development of various monitoring systems utilizing sensor networks has gained significant traction. One prominent example of such a system is the warehouse monitoring system, designed specifically to continuously assess environmental conditions such as temperature, humidity, and carbon dioxide (CO2) levels within storage facilities. Each individual sensor generates a small amount of data, typically around 100 bytes per reading, which captures the specific environmental metrics being monitored. However, the complexity arises when considering the scale of data collection across multiple warehouses. With numerous warehouses dedicated to storing different products, each equipped with hundreds of sensors, the aggregated volume of sensing data can become substantial. This presents a challenge in managing and processing the large datasets efficiently. To address this issue, we propose an innovative data input system that departs from the conventional method of storing each individual data point in a separate file. Such an approach can result in an excessive number of data accesses within the file system, leading to performance bottlenecks and inefficiencies. Instead, our system aggregates multiple sensing data readings into larger, consolidated files before saving them to the Hadoop Distributed File System (HDFS). This method not only reduces the number of individual data files created but also optimizes the storage and retrieval processes. Furthermore, we will implement the MapReduce framework to facilitate the efficient access and processing of these consolidated chunks of sensing data stored in HDFS. By leveraging MapReduce's parallel processing capabilities, we can streamline data analysis, enabling timely insights and decision-making based on the comprehensive environmental data collected from multiple warehouses. This approach enhances the overall efficiency and responsiveness of the warehouse monitoring system, ultimately contributing to better product preservation and operational management.

Keywords:

Distributed sensor stream data, HDFS, Sensor Data; Real-time Sensor Stream

 

About this Article: