This is the code for time series data analyzing. We will use LSTM to predict the value of sensor reading in the future and generate a "health index" for each component of the system and finally generate a overall "health index" for the whole system, which indicated the system's future health status.
Twelve time series data from twelve different sensors which including temperature sensor, pressure sensor and vibration sensor. The raw dataset are located under folder "original". The format is time_sensorName.csv. e.g. 1705_MAIN_FILTER_OIL_TEMP.csv indicates dataset for Main Filter Oil Temperature sensor on May, 2017. The time interval of the raw dataset is different, which means need to be processed. The preprocessed (sampled) dataset is located under folder "sampled". They are sampled into different time intervals.
A long short-term memory network (LSTM) was implemented for data prediction. The schematic diagram of a LSTM cell, the building block of LSTM network, is shown in the following figures.
- utils/: for data reading, sampling and write into CSV
- Sensor.py: predicting the sensor value in the future
- train_batch.py: train LSTM models
Python (3.5)
Tensorflow (>1.0)
keras
numpy
pandas
scikit-learn
datetime
matplotlib
scipy
pickle
- Change the configurations in script train_batch.py. E.g. number of epochs, dataset path
- Run script train_batch.py, the generated results will be saved in the path that you specified.
This is the single step prediction result for one day sampled oil return temperature sensor. This is the RMSE comparison of multi-step prediction for different time interval sampled oil return temperature sensor.
This is the predicted health index of each sensor and the whole system on November.