Skip to content

cemretok/DDTW

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 

Repository files navigation

DDTW

Implementation of the system described in: "Time Series Similarity Search for StreamingData in Distributed Systems" http://www.redaktion.tu-berlin.de/fileadmin/fg131/Publikation/Papers/Time_Series_Similarity_Search_Darli-2019_crv.pdf

Note:

  • The current implementation has been changed to derive the distance threshold from the query, therefore the time it takes for finding the pattern in "One patient, one day, Best Match" mode will take a bit longer than the 37 seconds reported in the paper, the current time is around 62 s.
  • The changes included in this code are intended to improve the performance in "Range query" mode, big sets and in a cluster.

The code can be run locally (tested in a laptop), import the code in Intellij or generate the jar with maven and execute it in the command line as explained below:

Running with Apache Flink on the command line:

We test this example with flink-1.8.0

After starting Flink (see basics in: https://ci.apache.org/projects/flink/flink-docs-release-1.8/tutorials/local_setup.html)

$FLINK_PATH/bin/start-cluster.sh 

Generate the target (we use: maven 3.6 and Java 8):

mvn clean package

Run the program:

$FLINK_PATH/bin/flink run -p 8 -c simsearch.examples.ECGExample target/DTW-1.1-SNAPSHOT.jar \
                     --pathInput $USER_PATH/src/main/resources/ecgSmall.txt \
                     --pathOutput $USER_PATH/src/main/resources/ecg_result \
                     --sampleRate 1000 \
                     --pathPattern $USER_PATH/src/main/resources/ecg_query.txt

Output result in ecg_result, looks like:

#(startTime, endTime, distance, tsKey, latency)
...
(2474640,2475102,7.91,ecg_patient1,6)
(2661120,2661582,3.26,ecg_patient1,8)  <-- best match
(2781324,2781786,4.6,ecg_patient1,6)
(2866248,2866710,7.29,ecg_patient1,5)
(7140504,7140966,4.14,ecg_patient1,6)
(14398020,14398482,6.33,ecg_patient1,7)

Data

The data included in this repo (ecgSmall.txt) is a short excerpt of the original file that can be found in: http://www.cs.ucr.edu/~eamonn/ECG_one_day.zip

About

Distributed DTW

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Java 100.0%