Dynamic Time Warping (DTW)
distance has been effectively used in mining time series data in a
multitude of domains. However, in its original formulation DTW is
extremely inefficient in comparing long sparse time series, containing
mostly zeros and some unevenly spaced non-zero observations. Original
DTW distance does not take advantage of this sparsity, leading to
redundant calculations and a prohibitively large computational cost for
long time series.
Pdf version of the paper is available here.
The source code can be downloaded here.
The main part of the project is implemented in C++. In order to run the code, you should use "mex" function in Matlab like the following (after unzipping source.zip):
Given two run length encoded time series X and Y:
To calculate global AWarp distance by using UBCost algorithm:
To calculate global AWarp distance by using LBCost algorithm:
To calculate costrained AWarp distance:
where win is the size of the window in number of points.
A sample of the house dataset (used in the paper) can be downloaded here. The first number in each row is the label, and the remaining of the row is the time series of the washing machine's power consumption (in a day) in run length encoded format.
A sample of the Twitter dataset can be downloaded here. Each row is the activity of a Twitter user (millisecond resolution) in run length encoded format.