k-Shape: Efficient and Accurate Clustering of Time Series
jopa@cs.columbia.edugravano@cs.columbia.edu
Columbia UniversityColumbia University
This is the website for our ACM SIGMOD 2015 research paper, "k-Shape: Efficient and Accurate Clustering of Time Series."
We make our source code publicly available here and provide details on how to obtain free access to the datasets used in our experimental results.
Datasets
We used the world's largest collection of class-labeled time-series datasets, namely theUCR Time-Series Repository.
To obtain free access to the datasets please refer to the repository's website athttp://www.cs.ucr.edu/~eamonn/time_series_data/.
Please note that some of the time-series datasets, namely, Beef, Coffee, Cricket_X, Cricket_Y, Cricket_Z, Fish, OSULeaf, and OliveOil, are either not properly z-normalized or not z-normalized at all. Therefore, for our experiments we also perform the z-normalization step for all datasets. Importantly, due to this issue, our results might differ from what has been reported in the literature, as several works assumed all datasets are already z-normalized.
WithProf. Eamonn Keoghwe wrote a more detailed analysis on this matterhere.
Source code
You can obtain the source code of all approaches compared in our paper, including our approach,here.
Please contact the first author (jopa@cs.columbia.edu) to obtain the password.
Publication
You can download our paperhere.
3. K-SHAPE CLUSTERING ALGORITHM
3.1 Time-Series Shape Similarity
3.2 Time-Series Shape Extraction