Souikaina Filali will be completing this work as her intern project for summer 2018. Details from her proposal:
The project goal is to extend KMeans clustering and KNN classifier algorithms for the use time series data model. The extension comprises the implementation of different time series distance measures to be used with KMeans and KNN. In addition, different data normalization and imputation strategies in the case of missing values will be added.
The second part of the project involves designing a new time series classifier better suited for long time series data that finds motifs (discriminative subsequences) and use them to transform the high dimensional data series into single value data points; therefore, allowing the use of all the existing classifiers in ECL-ML
- Extend lock-step measures to time series data model
- Implement elastic measures (Dynamic Time Warping, Move Split Merge and Longest Common Subsequence).
- Implement missing values Imputation techniques:
- KNN impute
- Mean/Median imputation
- Implement normalizations methods:
- Min-Max Normalization
- Decimal Scaling
- Extend KMeans to be used with elastic and lock-step time series distance measures.
- Extend KNN to be used with elastic and lock-step time series distance measures.
- Implement the proposed time series classifier that mines time series motifs and use it for classification
- Test and Documentation