diff --git a/R/rucrdtw-package.r b/R/rucrdtw-package.r index 5a1f55e..0044b28 100644 --- a/R/rucrdtw-package.r +++ b/R/rucrdtw-package.r @@ -1,6 +1,6 @@ #' rucrdtw: Fast time series subsequence search in R #' -#' @description Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). +#' @description Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). #' #' Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. #' diff --git a/man/rucrdtw.Rd b/man/rucrdtw.Rd index d242728..592d434 100644 --- a/man/rucrdtw.Rd +++ b/man/rucrdtw.Rd @@ -5,7 +5,7 @@ \alias{rucrdtw} \title{rucrdtw: Fast time series subsequence search in R} \description{ -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. diff --git a/paper.md b/paper.md index 59bed74..a289200 100644 --- a/paper.md +++ b/paper.md @@ -14,7 +14,7 @@ bibliography: vignettes/rucrdtw.bib --- # Summary -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The [UCR Suite](http://www.cs.ucr.edu/~eamonn/UCRsuite.html) [@rakthanmanon2012searching] provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. diff --git a/vignettes/using_rucrdtw.Rmd b/vignettes/using_rucrdtw.Rmd index 7482d81..4bdf344 100644 --- a/vignettes/using_rucrdtw.Rmd +++ b/vignettes/using_rucrdtw.Rmd @@ -11,7 +11,7 @@ bibliography: rucrdtw.bib --- ## Introduction -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The [UCR Suite](http://www.cs.ucr.edu/~eamonn/UCRsuite.html) [@rakthanmanon2012searching] provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. @@ -89,7 +89,7 @@ legend("topright", legend = c("query", "DTW match", "ED match"), col=c("red", "b ``` ## Comparison with a naive DTW sub-sequence search -We can compare the speed-up achived with the UCR algorithm by comparing it to a naive sliding-window comparison with the `dtw` function from the [`dtw` package](https://CRAN.R-project.org/package=dtw) [@giorgino2009computing]. We create another time series and load `dtw`. +We can compare the speed-up achieved with the UCR algorithm by comparing it to a naive sliding-window comparison with the `dtw` function from the [`dtw` package](https://CRAN.R-project.org/package=dtw) [@giorgino2009computing]. We create another time series and load `dtw`. ```{r dtw-comparison, message=FALSE} set.seed(123) diff --git a/vignettes/using_rucrdtw.html b/vignettes/using_rucrdtw.html index ba62ca3..52833da 100644 --- a/vignettes/using_rucrdtw.html +++ b/vignettes/using_rucrdtw.html @@ -310,7 +310,7 @@

2016-11-06

Introduction

-

Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series (Berndt and Clifford 1994). The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection.

+

Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series (Berndt and Clifford 1994). The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection.

Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations.

A broad suite of DTW algorithms is implemented in R in the dtw package (Giorgino 2009). The rucrdtw R package provides complementary functionality for fast similarity searches by providing R bindings for the UCR Suite via Rcpp (Eddelbuettel and Francois 2011). In addition to queries and data stored in text files, rucrdtw also implements methods for queries and/or data that are held in memory as R objects, as well as a method to do fast similarity searches against reference libraries of time series.

@@ -333,12 +333,12 @@

Examples

Since both query and data are R vectors, we use the vector-vector methods for the search.

system.time(dtw_search <- ucrdtw_vv(data = rwalk, query = query, dtwwindow = 0.05))
##    user  system elapsed 
-##    1.71    0.02    1.72
+## 1.59 0.00 1.61
all.equal(qstart, dtw_search$location)
## [1] TRUE
system.time(ed_search <- ucred_vv(data = rwalk, query = query))
##    user  system elapsed 
-##    1.72    0.00    1.76
+## 1.61 0.01 1.65
all.equal(qstart, ed_search$location)
## [1] TRUE

And in a matter of seconds we have searched 10 million data points and rediscovered our query!

@@ -368,7 +368,7 @@

Examples