diff --git a/R/rucrdtw-package.r b/R/rucrdtw-package.r index 5a1f55e..0044b28 100644 --- a/R/rucrdtw-package.r +++ b/R/rucrdtw-package.r @@ -1,6 +1,6 @@ #' rucrdtw: Fast time series subsequence search in R #' -#' @description Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). +#' @description Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). #' #' Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. #' diff --git a/man/rucrdtw.Rd b/man/rucrdtw.Rd index d242728..592d434 100644 --- a/man/rucrdtw.Rd +++ b/man/rucrdtw.Rd @@ -5,7 +5,7 @@ \alias{rucrdtw} \title{rucrdtw: Fast time series subsequence search in R} \description{ -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. A broad suite of DTW algorithms is implemented in R in the \strong{dtw} package (Giorgino 2009). Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. diff --git a/paper.md b/paper.md index 59bed74..a289200 100644 --- a/paper.md +++ b/paper.md @@ -14,7 +14,7 @@ bibliography: vignettes/rucrdtw.bib --- # Summary -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The [UCR Suite](http://www.cs.ucr.edu/~eamonn/UCRsuite.html) [@rakthanmanon2012searching] provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. diff --git a/vignettes/using_rucrdtw.Rmd b/vignettes/using_rucrdtw.Rmd index 7482d81..4bdf344 100644 --- a/vignettes/using_rucrdtw.Rmd +++ b/vignettes/using_rucrdtw.Rmd @@ -11,7 +11,7 @@ bibliography: rucrdtw.bib --- ## Introduction -Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. +Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series [@berndt1994using]. The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection. Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The [UCR Suite](http://www.cs.ucr.edu/~eamonn/UCRsuite.html) [@rakthanmanon2012searching] provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations. @@ -89,7 +89,7 @@ legend("topright", legend = c("query", "DTW match", "ED match"), col=c("red", "b ``` ## Comparison with a naive DTW sub-sequence search -We can compare the speed-up achived with the UCR algorithm by comparing it to a naive sliding-window comparison with the `dtw` function from the [`dtw` package](https://CRAN.R-project.org/package=dtw) [@giorgino2009computing]. We create another time series and load `dtw`. +We can compare the speed-up achieved with the UCR algorithm by comparing it to a naive sliding-window comparison with the `dtw` function from the [`dtw` package](https://CRAN.R-project.org/package=dtw) [@giorgino2009computing]. We create another time series and load `dtw`. ```{r dtw-comparison, message=FALSE} set.seed(123) diff --git a/vignettes/using_rucrdtw.html b/vignettes/using_rucrdtw.html index ba62ca3..52833da 100644 --- a/vignettes/using_rucrdtw.html +++ b/vignettes/using_rucrdtw.html @@ -310,7 +310,7 @@
Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series (Berndt and Clifford 1994). The remaining cumulative distance between the series after the alignement is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection.
+Dynamic Time Warping (DTW) methods provide algorithms to optimally map a given time series onto all or part of another time series (Berndt and Clifford 1994). The remaining cumulative distance between the series after the alignment is a useful distance metric in time series data mining applications for tasks such as classification, clustering, and anomaly detection.
Calculating a DTW alignment is computationally relatively expensive, and as a consequence DTW is often a bottleneck in time series data mining applications. The UCR Suite (Rakthanmanon et al. 2012) provides a highly optimized algorithm for best-match subsequence searches that avoids unnecessary distance computations and thereby enables fast DTW and Euclidean Distance queries even in data sets containing trillions of observations.
A broad suite of DTW algorithms is implemented in R in the dtw
package (Giorgino 2009). The rucrdtw
R package provides complementary functionality for fast similarity searches by providing R bindings for the UCR Suite via Rcpp
(Eddelbuettel and Francois 2011). In addition to queries and data stored in text files, rucrdtw
also implements methods for queries and/or data that are held in memory as R objects, as well as a method to do fast similarity searches against reference libraries of time series.
Since both query and data are R vectors, we use the vector-vector methods for the search.
## user system elapsed
-## 1.71 0.02 1.72
+## 1.59 0.00 1.61
## [1] TRUE
## user system elapsed
-## 1.72 0.00 1.76
+## 1.61 0.01 1.65
## [1] TRUE
And in a matter of seconds we have searched 10 million data points and rediscovered our query!
@@ -368,7 +368,7 @@We can compare the speed-up achived with the UCR algorithm by comparing it to a naive sliding-window comparison with the dtw
function from the dtw
package (Giorgino 2009). We create another time series and load dtw
.
We can compare the speed-up achieved with the UCR algorithm by comparing it to a naive sliding-window comparison with the dtw
function from the dtw
package (Giorgino 2009). We create another time series and load dtw
.
set.seed(123)
rwalk <- cumsum(runif(5e3, min = -0.5, max = 0.5))
qstart <- 876
@@ -405,7 +405,7 @@ Comparison with a naive DTW sub-sequence search
legend("topright", legend = c("naive DTW", "UCR DTW"), fill = c("#33a02c","#1f78b4"), bty="n")
}
## Loading required package: rbenchmark
-The speed-up is approximately 3 orders of magnitude.