diff --git a/.gitignore b/.gitignore
index d4e7113..0c10aba 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,8 +1,8 @@
 inst/
-Meta
 .Rproj.user
 .Rhistory
 .RData
 *.el
 notes.org
 doc
+/Meta/
diff --git a/R/FRD_lp.R b/R/FRD_lp.R
index cca1145..79aae6e 100644
--- a/R/FRD_lp.R
+++ b/R/FRD_lp.R
@@ -128,10 +128,10 @@ FRDHonest <- function(formula, data, subset, weights, cutoff=0, M,
 #'     class \code{"RDBW"} is a list containing the following components:
 #'
 #'     \describe{
-#'     \item{\code{hp}}{bandwidth for observations above cutoff}
+#'     \item{\code{hp}}{bandwidth for observations weakly above cutoff}
 #'
-#'     \item{\code{hm}}{bandwidth for observations below cutoff, equal to
-#'     \code{hp} unless \code{bw.equal==FALSE}}
+#'     \item{\code{hm}}{bandwidth for observations strictly below cutoff, equal
+#'     to \code{hp} unless \code{bw.equal==FALSE}}
 #'
 #'     \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
 #'     just above and just below cutoff, \eqn{\sigma^2_+(0)} and
diff --git a/R/RD_lp.R b/R/RD_lp.R
index 6278c60..93b0bee 100644
--- a/R/RD_lp.R
+++ b/R/RD_lp.R
@@ -135,9 +135,9 @@ RDHonest <- function(formula, data, subset, weights, cutoff=0, M,
 #'     class \code{"RDBW"} is a list containing the following components:
 #'
 #'     \describe{
-#'     \item{\code{hp}}{bandwidth for observations above cutoff}
+#'     \item{\code{hp}}{bandwidth for observations strictly above cutoff}
 #'
-#'     \item{\code{hm}}{bandwidth for observations below cutoff, equal to
+#'     \item{\code{hm}}{bandwidth for observations weakly below cutoff, equal to
 #'     \code{hp} unless \code{bw.equal==FALSE}}
 #'
 #'     \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
diff --git a/doc/RDHonest.R b/doc/RDHonest.R
index 7381e7c..7233095 100644
--- a/doc/RDHonest.R
+++ b/doc/RDHonest.R
@@ -78,7 +78,7 @@ RDHonest(voteshare ~ margin, data=lee08, kern="uniform", M=M, sclass="H", opt.cr
 
 
 ## -----------------------------------------------------------------------------
-## Add variance estimate to the lee data so that the RDSmoothnessBound
+## Add variance estimate to the Lee (2008) data so that the RDSmoothnessBound
 ## function doesn't have to compute them each time
 dl <- NPRPrelimVar.fit(dl, se.initial="nn")
 
diff --git a/doc/RDHonest.Rmd b/doc/RDHonest.Rmd
index 431249f..74e5347 100644
--- a/doc/RDHonest.Rmd
+++ b/doc/RDHonest.Rmd
@@ -47,10 +47,10 @@ In the sharp regression discontinuity model, we observe units $i=1,\dotsc,n$,
 with the outcome $y_i$ for the $i$th unit given by $$ y_i = f(x_i) + u_i, $$
 where $f(x_i)$ is the expectation of $y_i$ conditional on the running variable
 $x_i$ and $u_i$ is the regression error. A unit is treated if and only if the
-running variable $x_{i}$ lies above a known cutoff $c_{0}$. The parameter of
-interest is given by the jump of $f$ at the cutoff, $$ \beta=\lim_{x\downarrow
-c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let $\sigma^2(x_i)$ denote the
-conditional variance of $u_i$.
+running variable $x_{i}$ lies weakly above a known cutoff $x_{i}\geq c_{0}$. The
+parameter of interest is given by the jump of $f$ at the cutoff, $$
+\beta=\lim_{x\downarrow c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let
+$\sigma^2(x_i)$ denote the conditional variance of $u_i$.
 
 In the @lee08 dataset, the running variable corresponds to the margin of victory of
 a Democratic candidate in a US House election, and the treatment corresponds to
@@ -63,7 +63,8 @@ occurred in 1947. The running variable is the year in which the individual turne
 14, with the cutoff equal to 1947 so that the "treatment" is being subject to a
 higher minimum school-leaving age. The outcome is log earnings in 1998.
 
-Some of the functions in the package require the data to be transformed into a custom `RDData` format. This can be accomplished with the `RDData` function:
+Some of the functions in the package require the data to be transformed into a
+custom `RDData` format. This can be accomplished with the `RDData` function:
 
 ```{r}
 library("RDHonest")
@@ -241,6 +242,10 @@ variable is discrete, with $G$ support points: their construction makes no
 assumptions on the nature of the running variable (see Section 5.1 in @KoRo16
 for more detailed discussion).
 
+Note that units that lies exactly at the cutoff are considered treated, since
+the definition of treatment is that the running variable
+ $x_i\geq c_0$.
+
 As an example, consider the @oreopoulos06 data, in which the running variable is age in years:
 ```{r}
 ## Replicate Table 2, column (10)
@@ -393,7 +398,7 @@ The package also implements lower-bound estimates for the smoothness constant
 $M$ for the Taylor and Hölder smoothness class, as described in the supplements to @KoRo16 and @ArKo16optimal
 
 ```{r}
-## Add variance estimate to the lee data so that the RDSmoothnessBound
+## Add variance estimate to the Lee (2008) data so that the RDSmoothnessBound
 ## function doesn't have to compute them each time
 dl <- NPRPrelimVar.fit(dl, se.initial="nn")
 
@@ -443,8 +448,9 @@ different, but the worst-case bias and the point estimate are identical.
 
 ## Model
 
-In a fuzzy RD design, the treatment $d_{i}$ is not entirely determined by
-whether the running variable $x_{i}$ exceeds a cutoff. Instead, the cutoff
+In a fuzzy RD design, units are assigned to treatment if their running variable
+$x_{i}$ weakly exceeds a cutoff $x_i\geq c_{0}$. However, the actual treatment
+$d_{i}$ does not perfectly comply with the treatment assignment. Instead, the cutoff
 induces a jump in the treatment probability. The resulting reduced-form and
 first-stage regressions are given by
 \begin{align*}
@@ -454,8 +460,12 @@ See Section 3.3 in @ArKo16honest for a more detailed description.
 
 In the @battistin09 dataset, the treatment variable is an indicator for
 retirement, and the running variable is number of years since being eligible to
-retire. The cutoff is $0$. (individuals exactly at the cutoff are dropped).
-Similarly to the `RDData` function, the `FRDData` function transforms the data into an appropriate format:
+retire. The cutoff is $0$. Individuals exactly at the cutoff are dropped from
+the dataset. If there were individuals exactly at the cutoff, they are assumed
+to be assigned to the treatment group.
+
+Similarly to the `RDData` function, the `FRDData` function transforms the data
+into an appropriate format:
 
 ```{r}
 ## Assumes first column in the data frame corresponds to outcome,
diff --git a/doc/RDHonest.pdf b/doc/RDHonest.pdf
index 2216df5..ea5e604 100644
Binary files a/doc/RDHonest.pdf and b/doc/RDHonest.pdf differ
diff --git a/doc/lpkernels.pdf b/doc/lpkernels.pdf
index 4c1a6e1..f47c66b 100644
Binary files a/doc/lpkernels.pdf and b/doc/lpkernels.pdf differ
diff --git a/doc/manual.pdf b/doc/manual.pdf
index 89112f3..b35d2f4 100644
Binary files a/doc/manual.pdf and b/doc/manual.pdf differ
diff --git a/man-roxygen/RDBW.R b/man-roxygen/RDBW.R
index 4235f68..9022801 100644
--- a/man-roxygen/RDBW.R
+++ b/man-roxygen/RDBW.R
@@ -1,6 +1,6 @@
 #' @param h bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 #'     named vector of length two with names \code{"p"} and \code{"m"}, in which
-#'     case the bandwidth \code{h["m"]} is used for observations below the
-#'     cutoff, and the bandwidth \code{h["p"]} is used for observations above
-#'     the cutoff. If not supplied, optimal bandwidth is computed according to
-#'     criterion given by \code{opt.criterion}.
+#'     case the bandwidth \code{h["m"]} is used for observations strictly below
+#'     the cutoff, and the bandwidth \code{h["p"]} is used for observations
+#'     weakly above the cutoff. If not supplied, optimal bandwidth is computed
+#'     according to criterion given by \code{opt.criterion}.
diff --git a/man/FRDHonest.Rd b/man/FRDHonest.Rd
index 2f408a6..6a3259c 100644
--- a/man/FRDHonest.Rd
+++ b/man/FRDHonest.Rd
@@ -80,10 +80,10 @@ cutoff should be constrained to equal to each other.}
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{se.method}{Vector with methods for estimating standard error of
 estimate. If \code{NULL}, standard errors are not computed. The elements of
diff --git a/man/FRDOptBW.Rd b/man/FRDOptBW.Rd
index f0ca4a0..b9eaaf6 100644
--- a/man/FRDOptBW.Rd
+++ b/man/FRDOptBW.Rd
@@ -125,10 +125,10 @@ Returns an object of class \code{"RDBW"}. The function \code{print}
     class \code{"RDBW"} is a list containing the following components:
 
     \describe{
-    \item{\code{hp}}{bandwidth for observations above cutoff}
+    \item{\code{hp}}{bandwidth for observations weakly above cutoff}
 
-    \item{\code{hm}}{bandwidth for observations below cutoff, equal to
-    \code{hp} unless \code{bw.equal==FALSE}}
+    \item{\code{hm}}{bandwidth for observations strictly below cutoff, equal
+    to \code{hp} unless \code{bw.equal==FALSE}}
 
     \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
     just above and just below cutoff, \eqn{\sigma^2_+(0)} and
diff --git a/man/LPPHonest.Rd b/man/LPPHonest.Rd
index 2f62f43..a1cb639 100644
--- a/man/LPPHonest.Rd
+++ b/man/LPPHonest.Rd
@@ -76,10 +76,10 @@ contain \code{NA}s. The default is set by the \code{na.action} setting of
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{se.method}{Vector with methods for estimating standard error of
 estimate. If \code{NULL}, standard errors are not computed. The elements of
diff --git a/man/NPRHonest.fit.Rd b/man/NPRHonest.fit.Rd
index 371655f..9ca5f73 100644
--- a/man/NPRHonest.fit.Rd
+++ b/man/NPRHonest.fit.Rd
@@ -35,10 +35,10 @@ either be a string equal to \code{"triangular"} (\eqn{k(u)=(1-|u|)_{+}}),
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{opt.criterion}{Optimality criterion that bandwidth is designed to
     optimize. The options are:
diff --git a/man/NPRreg.fit.Rd b/man/NPRreg.fit.Rd
index 4397e98..c39f79c 100644
--- a/man/NPRreg.fit.Rd
+++ b/man/NPRreg.fit.Rd
@@ -20,10 +20,10 @@ NPRreg.fit(
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{kern}{specifies kernel function used in the local regression. It can
 either be a string equal to \code{"triangular"} (\eqn{k(u)=(1-|u|)_{+}}),
diff --git a/man/RDHonest.Rd b/man/RDHonest.Rd
index 9f945f5..3f9bf49 100644
--- a/man/RDHonest.Rd
+++ b/man/RDHonest.Rd
@@ -79,10 +79,10 @@ cutoff should be constrained to equal to each other.}
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{se.method}{Vector with methods for estimating standard error of
 estimate. If \code{NULL}, standard errors are not computed. The elements of
diff --git a/man/RDHonestBME.Rd b/man/RDHonestBME.Rd
index 6c2b4bc..412e87a 100644
--- a/man/RDHonestBME.Rd
+++ b/man/RDHonestBME.Rd
@@ -42,10 +42,10 @@ contain \code{NA}s. The default is set by the \code{na.action} setting of
 
 \item{h}{bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
 named vector of length two with names \code{"p"} and \code{"m"}, in which
-case the bandwidth \code{h["m"]} is used for observations below the
-cutoff, and the bandwidth \code{h["p"]} is used for observations above
-the cutoff. If not supplied, optimal bandwidth is computed according to
-criterion given by \code{opt.criterion}.}
+case the bandwidth \code{h["m"]} is used for observations strictly below
+the cutoff, and the bandwidth \code{h["p"]} is used for observations
+weakly above the cutoff. If not supplied, optimal bandwidth is computed
+according to criterion given by \code{opt.criterion}.}
 
 \item{alpha}{determines confidence level, \eqn{1-\alpha}{1-alpha}}
 
diff --git a/man/RDOptBW.Rd b/man/RDOptBW.Rd
index 32c307d..7160288 100644
--- a/man/RDOptBW.Rd
+++ b/man/RDOptBW.Rd
@@ -121,9 +121,9 @@ Returns an object of class \code{"RDBW"}. The function \code{print}
     class \code{"RDBW"} is a list containing the following components:
 
     \describe{
-    \item{\code{hp}}{bandwidth for observations above cutoff}
+    \item{\code{hp}}{bandwidth for observations strictly above cutoff}
 
-    \item{\code{hm}}{bandwidth for observations below cutoff, equal to
+    \item{\code{hm}}{bandwidth for observations weakly below cutoff, equal to
     \code{hp} unless \code{bw.equal==FALSE}}
 
     \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
diff --git a/tests/testthat/test_rd.R b/tests/testthat/test_rd.R
index 1696d95..689cd73 100644
--- a/tests/testthat/test_rd.R
+++ b/tests/testthat/test_rd.R
@@ -104,8 +104,8 @@ test_that("Honest inference in Lee and LM data",  {
     expect_equal(r$maxbias, ff(r$hp, "uniform", "supplied.var")$maxbias)
 
     r <- es("triangular", "nn")
-    expect_equal(r$hm, 22.80882408)
-    expect_equal(unname(r$estimate+r$hl), 0.05476609)
+    expect_lt(abs(r$hm- 22.80882408), 5e-7)
+    expect_lt(unname(r$estimate+r$hl- 0.05476609), 1e-7)
     ## End replication
 
     ## Replicate 1511.06028v2
diff --git a/vignettes/RDHonest.Rmd b/vignettes/RDHonest.Rmd
index 136ff91..74e5347 100644
--- a/vignettes/RDHonest.Rmd
+++ b/vignettes/RDHonest.Rmd
@@ -47,10 +47,10 @@ In the sharp regression discontinuity model, we observe units $i=1,\dotsc,n$,
 with the outcome $y_i$ for the $i$th unit given by $$ y_i = f(x_i) + u_i, $$
 where $f(x_i)$ is the expectation of $y_i$ conditional on the running variable
 $x_i$ and $u_i$ is the regression error. A unit is treated if and only if the
-running variable $x_{i}$ lies above a known cutoff $c_{0}$. The parameter of
-interest is given by the jump of $f$ at the cutoff, $$ \beta=\lim_{x\downarrow
-c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let $\sigma^2(x_i)$ denote the
-conditional variance of $u_i$.
+running variable $x_{i}$ lies weakly above a known cutoff $x_{i}\geq c_{0}$. The
+parameter of interest is given by the jump of $f$ at the cutoff, $$
+\beta=\lim_{x\downarrow c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let
+$\sigma^2(x_i)$ denote the conditional variance of $u_i$.
 
 In the @lee08 dataset, the running variable corresponds to the margin of victory of
 a Democratic candidate in a US House election, and the treatment corresponds to
@@ -63,7 +63,8 @@ occurred in 1947. The running variable is the year in which the individual turne
 14, with the cutoff equal to 1947 so that the "treatment" is being subject to a
 higher minimum school-leaving age. The outcome is log earnings in 1998.
 
-Some of the functions in the package require the data to be transformed into a custom `RDData` format. This can be accomplished with the `RDData` function:
+Some of the functions in the package require the data to be transformed into a
+custom `RDData` format. This can be accomplished with the `RDData` function:
 
 ```{r}
 library("RDHonest")
@@ -241,6 +242,10 @@ variable is discrete, with $G$ support points: their construction makes no
 assumptions on the nature of the running variable (see Section 5.1 in @KoRo16
 for more detailed discussion).
 
+Note that units that lies exactly at the cutoff are considered treated, since
+the definition of treatment is that the running variable
+ $x_i\geq c_0$.
+
 As an example, consider the @oreopoulos06 data, in which the running variable is age in years:
 ```{r}
 ## Replicate Table 2, column (10)
@@ -443,8 +448,9 @@ different, but the worst-case bias and the point estimate are identical.
 
 ## Model
 
-In a fuzzy RD design, the treatment $d_{i}$ is not entirely determined by
-whether the running variable $x_{i}$ exceeds a cutoff. Instead, the cutoff
+In a fuzzy RD design, units are assigned to treatment if their running variable
+$x_{i}$ weakly exceeds a cutoff $x_i\geq c_{0}$. However, the actual treatment
+$d_{i}$ does not perfectly comply with the treatment assignment. Instead, the cutoff
 induces a jump in the treatment probability. The resulting reduced-form and
 first-stage regressions are given by
 \begin{align*}
@@ -454,8 +460,12 @@ See Section 3.3 in @ArKo16honest for a more detailed description.
 
 In the @battistin09 dataset, the treatment variable is an indicator for
 retirement, and the running variable is number of years since being eligible to
-retire. The cutoff is $0$. (individuals exactly at the cutoff are dropped).
-Similarly to the `RDData` function, the `FRDData` function transforms the data into an appropriate format:
+retire. The cutoff is $0$. Individuals exactly at the cutoff are dropped from
+the dataset. If there were individuals exactly at the cutoff, they are assumed
+to be assigned to the treatment group.
+
+Similarly to the `RDData` function, the `FRDData` function transforms the data
+into an appropriate format:
 
 ```{r}
 ## Assumes first column in the data frame corresponds to outcome,