Skip to content

Commit

Permalink
Update documentation
Browse files Browse the repository at this point in the history
  • Loading branch information
kolesarm committed Jun 18, 2021
1 parent 1a51291 commit e105d0c
Show file tree
Hide file tree
Showing 19 changed files with 81 additions and 61 deletions.
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
inst/
Meta
.Rproj.user
.Rhistory
.RData
*.el
notes.org
doc
/Meta/
6 changes: 3 additions & 3 deletions R/FRD_lp.R
Original file line number Diff line number Diff line change
Expand Up @@ -128,10 +128,10 @@ FRDHonest <- function(formula, data, subset, weights, cutoff=0, M,
#' class \code{"RDBW"} is a list containing the following components:
#'
#' \describe{
#' \item{\code{hp}}{bandwidth for observations above cutoff}
#' \item{\code{hp}}{bandwidth for observations weakly above cutoff}
#'
#' \item{\code{hm}}{bandwidth for observations below cutoff, equal to
#' \code{hp} unless \code{bw.equal==FALSE}}
#' \item{\code{hm}}{bandwidth for observations strictly below cutoff, equal
#' to \code{hp} unless \code{bw.equal==FALSE}}
#'
#' \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
#' just above and just below cutoff, \eqn{\sigma^2_+(0)} and
Expand Down
4 changes: 2 additions & 2 deletions R/RD_lp.R
Original file line number Diff line number Diff line change
Expand Up @@ -135,9 +135,9 @@ RDHonest <- function(formula, data, subset, weights, cutoff=0, M,
#' class \code{"RDBW"} is a list containing the following components:
#'
#' \describe{
#' \item{\code{hp}}{bandwidth for observations above cutoff}
#' \item{\code{hp}}{bandwidth for observations strictly above cutoff}
#'
#' \item{\code{hm}}{bandwidth for observations below cutoff, equal to
#' \item{\code{hm}}{bandwidth for observations weakly below cutoff, equal to
#' \code{hp} unless \code{bw.equal==FALSE}}
#'
#' \item{\code{sigma2m}, \code{sigma2p}}{estimate of conditional variance
Expand Down
2 changes: 1 addition & 1 deletion doc/RDHonest.R
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ RDHonest(voteshare ~ margin, data=lee08, kern="uniform", M=M, sclass="H", opt.cr


## -----------------------------------------------------------------------------
## Add variance estimate to the lee data so that the RDSmoothnessBound
## Add variance estimate to the Lee (2008) data so that the RDSmoothnessBound
## function doesn't have to compute them each time
dl <- NPRPrelimVar.fit(dl, se.initial="nn")

Expand Down
30 changes: 20 additions & 10 deletions doc/RDHonest.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ In the sharp regression discontinuity model, we observe units $i=1,\dotsc,n$,
with the outcome $y_i$ for the $i$th unit given by $$ y_i = f(x_i) + u_i, $$
where $f(x_i)$ is the expectation of $y_i$ conditional on the running variable
$x_i$ and $u_i$ is the regression error. A unit is treated if and only if the
running variable $x_{i}$ lies above a known cutoff $c_{0}$. The parameter of
interest is given by the jump of $f$ at the cutoff, $$ \beta=\lim_{x\downarrow
c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let $\sigma^2(x_i)$ denote the
conditional variance of $u_i$.
running variable $x_{i}$ lies weakly above a known cutoff $x_{i}\geq c_{0}$. The
parameter of interest is given by the jump of $f$ at the cutoff, $$
\beta=\lim_{x\downarrow c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let
$\sigma^2(x_i)$ denote the conditional variance of $u_i$.

In the @lee08 dataset, the running variable corresponds to the margin of victory of
a Democratic candidate in a US House election, and the treatment corresponds to
Expand All @@ -63,7 +63,8 @@ occurred in 1947. The running variable is the year in which the individual turne
14, with the cutoff equal to 1947 so that the "treatment" is being subject to a
higher minimum school-leaving age. The outcome is log earnings in 1998.

Some of the functions in the package require the data to be transformed into a custom `RDData` format. This can be accomplished with the `RDData` function:
Some of the functions in the package require the data to be transformed into a
custom `RDData` format. This can be accomplished with the `RDData` function:

```{r}
library("RDHonest")
Expand Down Expand Up @@ -241,6 +242,10 @@ variable is discrete, with $G$ support points: their construction makes no
assumptions on the nature of the running variable (see Section 5.1 in @KoRo16
for more detailed discussion).

Note that units that lies exactly at the cutoff are considered treated, since
the definition of treatment is that the running variable
$x_i\geq c_0$.

As an example, consider the @oreopoulos06 data, in which the running variable is age in years:
```{r}
## Replicate Table 2, column (10)
Expand Down Expand Up @@ -393,7 +398,7 @@ The package also implements lower-bound estimates for the smoothness constant
$M$ for the Taylor and Hölder smoothness class, as described in the supplements to @KoRo16 and @ArKo16optimal

```{r}
## Add variance estimate to the lee data so that the RDSmoothnessBound
## Add variance estimate to the Lee (2008) data so that the RDSmoothnessBound
## function doesn't have to compute them each time
dl <- NPRPrelimVar.fit(dl, se.initial="nn")
Expand Down Expand Up @@ -443,8 +448,9 @@ different, but the worst-case bias and the point estimate are identical.

## Model

In a fuzzy RD design, the treatment $d_{i}$ is not entirely determined by
whether the running variable $x_{i}$ exceeds a cutoff. Instead, the cutoff
In a fuzzy RD design, units are assigned to treatment if their running variable
$x_{i}$ weakly exceeds a cutoff $x_i\geq c_{0}$. However, the actual treatment
$d_{i}$ does not perfectly comply with the treatment assignment. Instead, the cutoff
induces a jump in the treatment probability. The resulting reduced-form and
first-stage regressions are given by
\begin{align*}
Expand All @@ -454,8 +460,12 @@ See Section 3.3 in @ArKo16honest for a more detailed description.

In the @battistin09 dataset, the treatment variable is an indicator for
retirement, and the running variable is number of years since being eligible to
retire. The cutoff is $0$. (individuals exactly at the cutoff are dropped).
Similarly to the `RDData` function, the `FRDData` function transforms the data into an appropriate format:
retire. The cutoff is $0$. Individuals exactly at the cutoff are dropped from
the dataset. If there were individuals exactly at the cutoff, they are assumed
to be assigned to the treatment group.

Similarly to the `RDData` function, the `FRDData` function transforms the data
into an appropriate format:

```{r}
## Assumes first column in the data frame corresponds to outcome,
Expand Down
Binary file modified doc/RDHonest.pdf
Binary file not shown.
Binary file modified doc/lpkernels.pdf
Binary file not shown.
Binary file modified doc/manual.pdf
Binary file not shown.
8 changes: 4 additions & 4 deletions man-roxygen/RDBW.R
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#' @param h bandwidth, a scalar parameter. For fuzzy or sharp RD, it can be a
#' named vector of length two with names \code{"p"} and \code{"m"}, in which
#' case the bandwidth \code{h["m"]} is used for observations below the
#' cutoff, and the bandwidth \code{h["p"]} is used for observations above
#' the cutoff. If not supplied, optimal bandwidth is computed according to
#' criterion given by \code{opt.criterion}.
#' case the bandwidth \code{h["m"]} is used for observations strictly below
#' the cutoff, and the bandwidth \code{h["p"]} is used for observations
#' weakly above the cutoff. If not supplied, optimal bandwidth is computed
#' according to criterion given by \code{opt.criterion}.
8 changes: 4 additions & 4 deletions man/FRDHonest.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 3 additions & 3 deletions man/FRDOptBW.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/LPPHonest.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/NPRHonest.fit.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/NPRreg.fit.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/RDHonest.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 4 additions & 4 deletions man/RDHonestBME.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions man/RDOptBW.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

4 changes: 2 additions & 2 deletions tests/testthat/test_rd.R
Original file line number Diff line number Diff line change
Expand Up @@ -104,8 +104,8 @@ test_that("Honest inference in Lee and LM data", {
expect_equal(r$maxbias, ff(r$hp, "uniform", "supplied.var")$maxbias)

r <- es("triangular", "nn")
expect_equal(r$hm, 22.80882408)
expect_equal(unname(r$estimate+r$hl), 0.05476609)
expect_lt(abs(r$hm- 22.80882408), 5e-7)
expect_lt(unname(r$estimate+r$hl- 0.05476609), 1e-7)
## End replication

## Replicate 1511.06028v2
Expand Down
28 changes: 19 additions & 9 deletions vignettes/RDHonest.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,10 +47,10 @@ In the sharp regression discontinuity model, we observe units $i=1,\dotsc,n$,
with the outcome $y_i$ for the $i$th unit given by $$ y_i = f(x_i) + u_i, $$
where $f(x_i)$ is the expectation of $y_i$ conditional on the running variable
$x_i$ and $u_i$ is the regression error. A unit is treated if and only if the
running variable $x_{i}$ lies above a known cutoff $c_{0}$. The parameter of
interest is given by the jump of $f$ at the cutoff, $$ \beta=\lim_{x\downarrow
c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let $\sigma^2(x_i)$ denote the
conditional variance of $u_i$.
running variable $x_{i}$ lies weakly above a known cutoff $x_{i}\geq c_{0}$. The
parameter of interest is given by the jump of $f$ at the cutoff, $$
\beta=\lim_{x\downarrow c_{0}}f(x)-\lim_{x\uparrow c_{0}}f(x).$$ Let
$\sigma^2(x_i)$ denote the conditional variance of $u_i$.

In the @lee08 dataset, the running variable corresponds to the margin of victory of
a Democratic candidate in a US House election, and the treatment corresponds to
Expand All @@ -63,7 +63,8 @@ occurred in 1947. The running variable is the year in which the individual turne
14, with the cutoff equal to 1947 so that the "treatment" is being subject to a
higher minimum school-leaving age. The outcome is log earnings in 1998.

Some of the functions in the package require the data to be transformed into a custom `RDData` format. This can be accomplished with the `RDData` function:
Some of the functions in the package require the data to be transformed into a
custom `RDData` format. This can be accomplished with the `RDData` function:

```{r}
library("RDHonest")
Expand Down Expand Up @@ -241,6 +242,10 @@ variable is discrete, with $G$ support points: their construction makes no
assumptions on the nature of the running variable (see Section 5.1 in @KoRo16
for more detailed discussion).

Note that units that lies exactly at the cutoff are considered treated, since
the definition of treatment is that the running variable
$x_i\geq c_0$.

As an example, consider the @oreopoulos06 data, in which the running variable is age in years:
```{r}
## Replicate Table 2, column (10)
Expand Down Expand Up @@ -443,8 +448,9 @@ different, but the worst-case bias and the point estimate are identical.

## Model

In a fuzzy RD design, the treatment $d_{i}$ is not entirely determined by
whether the running variable $x_{i}$ exceeds a cutoff. Instead, the cutoff
In a fuzzy RD design, units are assigned to treatment if their running variable
$x_{i}$ weakly exceeds a cutoff $x_i\geq c_{0}$. However, the actual treatment
$d_{i}$ does not perfectly comply with the treatment assignment. Instead, the cutoff
induces a jump in the treatment probability. The resulting reduced-form and
first-stage regressions are given by
\begin{align*}
Expand All @@ -454,8 +460,12 @@ See Section 3.3 in @ArKo16honest for a more detailed description.

In the @battistin09 dataset, the treatment variable is an indicator for
retirement, and the running variable is number of years since being eligible to
retire. The cutoff is $0$. (individuals exactly at the cutoff are dropped).
Similarly to the `RDData` function, the `FRDData` function transforms the data into an appropriate format:
retire. The cutoff is $0$. Individuals exactly at the cutoff are dropped from
the dataset. If there were individuals exactly at the cutoff, they are assumed
to be assigned to the treatment group.

Similarly to the `RDData` function, the `FRDData` function transforms the data
into an appropriate format:

```{r}
## Assumes first column in the data frame corresponds to outcome,
Expand Down

0 comments on commit e105d0c

Please sign in to comment.