Skip to content

Commit

Permalink
Merge pull request #51 from Robinlovelace/more-readme-changes-evt
Browse files Browse the repository at this point in the history
more changes to README
  • Loading branch information
eugenividal authored Aug 22, 2024
2 parents 27a7ace + b0a6745 commit 2abc2a8
Show file tree
Hide file tree
Showing 3 changed files with 65 additions and 70 deletions.
104 changes: 47 additions & 57 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,37 +5,29 @@
[![R-CMD-check](https://github.com/Robinlovelace/spanish_od_data/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Robinlovelace/spanish_od_data/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

**spanishoddata** provides functions for downloading and formatting
Spanish origin-destination (OD) data from the Ministry of Transport and
Sustainable Mobility of Spain.

The Spanish OD data is primarily sourced from mobile phone positioning
data and includes matrices for overnight stays, individual movements,
and trips of Spanish residents at different geographical levels.

This data is part of the basic studies of the ‘Open Data Mobility’
project and it is released monthly on the
[website](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad)
of the Ministry of Transport and Sustainable Mobility of Spain from
January 2022.

The data is provided in the ‘estudios_basicos’ folder (at the end of the
website) which has three levels of nested subfolders: the first for
geographic area levels (districts, large urban agglomerations,
municipalities); the second for the OD matrix type (overnight stays,
individuals, trips); and the third for temporal aggregation (daily and
full month).
**spanishoddata** is an R package that provides functions for
downloading and formatting Spanish origin-destination (OD) data from the
Ministry of Transport and Sustainable Mobility of Spain.

It supports the two versions of the Spanish OD data. [The first
version](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/estudios-de-movilidad-anteriores/covid-19/opendata-movilidad)
covers data from 2020 and 2021, including the period of the COVID-19
pandemic. [The second
version](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad)
contains data from January 2022 onwards and is updated monthly on the
fifteenth of each month. Both versions of the data primarily consist of
mobile phone positioning data, and include matrices for overnight stays,
individual movements, and trips of Spanish residents at different
geographical levels.

**spanishoddata** is designed to save people time by providing the data
in analysis-ready formats. Automating the process of downloading,
cleaning and importing the data can also reduce the risk of errors in
the laborious process of data preparation.

**spanishoddata** also reduces computational resources by using
computationally efficient packages behind the scenes. To effectively
work with multiple data files, it’s recommended you set up a data
directory where the package can search for the data, and download only
the files that are not already present.
cleaning, and importing the data can also reduce the risk of errors in
the laborious process of data preparation. It also reduces computational
resources by using computationally efficient packages behind the scenes.
To effectively work with multiple data files, it’s recommended you set
up a data directory where the package can search for the data and
download only the files that are not already present.

# Installation

Expand All @@ -46,16 +38,6 @@ if (!require("remotes")) install.packages("remotes")
remotes::install_github("Robinlovelace/spanishoddata")
```

parallelly (1.37.1 -> 1.38.0 ) [CRAN]
duckdb (0.10.2 -> 1.0.0-2) [CRAN]
── R CMD build ─────────────────────────────────────────────────────────────────
* checking for file ‘/tmp/RtmpIfAw5i/remotes5ff8143b636a6/Robinlovelace-spanishoddata-b5ca4d9/DESCRIPTION’ ... OK
* preparing ‘spanishoddata’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘spanishoddata_0.0.0.9000.tar.gz’

Load it as follows:

``` r
Expand Down Expand Up @@ -150,14 +132,6 @@ Zones can be downloaded as follows:

``` r
distritos <- spod_get_zones("distritos", ver = 2)
```

Deleting source `/tmp/RtmpIfAw5i/clean_data/v2/zones/distritos_mitma.gpkg' failed
Writing layer `distritos_mitma' to data source
`/tmp/RtmpIfAw5i/clean_data/v2/zones/distritos_mitma.gpkg' using driver `GPKG'
Writing 3909 features with 3 fields and geometry type Unknown (any).

``` r
distritos_wgs84 <- distritos |>
sf::st_simplify(dTolerance = 200) |>
sf::st_transform(4326)
Expand All @@ -166,7 +140,7 @@ plot(sf::st_geometry(distritos_wgs84))

![](man/figures/README-distritos-1.png)

## Estudios básicos
## OD data

``` r
od_db <- spod_get_od(
Expand Down Expand Up @@ -252,14 +226,14 @@ od1_head |>
knitr::kable()
```

| fecha | periodo | origen | destino | distancia | actividad_origen | actividad_destino | estudio_origen_posible | estudio_destino_posible | residencia | renta | edad | sexo | viajes | viajes_km |
|---:|:---|:---|:---|:---|:---|:---|:---|:---|:---|:---|:---|:---|---:|---:|
| 20240301 | 19 | 01009_AM | 01001 | 0.5-2 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 5.124 | 6.120 |
| 20240301 | 15 | 01002 | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 2.360 | 100.036 |
| 20240301 | 00 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 1.743 | 22.293 |
| 20240301 | 05 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 2.404 | 24.659 |
| 20240301 | 06 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 5.124 | 80.118 |
| 20240301 | 09 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 7.019 | 93.938 |
| fecha | periodo | origen | destino | distancia | actividad_origen | actividad_destino | estudio_origen_posible | estudio_destino_posible | residencia | renta | edad | sexo | viajes | viajes_km |
|---------:|:--------|:---------|:--------|:----------|:-----------------|:------------------|:-----------------------|:------------------------|:-----------|:------|:-----|:-----|-------:|----------:|
| 20240301 | 19 | 01009_AM | 01001 | 0.5-2 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 5.124 | 6.120 |
| 20240301 | 15 | 01002 | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 2.360 | 100.036 |
| 20240301 | 00 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 1.743 | 22.293 |
| 20240301 | 05 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 2.404 | 24.659 |
| 20240301 | 06 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 5.124 | 80.118 |
| 20240301 | 09 | 01009_AM | 01001 | 10-50 | frecuente | casa | no | no | 01 | 10-15 | NA | NA | 7.019 | 93.938 |

``` r
DBI::dbDisconnect(con)
Expand All @@ -276,7 +250,7 @@ od_multi_list[[1]]
```

# Source: SQL [?? x 18]
# Database: DuckDB v1.0.0 [robin@Linux 6.8.0-40-generic:R 4.4.1/:memory:]
# Database: DuckDB v1.0.0 [eugeni@Linux 5.15.0-118-generic:R 4.4.1/:memory:]
fecha periodo origen destino distancia actividad_origen actividad_destino
<dbl> <chr> <chr> <chr> <chr> <chr> <chr>
1 20240307 00 01009_… 01001 0.5-2 frecuente casa
Expand Down Expand Up @@ -342,7 +316,8 @@ od_national_interzonal <- od_national_aggregated |>
filter(id_origin != id_destination)
```

We can convert these to geographic data with the {od} package:
We can convert these to geographic data with the {od} package (Lovelace
and Morgan 2024):

``` r
od_national_sf <- od::od_to_sf(
Expand Down Expand Up @@ -410,3 +385,18 @@ For more information on the package, see:
- The [uses
vignette](https://robinlovelace.github.io/spanishoddata/articles/uses.html)
which documents use cases

# References

<div id="refs" class="references csl-bib-body hanging-indent"
entry-spacing="0">

<div id="ref-lovelace_od_2024" class="csl-entry">

Lovelace, Robin, and Malcolm Morgan. 2024. “Od: Manipulate and Map
Origin-Destination Data,” August.
<https://cran.r-project.org/web/packages/od/od.pdf>.

</div>

</div>
21 changes: 8 additions & 13 deletions README.qmd
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
---
format: gfm
bibliography: references.bib
execute:
message: false
warning: false
Expand All @@ -12,19 +13,11 @@ knitr:
[![R-CMD-check](https://github.com/Robinlovelace/spanish_od_data/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/Robinlovelace/spanish_od_data/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

**spanishoddata** provides functions for downloading and formatting Spanish origin-destination (OD) data from the Ministry of Transport and Sustainable Mobility of Spain.
**spanishoddata** is an R package that provides functions for downloading and formatting Spanish origin-destination (OD) data from the Ministry of Transport and Sustainable Mobility of Spain.

The Spanish OD data is primarily sourced from mobile phone positioning data and includes matrices for overnight stays, individual movements, and trips of Spanish residents at different geographical levels.
It supports the two versions of the Spanish OD data. [The first version](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/estudios-de-movilidad-anteriores/covid-19/opendata-movilidad) covers data from 2020 and 2021, including the period of the COVID-19 pandemic. [The second version](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad) contains data from January 2022 onwards and is updated monthly on the fifteenth of each month. Both versions of the data primarily consist of mobile phone positioning data, and include matrices for overnight stays, individual movements, and trips of Spanish residents at different geographical levels.

This data is part of the basic studies of the 'Open Data Mobility' project and it is released monthly on the [website](https://www.transportes.gob.es/ministerio/proyectos-singulares/estudios-de-movilidad-con-big-data/opendata-movilidad) of the Ministry of Transport and Sustainable Mobility of Spain from January 2022.

The data is provided in the ‘estudios_basicos’ folder (at the end of the website) which has three levels of nested subfolders: the first for geographic area levels (districts, large urban agglomerations, municipalities); the second for the OD matrix type (overnight stays, individuals, trips); and the third for temporal aggregation (daily and full month).

**spanishoddata** is designed to save people time by providing the data in analysis-ready formats.
Automating the process of downloading, cleaning and importing the data can also reduce the risk of errors in the laborious process of data preparation.

**spanishoddata** also reduces computational resources by using computationally efficient packages behind the scenes.
To effectively work with multiple data files, it's recommended you set up a data directory where the package can search for the data, and download only the files that are not already present.
**spanishoddata** is designed to save people time by providing the data in analysis-ready formats. Automating the process of downloading, cleaning, and importing the data can also reduce the risk of errors in the laborious process of data preparation. It also reduces computational resources by using computationally efficient packages behind the scenes. To effectively work with multiple data files, it’s recommended you set up a data directory where the package can search for the data and download only the files that are not already present.

# Installation

Expand Down Expand Up @@ -126,7 +119,7 @@ distritos_wgs84 <- distritos |>
plot(sf::st_geometry(distritos_wgs84))
```

## Estudios básicos
## OD data

```{r}
od_db <- spod_get_od(
Expand Down Expand Up @@ -229,7 +222,7 @@ od_national_interzonal <- od_national_aggregated |>
filter(id_origin != id_destination)
```

We can convert these to geographic data with the {od} package:
We can convert these to geographic data with the {od} package [@lovelace_od_2024]:

```{r}
#| label: desire-lines
Expand Down Expand Up @@ -317,3 +310,5 @@ usethis::use_tidy_description()
# Create new vignette called 'uses' with the title "Use cases":
usethis::use_vignette("uses", title = "Use cases")
```

# References
10 changes: 10 additions & 0 deletions references.bib
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@

@article{lovelace_od_2024,
title = {od: {Manipulate} and {Map} {Origin}-{Destination} {Data}},
url = {https://cran.r-project.org/web/packages/od/od.pdf},
language = {en},
author = {Lovelace, Robin and Morgan, Malcolm},
month = aug,
year = {2024},
file = {Lovelace and Morgan - od Manipulate and Map Origin-Destination Data.pdf:/home/eugeni/Zotero/storage/UR5W7NFU/Lovelace and Morgan - od Manipulate and Map Origin-Destination Data.pdf:application/pdf},
}

0 comments on commit 2abc2a8

Please sign in to comment.