Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

V1 full support based on district level data only #52

Merged
merged 119 commits into from
Aug 27, 2024

Conversation

e-kotov
Copy link
Member

@e-kotov e-kotov commented Aug 22, 2024

ok, all done, full support for V1 data and much more cool stuff.

Important: to simplify a few things, the v1 data is now downloaded as csv.gz instead of txt.gz. So before testing, please nuke your cache folder or rename all txt.gz files to csv.gz files. txt.gz files should not get in the way as the functions look specifically for csv.gz files, but if you do not remove them, they will just take up storage space.

Please install with

remotes::install_github(
 "Robinlovelace/spanishoddata@v1-full-support-based-on-district-level-data-only",
 build_vignettes = TRUE,
 force = TRUE
)

If the above fails for you because of vignettes, do:

Sys.setenv(PKG_BUILD_VIGNETTES=TRUE)
pak::pkg_install(
    "Robinlovelace/spanishoddata@v1-full-support-based-on-district-level-data-only",
    upgrade = TRUE
)

Here's an overview:

flowchart TB
    A["For quick analysis of few dates\nwork with raw CSV.gz data"] -->|"spod_get(
    type = 'origin-destination',
    zones = 'districts',
    dates = c(start = '2020-02-14', end = '2020-03-14') )"
    | F["'tbl' object with 'id' for origins and destinations"]
    
    C["Analyse longer periods (several months)\nor even the whole dataset over several years"]
    -->|"spod_convert_for_analysis(
type = 'origin-destination',
    zones = 'districts',
    dates = c(start = '2020-02-14', end = '2021-05-09') )"| D["path to converted data"]
    D -->|"spod_connect_to_converted_data()" | F["'tbl' object with 'id' for origins and destinations"]
    
    F -->|"dplyr functions: select(), filter(), mutate(), group_by(), summarise(), etc..."| G["dplyr::collect()"]
    G --> H["Result: data.frame / tibble"] --> R[spatial data matched by 'id' with aggegated mobility flows]

    X["spatial data with zones"] --> |"spod_get_zones(
    zones = 'districts',
    ver = 1 )"| Y["polygons with zones in sf object\nwith 'id' that match with origins and destinations"] --> R
Loading

See the new vignette for testing all the functions.

@eugenividal I am very interested in your opinion on how well the concept of working with this virtual duckdb table that you get from spod_get() and spod_connect_to_converted_data() is explained in the vignette https://github.com/Robinlovelace/spanishoddata/blob/v1-full-support-based-on-district-level-data-only/vignettes/v1-2020-2021-mitma-data-codebook.qmd . Also, suggestions for better function names, specifically the ones that are user-facing functions covered in the vignette.

e-kotov added 30 commits August 15, 2024 17:11
@Robinlovelace
Copy link
Collaborator

Great, let me know if you want me to postpone putting it out there but sounds like we're on-track for launch.

@Robinlovelace
Copy link
Collaborator

All on track to merge this by 15:00 UK time (16:00 in Germany I believe) @e-kotov ?

@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

@Robinlovelace yes, cleaned up 99%. Ready. We don't have a logo)

@e-kotov e-kotov merged commit e0f8a81 into main Aug 27, 2024
2 checks passed
@e-kotov e-kotov deleted the v1-full-support-based-on-district-level-data-only branch August 27, 2024 11:13
@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

Done. Let's see how the pkgdown website renders and looks live and perhaps will make a few tweaks.

@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

OK, now we are done. We have a working package,

a logo

and an opengraph card for social media

@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

@Robinlovelace
Copy link
Collaborator

Sure!

@eugenividal
Copy link
Collaborator

eugenividal commented Aug 27, 2024

@e-kotov The logo is super cool. I loved it! 🤩

Just a quick question: is that plot real? I meant does the arrow width represent the volume of trips? I'd have expected most trips in the triangle Madrid - Barcelona - Valencia

@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

@eugenividal it is real. I would have spent the whole day to make it by hand)
I will post the code a bit later, I want to find a good place to put it. Probably should not be in the package itself in the R folder, but in some other folder in the repo. These flows are aggregated and therefore what you see could be more of a visual artifact. There are not between cities, but between provinces.

@eugenividal
Copy link
Collaborator

eugenividal commented Aug 28, 2024

@e-kotov thanks for your quick reply. That's interesting!

@e-kotov
Copy link
Member Author

e-kotov commented Aug 28, 2024

@eugenividal ah, also I did some flows aggregation as there's some bug in flowmapper that I did not have time to deal with
Screenshot 2024-08-28 at 10 22 29

So the lines in the logo actually combine flows for just 10 nodes and some provinces are aggregated into single point... The image above is with 19 nodes (max allowed by the package)

@eugenividal
Copy link
Collaborator

eugenividal commented Aug 28, 2024

@e-kotov There are 17 autonomous communities and two autonomous cities (Ceuta and Melilla) in Spain. Perfect for the max allowed in the package!

Copy link
Collaborator

@eugenividal eugenividal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @e-kotow, I had a chance to take a look at the vignette "Downloading and converting OD datasets".
I like the new function names. I think they are effectively balanced between being descriptive and not overly long.
The text is generally clear and easy to understand. However, I believe that restructuring some sections could improve the overall clarity.
This and other minor comments (rephrasing suggestions) are explained below.
I also got some errors when running the code. I could provide you with details later if you need them.

vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
vignettes/convert.qmd Show resolved Hide resolved
@Robinlovelace
Copy link
Collaborator

Great set of comments @eugenividal. PR welcome, look forward to seeing @e-kotov's comments, your suggestions seem good to me, thank you!

@e-kotov
Copy link
Member Author

e-kotov commented Aug 29, 2024

@eugenividal Thanks for a thorough review and feedback. Working on updated version in a new branch here: https://github.com/Robinlovelace/spanishoddata/blob/minor-fixes-ek/vignettes/convert.qmd . Work-in-progress, so no new review necessary. I will do a PR when I'm ready.

@eugenividal
Copy link
Collaborator

@e-kotov No problem. Great! I'll be working on other stuff until Tuesday. Yes, please, send a new PR when it is ready and let me know from Tuesday if you would like me to look at any other aspects.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants