Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create a warning for actions that may load more than X GB into memory #34

Closed
e-kotov opened this issue Aug 14, 2024 · 3 comments
Closed
Assignees

Comments

@e-kotov
Copy link
Member

e-kotov commented Aug 14, 2024

With lazy tables being a relatively new concept for most users, it is super easy for the user to accidentally try to dplyr::collect() too much data into memory.

Perhaps provide a warning for novice users when creating a DuckDB lazy table based on more than X days of data, that they should not dplyr::collect() this into memory and should learn to use DuckDB/Arrow with dplyr and only pull the results of aggregation.

Expert users would be able to silence this warning with a package option.

@e-kotov e-kotov self-assigned this Aug 14, 2024
@Robinlovelace
Copy link
Collaborator

I think this is a good idea but also that we shouldn't spend too much time doing 'defensive programming'. If we state in the docs that the package can download lots of data and that we assume the user will take care with their settings that could save some lines of code, right?

@e-kotov
Copy link
Member Author

e-kotov commented Aug 14, 2024

Sure, just noting down some nice to haves, so that I don't forget. Not a big priority and I guess we are already doing some explaining in the README on how to work with lazy tables. The side effect of the package is that we are educating the users about the cutting edge approaches to working with large datasets on their laptops.

@e-kotov
Copy link
Member Author

e-kotov commented Aug 27, 2024

Closed by #52

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants