Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modernize PUDL packaging #2140

Closed
zaneselvans opened this issue Dec 16, 2022 · 4 comments
Closed

Modernize PUDL packaging #2140

zaneselvans opened this issue Dec 16, 2022 · 4 comments
Assignees
Labels
dependencies Pull requests that update a dependency file packaging Software packaging and distribution of PUDL via pypi, etc.

Comments

@zaneselvans
Copy link
Member

zaneselvans commented Dec 16, 2022

Installation using setup.py has been deprecated in favor of PEP 517 / PEP 518 / PEP 621 / PEP 660 based systems, and we are getting warnings about it, so we need to overhaul the packaging setup. Instead we should probably configure the packaging entirely using pyproject.toml

In addition, the pudl.metadata.templates directory is a "package" with no code (just jinja templates) and it needs to be specifically called out for inclusion in the package -- otherwise it'll be left out in future packaging setups.

We may also want to look at using another build system like poetry, and explicitly lock dependencies to particular versions. A blog post from last year about the changes.

@zaneselvans zaneselvans added packaging Software packaging and distribution of PUDL via pypi, etc. dependencies Pull requests that update a dependency file labels Dec 16, 2022
@zaneselvans zaneselvans self-assigned this Dec 16, 2022
@jdangerx
Copy link
Member

jdangerx commented Jan 4, 2023

Looks like we can use conda-lock to pin our transitive dependencies, which will help with our dev env reproducibility/upgradability. In theory we won't have to recreate our environments when switching branches :)

There's a blog post about it here, too: https://pythonspeed.com/articles/conda-dependency-management/

@zaneselvans
Copy link
Member Author

How does using a lockfile keep us from needing to recreate environments when we switch branches? If the branches have different dependencies wouldn't that be reflected in the different lockfiles, and wouldn't the python environment need to be rebuilt to match?

While we use conda for python environment isolation and installation of some finnicky binary / system dependencies, almost all of the software installation is actually done by pip since the overwhelming majority of the packages in the conda environment are installed by the pip section the file, that installs the actual catalystcoop.pudl package (from the repo, with extras, using --editable). Would those pip installed packages end up being captured by the conda environment? I guess they show up as installed when you do conda list but they come from the pypi "channel"

@jdangerx
Copy link
Member

jdangerx commented Jan 5, 2023

The python environment would be updated to match the lockfile, yeah. Though it'd be an incremental rebuild, not a full one.

In theory different branches will only have different library versions if we've intentionally changed direct dependency versions. Do we anticipate adding/removing/updating direct dependencies frequently? I just took a look at git log setup.py and it seems like we're bumping versions a couple times a week over the last few months, which seems kind of annoying but maybe not terrible.

I'm not sure how conservative the solver is re: trying to change as few library versions as possible when re-solving, but I'd assume that it is still faster than "re-do everything every time."

conda-lock does appear to support pip dependencies within pyproject.toml, see this section of the docs - though I think we'll probably have to try moving everything to pyproject.toml first (looks like conda-lock also allows for the non-Python dependencies to be defined in pyproject.toml, but I wonder about how to propagate that stuff to downstream users...)

@zaneselvans
Copy link
Member Author

This should have been closed by #2479

Opening a new issue to deal with lockfiles in particular: #2896

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file packaging Software packaging and distribution of PUDL via pypi, etc.
Projects
None yet
Development

No branches or pull requests

2 participants