Skip to content

Commit

Permalink
Merge branch 'main' into update-metadata-docstring
Browse files Browse the repository at this point in the history
  • Loading branch information
chinandrew authored Oct 30, 2020
2 parents 7d0105f + 472ae36 commit e7c5a39
Show file tree
Hide file tree
Showing 102 changed files with 1,615 additions and 976 deletions.
12 changes: 6 additions & 6 deletions .github/workflows/python_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@ on:

jobs:
build:

runs-on: ubuntu-latest
defaults:
run:
working-directory: Python-packages/covidcast-py/
strategy:
matrix:
python-version: [3.6]
Expand All @@ -25,12 +27,10 @@ jobs:
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r Python-packages/covidcast-py/requirements.txt
pip install -r requirements_ci.txt
- name: Lint with pylint and mypy
run: |
pylint Python-packages/covidcast-py/covidcast/ --rcfile Python-packages/covidcast-py/.pylintrc
mypy Python-packages/covidcast-py/covidcast --config-file Python-packages/covidcast-py/mypy.ini
make lint
- name: Test with pytest
run: |
pytest Python-packages/covidcast-py/ -W ignore::UserWarning
make test
48 changes: 48 additions & 0 deletions .github/workflows/r_ci.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# This workflow uses actions that are not certified by GitHub.
# They are provided by a third-party and are governed by
# separate terms of service, privacy policy, and support
# documentation.
#
# See https://github.com/r-lib/actions/tree/master/examples#readme for
# additional example workflows available for the R community.

name: R

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

jobs:
build:
runs-on: ubuntu-latest
defaults:
run:
working-directory: R-packages/covidcast/
strategy:
matrix:
r-version: [3.5]

steps:
- uses: actions/checkout@v2
- name: Set up R ${{ matrix.r-version }}
uses: r-lib/actions/setup-r@ffe45a39586f073cc2e9af79c4ba563b657dc6e3
with:
r-version: ${{ matrix.r-version }}
- name: Install libcurl
run: sudo apt-get install libcurl4-openssl-dev
- name: Cache R packages
uses: actions/cache@v2
with:
path: ${{ env.R_LIBS_USER }}
key: ${{ runner.os }}-r-1-
- name: Install dependencies
run: |
install.packages(c("remotes", "rcmdcheck"))
remotes::install_deps(dependencies = TRUE)
shell: Rscript {0}
- name: Check
run: |
rcmdcheck::rcmdcheck(args = c("--no-manual", "--ignore-vignettes", "--as-cran"), build_args = c("--no-build-vignettes"), error_on = "error")
shell: Rscript {0}
66 changes: 53 additions & 13 deletions Python-packages/covidcast-py/DEVELOP.md
Original file line number Diff line number Diff line change
@@ -1,33 +1,73 @@
# Developing the covidcast package

## Structure
From `covidcast/Python-packages/covidcast-py`, the Python library files are located in the
`covidcast/` folder, with corresponding tests in `tests/covidcast/`.
Currently, primary user facing functions across the modules are being imported in `covidcast/__init__.py`
for organization and namespace purposes.

Sphinx documentation is in the `docs/` folder. See "Building the Package and Documentation" below
for information on how to build the documentation.

The CI workflow is stored in the repo's top level directory in `.github/workflows/python_ci.yml`

## Development
These are general recommendations for developing. They do not have to be strictly followed,
but are encouraged.

__Environment__
- A virtual environment is recommended, which can be started with the following commands:

```sh
python3 -m venv env
source env/bin/activate
```
this will create an `env/` folder containing files required in the environment, which
is gitignored. The environment can be deactived by running `deactivate`, and reactived by
rerunning `source env/bin/activate`. To create a new environment, you can delete the
`env/` folder and rerun the above commands if you do not require the old one anymore,
or rerun the above command with a new environment name in place of `env`.

__Style__
- Run `make lint` from `Python-packages/covidcast-py/` to run the lint commands.
- `mypy`, `pylint`, and `pydocstyle` are used for linting, with associated configurations for
`pylint` in `.pylintrc` and for `mypy` in `mypy.ini`.

__Testing__
- Run `make test` from `Python-packages/covidcast-py/` to run the test commands.
- `pytest` is the framework used in this package.
- Each function should have corresponding unit tests.
- Tests should be deterministic.
- Similarly, tests should not make network calls.

__Documentation__
- New public methods should have comprehensive docstrings and
an entry in the Sphinx documentation.
- Usage examples in Sphinx are recommended.

## Building the Package and Documentation
The package is fairly straightforward in structure, following the basic
[packaging
documentation](https://packaging.python.org/tutorials/packaging-projects/) and a
few other pieces I found.

When you develop a new package version, there are several steps to consider:
When you develop a new package version, there are several steps to consider.
These are written from the `Python-packages/covidcast-py/` directory:

1. Increment the package version in `setup.py` and in Sphinx's `conf.py`.
2. Rebuild the package. You will need to install the `wheel` package:

```sh
python3 setup.py clean
python3 setup.py sdist bdist_wheel
```

Verify the build worked without errors.
3. Locally install the package with `python3 setup.py install`.
4. Install dependencies with `pip3 install -r requirements.txt`
2. Install the requirements needed to build the package and documentation with `make install-requirements`
3. Rebuild and install the package locally with `make build-and-install`
5. Rebuild the documentation. The documentation lives in `docs/` and is built by
[Sphinx](https://www.sphinx-doc.org/en/master/), which automatically reads
the function docstrings and formats them. `docs/index.rst` contains the main
documentation and the `.. autofunction::` directives insert documentation of
specified functions.
To rebuild the documentation, install the `sphinx` package and run
To rebuild the documentation, run
```sh
cd docs/
make clean
make html
```
Expand All @@ -36,7 +76,7 @@ When you develop a new package version, there are several steps to consider:
If you make changes to `index.rst`, you can simply run `make html` to
rebuild without needing to reinstall the package.
4. Upload to PyPI. It should be as easy as
6. Upload to PyPI. It should be as easy as
```sh
twine upload dist/covidcast-0.0.9*
Expand Down
18 changes: 18 additions & 0 deletions Python-packages/covidcast-py/Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
.PHONY = lint, test, install-requirements, build-and-install

install-requirements:
pip install -r requirements_dev.txt
pip install -r requirements_ci.txt

build-and-install: install-requirements
python3 setup.py clean
python3 setup.py sdist bdist_wheel
pip3 install -e .

lint:
pylint covidcast/ --rcfile .pylintrc
mypy covidcast --config-file mypy.ini
pydocstyle covidcast/

test:
pytest tests/ -W ignore::UserWarning
Binary file added Python-packages/covidcast-py/bubble.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion Python-packages/covidcast-py/covidcast/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,6 @@
"""

from .covidcast import signal, metadata, aggregate_signals
from .plotting import plot_choropleth, get_geo_df, animate
from .plotting import plot, plot_choropleth, get_geo_df, animate
from .geography import (fips_to_name, cbsa_to_name, abbr_to_name,
name_to_abbr, name_to_cbsa, name_to_fips)
51 changes: 29 additions & 22 deletions Python-packages/covidcast-py/covidcast/covidcast.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,8 @@
import pandas as pd
from delphi_epidata import Epidata

from .errors import NoDataWarning

# Point API requests to the AWS endpoint
Epidata.BASE_URL = "https://api.covidcast.cmu.edu/epidata/api.php"

Expand Down Expand Up @@ -101,53 +103,59 @@ def signal(data_source: str,
columns:
``geo_value``
identifies the location, such as a state name or county FIPS code. The
Identifies the location, such as a state name or county FIPS code. The
geographic coding used by COVIDcast is described in the `API
documentation here
<https://cmu-delphi.github.io/delphi-epidata/api/covidcast_geography.html>`_.
``signal``
Name of the signal, same as the value of the ``signal`` input argument. Used for
downstream functions to recognize where this signal is from.
``time_value``
contains a `pandas Timestamp object
Contains a `pandas Timestamp object
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html>`_
identifying the date this estimate is for.
``issue``
contains a `pandas Timestamp object
Contains a `pandas Timestamp object
<https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html>`_
identifying the date this estimate was issued. For example, an estimate
with a ``time_value`` of June 3 might have been issued on June 5, after
the data for June 3rd was collected and ingested into the API.
``lag``
an integer giving the difference between ``issue`` and ``time_value``,
Integer giving the difference between ``issue`` and ``time_value``,
in days.
``value``
the signal quantity requested. For example, in a query for the
The signal quantity requested. For example, in a query for the
``confirmed_cumulative_num`` signal from the ``usa-facts`` source,
this would be the cumulative number of confirmed cases in the area, as
of the ``time_value``.
``stderr``
the value's standard error, if available.
The value's standard error, if available.
``sample_size``
indicates the sample size available in that geography on that day;
Indicates the sample size available in that geography on that day;
sample size may not be available for all signals, due to privacy or
other constraints.
``direction``
uses a local linear fit to estimate whether the signal in this region is
currently increasing or decreasing (reported as -1 for decreasing, 1 for
increasing, and 0 for neither).
``geo_type``
Geography type for the signal, same as the value of the ``geo_type`` input argument.
Used for downstream functions to parse ``geo_value`` correctly
``data_source``
Name of the signal source, same as the value of the ``data_source`` input argument. Used for
downstream functions to recognize where this signal is from.
Consult the `signal documentation
<https://cmu-delphi.github.io/delphi-epidata/api/covidcast_signals.html>`_
for more details on how values and standard errors are calculated for
specific signals.
"""

if geo_type not in VALID_GEO_TYPES:
raise ValueError("geo_type must be one of " + ", ".join(VALID_GEO_TYPES))

Expand Down Expand Up @@ -250,7 +258,6 @@ def metadata() -> pd.DataFrame:
``max_lag``
Largest lag from observation to issue, in days.
"""

meta = Epidata.covidcast_meta()

if meta["result"] != 1:
Expand Down Expand Up @@ -362,7 +369,6 @@ def _fetch_single_geo(data_source: str,
entries.
"""

as_of_str = _date_to_api_string(as_of) if as_of is not None else None
issues_strs = _dates_to_api_strings(issues) if issues is not None else None

Expand All @@ -379,10 +385,14 @@ def _fetch_single_geo(data_source: str,
issues=issues_strs, lag=lag)

# Two possible error conditions: no data or too much data.
if day_data["message"] != "success":
warnings.warn("Problem obtaining data on {day}: {message}".format(
day=day_str,
message=day_data["message"]))
if day_data["message"] == "no results":
warnings.warn(f"No {data_source} {signal} data found on {day_str} "
f"for geography '{geo_type}'",
NoDataWarning)
if day_data["message"] not in {"success", "no results"}:
warnings.warn(f"Problem obtaining {data_source} {signal} data on {day_str} "
f"for geography '{geo_type}': {day_data['message']}",
RuntimeWarning)

# In the too-much-data case, we continue to try putting the truncated
# data in our results. In the no-data case, skip this day entirely,
Expand All @@ -394,7 +404,7 @@ def _fetch_single_geo(data_source: str,

if len(dfs) > 0:
out = pd.concat(dfs)

out.drop("direction", axis=1, inplace=True)
out["time_value"] = pd.to_datetime(out["time_value"], format="%Y%m%d")
out["issue"] = pd.to_datetime(out["issue"], format="%Y%m%d")
out["geo_type"] = geo_type
Expand All @@ -409,7 +419,6 @@ def _signal_metadata(data_source: str,
signal: str, # pylint: disable=W0621
geo_type: str) -> dict:
"""Fetch metadata for a single signal as a dict."""

meta = metadata()

mask = ((meta.data_source == data_source) &
Expand All @@ -434,13 +443,11 @@ def _signal_metadata(data_source: str,

def _date_to_api_string(date: date) -> str: # pylint: disable=W0621
"""Convert a date object to a YYYYMMDD string expected by the API."""

return date.strftime("%Y%m%d")


def _dates_to_api_strings(dates: Union[date, list, tuple]) -> str:
"""Convert a date object, or pair of (start, end) objects, to YYYYMMDD strings."""

if not isinstance(dates, (list, tuple)):
return _date_to_api_string(dates)

Expand Down
5 changes: 5 additions & 0 deletions Python-packages/covidcast-py/covidcast/errors.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
"""Custom warnings and exceptions for covidcast functions."""


class NoDataWarning(Warning):
"""Warning raised when no data is returned on a given day by covidcast.signal()."""
3 changes: 2 additions & 1 deletion Python-packages/covidcast-py/covidcast/geography.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
"""Functions for converting and mapping between geographic types."""
import re
import warnings
from typing import Union, Iterable
Expand Down Expand Up @@ -237,7 +238,7 @@ def _lookup(key: Union[str, Iterable],


def _get_first_tie(dict_list: list) -> list:
"""Return a list with the first value for the first key for each of the input dicts
"""Return a list with the first value for the first key for each of the input dicts.
Needs to be Python 3.6+ for this to work, since earlier versions don't preserve insertion order.
Expand Down
Loading

0 comments on commit e7c5a39

Please sign in to comment.