Skip to content
This repository has been archived by the owner on Nov 4, 2024. It is now read-only.

Commit

Permalink
Merge pull request #2 from allianz-direct/version1
Browse files Browse the repository at this point in the history
Version1
  • Loading branch information
timvink authored Jan 20, 2022
2 parents 26b2841 + 07939ab commit 82bbfe7
Show file tree
Hide file tree
Showing 31 changed files with 961 additions and 529 deletions.
7 changes: 3 additions & 4 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@ jobs:
python setup.py sdist bdist_wheel
twine upload dist/*
# - name: Deploy mkdocs site
# run: |
# pip install mkdocs-git-authors-plugin
# mkdocs gh-deploy --force
- name: Deploy mkdocs site
run: |
mkdocs gh-deploy --force
4 changes: 2 additions & 2 deletions .github/workflows/unit_tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,11 +35,11 @@ jobs:
- name: Static code checking with pyflakes
run: |
pyflakes precommit_nbconvert_rename
pyflakes nb_prep
- name: Run unit tests
run: |
git config --global user.name "Github Action"
git config --global user.email "[email protected]"
pytest --cov=precommit_nbconvert_rename
pytest --cov=nb_prep
5 changes: 3 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ repos:
language: system
types: [python]
exclude: tests/
args: [--max-line-length=120, --docstring-convention=google, "--ignore=D100,D104,D212,D200,E203,W293,D412,W503,E731"]
args: [--max-line-length=120, --docstring-convention=google, "--ignore=D100,D104,D212,D200,D301,E203,W293,D412,W503,E731"]
- repo: local
hooks:
- id: pytest
Expand All @@ -41,4 +41,5 @@ repos:
# E203
# W293 blank line contains whitespace
# W503 line break before binary operator (for compatibility with black)
# E731: allow lambdas to be used, do not enforce def
# E731: allow lambdas to be used, do not enforce def
# D301 Use r""" if any backslashes in a docstring
17 changes: 9 additions & 8 deletions .pre-commit-hooks.yaml
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
- id: nbconvert_rename_precommit
name: precommit_nbconvert_rename (pre-commit; run nbconvert)
description: 'Converts to .ipynb to .html and adds date prefix and hash placeholder.'
entry: nbconvert_rename
- id: nb_prep_precommit
name: nb_prep (pre-commit; process notebooks)
description: 'Converts to .ipynb to .html and adds date prefix and hash placeholder. Strips notebook of outputs.'
entry: nb_prep process
language: python
language_version: python3
types: [jupyter]
stages: [commit]
- id: nbconvert_rename_postcommit
name: precommit_nbconvert_rename (post-commit; replace commithash in .html filenames)
- id: nb_prep_postcommit
name: nb_prep (post-commit; replace hash placeholder in .html filenames)
description: 'Replaces NBCONVERT_RENAME_COMMITHASH_PLACEHOLDER with commit hash in any .html filenames.'
entry: rename_commithash
entry: nb_prep rename
types: [html]
language: python
language_version: python3
# always_run because .html files are probably gitignored
always_run: true
stages: [post-commit]
stages: [post-commit]
13 changes: 7 additions & 6 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,14 @@
## Dev setup

```shell
pip install -r dev_requirements.txt
pip install -r tests/test_requirements.txt
pre-commit install
```

## Edit drawings

We useed excalidraw, you can edit the vector image [here](https://excalidraw.com/#json=5272425855975424,sXm3L5A8Yr5EH9nkuENJIQ).

## Testing

There are some unit tests you can run with `pytest`.
Expand All @@ -20,13 +21,13 @@ In a workspace directory, assuming already have a local clone of this repo:

```shell
mkdir test_prj
cp precommit_nbconvert_rename/tests/data/example.ipynb test_prj/
cp nb_prep/tests/data/example.ipynb test_prj/
cd test_prj
git init
git add --all
pre-commit try-repo ..\precommit_nbconvert_rename\ --verbose
pre-commit try-repo ../nb_prep --verbose
git commit -m "test"
pre-commit try-repo ..\precommit_nbconvert_rename\ --verbose --hook-stage post-commit
pre-commit try-repo ../nb_prep/ --verbose --hook-stage post-commit
```

### manually test a precommit config
Expand All @@ -35,8 +36,8 @@ In a workspace directory, assuming already have a local clone of this repo:

```shell
mkdir test_precommit_prj
cp precommit_nbconvert_rename/tests/data/example.ipynb test_precommit_prj/
cp precommit_nbconvert_rename/tests/data/pre-commit-test-config.yaml test_precommit_prj/.pre-commit-config.yaml
cp nb_prep/tests/data/example.ipynb test_precommit_prj/
cp nb_prep/tests/data/pre-commit-test-config.yaml test_precommit_prj/.pre-commit-config.yaml
cd test_precommit_prj
git init
pre-commit install
Expand Down
151 changes: 18 additions & 133 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,145 +1,30 @@
[![Unit tests](https://github.com/allianz-direct/precommit_nbconvert_rename/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/allianz-direct/precommit_nbconvert_rename/actions/workflows/unit_tests.yml)
[![Unit tests](https://github.com/allianz-direct/nb_prep/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/allianz-direct/nb_prep/actions/workflows/unit_tests.yml)
![PyPI - Python Version](https://img.shields.io/pypi/pyversions/nb-prep)
![PyPI](https://img.shields.io/pypi/v/nb-prep)
![PyPI - Downloads](https://img.shields.io/pypi/dm/nb-prep)
![GitHub contributors](https://img.shields.io/github/contributors/timvink/nb-prep)
![PyPI - License](https://img.shields.io/pypi/l/nb-prep)

# precommit_nbconvert_rename
# nb_prep

Use `nbconvert` and `nbstripout` together as precommit hooks.
`nb_prep` makes it easier to prepare jupyter notebooks for storing in git and sharing with stakeholders.

A pre-commit hook that converts any changed jupyter notebooks (`.ipynb`) to `.html` files with a YYYMMDD date prefix and commit hash suffix added:
You can use the `nb_prep` CLI to:

`my_notebook.ipynb` -> `20211026_my_notebook_eac9e43.html`
- Convert jupyter notebooks to HTML (using [`nbconvert`](https://nbconvert.readthedocs.io/)) and:
- add a date prefix to the filename.
- add a git hash suffix to the filename.
- move the HTML file to a configured output directory
- Strip all cell outputs (using [`nbstripout`](https://github.com/kynan/nbstripout))

## Use case

Jupyter notebooks contain not only code but also outputs (tables, plots, interactive elements) as well as execution counts. You should not commit data to git (also because of security) so a common solution for jupyter notebooks is to use [nbstripout](https://github.com/kynan/nbstripout) as [pre-commit](https://pre-commit.com/) hook. This has as added benefit that your notebooks are not more easily version-controlled, as re-running a cell does not lead to a `git diff`. The downside is having to re-execute notebooks everytime you want to view or share them.

`precommit_nbconvert_rename` runs [nbconvert](https://github.com/jupyter/nbconvert) each time you make a commit that touches a jupyter notebook, and adds a date prefix and commit hash suffix to the filename. Having the commit hash in the file named has the added benefit that you can always find the changes in the file in git. Obviously these `.html` should remain local and not be committed to `git`, so make sure to `*.html` to your `.gitignore` file. Here's an overview of the workflow:

<img src="images/schema_workflow.png" width="700px">

> Note: `nbstripout` pre-commit hooks will edit your notebook files and fail the pre-commit. When you add the stripped notebook and commit again, `nbconvert-rename` will not run `nbconvert` again because there is already .html file
You can also configure `nb_prep` once as a pre-commit hook and have notebook output automatically prepared every time you `git commit`.

## Installation

```bash
pip install precommit_nbconvert_rename
```

## Usage

You need to update the `.pre-commit-config.yaml` in your repository. We'll assume you want to use `nbconvert_rename` with [nbstripout](https://github.com/kynan/nbstripout#using-nbstripout-as-a-pre-commit-hook) and include that here:

```yaml
default_stages: [commit]
repos:
- repo: local
hooks:
- id: nbconvert_rename_precommit
name: precommit_nbconvert_rename (pre-commit; run nbconvert)
description: 'Converts to .ipynb to .html and adds date prefix and hash placeholder.'
entry: nbconvert_rename
language: python
language_version: python3
types: [jupyter]
stages: [commit]
- id: nbconvert_rename_postcommit
name: precommit_nbconvert_rename (post-commit; replace commithash in .html filenames)
description: 'Replaces NBCONVERT_RENAME_COMMITHASH_PLACEHOLDER with commit hash in any .html filenames.'
entry: rename_commithash
types: [html]
language: python
language_version: python3
always_run: true
stages: [post-commit]
- repo: local
hooks:
- id: nbstripout
name: nbstripout
entry: nbstripout
language: system
```
You need to install the pre-commit and the post-commit hooks separately:
```shell
pre-commit install
pre-commit install --hook-type post-commit
```

When you commit a notebook, you might see something like:

```shell
git add notebook.ipynb
git commit -m "Add notebook"
# precommit_nbconvert_rename (pre-commit; run nbconvert)............................Passed
# nbstripout........................................................................Failed
# - hook id: nbstripout
# - files were modified by this hook
# precommit_nbconvert_rename (post-commit; replace commithash in .html filenames)...Passed
```

`nbstripout` has overwritten `notebook.ipynb` and `nbconvert-rename` has created a file named something like `20211026_notebook_NBCONVERT_RENAME_COMMITHASH_PLACEHOLDER.html`.
Make sure to avoid committing HTML files by adding `.html` added to your `.gitignore` file. Next:

```shell
git add notebook.ipynb
git commit -m "Add notebook"
# precommit_nbconvert_rename (pre-commit; run nbconvert)............................Passed
# nbstripout........................................................................Passed
# precommit_nbconvert_rename (post-commit; replace commithash in .html filenames)...Passed
pip install nb_prep
```

Now, you've committed a clean, stripped version of `notebook.ipynb` and you have a local snapshot of your notebook named something like `20211026_notebook_eac9e43.html`.
## Documentation

## Options

### Using templates

If you want to specify a different template for `nbconvert`, you can add an argument to the `nbconvert_rename_precommit` hook:

```yaml
- repo: local
hooks:
- id: nbconvert_rename_precommit
entry: nbconvert_rename
...
args: ["--template","reveal"]
```
### Removing cell blocks
You can also choose to remove input code blocks (equivalent to `jupyter nbconvert --no-input`)

```yaml
- repo: local
hooks:
- id: nbconvert_rename_precommit
entry: nbconvert_rename
...
args: ["--no-input"]
```

### Specifying an output directory

You might want to output all HTML notebooks in a specific folder. You can specify a relative (to project root) or absolute path using `--output-dir`:

```yaml
- repo: local
hooks:
- id: nbconvert_rename_precommit
entry: nbconvert_rename
...
args: ["--output-dir","../data/notebooks"]
```

### Excluding directories and files

You can ignore certain notebooks or even entire directories with [globs](https://docs.python.org/3/library/glob.html), using a relative (to project root) or absolute path with `--exclude`. For example:

```yaml
- repo: local
hooks:
- id: nbconvert_rename_precommit
entry: nbconvert_rename
...
args: ["--exclude","../data/notebooks/*", "a_notebook.ipynb"]
```
See [allianz-direct.github.io/nb_prep](https://allianz-direct.github.io/nb_prep).
Binary file added docs/assets/images/schema_workflow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
6 changes: 6 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
---
hide:
- navigation
---

--8<-- "README.md"
107 changes: 107 additions & 0 deletions docs/options.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
---
hide:
- navigation
---

# Options

Options are also documented in the CLI tool, see

```shell
nb_prep --help
```

## Using templates

If you want to specify a different template for `nbconvert`, you can add an argument to the `nb_prep process` hook:

=== "CLI"

```bash
nb_prep process --nbconvert-template 'reveal' .
```

=== "Pre-commit hook"

```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/allianz-direct/nb_prep
rev: main
hooks:
- id: nb_prep_precommit
args: ["--nbconvert-template","reveal"]
- id: nb_prep_postcommit
```

## Removing cell blocks

You can also choose to remove input code blocks from the converted HTML (equivalent to `jupyter nbconvert --no-input`).


=== "CLI"

```bash
nb_prep process --nbconvert-no-input .
```

=== "Pre-commit hook"

```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/allianz-direct/nb_prep
rev: main
hooks:
- id: nb_prep_precommit
args: ["--nbconvert-no-input"]
- id: nb_prep_postcommit
```

## Specifying an output directory

You might want to output all HTML notebooks in a specific folder. The default is using the same folder as the notebook. You can specify different folder relative to the project root or by absolute path using `--output-dir`:

=== "CLI"

```bash
nb_prep process --output-dir "~/workspace/notebook_output" .
nb_prep rename --output-dir "~/workspace/notebook_output" .
```

=== "Pre-commit hook"

```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/allianz-direct/nb_prep
rev: main
hooks:
- id: nb_prep_precommit
args: ["--output-dir","~/workspace/notebook_output"]
- id: nb_prep_postcommit
args: ["--output-dir","~/workspace/notebook_output"]
```

## Excluding directories and files

You can ignore certain notebooks or even entire directories with [globs](https://docs.python.org/3/library/glob.html), using a relative (to project root) or absolute path with `--exclude`. For example:

=== "CLI"

```bash
nb_prep process --exclude "templates/*", "a_notebook.ipynb" .
```

=== "Pre-commit hook"

```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/allianz-direct/nb_prep
rev: main
hooks:
- id: nb_prep_precommit
args: ["--exclude","templates/*", "a_notebook.ipynb"]
- id: nb_prep_postcommit
```
Loading

0 comments on commit 82bbfe7

Please sign in to comment.