Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feat) #2198 add postgres backend similarity functions to fully supported #2199

Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
58d6d9c
Make spellcheck work cross-platform
zmbc Apr 3, 2024
2fbcced
Merge branch 'master' into spellcheck-cross-platform
zmbc Apr 4, 2024
f159e05
Merge branch 'master' into spellcheck-cross-platform
zslade Apr 5, 2024
75e752f
Make spellchecker script executable
zmbc Apr 9, 2024
99838f3
Include task in pyspelling call
zmbc Apr 9, 2024
71844d5
Include sentence to encourage contributions
zmbc Apr 9, 2024
ab17375
Update documentation on settings validation in response to code changes
ThomasHepworth Apr 23, 2024
ebba34b
Update predict.py
samnlindsay Apr 24, 2024
86f955c
Merge pull request #2152 from moj-analytical-services/bugfix_predict_…
RobinL Apr 25, 2024
ad28a62
Merge pull request #2149 from moj-analytical-services/docs/updating_s…
ThomasHepworth Apr 30, 2024
3d7cf00
Fixing spurious error messages with Databricks enable_splink
aymonwuolanne May 1, 2024
7dccd66
format
RobinL May 1, 2024
268f77e
remove ref to github action
zslade May 2, 2024
8425395
Merge pull request #2163 from moj-analytical-services/docs_tweak
zslade May 2, 2024
e252813
Reword script instructions
zmbc May 6, 2024
df49f62
Merge branch 'master' into spellcheck-cross-platform
zmbc May 7, 2024
be6a9ad
Merge pull request #2159 from aymonwuolanne/master
RobinL May 8, 2024
5c6df64
Merge branch 'master' into spellcheck-cross-platform
zmbc May 8, 2024
6c0437c
Fix Splink 4 blog post link
probjects May 9, 2024
1c69ebc
Merge pull request #2172 from probjects/master
RobinL May 10, 2024
0a10a93
Merge branch 'master' into spellcheck-cross-platform
zslade May 10, 2024
479cef8
Merge pull request #2131 from zmbc/spellcheck-cross-platform
zslade May 10, 2024
0684423
(feat) #2198 add postgres backend similarity functions to fully suppo…
vflumeris May 31, 2024
139cf12
fix missing import, run format again?
vflumeris May 31, 2024
026a90d
update broken documentation build
vflumeris Jun 3, 2024
50d8cac
fix documententation bad on postgres_docker
vflumeris Jun 3, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .github/workflows/documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,17 @@ jobs:
with:
python-version: 3.9.10

#----------------------------------------------
# Install GLIBC - error occured 2024-06-01 on ubuntu-latest with python3.9
# Temporary as ubuntu-latest could change or a different python version could
# change this current error.
#----------------------------------------------
- name: Install GLIBC

run: |
sudo apt-get update
sudo apt-get install -y libc6

#----------------------------------------------
# -- save a few section by caching poetry --
#----------------------------------------------
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -179,3 +179,5 @@ cython_debug/
splink_db
splink_db_log
spark-warehouse

scripts/pyspelling/dictionary.dic
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
[![Documentation](https://img.shields.io/badge/API-documentation-blue)](https://moj-analytical-services.github.io/splink/)

> [!IMPORTANT]
> Development has begun on Splink 4 on the `splink4_dev` branch. Splink 3 is in maintenance mode and we are no longer accepting new features. We welcome contributions to Splink 4. Read more on our latest [blog](https://moj-analytical-services.github.io/splink/blog/2024/03/19/splink4.html).
> Development has begun on Splink 4 on the `splink4_dev` branch. Splink 3 is in maintenance mode and we are no longer accepting new features. We welcome contributions to Splink 4. Read more on our latest [blog](https://moj-analytical-services.github.io/splink/blog/2024/04/02/splink-3-updates-and-splink-4-development-announcement---april-2024.html).

# Fast, accurate and scalable probabilistic data linkage

Expand Down
18 changes: 12 additions & 6 deletions docs/dev_guides/changing_splink/contributing_to_docs.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,29 +16,35 @@ Once you've finished updating Splink documentation we ask that you run our spell

## Spellchecking docs

When updating Splink documentation, we ask that you run our spellchecker before submitting a pull request. This is to help ensure quality and consistency across the documentation. Please note, the spellchecker _only works on markdown files_ and currently only works on systems which support `Homebrew` package manager. Instructions for other operating systems will be released later.
When updating Splink documentation, we ask that you run our spellchecker before submitting a pull request. This is to help ensure quality and consistency across the documentation. If for whatever reason you can't run the spellchecker on your system, please don't let this prevent you from contributing to the documentation. Please note, the spellchecker _only works on markdown files_.

To run the spellchecker on either a single markdown file or folder of markdown files, you can use the following script:
If you are a Mac user with the `Homebrew` package manager installed, the script below will automatically install
the required system dependency, `aspell`.
If you've created your development environment [using conda](./development_quickstart.md), `aspell` will have been installed as part of that
process.
Instructions for installing `aspell` through other means may be added here in the future.

To run the spellchecker on either a single markdown file or folder of markdown files, you can run the following bash script:

```sh
source scripts/pyspelling/spellchecker.sh <path_to_file_or_folder>
./scripts/pyspelling/spellchecker.sh <path_to_file_or_folder>
```

Omitting the file/folder path will run the spellchecker on all markdown files contained in the `docs` folder. We recommend running the spellchecker only on files that you have created or edited.

The spellchecker uses the Python package [PySpelling](https://facelessuser.github.io/pyspelling/) and its underlying spellchecking tool, Aspell. Running the above script will automatically install these packages along with any other necessary dependencies.

The spellchecker compares words to a [standard British English dictionary](https://github.com/LibreOffice/dictionaries/blob/master/en/en_GB.aff) and a custom dictionary (`scripts/pyspelling/custom_dictionary.txt`) of words. If no spelling mistakes are found, you will see the following terminal printout:
The spellchecker compares words to a standard British English dictionary and a custom dictionary (`scripts/pyspelling/custom_dictionary.txt`) of words. If no spelling mistakes are found, you will see the following terminal printout:

```sh
```

Spelling check passed :)

```

otherwise, PySpelling will printout the spelling mistakes found in each file.

Correct spellings of words not found in a standard dictionary (e.g. Splink) can be recorded as such by adding them to `scripts/pyspelling/custom_dictionary.txt`. (Don't worry about adding them in alphabetical order or accidental duplication as this will be handled automatically by a GitHub Action future.)
Correct spellings of words not found in a standard dictionary (e.g. "Splink") can be recorded as such by adding them to `scripts/pyspelling/custom_dictionary.txt`.

Please correct any mistakes found or update the custom dictionary to ensure the spellchecker passes before putting in a pull request containing updates to the documentation.

Expand Down
2 changes: 1 addition & 1 deletion docs/dev_guides/changing_splink/development_quickstart.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,7 @@ and the teardown script each time you want to stop it:
```

Included in the docker-compose file is a [pgAdmin](https://www.pgadmin.org/) container to allow easy exploration of the database as you work, which can be accessed in-browser on the default port.
The default username is `[email protected]` with password `b`.
The default url: http://localhost:80/ username is `[email protected]` with password `b`.

## Step 3, Conda install option: Install system dependencies

Expand Down
Loading
Loading