-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b82e2a4
commit db042fe
Showing
4,035 changed files
with
10,968,784 additions
and
0 deletions.
The diff you're trying to view is too large. We only load the first 3000 changed files.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,74 @@ | ||
Directions to Create Description Audit Executable: | ||
|
||
------------------------ | ||
|
||
# FAQ: | ||
|
||
## Who is this guide for? | ||
|
||
This guide is intended for interested parties with at least working knowledge of Python programming, | ||
command line interfaces, and Git. Users interested in re-producing the executable program should | ||
first check if there is already an up-to-date executable for the operating system on their computer | ||
|
||
## Why do operating systems matter? | ||
|
||
Executables in Python can only be packaged from the operating system that will be used to run them. | ||
Machines running a Linux operating system, for example, can create an executable that will run on other | ||
Linux machines, but there is no easily accessible way to create an executable for other operating systems | ||
from that one machine. Especially with intellectual property laws pertaining to the use of Mac OS X by | ||
non-Apple users, the best workaround that we've found is to provide directions for prospective users/interested | ||
parties to compile the application as an executable for their respective operating systems. | ||
|
||
If you create an executable for your operating system and the project doesn't have it available yet, | ||
we'd love it if you submitted the executable back to us with a merge request for others to use! This | ||
project is approaching a natural closing point at the Rubenstein Library, but updates, additions, etc. | ||
are welcomed from other interested parties. | ||
|
||
## Why is this project distributed as an executable? | ||
|
||
This project is primarily intended to support library archivists and other members of the library community | ||
focused on anti-racism and social justice in archival work. Although library archivists have a number of | ||
niche skills which are very valuable on this project, a complex knowledge of computer science isn't exactly | ||
in their job descriptions! We have chosen to create an executable with a graphic user interface (GUI) for | ||
ease of use by the people most likely to be engaging with this work. An executable created with PyInstaller | ||
allows users to click on the executable and run the project in minutes, without having to independently download | ||
Python or any other dependencies used in the project. All of the code is 'under the hood' to present an easy, | ||
accessible interface to end users with limited computer science background. | ||
|
||
This project can still be run from the command line or from a Python IDE. Please see our README for installation instructions | ||
for the full project, command line argument descriptions, etc. | ||
|
||
------------------------ | ||
|
||
**If you have already forked the project onto your local machine and installed dependencies, skip to step 5** | ||
|
||
1. Fork and clone project to local machine from Git. | ||
2. Create a virtual environment to house dependencies for this project. I typically use venv on the command line within | ||
the project directory for this, as seen below, but feel free to use your favorite virtual environment: | ||
on Linux or macOS: python3 -m venv env | ||
on Windows: py -m venv env | ||
(If you don't have venv installed, you can install it via pip from macOS/or Linux and Windows using | ||
python3 -m pip install --user virtualenv OR py -m pip install --user virtualenv, respectively) | ||
3. Activate virtual environment. As long as this virtual environment is activated, pip will install packages into | ||
this specific virtual environment, preventing any possible interaction between dependencies for this project and others | ||
on your device. Deactivate this virtual environment by tying 'deactivate' in your command line, and activate as follows: | ||
on Linux or macOS: source env/bin/activate | ||
on Windows: .\env\Scripts\activate | ||
4. Install project dependencies from requirements.txt. This will install all libraries and packages mentioned in this file, | ||
therefore establishing needed dependencies to run project normally: | ||
pip install -r requirements.txt | ||
5. Install PyInstaller to create project executable: | ||
pip install PyInstaller | ||
6. PyInstaller generally runs in two steps, first creating the executable using a directory. Do this while adding making sure to indicate | ||
that PyInstaller should check for the manually created hook-spacy.py used to install spaCy's hidden dependencies: | ||
pyinstaller CLI.py --additional-hooks-dir=. | ||
7. PyInstaller output is very verbose, and it can take a few minutes to run depending on your machine, but once this has completed, repeat | ||
using the --onefile tag to create a single executable for distribution: | ||
pyinstaller CLI.py --additional-hooks-dir=. --onefile | ||
8. The up-to-date executable for your operating system should be in the 'dist' folder in your local project directory. Depending on your | ||
machine/operating system, the file extension may vary, but on Windows there is a single executable file CLI.exe that will run the entire project. | ||
|
||
------------------------ | ||
|
||
Good luck! Raise any issues on Git with questions. | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,69 @@ | ||
import argparse | ||
import os | ||
import sys | ||
from scripts.description_audit_driver import main | ||
from scripts.description_audit_GUI import main as run_gui | ||
|
||
guiparser = argparse.ArgumentParser() | ||
guiparser.add_argument('--nogui', default=False, action="store_true") | ||
|
||
if __name__ == '__main__': | ||
|
||
preargs = guiparser.parse_known_args() | ||
|
||
if not preargs[0].nogui: | ||
# launch GUI | ||
args_from_gui = run_gui() | ||
lexicon_csv_path = args_from_gui[0] | ||
lexicon_test = args_from_gui[1] | ||
hatebase_include = args_from_gui[2] | ||
output_path = args_from_gui[3] | ||
ead_path = args_from_gui[4] | ||
marcxml_path = args_from_gui[5] | ||
|
||
else: | ||
noguiparser = guiparser | ||
noguiparser.add_argument('lexicon_csv_path', type=str, help="Path to CSV file containing lexicons") | ||
noguiparser.add_argument('lexicon_test', type=str, help="Headers to CSV indicating lexicons to match to. " | ||
"To use multiple, separate by underscores. To use all, " | ||
"type 'ALL'.") | ||
# If you have any particularly lengthy or false positive-prone lexicons that you want to only include if | ||
# they are explicitly declared, modify references to the below variable in parse_lexicon() driver function. | ||
noguiparser.add_argument('hatebase_include', type=int, help="Boolean True or False indicating " | ||
"whether the lengthy HateBase " | ||
"lexicons should be included. " | ||
"Default is False.") | ||
noguiparser.add_argument('output_path', type=str, help="Path to folder where CSV reports should be stored") | ||
noguiparser.add_argument('ead_path', type=str, help="Path to folder comprised of EAD archive files in XML.") | ||
noguiparser.add_argument('marcxml_path', type=str, help="Path to XML file containing MARCXML archive.") | ||
|
||
args = noguiparser.parse_args() | ||
print(args) | ||
lexicon_csv_path = args.lexicon_csv_path | ||
lexicon_test = args.lexicon_test | ||
hatebase_include = args.hatebase_include | ||
output_path = args.output_path | ||
ead_path = args.ead_path | ||
marcxml_path = args.marcxml_path | ||
|
||
if not os.path.isfile(lexicon_csv_path): | ||
print("The lexicon CSV file specified does not exist on this path.") | ||
sys.exit() | ||
|
||
if not os.path.isdir(output_path): | ||
print("The output path given is not a file directory.") | ||
sys.exit() | ||
|
||
if marcxml_path == ead_path: | ||
print("Path to at least one archival structure must be specified") | ||
sys.exit() | ||
|
||
if not (os.path.isdir(ead_path) or (ead_path == "NONE")): | ||
print("The EAD path given does not lead to a directory of archival information.") | ||
sys.exit() | ||
|
||
if not (os.path.isfile(marcxml_path) or (marcxml_path == "NONE")): | ||
print("The MARCXML archival structure does not exist on this path.") | ||
sys.exit() | ||
|
||
main(lexicon_csv_path, lexicon_test, hatebase_include, output_path, ead_path, marcxml_path) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# -*- mode: python ; coding: utf-8 -*- | ||
|
||
|
||
block_cipher = None | ||
|
||
|
||
a = Analysis(['description_audit.py'], | ||
pathex=['C:\\Users\\msham\\OneDrive\\Documents\\description-audit'], | ||
binaries=[], | ||
datas=[], | ||
hiddenimports=[], | ||
hookspath=['.'], | ||
runtime_hooks=[], | ||
excludes=[], | ||
win_no_prefer_redirects=False, | ||
win_private_assemblies=False, | ||
cipher=block_cipher, | ||
noarchive=False) | ||
pyz = PYZ(a.pure, a.zipped_data, | ||
cipher=block_cipher) | ||
exe = EXE(pyz, | ||
a.scripts, | ||
a.binaries, | ||
a.zipfiles, | ||
a.datas, | ||
[], | ||
name='description_audit', | ||
debug=False, | ||
bootloader_ignore_signals=False, | ||
strip=False, | ||
upx=True, | ||
upx_exclude=[], | ||
runtime_tmpdir=None, | ||
console=True ) |
Git LFS file not shown
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
from PyInstaller.utils.hooks import collect_all | ||
|
||
# ----------------------------- SPACY ----------------------------- | ||
data = collect_all('spacy') | ||
|
||
datas = data[0] | ||
binaries = data[1] | ||
hiddenimports = data[2] | ||
|
||
# ----------------------------- THINC ----------------------------- | ||
data = collect_all('thinc') | ||
|
||
datas += data[0] | ||
binaries += data[1] | ||
hiddenimports += data[2] | ||
|
||
# ----------------------------- CYMEM ----------------------------- | ||
data = collect_all('cymem') | ||
|
||
datas += data[0] | ||
binaries += data[1] | ||
hiddenimports += data[2] | ||
|
||
# ----------------------------- PRESHED ----------------------------- | ||
data = collect_all('preshed') | ||
|
||
datas += data[0] | ||
binaries += data[1] | ||
hiddenimports += data[2] | ||
|
||
# ----------------------------- BLIS ----------------------------- | ||
|
||
data = collect_all('blis') | ||
|
||
datas += data[0] | ||
binaries += data[1] | ||
hiddenimports += data[2] | ||
# This hook file is a bit of a hack - really, all of the libraries should be in seperate | ||
|
||
# ----------------------------- OTHER ----------------------------- | ||
|
||
hiddenimports += ['bs4', 'pandas', 'srsly.msgpack.util'] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,54 @@ | ||
Aggrandizement,RaceEuphemisms,RaceTerms,SlaveryTerms,GenderTerms | ||
acclaimed,color blind,aboriginal,abolition,miss | ||
ambitious,colored,aboriginals,abolitionist,mistress | ||
celebrated,coloured,aborigines,antislavery,mrs. | ||
distinguished,negro,aliens,anti-slavery,muse | ||
eminent,race relations,arab,bill of sale,spouse | ||
esteemed,race situation,arabs,bills of sale,wife | ||
expert,race-based,asians,enslaved, | ||
father of,racial,asiatic,freed slave, | ||
foremost,racism,blacks,freed slaves, | ||
founding father,riot,bushman,freedman, | ||
genius,troubles,bushmen,freedmen, | ||
gentleman,unruly,bushwoman,manumission, | ||
important,,chink,manumitted, | ||
influential,,civilized,negro, | ||
man of letters,,coolie,overseer, | ||
masterpiece,,coolies,plantation, | ||
notable,,creole,planter, | ||
pioneer,,creoles,runaway slave, | ||
plantation owner,,dyke,runaway slaves, | ||
planter,,ethnic,slave, | ||
preeminent,,exotic,slave holder, | ||
prestigious,,fag,slave master, | ||
prolific,,gook,slave owner, | ||
prominent,,gypsies,slave owner, | ||
renowned,,gypsy,slaveholder, | ||
respected,,hispanics,slavery, | ||
revolutionary,,illegal alien,slaves, | ||
seminal,,illegal aliens,, | ||
successful,,illegal immigrant,, | ||
wealthy,,illegal immigrants,, | ||
,,illegals,, | ||
,,indian,, | ||
,,indians,, | ||
,,japs,, | ||
,,mammy,, | ||
,,mulatto,, | ||
,,mulattoes,, | ||
,,mulattos,, | ||
,,native americans,, | ||
,,natives,, | ||
,,negoes,, | ||
,,negro,, | ||
,,negros,, | ||
,,oriental,, | ||
,,primitive people,, | ||
,,primitives,, | ||
,,pygmies,, | ||
,,pygmy,, | ||
,,sambo,, | ||
,,savages,, | ||
,,squaw,, | ||
,,squaws,, | ||
,,uncivilized,, |
Oops, something went wrong.