diff --git a/examples/spouse/README.md b/examples/spouse/README.md index 78727c686..65308cb67 100644 --- a/examples/spouse/README.md +++ b/examples/spouse/README.md @@ -96,6 +96,8 @@ deepdive do articles DeepDive will output an execution plan, which will pop up in your default text editor; save and exit to accept, and DeepDive will run, creating the table and then fetching & loading the data! +show example tuples with `deepdive query` + ### 1.2. Adding NLP markups Next, we'll use Stanford's [CoreNLP](http://stanfordnlp.github.io/CoreNLP/) natural language processing (NLP) system to add useful markups and structure to our input data. This step will split up our articles into sentences, and their component _tokens_ (roughly, the words). @@ -148,6 +150,8 @@ deepdive compile deepdive do sentences ``` +show example tuples with `deepdive query` + Note that the previous steps—here, loading the articles—will _not_ be re-run unless we specify that they should be, using, e.g.: ```bash @@ -211,6 +215,9 @@ Again, to run, just compile & execute as in previous steps: deepdive compile && deepdive do person_mention ``` +show example tuples with `deepdive query` + + #### Mentions of spouses (pairs of people) Next, we'll take all pairs of **non-overlapping person mentions that co-occur in a sentence with less than 5 people total,** and consider these as the set of potential ('candidate') spouse mentions. We thus filter out sentences with large numbers of people for the purposes of this tutorial; however these could be included if desired. @@ -247,6 +254,9 @@ Again, to run, just compile & execute as in previous steps. deepdive compile && deepdive do spouse_candidate ``` +show example tuples with `deepdive query` + + ### 1.4. Extracting features for each candidate Finally, we will extract a set of **features** for each candidate: @@ -305,6 +315,8 @@ Again, to run, just compile & execute as in previous steps. deepdive compile && deepdive do spouse_feature ``` +show example tuples with `deepdive query` + Now we have generated what looks more like the standard input to a machine learning problem—a set of objects, represented by sets of features, which we want to classify (here, as true or false mentions of a spousal relation). However, we **don't have any supervised labels** (i.e., a set of correct answers) for a machine learning algorithm to learn from! In most real world applications, a sufficiently large set of supervised labels is _not_ in fact available. @@ -315,7 +327,6 @@ With DeepDive, we take the approach sometimes referred to as _distant supervisio - ## 2. Distant supervision with data & rules In this section, we'll use _distant supervision_ (or '_data programming_') to provide a noisy set of labels to supervise our candidate relation mentions, based on which we can train a machine learning model. @@ -374,6 +385,9 @@ Notice that for DeepDive to load the data to the corresponding database table th deepdive do spouses_dbpedia ``` +show example tuples with `deepdive query` + + #### Supervising spouse candidates with DBpedia data First we'll declare a new table where we'll store the labels (referring to the spouse candidate mentions), with an integer value (`True=1, False=-1`) and a description (`rule_id`): @@ -481,6 +495,7 @@ deepdive compile && deepdive do has_spouse Recall that `deepdive do` will execute all upstream tasks as well, so this will execute all of the previous steps! +show example tuples with `deepdive query` @@ -546,18 +561,17 @@ has_spouse(p1_id, p2_id) => has_spouse(p1_id, p3_id) :- ``` - +show learning/inference steps, example results ## 4. Error analysis & debugging -_**TODO**_ -### Corpus exploration with Mindbender +### 4.1. Browsing data with Mindbender -This part of the tutorial is optional and focuses on how the user can browse through the input corpus via an automatically generated web-interface. The reader can safely skip this part. +write #### DDlog annotations for browsing data @@ -571,35 +585,29 @@ articles( ). ``` -#### Installing Mindbender -**_TODO: Put in proper way to do this!?_** -Given that `DEEPDIVE_ROOT` is a variable containing the path to the root of the deepdive repo, if you are on linux run: - -```bash -wget -O ${DEEPDIVE_ROOT}/dist/stage/bin/mindbender https://github.com/HazyResearch/mindbender/releases/download/v0.2.1/mindbender-v0.2.1-Linux-x86_64.sh -``` - -for other versions see [the releases page](https://github.com/HazyResearch/mindbender/releases). Then make sure that this location is on your path: +### 4.2. Estimating precision with Mindtagger -```bash -export PATH=${DEEPDIVE_ROOT}/dist/stage/bin:$PATH -``` +write -#### Running Mindbender -First, generate the input for mindtagger. You can edit the template for the data generated by editing `generate-input.sql`, and the template for displaying the data in `mindtagger.conf` and `template.html` (for more detail, see the [documentation](http://deepdive.stanford.edu/doc/basics/labeling.html))then run: +#### Running Mindtagger +First, generate the input for mindtagger. +You can edit the template for the data generated by editing `generate-input.sql`, and the template for displaying the data in `mindtagger.conf` and `template.html` (for more detail, see the [documentation](http://deepdive.stanford.edu/doc/basics/labeling.html)) then run: ```bash cd mindtagger -psql -d deepdive_spouse -f generate-input.sql > input.csv +deepdive sql input.csv ``` Next, start mindtagger: ```bash -PORT=$PORT ./start-mindtagger.sh +mindbender tagger mindtagger.conf ``` Then navigate to the URL displayed in your browser. -2. Describe how to setup mindbender. -3. Describe which commands to run to get the mindbender environment up and running. + + +### 4.3. Monitoring statistics with Dashboard + +write