Skip to content

Commit

Permalink
Sets outline for Tutorial Sec 4 and adds some todos for showing examp…
Browse files Browse the repository at this point in the history
…le data
  • Loading branch information
netj committed Feb 14, 2016
1 parent e7c9b2f commit 4bdc90b
Showing 1 changed file with 31 additions and 23 deletions.
54 changes: 31 additions & 23 deletions examples/spouse/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,6 +96,8 @@ deepdive do articles
DeepDive will output an execution plan, which will pop up in your default text editor;
save and exit to accept, and DeepDive will run, creating the table and then fetching & loading the data!

<todo>show example tuples with `deepdive query`</todo>

### 1.2. Adding NLP markups
Next, we'll use Stanford's [CoreNLP](http://stanfordnlp.github.io/CoreNLP/) natural language processing (NLP) system to add useful markups and structure to our input data.
This step will split up our articles into sentences, and their component _tokens_ (roughly, the words).
Expand Down Expand Up @@ -148,6 +150,8 @@ deepdive compile
deepdive do sentences
```

<todo>show example tuples with `deepdive query`</todo>

Note that the previous steps—here, loading the articles—will _not_ be re-run unless we specify that they should be, using, e.g.:

```bash
Expand Down Expand Up @@ -211,6 +215,9 @@ Again, to run, just compile & execute as in previous steps:
deepdive compile && deepdive do person_mention
```

<todo>show example tuples with `deepdive query`</todo>


#### Mentions of spouses (pairs of people)
Next, we'll take all pairs of **non-overlapping person mentions that co-occur in a sentence with less than 5 people total,** and consider these as the set of potential ('candidate') spouse mentions.
We thus filter out sentences with large numbers of people for the purposes of this tutorial; however these could be included if desired.
Expand Down Expand Up @@ -247,6 +254,9 @@ Again, to run, just compile & execute as in previous steps.
deepdive compile && deepdive do spouse_candidate
```

<todo>show example tuples with `deepdive query`</todo>


### 1.4. Extracting features for each candidate
Finally, we will extract a set of **features** for each candidate:

Expand Down Expand Up @@ -305,6 +315,8 @@ Again, to run, just compile & execute as in previous steps.
deepdive compile && deepdive do spouse_feature
```

<todo>show example tuples with `deepdive query`</todo>

Now we have generated what looks more like the standard input to a machine learning problem—a set of objects, represented by sets of features, which we want to classify (here, as true or false mentions of a spousal relation).
However, we **don't have any supervised labels** (i.e., a set of correct answers) for a machine learning algorithm to learn from!
In most real world applications, a sufficiently large set of supervised labels is _not_ in fact available.
Expand All @@ -315,7 +327,6 @@ With DeepDive, we take the approach sometimes referred to as _distant supervisio




## 2. Distant supervision with data & rules
In this section, we'll use _distant supervision_ (or '_data programming_') to provide a noisy set of labels to supervise our candidate relation mentions, based on which we can train a machine learning model.

Expand Down Expand Up @@ -374,6 +385,9 @@ Notice that for DeepDive to load the data to the corresponding database table th
deepdive do spouses_dbpedia
```

<todo>show example tuples with `deepdive query`</todo>


#### Supervising spouse candidates with DBpedia data
First we'll declare a new table where we'll store the labels (referring to the spouse candidate mentions), with an integer value (`True=1, False=-1`) and a description (`rule_id`):

Expand Down Expand Up @@ -481,6 +495,7 @@ deepdive compile && deepdive do has_spouse
Recall that `deepdive do` will execute all upstream tasks as well, so this will execute all of the previous steps!


<todo>show example tuples with `deepdive query`</todo>



Expand Down Expand Up @@ -546,18 +561,17 @@ has_spouse(p1_id, p2_id) => has_spouse(p1_id, p3_id) :-
```



<todo>show learning/inference steps, example results</todo>





## 4. Error analysis & debugging
_**TODO**_

### Corpus exploration with Mindbender
### 4.1. Browsing data with Mindbender

This part of the tutorial is optional and focuses on how the user can browse through the input corpus via an automatically generated web-interface. The reader can safely skip this part.
<todo>write</todo>

#### DDlog annotations for browsing data

Expand All @@ -571,35 +585,29 @@ articles(
).
```

#### Installing Mindbender
**_TODO: Put in proper way to do this!?_**
Given that `DEEPDIVE_ROOT` is a variable containing the path to the root of the deepdive repo, if you are on linux run:

```bash
wget -O ${DEEPDIVE_ROOT}/dist/stage/bin/mindbender https://github.com/HazyResearch/mindbender/releases/download/v0.2.1/mindbender-v0.2.1-Linux-x86_64.sh
```

for other versions see [the releases page](https://github.com/HazyResearch/mindbender/releases). Then make sure that this location is on your path:
### 4.2. Estimating precision with Mindtagger

```bash
export PATH=${DEEPDIVE_ROOT}/dist/stage/bin:$PATH
```
<todo>write</todo>

#### Running Mindbender
First, generate the input for mindtagger. You can edit the template for the data generated by editing `generate-input.sql`, and the template for displaying the data in `mindtagger.conf` and `template.html` (for more detail, see the [documentation](http://deepdive.stanford.edu/doc/basics/labeling.html))then run:
#### Running Mindtagger
First, generate the input for mindtagger.
You can edit the template for the data generated by editing `generate-input.sql`, and the template for displaying the data in `mindtagger.conf` and `template.html` (for more detail, see the [documentation](http://deepdive.stanford.edu/doc/basics/labeling.html)) then run:

```bash
cd mindtagger
psql -d deepdive_spouse -f generate-input.sql > input.csv
deepdive sql <generate-input.sql >input.csv
```

Next, start mindtagger:

```bash
PORT=$PORT ./start-mindtagger.sh
mindbender tagger mindtagger.conf
```

Then navigate to the URL displayed in your browser.

2. Describe how to setup mindbender.
3. Describe which commands to run to get the mindbender environment up and running.


### 4.3. Monitoring statistics with Dashboard

<todo>write</todo>

0 comments on commit 4bdc90b

Please sign in to comment.