-
Notifications
You must be signed in to change notification settings - Fork 356
demos.md
The LIT team maintains a number of hosted demos, as well as pre-built launchers for some common tasks and model types.
For publicly-visible demos hosted on Google Cloud, see https://pair-code.github.io/lit/demos/.
Hosted instance: https://pair-code.github.io/lit/demos/glue.html
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/glue_demo.py
- Multi-task demo:
- Sentiment analysis as a binary classification task (SST-2) on single sentences.
- Natural Language Inference (NLI) using MultiNLI, as a three-way classification task with two-segment input (premise, hypothesis).
- STS-B textual similarity task (see Regression / Scoring below).
- Switch tasks using the Settings (⚙️) menu.
- BERT models of different sizes, built on HuggingFace TF2 (Keras).
- Supports the widest range of LIT interpretability features:
- Model output probabilities, custom thresholds, and multiclass metrics.
- Jitter plot of output scores, to find confident examples or ones near the margin.
- Embedding projector to find clusters in representation space.
- Integrated Gradients, LIME, and other salience methods.
- Attention visualization.
- Counterfactual generators, including HotFlip for targeted adversarial perturbations.
Tip: check out a case study for this demo on the public LIT website: https://pair-code.github.io/lit/tutorials/sentiment
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/xnli_demo.py
- XNLI dataset translates a subset of MultiNLI into 14 different languages.
- Specify
--languages=en,jp,hi,...
flag to select which languages to load. - NLI as a three-way classification task with two-segment input (premise, hypothesis).
- Fine-tuned multilingual BERT model.
- Salience methods work with non-whitespace-delimited text, by using the model's wordpiece tokenization.
Hosted instance: https://pair-code.github.io/lit/demos/glue.html?models=stsb&dataset=stsb_dev
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/glue_demo.py
- STS-B textual similarity task, predicting scores on a range from 0 (unrelated) to 5 (very similar).
- BERT models built on HuggingFace TF2 (Keras).
- Supports a wide range of LIT interpretability features:
- Model output scores and metrics.
- Scatter plot of scores and error, and jitter plot of true labels for quick filtering.
- Embedding projector to find clusters in representation space.
- Integrated Gradients, LIME, and other salience methods.
- Attention visualization.
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/t5_demo.py
- Supports HuggingFace TF2 (Keras) models as well as TensorFlow SavedModel formats.
- Visualize beam candidates and highlight diffs against references.
- Visualize per-token decoder hypotheses to see where the model veers away from desired output.
- Filter examples by ROUGE score against reference.
- Embeddings from last layer of model, visualized with UMAP or PCA.
- Task wrappers to handle pre- and post-processing for summarization and machine translation tasks.
- Pre-loaded eval sets for CNNDM and WMT.
Tip: check out a case study for this demo on the public LIT website: https://pair-code.github.io/lit/tutorials/generation
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/lm_demo.py
- Compare multiple BERT and GPT-2 models side-by-side on a variety of plain-text corpora.
- LM visualization supports different modes:
- BERT masked language model: click-to-mask, and query model at that position.
- GPT-2 shows left-to-right hypotheses for each target token.
- Embedding projector to show latent space of the model.
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/coref/coref_demo.py
- Gold-mention coreference model, trained on OntoNotes.
- Evaluate on the Winogender schemas (Rudinger et al. 2018) which test for gendered associations with profession names.
- Visualizations of coreference edges, as well as binary classification between two candidate referents.
- Stratified metrics for quantifying model bias as a function of pronoun gender or Bureau of Labor Statistics profession data.
Tip: check out a case study for this demo on the public LIT website: https://pair-code.github.io/lit/tutorials/coref
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/penguin_demo.py
- Binary classification on penguin dataset.
- Showing using of LIT on non-text data (numeric and categorical features).
- Use partial-dependence plots to understand feature importance on individual examples, selections, or the entire evaluation dataset.
- Use binary classifier threshold setters to find best thresholds for slices of examples to achieve specific fairness constraints, such as demographic parity.
Code: https://github.com/PAIR-code/lit/blob/main/lit_nlp/examples/image_demo.py
- Classification on ImageNet labels using a MobileNet model.
- Showing using of LIT on image data.
- Explore results of multiple gradient-based image saliency techniques in the Salience Maps module.