Skip to content
This repository has been archived by the owner on Jan 15, 2024. It is now read-only.

Commit

Permalink
[Numpy] Benchmark the backbone models + Some fixes + Always use pytho…
Browse files Browse the repository at this point in the history
…n3 + Fix conversion tool (#1292)

* update

update

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Create requirements.txt

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update requirements.txt

update

Update README.md

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

Update benchmark_hf.py

fix

fix

Update test_models_bart.py

Update test_models_bart.py

Update bart.py

update

Update __init__.py

Update electra.py

update

update

Update convert_bert_from_tf_hub.sh

update

Update unittests.yml

fix conversion

update

fix bert conversion

update

fix

fix

Update __init__.py

fix bug

fix css

Update benchmark_utils.py

Update benchmark_utils.py

update

update

Update misc.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

no multiprocessing

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

fix bug

Update benchmark_utils.py

Update benchmark_utils.py

try to use mxnet profiler

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

fix

update

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

fix

Update benchmark_utils.py

Update bart.py

Update bart.py

fix

fix

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_gluonnlp.py

Update benchmark_gluonnlp.py

Update benchmark_gluonnlp.py

Update benchmark_utils.py

Update benchmark_utils.py

Update benchmark_utils.py

Update README.md

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update requirements.txt

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* debug

* Update benchmark_utils.py

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update benchmark_utils.py

* Update pretraining_utils.py

* Update benchmark_utils.py

* update

* Update benchmark_utils.py

* Update benchmark_utils.py

* fix convert

* tiny fix

* python3

* fix

* lower tolerance for albert large and xlarge

* Update benchmark_utils.py

* fix xlmr

* lower tolerance for albert large

* update

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* Update benchmark_utils.py

* fix

* Squashed commit of the following:

commit bd05969
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 23:44:53 2020 +0800

    lower tolerance for albert large

commit f0f9cd6
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 14:59:06 2020 +0800

    fix xlmr

commit edd6655
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 14:49:36 2020 +0800

    lower tolerance for albert large and xlarge

commit d651730
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 14:34:55 2020 +0800

    fix

commit e097c3b
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 14:02:13 2020 +0800

    python3

commit d6f3fc4
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 14:00:28 2020 +0800

    tiny fix

commit 93bd659
Author: ZheyuYe <[email protected]>
Date:   Tue Aug 11 13:08:34 2020 +0800

    fix convert

commit 9238d56
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 21:03:13 2020 -0700

    Update benchmark_utils.py

commit 9bbc581
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 12:58:04 2020 -0700

    Update benchmark_utils.py

commit b1f5955
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 11:18:43 2020 -0700

    update

commit a43e65b
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 10:32:55 2020 -0700

    Update benchmark_utils.py

commit 13db82f
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 10:16:46 2020 -0700

    Update pretraining_utils.py

commit fdd9df5
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 08:49:17 2020 -0700

    Update benchmark_utils.py

commit 44f9c8b
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 05:07:45 2020 -0700

    Update benchmark_gluonnlp.py

commit 45c58b6
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 05:06:05 2020 -0700

    Update benchmark_gluonnlp.py

commit f0ae933
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 05:04:41 2020 -0700

    Update benchmark_utils.py

commit 9735edb
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:59:58 2020 -0700

    debug

commit d9daf58
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:57:17 2020 -0700

    Update benchmark_utils.py

commit 9e0f631
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:56:52 2020 -0700

    Update benchmark_utils.py

commit 37f224f
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:56:06 2020 -0700

    Update benchmark_utils.py

commit 1cf5c7b
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:54:34 2020 -0700

    Update benchmark_utils.py

commit 15272f1
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:49:28 2020 -0700

    Update benchmark_utils.py

commit 8215df6
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:48:20 2020 -0700

    Update benchmark_utils.py

commit 1451f03
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:42:21 2020 -0700

    Update requirements.txt

commit 626739d
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:38:54 2020 -0700

    Update benchmark_utils.py

commit 1955197
Author: Xingjian Shi <[email protected]>
Date:   Mon Aug 10 04:31:30 2020 -0700

    Update benchmark_utils.py

commit 2fd7e3b
Author: Xingjian Shi <[email protected]>
Date:   Thu Aug 6 23:56:49 2020 -0700

    update

    update

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Create requirements.txt

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update requirements.txt

    update

    Update README.md

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    Update benchmark_hf.py

    fix

    fix

    Update test_models_bart.py

    Update test_models_bart.py

    Update bart.py

    update

    Update __init__.py

    Update electra.py

    update

    update

    Update convert_bert_from_tf_hub.sh

    update

    Update unittests.yml

    fix conversion

    update

    fix bert conversion

    update

    fix

    fix

    Update __init__.py

    fix bug

    fix css

    Update benchmark_utils.py

    Update benchmark_utils.py

    update

    update

    Update misc.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    no multiprocessing

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    fix bug

    Update benchmark_utils.py

    Update benchmark_utils.py

    try to use mxnet profiler

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    fix

    update

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    fix

    Update benchmark_utils.py

    Update bart.py

    Update bart.py

    fix

    fix

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_gluonnlp.py

    Update benchmark_gluonnlp.py

    Update benchmark_gluonnlp.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update benchmark_utils.py

    Update README.md

* fix squad

* fix typo

* Update benchmark_utils.py

* Update benchmark_hf.py

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update benchmark_utils.py

* Update benchmark_gluonnlp.py

* update

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update benchmark_gluonnlp.py

* Update README.md

* update

* Update benchmark_hf.py

* Update benchmark_hf.py

* Update requirements.txt

* Update benchmark_hf.py

* Delete conversion_tool_test.yml

* Update README.md

* Update README.md

* Update README.md

* move python --> python3

* try to fix test

* fix test case

* add test cases

* Update README.md

* update

* update logging config

* fix logging config

Co-authored-by: ZheyuYe <[email protected]>
  • Loading branch information
sxjscience and zheyuye authored Aug 14, 2020
1 parent 9e268c0 commit 32e87d4
Show file tree
Hide file tree
Showing 46 changed files with 2,008 additions and 313 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/unittests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ jobs:
python -m pip install --user --upgrade pip
python -m pip install --user setuptools pytest pytest-cov contextvars
python -m pip install --upgrade cython
python -m pip install --pre --user "mxnet>=2.0.0b20200716" -f https://dist.mxnet.io/python
python -m pip install --pre --user "mxnet>=2.0.0b20200802" -f https://dist.mxnet.io/python
python -m pip install --user -e .[extras]
- name: Test project
run: |
Expand Down
23 changes: 10 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,35 +20,32 @@ First of all, install the latest MXNet. You may use the following commands:

```bash
# Install the version with CUDA 10.0
pip install -U --pre "mxnet-cu100>=2.0.0b20200802" -f https://dist.mxnet.io/python
python3 -m pip install -U --pre "mxnet-cu100>=2.0.0b20200802" -f https://dist.mxnet.io/python

# Install the version with CUDA 10.1
pip install -U --pre "mxnet-cu101>=2.0.0b20200802" -f https://dist.mxnet.io/python
python3 -m pip install -U --pre "mxnet-cu101>=2.0.0b20200802" -f https://dist.mxnet.io/python

# Install the version with CUDA 10.2
pip install -U --pre "mxnet-cu102>=2.0.0b20200802" -f https://dist.mxnet.io/python
python3 -m pip install -U --pre "mxnet-cu102>=2.0.0b20200802" -f https://dist.mxnet.io/python

# Install the cpu-only version
pip install -U --pre "mxnet>=2.0.0b20200802" -f https://dist.mxnet.io/python
python3 -m pip install -U --pre "mxnet>=2.0.0b20200802" -f https://dist.mxnet.io/python
```


To install, use
To install GluonNLP, use

```bash
pip install -U -e .
python3 -m pip install -U -e .

# Also, you may install all the extra requirements via
pip install -U -e .[extras]

# In case you are using zsh, try to use the following command for installing
pip install -U -e ."[extras]"
python3 -m pip install -U -e ."[extras]"
```

If you find that you do not have the permission, you can also install to the user folder:

```bash
pip install -U -e . --user
python3 -m pip install -U -e . --user
```

For Windows users, we recommend to use the [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/about).
Expand All @@ -68,8 +65,8 @@ nlp_data help
nlp_preprocess help

# Also, you can use `python -m` to access the toolkits
python -m gluonnlp.cli.data help
python -m gluonnlp.cli.preprocess help
python3 -m gluonnlp.cli.data help
python3 -m gluonnlp.cli.preprocess help

```

Expand Down
12 changes: 7 additions & 5 deletions docs/_static/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@
}

@media (max-width: 650px) {
.install .option, .install .title {
width: 90%;
}
.install .title {
margin-top: 1em;
.install .option, .install .title {
width: 90%;
}

.install .title {
margin-top: 1em;
}
}
45 changes: 45 additions & 0 deletions scripts/benchmarks/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Benchmarking the Performance of NLP Backbones

We benchmark the latency and peak memory usage of a single training (forward + backward) and inference (forward-only) step
of the NLP backbones.
For comparison, we also provide the numbers of the models in huggingface.

## Backbones in HuggingFace

We use the [huggingface benchmark](https://github.com/huggingface/transformers/tree/master/examples/benchmarking)
to benchmark the training + inference speed of common workloads in NLP.

```bash
python3 -m pip install -U -r requirements.txt --user
python3 benchmark_hf.py
```

It will generate a list of csv files:

```
├── pytorch_train_fp32.csv
├── pytorch_train_fp16.csv
├── pytorch_infer_fp32.csv
├── pytorch_infer_fp16.csv
├── pytorch_infer_fp32_ts.csv
```

## GluonNLP Backbones based on MXNet-2.0

We profile three options: `NT` layout, `NT` layout with `TN` layout as the compute layout,
and `TN` layout.

```bash
python3 -m pip install -U -r requirements.txt --user
bash benchmark_gluonnlp.sh
```

It will generate csv files with `gluonnlp_` as the prefix
```
├── gluonnlp_train_fp32_NT_NT.csv
├── gluonnlp_train_fp32_NT_TN.csv
├── gluonnlp_train_fp32_TN_TN.csv
├── gluonnlp_infer_fp32_NT_NT.csv
├── gluonnlp_infer_fp32_NT_TN.csv
├── gluonnlp_infer_fp32_TN_TN.csv
```
130 changes: 130 additions & 0 deletions scripts/benchmarks/benchmark_gluonnlp.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,130 @@
import mxnet as mx
import argparse
import os
import pandas as pd
from benchmark_utils import GluonNLPBackboneBenchmark
import multiprocessing as mp
from multiprocessing import Process
mx.npx.set_np()


MODELS = [
'google_en_uncased_bert_base',
'google_en_uncased_bert_large',
'google_albert_base_v2',
'google_albert_large_v2',
'google_albert_xlarge_v2',
'google_albert_xxlarge_v2',
'google_electra_small',
'google_electra_base',
'google_electra_large',
'google_uncased_mobilebert',
'fairseq_bart_base',
'fairseq_bart_large'
]

# (batch_size, seq_length)
train_workloads =\
[(4, 128),
(8, 128),
(16, 128),
(32, 128),
(1, 512),
(2, 512),
(4, 512),
(8, 512)]


inference_workloads = [
(1, 128),
(1, 384),
(1, 512),
(8, 32),
(8, 128),
(8, 512),
(32, 512),
(256, 128),
(400, 100),
]


def get_parser():
parser = argparse.ArgumentParser(description='Process some integers.')
parser.add_argument('--layout', type=str, default='NT',
help='The layout of the computation')
parser.add_argument('--compute_layout', type=str, default=None,
help='The compute layout of the computation')
parser.add_argument('--mode', type=str, default='train',
choices=['train', 'inference'])
return parser


def run_benchmark(workload, model_name, out_file_name, is_train):
if is_train:
benchmark = GluonNLPBackboneBenchmark(
workloads=workload,
model_names=model_name,
profile_inference=False,
profile_train=True,
to_csv=True,
train_out_csv_file=out_file_name)
benchmark.run()
else:
benchmark = GluonNLPBackboneBenchmark(
workloads=workload,
model_names=model_name,
profile_inference=True,
profile_train=False,
to_csv=True,
inference_out_csv_file=out_file_name)
benchmark.run()
return


if __name__ == '__main__':
mp.set_start_method('spawn')
parser = get_parser()
args = parser.parse_args()
if args.compute_layout is None:
args.compute_layout = args.layout
for layout, compute_layout in [(args.layout, args.compute_layout)]:
if compute_layout != layout:
profile_models = [ele for ele in MODELS if 'bart' not in ele]
else:
profile_models = [ele for ele in MODELS]
if args.mode == 'inference':
out_dir = 'infer_fp32_{}_{}'.format(layout, compute_layout)
df = pd.DataFrame(columns=['model', 'batch_size', 'sequence_length',
'latency', 'memory'])
os.makedirs(out_dir, exist_ok=True)
for model_name in profile_models:
for workload in inference_workloads:
out_path = os.path.join(out_dir, '{}_{}_{}.csv'.format(model_name, workload[0],
workload[1]))
process = Process(
target=run_benchmark,
args=(workload, model_name, out_path, False))
process.start()
process.join()
new_df = pd.read_csv(out_path)
df = df.append(new_df, ignore_index=True)
df.to_csv('gluonnlp_infer_fp32_{}_{}.csv'.format(layout, compute_layout))
elif args.mode == 'train':
out_dir = 'train_fp32_{}_{}'.format(layout, compute_layout)
df = pd.DataFrame(columns=['model', 'batch_size', 'sequence_length',
'latency', 'memory'])
os.makedirs(out_dir, exist_ok=True)
for model_name in profile_models:
for workload in train_workloads:
out_path = os.path.join(out_dir, '{}_{}_{}.csv'.format(model_name, workload[0],
workload[1]))
process = Process(
target=run_benchmark,
args=(workload, model_name, out_path, True))
process.start()
process.join()
new_df = pd.read_csv(out_path)
df = df.append(new_df, ignore_index=True)
df.to_csv('gluonnlp_train_fp32_{}_{}.csv'.format(layout, compute_layout))
else:
raise NotImplementedError
14 changes: 14 additions & 0 deletions scripts/benchmarks/benchmark_gluonnlp.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
for mode in train inference
do
python3 benchmark_gluonnlp.py --layout NT --compute_layout NT --mode $mode
done

for mode in train inference
do
python3 benchmark_gluonnlp.py --layout NT --compute_layout TN --mode $mode
done

for mode in train inference
do
python3 benchmark_gluonnlp.py --layout TN --compute_layout TN --mode $mode
done
Loading

0 comments on commit 32e87d4

Please sign in to comment.