The code to train and evaluate the agent.
You need Python 3 (preferably 3.6) installed, as well as the requirements from requirements.txt
:
$ pip install -r requirements.txt
To evaluate the agent, we require Object-Detection-Metrics. Because is not available as a pip package, you need to make it available as a module detection_metrics
in the agent directory:
git clone https://github.com/rafaelpadilla/Object-Detection-Metrics
cp -r Object-Detection-Metrics/lib text-localization-agent/detection_metrics
Furthermore, you need to install the text-localization-environment by following its Installation instructions.
Training an agent requires different files based on the dataset used. For the simple
dataset, you need two files:
- A textfile where each line contains the path to one image in the training dataset
- A numpy file (.npy) that contains the bounding boxes associated with each image. For n images this file contains a list with n entries where each entry is a list of bounding boxes in the format ((xtopleft, ytopleft), (xbottomright, ybottomright))
Datasets generated by the dataset generator fullfill these requirements.
You specify the dataset in a config file similar to example.ini
using the dataset
and dataset_path
options. The dataset loader currently supports simple
, sign
, synthtext
.
Overview of python executables:
File | Purpose | Example Command |
---|---|---|
train_agent.py |
Train the agent. This creates a new folder in ./experiments/<experiment_id> using the experiment_id specified in the config. This output directory contains saved models, logfiles (incl. tensorboard logs) and training plots. |
python train_agent.py --config ./config/example.ini |
eval_agent.py |
Evaluate the agent. Tests the agent with n images and creates a report under <path-to-experiment>/evaluation containing metrics and visualizations. |
python eval_agent.py -e <path-to-experiment> --config ./config/example.ini |
scripts/render_graph.py |
Creates a .dot file of the computational graph of the agent | python ./scripts/visualize_agent_graph.py --config ./config/example.ini |
If you would like the program to generate log-files appropriate for visualization in TensorBoard, you need to:
-
Install tensorflow
pip install tensorflow
(If you use Python 3.7 and the installation fails, use:
pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.12.0-py3-none-any.whl
instead. See here, why.) -
Run the text-localization-agent program with the
--tensorboard
flag (oruse_tensorboard = True
in the config)python train-agent.py --config .config/example.ini
-
Start TensorBoard pointing to the
tensorboard/
directory inside the experiment's directorytensorboard --logdir=<path-to-text-localization-agent>/experiments/<experiment>/tensorboard/
When running tensorboard on the chair's server, you need to additionally pass
--bind_all
with the command.If you want to compare multiple experiments, you need to copy over their tensorboard logs into a single folder & point tensorboard to it.
-
Open the TensorBoard UI via the link that is provided when the
tensorboard
program is started (http://localhost:600 or http://:6006)
To run the training on one of the chair's servers you need to:
-
Clone the necessary repositories
-
Create a new virtual environment. Note that the Python version needs to be at least 3.6 for everything to run. The default might be a lower version so if that is the case you must make sure that the correct version is used. You can pass the correct python version to virtualenv via the
-p
parameter, for example$ virtualenv -p python3.6 <envname>
-
Activate the environment via
$ source <envname>/bin/activate
-
Install the required packages (see section "Prerequisites"). Don't forget cupy, tb_chainer and tensorflow!
-
Prepare the training data (either generate it using the dataset-generator or transfer existing data on the server)
-
To avoid stopping the training after disconnecting from the server, you might want to use a terminal-multiplexer such as tmux or screen
-
Set the CUDA_PATH and LD_LIBRARY_PATH variables if they are not already set. The command should be something like
$ export CUDA_PATH=/usr/local/cuda $ export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATH
-
To download the ResNet-50 caffemodel (it isn't downloaded automatically) see link and save it where necessary (an error will tell you where if you try to create a TextLocEnv).
-
Start training!
These instructions are for starting from scratch, for example if there is already a suitable virtual environment you obviously don't need to create a new one.