text-localization-agent

The code to train and evaluate the agent.

Prerequisites

You need Python 3 (preferably 3.6) installed, as well as the requirements from requirements.txt:

$ pip install -r requirements.txt

To evaluate the agent, we require Object-Detection-Metrics. Because is not available as a pip package, you need to make it available as a module detection_metrics in the agent directory:

git clone https://github.com/rafaelpadilla/Object-Detection-Metrics
cp -r Object-Detection-Metrics/lib text-localization-agent/detection_metrics

Furthermore, you need to install the text-localization-environment by following its Installation instructions.

Usage

Training an agent requires different files based on the dataset used. For the simple dataset, you need two files:

A textfile where each line contains the path to one image in the training dataset
A numpy file (.npy) that contains the bounding boxes associated with each image. For n images this file contains a list with n entries where each entry is a list of bounding boxes in the format ((x_topleft, y_topleft), (x_bottomright, y_bottomright))

Datasets generated by the dataset generator fullfill these requirements.

You specify the dataset in a config file similar to example.ini using the dataset and dataset_path options. The dataset loader currently supports simple, sign, synthtext.

Overview of python executables:

File	Purpose	Example Command
`train_agent.py`	Train the agent. This creates a new folder in `./experiments/<experiment_id>` using the `experiment_id` specified in the config. This output directory contains saved models, logfiles (incl. tensorboard logs) and training plots.	`python train_agent.py --config ./config/example.ini`
`eval_agent.py`	Evaluate the agent. Tests the agent with `n` images and creates a report under `<path-to-experiment>/evaluation` containing metrics and visualizations.	`python eval_agent.py -e <path-to-experiment> --config ./config/example.ini`
`scripts/render_graph.py`	Creates a .dot file of the computational graph of the agent	`python ./scripts/visualize_agent_graph.py --config ./config/example.ini`

Monitoring with TensorBoard

If you would like the program to generate log-files appropriate for visualization in TensorBoard, you need to:

Install tensorflow
```
pip install tensorflow
```
(If you use Python 3.7 and the installation fails, use: pip install --upgrade https://storage.googleapis.com/tensorflow/mac/cpu/tensorflow-1.12.0-py3-none-any.whl instead. See here, why.)
Run the text-localization-agent program with the --tensorboard flag (or use_tensorboard = True in the config)
```
python train-agent.py --config .config/example.ini
```
Start TensorBoard pointing to the tensorboard/ directory inside the experiment's directory
```
tensorboard --logdir=<path-to-text-localization-agent>/experiments/<experiment>/tensorboard/
```
When running tensorboard on the chair's server, you need to additionally pass --bind_all with the command.

If you want to compare multiple experiments, you need to copy over their tensorboard logs into a single folder & point tensorboard to it.
Open the TensorBoard UI via the link that is provided when the tensorboard program is started (http://localhost:600 or http://:6006)

Training on the chair's servers

To run the training on one of the chair's servers you need to:

Clone the necessary repositories
Create a new virtual environment. Note that the Python version needs to be at least 3.6 for everything to run. The default might be a lower version so if that is the case you must make sure that the correct version is used. You can pass the correct python version to virtualenv via the -p parameter, for example
```
$ virtualenv -p python3.6 <envname>
```
Activate the environment via
```
$ source <envname>/bin/activate
```
Install the required packages (see section "Prerequisites"). Don't forget cupy, tb_chainer and tensorflow!
Prepare the training data (either generate it using the dataset-generator or transfer existing data on the server)
To avoid stopping the training after disconnecting from the server, you might want to use a terminal-multiplexer such as tmux or screen
Set the CUDA_PATH and LD_LIBRARY_PATH variables if they are not already set. The command should be something like
```
$ export CUDA_PATH=/usr/local/cuda
$ export LD_LIBRARY_PATH=$CUDA_PATH/lib64:$LD_LIBRARY_PATH
```
To download the ResNet-50 caffemodel (it isn't downloaded automatically) see link and save it where necessary (an error will tell you where if you try to create a TextLocEnv).
Start training!

These instructions are for starting from scratch, for example if there is already a suitable virtual environment you obviously don't need to create a new one.

Name		Name	Last commit message	Last commit date
Latest commit History 138 Commits
agent		agent
config		config
fonts/open-sans		fonts/open-sans
scripts		scripts
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
config.py		config.py
eval_agent.py		eval_agent.py
requirements.freeze.txt		requirements.freeze.txt
requirements.txt		requirements.txt
run_experiment.py		run_experiment.py
train_agent.py		train_agent.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

text-localization-agent

Prerequisites

Usage

Monitoring with TensorBoard

Training on the chair's servers

About

Releases

Packages

Languages

midl19t3/text-localization-agent

Folders and files

Latest commit

History

Repository files navigation

text-localization-agent

Prerequisites

Usage

Monitoring with TensorBoard

Training on the chair's servers

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages