-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathHowTo.txt
66 lines (54 loc) · 2.74 KB
/
HowTo.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
BASELINE
This repository contains the baseline models used for taskA or taskB at SemEval-Task5 (MAMI).
Each baseline loads the challenge data (e.g. csv file contating the meme transcription and labels, raw images). The data path are specified at the beginning of each script and can be customized according to the user preferences.
Each baseline learns a model on the training data and makes predictions on the test set. Macro-average F1-Measure is computed for taskA, while weighted-average F1-Measure is estimated for taskB using the script 'evaluation.py'.
The scripts of each baseline produces, additionally to the prediction file, a dump of the trained model, the performance measures and an image denoting the model structure.
EVALUATION
In order to estimate the performance measure for each subtask on the test instances, the evaluation script uses the Golden Labels contained in the 'truth.txt' file.
This file is structured as follow:
filename[tab]{0|1}{tab}{0|1}[tab]{0|1}[tab]{0|1}[tab]{0|1}
example:
1.jpg 1 1 0 0 1
2.jpg 0 0 0 0 0
3.jpg 0 0 1 0 1
Model predictions are saved in 'answer.txt'.
EXECUTION
To run 'evaluation.py' or any baseline script, the file 'truth.txt' must be placed in a folder name 'ref'. To execute all the baseline it is necessary to satisfy the requirements indicated in the relative file ('requirements.txt'). Requirements can be installed via the command "pip install -r requirements.txt".
To run each baseline model, the following items in the same folder are required:
- evaluation script ('evaluation.py')
- train and test data as csv file
- the Ground Truth file ('truth.txt') in the folder 'ref'
- folder 'TRAINING' containing the memes provided by the challenge
The execution folder for evaluating the baselines should be structured as follow:
input/
|_baseline_name.py
|_ evaluation.py
|_train.csv
|_test.csv
|_ ref/
| |_truth.txt
|_ TRAINING/
|_ 1.jpg
|_ 2.jpg
|_ ...
If you want to evaluate the performance of your own model, the folder containing 'evaluation.py' should include also a subfolder 'res' including the prediction file 'answer.txt' and a subfolder 'res' reporting the Golden Labels in the file 'truth.txt'.
The execution folder for evaluating your model should be structured as follow:
input/
|_ evaluation.py
|_ ref/
| |_truth.txt
|_ res/
|_answer.txt
OUTPUT
At the end of the execution, a folder related to each specific baseline will be created. Each folder will contain:
- a dump of the model (.h5)
- a .png file showing the architecture of the model
- the score obtained from the evalutation script:'scores.txt'
- a 'res' folder containing the prediction for each meme ('answer.txt')
The output folder will be structured as follow:
output/
|_ model.h5
|_ model.png
|_ scores.txt
|_ res/
|_answer.txt