Skip to content

Commit

Permalink
first and final commit
Browse files Browse the repository at this point in the history
  • Loading branch information
timurcarstensen committed Aug 7, 2022
1 parent 2e9c28b commit a7f6aa1
Show file tree
Hide file tree
Showing 264 changed files with 38,317 additions and 1 deletion.
23 changes: 23 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
node_modules
typings
*.pyc
.DS_Store
package-lock.json
mtp-ai-turing-tumble.iml
.idea/
out
reinforcement_learning/wandb
reinforcement_learning/tmp
/out/
/reinforcement_learning/tmp/
.idea/*
reinforcement_learning/wandb/*
out/*
reinforcement_learning/tmp/*
*/META-INF/*
MANIFEST.MF
wandb_key_file
LogFile.txt
State.txt
/reinforcement_learning/dataset_generators/rl_training_set.csv
/reinforcement_learning/environments/envs/bugbit_env_backup.py
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,4 @@ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
SOFTWARE.
64 changes: 64 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
# European Master Team Project - AI Turing Tumble

> This repository holds the code that was developed during the European Master Team Project
> in the spring semester of 2022 (EMTP 22). The project was supervised by
> [Dr. Christian Bartelt](https://www.uni-mannheim.de/en/ines/about-us/researchers/dr-christian-bartelt/) and
> [Jannik Brinkmann](https://www.linkedin.com/in/brinkmann-jannik/). The project team was
> composed of students from the [Babeș-Bolyai University](https://www.ubbcluj.ro/en/)
> in Cluj-Napoca, Romania, and the [University of Mannheim](https://www.uni-mannheim.de/), Germany.
## Introduction

In the game Turing Tumble, players construct mechanical computers that use the flow of marbles along a board to solve
logic problems. As the board and its parts are Turing complete, which means that they can be used to express any
mathematical function, an intelligent agent taught to solve a Turing Tumble challenge essentially learns how to write
code according to a given specification.

Following this logic, we taught an agent how to write a simple programme according to a minimal specification, using
an abstracted version of the Turing Tumble board as reinforcement learning training environment. This is related to
the emerging field of programme synthesis, as is for example applied in
[GitHub’s CoPilot](https://github.com/features/copilot).

## Participants

### Babeș-Bolyai University

* [Tudor Esan](https://github.com/TudorEsan) - B.Sc. Computer Science
* [Raluca Diana Chis](https://github.com/RalucaChis) - M.Sc. Applied Computational Intelligence

### University of Mannheim

* [Roman Hess](https://github.com/romanhess98) - M.Sc. Data Science
* [Timur Carstensen](https://github.com/timurcarstensen) - M.Sc. Data Science
* [Julie Naegelen](https://github.com/jnaeg) - M.Sc. Data Science
* [Tobias Sesterhenn](https://github.com/Tsesterh) - M.Sc. Data Science

## Contents of this repository

The project directory is organised in the following way:

| Path | Role |
|---------------------------|----------------------------------------------|
| `docs/` | Supporting material to document the project |
| `reinforcement_learning/` | Everything related to Reinforcement Learning |
| `src/` | Java sources |
| `ttsim/` | Source Code of the Turing Tumble Simulator |

## Weights & Biases (wandb)
We used [Weights & Biases](https://wandb.ai/) to log the results of our training:
1. [Reinforcement Learning](https://wandb.ai/mtp-ai-board-game-engine/ray-tune-bugbit)
2. [Pretraining](https://wandb.ai/mtp-ai-board-game-engine/Pretraining)
3. [Connect Four](https://wandb.ai/mtp-ai-board-game-engine/connect-four)

## Credits

We used third-party software to implement the project. Namely:

- **BugPlus** - [Dr. Christian Bartelt](https://www.uni-mannheim.de/en/ines/about-us/researchers/dr-christian-bartelt/)
- **Turing Tumble Simulator** - [Jesse Crossen](https://github.com/jessecrossen/ttsim)

## Final Project Presentation

Link to video:
[![Final Project Presentation Video](https://img.youtube.com/vi/w501gf2MLFM/0.jpg)](https://www.youtube.com/watch?v=w501gf2MLFM)

111 changes: 111 additions & 0 deletions docs/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
# Setup

> **DISCLAIMER: this project is meant to be run on Linux or macOS machines. Windows is not supported
> due to a scheduler conflict between JPype and Ray.**
## Prerequisites

1. An x86 machine running Linux or macOS (tested on Ubuntu 20.04 and macOS Monterey)
2. A working, clean (i.e new / separate) conda ([miniconda3](https://docs.conda.io/en/latest/miniconda.html)
or [anaconda3](https://docs.anaconda.com/anaconda/install/)) installation
3. [IntelliJ IDEA](https://www.jetbrains.com/idea/) (CE / Ultimate)

## Installing dependencies

1. Create and activate a new conda environment for python3.8 (`conda create -n mtp python=3.8` & `conda activate mtp`)
2. Navigate to the project root and run `pip install -r requirements.txt`
3. Run `pip install "ray[all]"` to install all dependencies for Ray

## Python Setup

1. In IntelliJ, open the project structure dialogue: `File -> Project Structure`
2. In Modules, select the project and click `Add` and select `Python`
3. In the `Python` tab, add a new Python interpreter by clicking on `...`
4. In the newly opened dialogue, click on `+` and click `Add Python SDK...`
5. In the dialogue, click on `Conda environment` and select the existing environment we created in the previous step
6. Select the newly registered interpreter as the project interpreter and close out of the dialogue after
clicking `apply`
7. Still in `Project Structure`, navigate to `Modules` and select the project: select the directories `src`
and `reinforcement_learning` and mark them as `Sources`. Click `Apply` and close out of the dialogue.

## Compiling the project

1. Make sure that the SDK and Language Level in the Project tab are set to 17 (i.e. openjdk-17)
2. Open the Project Structure Dialogue in IntelliJ `File -> Project Structure`
3. Select `Artifacts`
4. Add a JAR file with dependencies
5. Click on the folder icon and select `CF_Translated` in the next dialogue and click OK
6. Click OK again and then in the artifacts overview, in the Output Layout tab, select the Python library and remove it
7. Click on apply and OK
8. Build the artifact: `Build -> Build Artifacts`
9. In `reinforcement_learning/utilities/utilities.py`, make sure that the variable `artifact_directory` is set to the
folder that contains the compiled artifact. The variable `artifact_file_name` should be set to the name of the jar
file. (cf. image below: `artifact_directory = mtp_testing_jar` and `artifact_file_name = mtp-testing.jar`)

<p align="center">
<img src="assets/out_directory.png" alt="example challenge" width="300"/>
</p>

## Weights & Biases

We used [Weights & Biases](https://wandb.ai/) (wandb) to document our training progress and results. All training files
in this project rely on wandb for logging. For the following you will need a wandb account and your wandb
[API key](https://docs.wandb.ai/quickstart):

1. Create a file named `wandb_key_file` in the `reinforcement_learning` directory
2. Paste your wandb API key in the newly created file
3. In the training files, adjust the wandb configuration to your wandb account (i.e. arguments such as `entity`,
`project`, and `group` should be modified accordingly)

## Running the project

> DISCLAIMER: the number of bugs used in pretraining, reinforcement learning, and the environment configuration **must**
> be the same
> To run the individual parts of the final project pipeline, follow the steps outlined below.
> DISCLAIMER: the generation of Pretraining and RL training sets may take some time for *num_bugs > 6* and a large
> number
> of samples.
### Pretraining

#### Pretraining Sample Generation

Run the `reinforcement_learning/dataset_generators > pretraining_dataset_generation.py` script in the terminal or
execute the file in IntelliJ.
The generated dataset(s) will be saved in `data/training_sets/pretraining_training_sets` as different `.pkl` files.
Each training set is identified by the number of bugs and whether multiple_actions are allowed or not. If multiple
actions
are allowed, this means that in the training set there are samples where more then one edge have to be removed from the
CF Matrix.

#### Model Pretraining

Run the `reinforcement_learning/custom_torch_models > rl_network_pretraining.py` script in the terminal or execute
the file in IntelliJ.
The script loads the created training dataset depending on the number of bugs and whether multiple actions are allowed.
After the pretraining, the model is saved in `data/model_weights`.

### Reinforcement Learning (RL)

The RL part of the project is split up into two components: sample generation and training.

#### Reinforcement Learning Sample Generation

To generate RL samples, adjust the number of bugs and number of samples in the main function of
`reinforcement_learning/dataset_generators > rl_trainingset_generation.py` and run the file. The generated training set
will be saved
as a `.pkl` file in `data/training_sets/rl_training_sets`.

#### Reinforcement Learning Training

Reinforcement learning training is done in `reinforcement_learning > train.py`:

1. Set the path to your RL training set from the previous step in `global_config["training_set_path]`
2. If you did generate a pretraining set and did pretrain, also adjust `global_config["pretrained_model_path"]`
3. ***Optional***: if you want to initialise the learner with the pretrained model's weights, set the config parameter
in `global_config["pretraining"]` to `True`.

### Connect Four

For Connect Four please refer to the [Connect Four](connect-four.md) documentation.
Binary file added docs/assets/Example_Challenge.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/TT_architecture_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/TT_env_subdiagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/TT_transl_diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 64 additions & 0 deletions docs/assets/blue_bit.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit a7f6aa1

Please sign in to comment.