Inference-App merge (#181)

* changes added for masking- flag option in compile_llama.sh * frontend added, not tested * added num_threads in OnnxBridge->LLAMA and updated ezpc-cli * num_threads added to ezpc-cli aswell * readme updated for ezpc-cli num_threads * ezpc-cli-spp script added, yes to test complete setup * Readme updated * onnxbridge bug fix in sytorchbackendrep * testing changes for mlinf- revert before merge * testing changes for mlinf- revert before merge * dealer.py bug fix [cause: frontend merge] * added chmod +x in ezpc scripts & added preprocess.py * testing changes for mlinf- revert before merge * testing changes for mlinf- revert before merge * fixed dealer bug & app.py code added * changes from mling repo, gpt-branch commit * minor bug fix in tensor.h * inference-app Readme updated & mask -> encrypt * sample image download link added * sample image download link added * reverting changes for mlinf branch switch
mpc-msri · Jun 12, 2023 · 8ab7a90 · 8ab7a90
1 parent 9b72396
commit 8ab7a90
Show file tree

Hide file tree

Showing 30 changed files with 2,191 additions and 428 deletions.
diff --git a/OnnxBridge/LLAMA/compile_llama.sh b/OnnxBridge/LLAMA/compile_llama.sh
@@ -4,10 +4,23 @@ FSS_CPP_FILE=$1
 EZPC_SRC_PATH=$(dirname $0)
 
 if [ ! -e "$FSS_CPP_FILE" ]; then
-  echo "Please specify file name of the generated .cpp file using CompileONNXGraph.py";
+  echo "Please specify file name of the generated .cpp file using OnnxBridge";
   exit;
 fi
 
+if [ ! -z "$2" ] && [ "$2" != "-Do_Masking" ]; then
+  echo "Invalid 2nd argument"
+  echo "Please specify -Do_Masking as the 2nd argument if you want to enable masking for frontend";
+  exit;
+fi
+
+# Check if 2nd argument is provided
+if [ ! -z "$2" ] && [ "$2" = "-Do_Masking" ]; then
+  mask_flag="add_definitions(-DDo_Masking)";  
+else
+  mask_flag="";
+fi
+
 BINARY_NAME=$(basename $FSS_CPP_FILE .cpp)
 DIR="$(dirname "${FSS_CPP_FILE}")" 
 
@@ -38,6 +51,7 @@ PUBLIC
     \$<BUILD_INTERFACE:$sytorch_dir/include>
     \$<INSTALL_INTERFACE:\${CMAKE_INSTALL_INCLUDEDIR}>
 )
+$mask_flag
 target_link_libraries ($BINARY_NAME Eigen3::Eigen Threads::Threads LLAMA cryptoTools)
 " > CMakeLists.txt
 
@@ -46,7 +60,7 @@ make -j4
 rm -rf ../$BINARY_NAME 
 mv $BINARY_NAME ../$DIR
 cd ..
-rm -rf build_dir
+# rm -rf build_dir
 
 
 

diff --git a/OnnxBridge/LLAMA/sytorchBackendRep.py b/OnnxBridge/LLAMA/sytorchBackendRep.py
@@ -122,7 +122,7 @@ def cleartext_post(code_list, program, scale, mode, indent):
     if (party == 0) {'{'}
         Net<i64> net;
         net.init(scale);
-        std::string weights_file = __argv[3];
+        std::string weights_file = __argv[2];
         net.load(weights_file);
         Tensor<i64> input({'{'}{iterate_list([n]+ dims +[c])}{'}'});
         input.input_nchw(scale);
@@ -160,6 +160,30 @@ def llama_post(code_list, program, scale, mode, bitlength, indent):
 
     int party = atoi(__argv[1]);
     std::string ip = "127.0.0.1";
+    int nt=4;
+    std::string weights_file = "";
+
+    if(party == 0){'{'}
+        weights_file = __argv[2];
+    {'}'}
+    else if(party == DEALER){'{'}
+        if(__argc > 2){'{'}
+            nt = atoi(__argv[2]);
+        {'}'}
+    {'}'}
+    else if(party == SERVER){'{'}
+        weights_file = __argv[2];
+        if(__argc > 3){'{'}
+            nt = atoi(__argv[3]);
+        {'}'}
+    {'}'}
+    else if(party == CLIENT){'{'}
+        ip = __argv[2];
+        if(__argc > 3){'{'}
+            nt = atoi(__argv[3]);
+        {'}'}
+    {'}'}
+
 
     using LlamaVersion = LlamaExtended<u64>;
     LlamaVersion *llama = new LlamaVersion();
@@ -170,7 +194,6 @@ def llama_post(code_list, program, scale, mode, bitlength, indent):
     if (party == 0) {'{'}
         Net<i64> net;
         net.init(scale);
-        std::string weights_file = __argv[3];
         net.load(weights_file);
         Tensor<i64> input({'{'}{iterate_list([n]+ dims +[c])}{'}'});
         input.input_nchw(scale);
@@ -184,18 +207,15 @@ def llama_post(code_list, program, scale, mode, bitlength, indent):
     LlamaConfig::party = party;
     LlamaConfig::stochasticT = true;
     LlamaConfig::stochasticRT = true;
-    LlamaConfig::num_threads = 4;
-    if(__argc > 2){'{'}
-        ip = __argv[2];
-    {'}'}
+    LlamaConfig::num_threads = nt;
+
     llama->init(ip, true);
 
     Net<u64> net;
     net.init(scale);
     net.setBackend(llama);
     net.optimize();
     if(party == SERVER){'{'}
-        std::string weights_file = __argv[3];
         net.load(weights_file);
     {'}'}
     else if(party == DEALER){'{'}

diff --git a/OnnxBridge/README.md b/OnnxBridge/README.md
@@ -45,7 +45,8 @@ Secfloat/compile_secfloat.sh "/path/to/file.cpp"
 ```
 ```bash
 # for LLAMA / CLEARTEXT_LLAMA 
-LLAMA/compile_llama.sh "/path/to/file.cpp"
+LLAMA/compile_llama.sh "/path/to/file.cpp" [-Do_Masking]
+# `-Do_Masking` is an optional argument if we are compiling for Frontend,this helps generate masks.dat file
 ```
 ---
 ## Inference with each backend:
@@ -84,11 +85,11 @@ python3 main.py --path "/path/to/onnx-file" --generate "code" --backend LLAMA --
 LLAMA/compile_llama.sh "/path/to/file.cpp"
 
 # generate LLAMA keys on client and server machines
-./<network> 1
+./<network> 1 <num_threads>
 
 # start inference on server and client machines
-./<network> 2 <ip> <model_weights_file> // Server
-./<network> 3 <server-ip> < <image_file> // Client
+./<network> 2 <model_weights_file> <num_threads>// Server
+./<network> 3 <server-ip> <num_threads> < <image_file> // Client
 ```
 
 #### **LLAMA Cleartext**
@@ -98,7 +99,7 @@ cd OnnxBridge
 python3 main.py --path "/path/to/onnx-file" --generate "executable" --backend CLEARTEXT_LLAMA --scale scale --bitlength bitlength
 
 # start inference 
-./<network> 0 127.0.0.1 <model_weights_file> < <image_file> 
+./<network> 0 <model_weights_file> < <image_file> 
 ```
 
 

diff --git a/inference-app/Assets/computation.png b/inference-app/Assets/computation.png
diff --git a/inference-app/Assets/onnxBridge.jpg b/inference-app/Assets/onnxBridge.jpg
diff --git a/inference-app/Assets/preprocess.py b/inference-app/Assets/preprocess.py
@@ -0,0 +1,19 @@
+from PIL import Image
+import numpy as np
+import sys
+import os
+from torchvision import transforms
+
+preprocess = transforms.Compose(
+    [
+        transforms.Resize(320),
+        transforms.CenterCrop(320),
+        transforms.ToTensor(),
+        transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
+    ]
+)
+
+
+def get_arr_from_image(img):
+    arr = preprocess(img).unsqueeze(0).cpu().detach().numpy()
+    return arr
diff --git a/inference-app/Inference-App.md b/inference-app/Inference-App.md
@@ -0,0 +1,23 @@
+## Inference App - LLAMA
+
+Given an model onnx file, OnnxBridge can be used to generate an executable which can be run on two VMs, Server and Client (owning the model weights and input image respectively) and a Dealer (which pre-generates the randomness for the inference.), to get the secure inference output. Along with this we can use the Inference-App to give a GUI for inferencing. To generate the scripts involved in the Inference-App, use the `ezpc-cli-app.sh` script by running the following command locally (not neccesarily on a VM):
+
+```bash
+./ezpc-cli-app.sh -m /absolute/path/to/model.onnx -s server-ip -d dealer-ip [-nt num_threads]
+```
+
+In the above command, the paths are not local, but are the locations on the respective VMs. That is, `/absolute/path/to/model.onnx` is the path of model.onnx file on the server VM. <br/>
+ We also have to write the preprocessing script for our use case, refer to the preprocessing file of the [chexpert demo](../frontend/Assets/preprocess.py). If your preprocessing script uses some additional python packages, make sure they are installed on the frontend VM. Also, ensure that the client can communicate with the server through the IP address provided on the ports between the range 42002-42100. Optionally, you can also pass the following arguments:
+
+- `-scale <scale>`: the scaling factor for the model input (default: `15`)
+- `-bl <bitlength>`: the bitlength to use for the MPC computation (default: `40`)
+- `-nt <numthreads>`: the number of threads to use for MPC computation (default: `4`)
+
+The script generates 4 scripts:
+
+- `server.sh` - Transfer this script to the server VM in any empty directory. Running this script (without any argument) reads the ONNX file, strips model weights out of it, dumps sytorch code, zips the code required to be sent to the client and dealer and waits for the client to download the zip. Once the zip is transfered, the script waits for dealer to generate the randomness and then starts the inference once the client connects. Once inference is complete, it downloads fresh randomness generated by dealer and again waits for client to start inference, this happens in a loop for multiple inference.
+- `client-offline.sh` - Transfer this script to the client VM in any empty directory. Running this script fetches the stripped code from server and compiles the model. This script must be run on client VM parallely while server VM is running it's server script. It downloads the keys from dealer, then starts a flask server to listen for inference requests from frontend on port 5000, where it receives a image as numpy array, initiates secure inference with server and returns the result to frontend and starts receiveing keys from dealer again.
+- `client-online.sh` - It takes as input absolute path of numpy array of image for inference. Transfer this script to the client VM in the same directory. Running this script downloads randomness from dealer,  preprocesses the input, connects with the server and starts the inference. After the secure inference is complete, inference output is printed and saved in `output.txt` file. This script needs to be run every time for a new inference with a new input.
+- `dealer.sh` - Transfer this script to the dealer VM in any empty directory. Running this script waits for server to send the zip file, after which it generates and allows the client and server script to automatically download the co-related randomness for server and client. Parallely, frontend also downloads masks from dealer.  Once transferred, it generates a fresh pair of co-related randomness keys and again allows server and client to download it in a loop for multiple inference. When Dealer is generating keys, either of Client/Server/Frontend are not allowed to download keys or mask.
+
+- Use 'clean' as `script.sh clean` with any of above script to clean the setup. This removes all files created by script from the current directory except the script itself. [Note: **This might remove all files from the current directory, keep backup of any important file.**]
diff --git a/inference-app/README.md b/inference-app/README.md
@@ -0,0 +1,154 @@
+# Inference App
+This Gradio App gives a frontend to [EzPC](https://github.com/mpc-msri/EzPC) and enables you to make secure inference for images with a pretrained Model and get results in a UI based setup. <br/>
+Following are the system requirements and steps to run the Inference-App for doing secure inferencing on X-ray images with a Chexpert Model.
+
+
+
+# System Requirements
+To successfully execute this demo we will need three **Ubuntu** VMs [tested on Ubuntu 20.04.6 LTS]:
+1. **Dealer** : Works to generate pre-computed randomness and sends it to Client and Server for each inference. 
+2. **Server** : This party owns the model, and _does not share its model weights with Dealer/Client_, hence uses EzPC SMPC to achieve Secure Inference.
+3. **Client** : This party acts as Client, but _does not hold any data by itself_, it gets Masked Image from the frontend, thus this party itself _can't see the image data in cleartext_. On receiving the Masked Image it starts the secure inference with Server and returns the result back to frontend.
+
+
+Additionally we need a machine to run the frontend on, this is independent of OS, can be run on Client machine aswell if UI is available for Client VM, as the frontend runs in a browser.
+
+Notes:
+- Frontend should be able to communicate with Dealer and Client over port 5000.
+- Server should be able to communicate with Dealer and Client over port 8000.
+- Dealer should be able to communicate with Server and Client over port 9000.
+- Server and Client should be able to communicate over ports 42003-42005.
+
+
+# Setup
+
+1. On all Ubuntu VM, install dependencies:
+```bash
+sudo apt update
+sudo apt install libeigen3-dev cmake build-essential git zip
+```
+
+2. On all Ubuntu VM, install the python dependencies in a virtual environment.
+``` bash
+# Demo directory where we will install our dependencies and follow all the further steps.
+mkdir CHEXPERT-DEMO
+cd CHEXPERT-DEMO
+
+sudo apt install python3.8-venv
+python3 -m venv venv
+source venv/bin/activate
+
+wget https://raw.githubusercontent.com/mpc-msri/EzPC/master/OnnxBridge/requirements.txt
+pip install --upgrade pip
+sudo apt-get install python3-dev build-essential
+pip install -r requirements.txt
+pip install tqdm pyftpdlib flask
+```
+
+3. **SERVER** : Download ONNX file for CheXpert model and make a temporary directory.
+```bash
+# while inside CHEXPERT-DEMO
+wget "https://github.com/bhatuzdaname/models/raw/main/chexpert.onnx" -O chexpert.onnx
+mkdir play
+cd play
+```
+
+4. **CLIENT** : Make a temporary Directory.
+```bash
+# while inside CHEXPERT-DEMO
+mkdir play
+cd play
+```
+
+5. **DEALER** : Make a temporary Directory.
+```bash
+# while inside CHEXPERT-DEMO
+mkdir play
+cd play
+```
+
+6. **FRONTEND** : On the system being used as the frontend, follow below instructions to setup Webapp
+```bash
+# clone repo
+git clone https://github.com/mpc-msri/EzPC
+cd EzPC
+
+# create virtual environment and install dependencies 
+sudo apt update
+sudo apt install python3.8-venv
+python3 -m venv mlinf
+source mlinf/bin/activate
+pip install --upgrade pip
+sudo apt-get install python3-dev build-essential
+pip install -r inference-app/requirements.txt
+```
+
+7. **FRONTEND** : Generate the scripts and transfer them to respective machines. If server, client and dealer are in same virtual network, then pass the private network IP in the ezpc_cli-app.sh command.
+```bash
+cd inference-app
+chmod +x ezpc-cli-app.sh
+./ezpc-cli-app.sh -m /home/<user>/CHEXPERT-DEMO/chexpert.onnx -s <SERVER-IP> -d <DEALER-IP> [ -nt <num_threads> ]
+scp server.sh <SERVER-IP>:/home/<user>/CHEXPERT-DEMO/play/
+scp dealer.sh  <DEALER-IP>:/home/<user>/CHEXPERT-DEMO/play/
+scp client-offline.sh <CLIENT-IP>:/home/<user>/CHEXPERT-DEMO/play/
+scp client-online.sh  <CLIENT-IP>:/home/<user>/CHEXPERT-DEMO/play/
+```
+In the above commands in step 7, the file paths and directories are absolute paths on the Ubuntu VMs used. To know more about the `ezpc-cli-app.sh` script see [link](/inference-app/Inference-App.md). <br/><br/>
+On all Ubuntu VMs, make the bash scripts executable and execute them.
+
+```bash
+# (on server)
+chmod +x server.sh
+./server.sh
+
+# (on dealer)
+chmod +x dealer.sh
+./dealer.sh
+
+# (on client)
+chmod +x client-offline.sh client-online.sh
+./client-offline.sh
+```
+
+8. **FRONTEND** : setup & run the webapp:
+#### Create a .`env` file inside `EzPC/inference-app` directory to store the secrets as environment variables ( `_URL` is the IP address of Dealer ), the file should look as below:
+    _URL = "X.X.X.X"
+    _USER = "frontend"
+    _PASSWORD = "frontend"
+    _FILE_NAME = "masks.dat"
+    _CLIENT_IP = "X.X.X.X"
+
+Download the preprocessing file for image (specific to model) inside `/inference-app` directory:
+```bash
+# This file takes in image as <class 'PIL.Image.Image'>
+# preprocess it and returns it as a numpy array of size required by Model.
+wget "https://raw.githubusercontent.com/mpc-msri/EzPC/master/inference-app/Assets/preprocess.py" -O preprocess.py
+```
+
+```bash
+# Next we download example image for the app.
+cd Assets 
+mkdir examples && cd examples 
+wget "https://raw.githubusercontent.com/drunkenlegend/ezpc-warehouse/main/Chexpert/cardiomegaly.jpg" -O 1.jpg
+cd ../..
+```
+
+***Note:*** 
+
+    Further in case of using some other model for demo and customising WebApp to fit your model,
+    modify the USER_INPUTS in constants.py file in /inference-app directory.
+
+```bash
+# while inside inference-app directory
+python app.py
+```
+
+Open the url received after running the last command on inference-app and play along:
+1. Upload X-ray image.
+2. Get Encryption Keys
+3. Encrypt Image
+4. Start Inference
+
+
+
+