Documentation improvements

lanl-ansi · Oct 20, 2024 · 4724026 · 4724026
1 parent 99d544b
commit 4724026
Showing 5 changed files with 86 additions and 11 deletions.
diff --git a/README.md b/README.md
@@ -29,6 +29,45 @@ import Pkg
 Pkg.add(; url = "https://github.com/lanl-ansi/MathOptAI.jl")
 ```
 
+## Getting started
+
+Here's an example of using MathOptAI to embed a trained neural network from Flux
+into a JuMP model. The vector of JuMP variables `x` is fed as input to the
+neural network. The output `y` is a vector of JuMP variables that represents the
+output layer of the neural network. The `formulation` object stores the
+additional variables and constraints that were added to `model`.
+
+```julia
+julia> using JuMP, MathOptAI, Flux
+
+julia> predictor = Flux.Chain(
+           Flux.Dense(28^2 => 32, Flux.sigmoid),
+           Flux.Dense(32 => 10),
+           Flux.softmax,
+       );
+
+julia> #= Train the Flux model. Code not shown for simplicity =#
+
+julia> model = JuMP.Model();
+
+julia> JuMP.@variable(model, 0 <= x[1:28^2] <= 1);
+
+julia> y, formulation = MathOptAI.add_predictor(model, predictor, x);
+
+julia> y
+10-element Vector{VariableRef}:
+ moai_SoftMax[1]
+ moai_SoftMax[2]
+ moai_SoftMax[3]
+ moai_SoftMax[4]
+ moai_SoftMax[5]
+ moai_SoftMax[6]
+ moai_SoftMax[7]
+ moai_SoftMax[8]
+ moai_SoftMax[9]
+ moai_SoftMax[10]
+```
+
 ## Documentation
 
 Documentation is available at

diff --git a/docs/src/developers/design_principles.md b/docs/src/developers/design_principles.md
@@ -28,8 +28,8 @@ MathOptAI chooses to use "predictor" as the synonym for the machine learning
 model. Hence, we have `AbstractPredictor`, `add_predictor`, and
 `build_predictor`.
 
-In contrast, gurob-machinelearning tennds to use "regression model" and OMLT
-does not have a single unified API.
+In contrast, gurob-machinelearning tends to use "regression model" and OMLT
+uses "formulation."
 
 We choose "predictor" because all models we implement are of the form
 ``y = f(x)``.
@@ -167,9 +167,6 @@ y, formulation = MathOptAI.add_predictor(model, MathOptAI.ReLU(), x)
 ```
 for any size of `x`.
 
-We choose this decision to simplify the implementation, and because we think
-deleting a predictor is an uncommon operation.
-
 ## Activations are predictors
 
 OMLT makes a distinction between layers, like `full_space_dense_layer`, and
@@ -196,7 +193,7 @@ Many predictors have multiple ways that they can be formulated in an
 optimization model. For example, [`ReLU`](@ref) implements the non-smooth
 nonlinear formulation ``y = \max\{x, 0\}``, while [`ReLUQuadratic`](@ref)
 implements a the complementarity formulation
-``x = y - slack; y, slack \\ge 0; y * slack == 0``.
+``x = y - slack; y, slack \ge 0; y * slack = 0``.
 
 Choosing the appropriate formulation for the combination of model and solver can
 have a large impact on the performance.

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -17,6 +17,45 @@ for details.
 _Despite the name similarity, this project is not affiliated with [OMLT](https://github.com/cog-imperial/OMLT),
 the Optimization and Machine Learning Toolkit._
 
+## Getting started
+
+Here's an example of using MathOptAI to embed a trained neural network from Flux
+into a JuMP model. The vector of JuMP variables `x` is fed as input to the
+neural network. The output `y` is a vector of JuMP variables that represents the
+output layer of the neural network. The `formulation` object stores the
+additional variables and constraints that were added to `model`.
+
+```julia
+julia> using JuMP, MathOptAI, Flux
+
+julia> predictor = Flux.Chain(
+           Flux.Dense(28^2 => 32, Flux.sigmoid),
+           Flux.Dense(32 => 10),
+           Flux.softmax,
+       );
+
+julia> #= Train the Flux model. Code not shown for simplicity =#
+
+julia> model = JuMP.Model();
+
+julia> JuMP.@variable(model, 0 <= x[1:28^2] <= 1);
+
+julia> y, formulation = MathOptAI.add_predictor(model, predictor, x);
+
+julia> y
+10-element Vector{VariableRef}:
+ moai_SoftMax[1]
+ moai_SoftMax[2]
+ moai_SoftMax[3]
+ moai_SoftMax[4]
+ moai_SoftMax[5]
+ moai_SoftMax[6]
+ moai_SoftMax[7]
+ moai_SoftMax[8]
+ moai_SoftMax[9]
+ moai_SoftMax[10]
+```
+
 ## Getting help
 
 This package is under active development. For help, questions, comments, and

diff --git a/src/predictors/Pipeline.jl b/src/predictors/Pipeline.jl
@@ -10,7 +10,7 @@
 An [`AbstractPredictor`](@ref) that represents a pipeline (composition) of
 nested layers:
 ```math
-f(x) = (l_1 \\cdots l_N)(x)
+f(x) = (l_1 \\cdots (l_N(x))
 ```
 
 ## Example

diff --git a/src/predictors/ReLU.jl b/src/predictors/ReLU.jl
@@ -8,7 +8,7 @@
     ReLU() <: AbstractPredictor
 
 An [`AbstractPredictor`](@ref) that implements the ReLU constraint
-\$y = \\max(0, x)\$ as a non-smooth nonlinear constraint.
+\$y = \\max\{0, x\}\$ as a non-smooth nonlinear constraint.
 
 ## Example
 
@@ -77,7 +77,7 @@ end
     ReLUBigM(M::Float64) <: AbstractPredictor
 
 An [`AbstractPredictor`](@ref) that implements the ReLU constraint
-\$y = \\max(0, x)\$ via a big-M MIP reformulation.
+\$y = \\max\{0, x\}\$ via a big-M MIP reformulation.
 
 ## Example
 
@@ -151,7 +151,7 @@ end
     ReLUSOS1() <: AbstractPredictor
 
 An [`AbstractPredictor`](@ref) that implements the ReLU constraint
-\$y = \\max(0, x)\$ by the reformulation:
+\$y = \\max\{0, x\}\$ by the reformulation:
 ```math
 \\begin{aligned}
 x = y - z \\\\
@@ -219,7 +219,7 @@ end
     ReLUQuadratic() <: AbstractPredictor
 
 An [`AbstractPredictor`](@ref) that implements the ReLU constraint
-\$y = \\max(0, x)\$ by the reformulation:
+\$y = \\max\{0, x\}\$ by the reformulation:
 ```math
 \\begin{aligned}
 x = y - z \\\\