Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSO Compliance: R2 #139

Merged
merged 24 commits into from
Aug 15, 2024
Merged

Conversation

MaxenceGollier
Copy link
Contributor

Rewrote R2_alg.jl to be more in phase with other JSO solvers.

  • Added a function SolverCore.solve!( R2Solver, RegularizedNLPModel, stats ) which implements the algorithm.
  • Callbacks are now available
  • Changed implementations of the previous R2 functions to fit this new approach but made sure to keep the previous signatures and returns so that nothing else in this package is impacted. This makes the code messy in some places and should be removed. (for example, there is no need for Fhist, Hhist, etc. in my opinion). Also, the documentation i added does not show these functions.
  • On a small benchmark, there seems to be less allocations with this code.

I initially just wanted to add callbacks so i don't know if this is relevant. What do you think @dpo, @geoffroyleconte, @MohamedLaghdafHABIBOULLAH ?

@dpo
Copy link
Member

dpo commented Jun 8, 2024

Thank you, @MaxenceGollier! We'll look in detail.

Yes, callbacks are necessary. See #131.

ps: you can mark functions that should eventually be removed with Base.@deprecate (https://docs.julialang.org/en/v1/base/base/#Base.@deprecate).

Copy link
Member

@dpo dpo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few quick simple comments for now.

src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
Copy link
Member

@dpo dpo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you. Here are a few more comments.

I would like to see a comparison of the old and new R2 on the problems we have to make sure the results are the same (number of iterations and final values (f, h, ξ, …).

src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
@MaxenceGollier
Copy link
Contributor Author

First, this version:

julia> using LinearAlgebra, NLPModels, NLPModels, NLPModelsModifiers, RegularizedProblems, ProximalOperators, Random
julia> Random.seed!(0)
julia> bpdn = bpdn_model(1)[1]
julia> λ = norm(grad(bpdn, zeros(bpdn.meta.nvar)), Inf) / 10
julia> h = NormL1(λ)

julia> import Pkg; Pkg.add(url = "https://github.com/MaxenceGollier/RegularizedOptimization.jl.git", rev = "RegularizedModels")
julia> using RegularizedOptimization
julia> options = ROSolverOptions(ν = 1.0, β = 1e16, ϵa = 1e-6, ϵr = 1e-6, verbose = 1)

julia> R2(bpdn, h, options)
[ Info:   iter     f(x)     h(x)   √(ξ/ν)        ρ        σ      ‖x‖      ‖s‖
[ Info:      0  1.8e+00  0.0e+00  1.3e+00  5.5e-01  1.0e+00  0.0e+00  1.3e+00               =
[ Info:      1  2.5e-01  5.8e-01  3.3e-01  9.5e-01  1.0e+00  1.3e+00  3.2e-01               ↘
[ Info:      2  2.1e-01  5.2e-01  2.5e-01  8.5e-01  3.3e-01  1.5e+00  7.3e-01               =
[ Info:      3  1.1e-01  4.5e-01  1.8e-01  8.6e-01  3.3e-01  2.0e+00  5.3e-01               =
[ Info:      4  1.3e-01  3.5e-01  1.5e-01  6.5e-01  3.3e-01  2.2e+00  4.4e-01               =
[ Info:      5  5.0e-02  3.9e-01  6.1e-02  7.1e-01  3.3e-01  2.6e+00  1.4e-01               =
[ Info:      6  4.9e-02  3.8e-01  2.8e-02  3.4e-01  3.3e-01  2.7e+00  8.3e-02               =
[ Info:      7  3.9e-02  3.9e-01  1.4e-02  2.0e-01  3.3e-01  2.7e+00  4.1e-02               =
[ Info:      8  4.1e-02  3.9e-01  8.9e-03  1.8e-01  3.3e-01  2.7e+00  2.7e-02               =
[ Info:      9  3.9e-02  3.9e-01  5.8e-03  1.7e-01  3.3e-01  2.7e+00  1.7e-02               =
[ Info:     10  4.0e-02  3.9e-01  3.8e-03  1.7e-01  3.3e-01  2.7e+00  1.2e-02               =
[ Info:     11  4.0e-02  3.9e-01  2.5e-03  1.7e-01  3.3e-01  2.7e+00  7.6e-03               =
[ Info:     12  4.0e-02  3.9e-01  1.7e-03  1.7e-01  3.3e-01  2.7e+00  5.0e-03               =
[ Info:     13  4.0e-02  3.9e-01  1.1e-03  1.7e-01  3.3e-01  2.7e+00  3.3e-03               =
[ Info:     14  4.0e-02  3.9e-01  7.4e-04  1.7e-01  3.3e-01  2.7e+00  2.2e-03               =
[ Info:     15  4.0e-02  3.9e-01  4.9e-04  1.7e-01  3.3e-01  2.7e+00  1.5e-03               =
[ Info:     16  4.0e-02  3.9e-01  3.2e-04  1.7e-01  3.3e-01  2.7e+00  9.7e-04               =
[ Info:     17  4.0e-02  3.9e-01  2.1e-04  1.7e-01  3.3e-01  2.7e+00  6.4e-04               =
[ Info:     18  4.0e-02  3.9e-01  1.4e-04  1.7e-01  3.3e-01  2.7e+00  4.2e-04               =
[ Info:     19  4.0e-02  3.9e-01  9.3e-05  1.7e-01  3.3e-01  2.7e+00  2.8e-04               =
[ Info:     20  4.0e-02  3.9e-01  6.2e-05  1.7e-01  3.3e-01  2.7e+00  1.8e-04               =
[ Info:     21  4.0e-02  3.9e-01  4.1e-05  1.7e-01  3.3e-01  2.7e+00  1.2e-04               =
[ Info:     22  4.0e-02  3.9e-01  2.7e-05  1.7e-01  3.3e-01  2.7e+00  8.1e-05               =
[ Info:     23  4.0e-02  3.9e-01  1.8e-05  1.7e-01  3.3e-01  2.7e+00  5.4e-05               =
[ Info:     24  4.0e-02  3.9e-01  1.2e-05  1.7e-01  3.3e-01  2.7e+00  3.5e-05               =
[ Info:     25  4.0e-02  3.9e-01  7.8e-06  1.7e-01  3.3e-01  2.7e+00  2.3e-05               =
[ Info:     26  4.0e-02  3.9e-01  5.2e-06  1.7e-01  3.3e-01  2.7e+00  1.5e-05               =
[ Info:     27  4.0e-02  3.9e-01  3.4e-06  1.7e-01  3.3e-01  2.7e+00  1.0e-05               =
[ Info: R2: terminating with √(ξ/ν) = 2.2587837268169823e-6
"Execution stats: first-order stationary"

Restarting Julia, the old version gives

julia> using LinearAlgebra, NLPModels, NLPModels, NLPModelsModifiers, RegularizedProblems, ProximalOperators, Random
julia> Random.seed!(0)
julia> bpdn = bpdn_model(1)[1]
julia> λ = norm(grad(bpdn, zeros(bpdn.meta.nvar)), Inf) / 10
julia> h = NormL1(λ)

julia> import Pkg; Pkg.add(url = "https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl.git")
julia> using RegularizedOptimization
julia> options = ROSolverOptions(ν = 1.0, β = 1e16, ϵa = 1e-6, ϵr = 1e-6, verbose = 10)

julia> R2(bpdn, h, options)
[ Info:   iter     f(x)     h(x)  √(ξ/ν)        ρ       σ     ‖x‖     ‖s‖
[ Info:      1  1.8e+00  0.0e+00 1.3e+00  5.5e-01 1.0e+00 0.0e+00 1.3e+00 =
[ Info:      2  2.5e-01  5.8e-01 3.3e-01  9.5e-01 1.0e+00 1.3e+00 3.2e-01 ↘
[ Info:      3  2.1e-01  5.2e-01 2.5e-01  8.5e-01 3.3e-01 1.5e+00 7.3e-01 =
[ Info:      4  1.1e-01  4.5e-01 1.8e-01  8.6e-01 3.3e-01 2.0e+00 5.3e-01 =
[ Info:      5  1.3e-01  3.5e-01 1.5e-01  6.5e-01 3.3e-01 2.2e+00 4.4e-01 =
[ Info:      6  5.0e-02  3.9e-01 6.1e-02  7.1e-01 3.3e-01 2.6e+00 1.4e-01 =
[ Info:      7  4.9e-02  3.8e-01 2.8e-02  3.4e-01 3.3e-01 2.7e+00 8.3e-02 =
[ Info:      8  3.9e-02  3.9e-01 1.4e-02  2.0e-01 3.3e-01 2.7e+00 4.1e-02 =
[ Info:      9  4.1e-02  3.9e-01 8.9e-03  1.8e-01 3.3e-01 2.7e+00 2.7e-02 =
[ Info:     10  3.9e-02  3.9e-01 5.8e-03  1.7e-01 3.3e-01 2.7e+00 1.7e-02 =
[ Info:     11  4.0e-02  3.9e-01 3.8e-03  1.7e-01 3.3e-01 2.7e+00 1.2e-02 =
[ Info:     12  4.0e-02  3.9e-01 2.5e-03  1.7e-01 3.3e-01 2.7e+00 7.6e-03 =
[ Info:     13  4.0e-02  3.9e-01 1.7e-03  1.7e-01 3.3e-01 2.7e+00 5.0e-03 =
[ Info:     14  4.0e-02  3.9e-01 1.1e-03  1.7e-01 3.3e-01 2.7e+00 3.3e-03 =
[ Info:     15  4.0e-02  3.9e-01 7.4e-04  1.7e-01 3.3e-01 2.7e+00 2.2e-03 =
[ Info:     16  4.0e-02  3.9e-01 4.9e-04  1.7e-01 3.3e-01 2.7e+00 1.5e-03 =
[ Info:     17  4.0e-02  3.9e-01 3.2e-04  1.7e-01 3.3e-01 2.7e+00 9.7e-04 =
[ Info:     18  4.0e-02  3.9e-01 2.1e-04  1.7e-01 3.3e-01 2.7e+00 6.4e-04 =
[ Info:     19  4.0e-02  3.9e-01 1.4e-04  1.7e-01 3.3e-01 2.7e+00 4.2e-04 =
[ Info:     20  4.0e-02  3.9e-01 9.3e-05  1.7e-01 3.3e-01 2.7e+00 2.8e-04 =
[ Info:     21  4.0e-02  3.9e-01 6.2e-05  1.7e-01 3.3e-01 2.7e+00 1.8e-04 =
[ Info:     22  4.0e-02  3.9e-01 4.1e-05  1.7e-01 3.3e-01 2.7e+00 1.2e-04 =
[ Info:     23  4.0e-02  3.9e-01 2.7e-05  1.7e-01 3.3e-01 2.7e+00 8.1e-05 =
[ Info:     24  4.0e-02  3.9e-01 1.8e-05  1.7e-01 3.3e-01 2.7e+00 5.4e-05 =
[ Info:     25  4.0e-02  3.9e-01 1.2e-05  1.7e-01 3.3e-01 2.7e+00 3.5e-05 =
[ Info:     26  4.0e-02  3.9e-01 7.8e-06  1.7e-01 3.3e-01 2.7e+00 2.3e-05 =
[ Info:     27  4.0e-02  3.9e-01 5.2e-06  1.7e-01 3.3e-01 2.7e+00 1.5e-05 =
[ Info:     28  4.0e-02  3.9e-01 3.4e-06  1.7e-01 3.3e-01 2.7e+00 1.0e-05 =
[ Info:     29  4.0e-02  3.9e-01 2.3e-06          3.3e-01 2.7e+00 6.8e-06
[ Info: R2: terminating with √(ξ/ν) = 2.2587837268169823e-6
"Execution stats: first-order stationary"

There is just a small difference in the fact that the iterates start at 0 and don't print the last one, I think the last one is useless because except for the stationarity measure nothing changes.

@MaxenceGollier
Copy link
Contributor Author

Then, for benchmarks,

For this version:

julia> using LinearAlgebra, NLPModels, NLPModels, NLPModelsModifiers, RegularizedProblems, ProximalOperators, Random, BenchmarkTools
julia> Random.seed!(0)
julia> bpdn = bpdn_model(1)[1]
julia> λ = norm(grad(bpdn, zeros(bpdn.meta.nvar)), Inf) / 10
julia> h = NormL1(λ)

julia> import Pkg; Pkg.add(url = "https://github.com/MaxenceGollier/RegularizedOptimization.jl.git", rev = "RegularizedModels")
julia> using RegularizedOptimization
julia> options = ROSolverOptions(ν = 1.0, β = 1e16, ϵa = 1e-6, ϵr = 1e-6, verbose = 0)

julia>  @benchmark R2(bpdn, h, options)
BenchmarkTools.Trial: 6852 samples with 1 evaluation.
 Range (min … max):  345.100 μs … 404.146 ms  ┊ GC (min … max):  0.00% … 99.76%
 Time  (median):     558.700 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   720.358 μs ±   4.890 ms  ┊ GC (mean ± σ):  11.14% ±  6.57%

          ▃▅█▇▆▄▃▂▁    ▂▄▃▂▁  ▁▂▂▂▁▂▁▁                          ▂
  ▇▃▁▅▃▅▅▁███████████████████████████████▇▇█▇▇▇▇▆▅▆▇▆▇▇▆▅▅▅▃▅▁▅ █
  345 μs        Histogram: log(frequency) by time       1.49 ms <

 Memory estimate: 407.41 KiB, allocs estimate: 7756.

and for the old version:

julia> using LinearAlgebra, NLPModels, NLPModels, NLPModelsModifiers, RegularizedProblems, ProximalOperators, Random, BenchmarkTools
julia> Random.seed!(0)
julia> bpdn = bpdn_model(1)[1]
julia> λ = norm(grad(bpdn, zeros(bpdn.meta.nvar)), Inf) / 10
julia> h = NormL1(λ)

julia> import Pkg; Pkg.add(url = "https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl.git")
julia> using RegularizedOptimization
julia> options = ROSolverOptions(ν = 1.0, β = 1e16, ϵa = 1e-6, ϵr = 1e-6, verbose = 0)

julia> @benchmark R2(bpdn, h, options)
BenchmarkTools.Trial: 276 samples with 1 evaluation.
 Range (min … max):  13.377 ms … 457.888 ms  ┊ GC (min … max): 0.00% … 96.51%
 Time  (median):     15.351 ms               ┊ GC (median):    0.00%
 Time  (mean ± σ):   18.084 ms ±  26.749 ms  ┊ GC (mean ± σ):  9.85% ±  7.25%

   █▆▆▄▁▂▂ ▁      ▃ ▂▂ ▂▄▄▃
  ▆█████████▆▆▅▆▆███████████▁▅▆▅▆▅▁▅▆▅▅▁▆▅▆▁▅▆▁▁▁▁▁▁▅▁▁▁▁▁▁▅▆▅ ▆
  13.4 ms       Histogram: log(frequency) by time      27.2 ms <

 Memory estimate: 3.68 MiB, allocs estimate: 222912.

@MaxenceGollier
Copy link
Contributor Author

Changing the line @benchmark R2(bpdn, h, options) to @benchmark R2($bpdn, $h, $options) don't change anything.

@dpo
Copy link
Member

dpo commented Jun 11, 2024

@MaxenceGollier I fixed the demos. Could you please rebase your branch? That way, you'll have the latest fixes (builds on FreeBSD + the demos working again). The procedure goes like this:

> git checkout master

If origin in your clone points to this repository (you can find out with git remote -v), do

> git pull origin  master

If origin points to your fork instead, first do

> git remote add upstream https://github.com/JuliaSmoothOptimizers/RegularizedOptimization.jl.git
> git pull upstream master

Now head back to your branch and rebase

> git checkout RegularizedModels
> git rebase master

Hopefully, there are no conflicts to fix.

@MaxenceGollier
Copy link
Contributor Author

There are still rests of the older version which are just kept to not destroy other implementations in this package that rely on the older version.
For example, I think the constructor

function R2Solver(
  x0::S,
  options::ROSolverOptions,
  l_bound::S,
  u_bound::S;
  ψ = nothing
)

is useless. Also, I don't see why we should use ROSolverOptions instead of keyword arguments.
Finally, storing

Fobj_hist::Vector{R}
Hobj_hist::Vector{R}
Complex_hist::Vector{Int}

in R2Solver is expensive and useless. Should we just keep it this way for the moment and maybe try to remove it later ?

@dpo
Copy link
Member

dpo commented Jun 12, 2024

Your branch still has conflicts.

Let’s keep “old” features for now. We’ll remove them later. I opened a PR a while ago to remove Fhist, etc., We can update it after this one’s been merged.

@MaxenceGollier
Copy link
Contributor Author

Tests should pass now with 754feea

Copy link
Contributor

Here are the
demos-results

Copy link
Member

@dpo dpo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor comments.

src/R2_alg.jl Outdated Show resolved Hide resolved
src/R2_alg.jl Outdated Show resolved Hide resolved
Copy link
Contributor

Here are the
demos-results

MaxenceGollier and others added 2 commits June 19, 2024 21:59
Co-authored-by: Dominique <[email protected]>
Co-authored-by: Dominique <[email protected]>
Copy link

codecov bot commented Jun 22, 2024

Codecov Report

Attention: Patch coverage is 90.32258% with 12 lines in your changes missing coverage. Please review.

Project coverage is 60.21%. Comparing base (e0f214d) to head (6dfcb53).
Report is 1 commits behind head on master.

Current head 6dfcb53 differs from pull request most recent head f0a273c

Please upload reports for the commit f0a273c to get more accurate results.

Files Patch % Lines
src/R2_alg.jl 90.32% 12 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #139      +/-   ##
==========================================
- Coverage   61.53%   60.21%   -1.32%     
==========================================
  Files          11       11              
  Lines        1292     1302      +10     
==========================================
- Hits          795      784      -11     
- Misses        497      518      +21     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/R2_alg.jl Outdated Show resolved Hide resolved
Copy link
Contributor

Here are the
demos-results

Co-authored-by: Dominique <[email protected]>
Copy link
Contributor

Here are the
demos-results

@dpo
Copy link
Member

dpo commented Jun 22, 2024

There are lots of large differences in the demos, in particular on the plots showing the number of inner iterations. I wonder if other solvers are able to call the new R2 and if the number of iterations is reported properly. Could you please check?

Among others, these tests show differences:

  • bpdn-inner-outer-constr-lm-r2-l0-linf
  • bpdn-inner-outer-constr-lm-r2-l1-linf
  • bpdn-inner-outer-constr-lmtr-r2-l0-linf
  • bpdn-inner-outer-constr-lmtr-r2-l1-linf
  • bpdn-inner-outer-r2-l1-l2 (this is R2 alone; the number of iterations goes from 20 to 35)
  • bpdn-inner-outer-r2-l1-linf (R2 alone)
  • and others.

Those problems are generated randomly but we initialize the seed, so they should be the same.

@MaxenceGollier
Copy link
Contributor Author

MaxenceGollier commented Jun 23, 2024

Yes they do, when i test locally, all inner iterations are reported correctly. Can you show me with respect to what you are comparing ? I just see the demos-result for this branch (and not for the base one ?)

For example for bpdn-inner-outer-r2-l1-l2, I see less than 35 outer iterations on the plot so we do have between 20 and 35 iterations.

@dpo
Copy link
Member

dpo commented Jun 23, 2024

I was comparing to the demos in this PR: #138
There have been no changes to the solvers since that PR.
But you could easily regenerate the demo results locally from the main branch.

@MaxenceGollier
Copy link
Contributor Author

It is weird because my local branch is up to date with this and when I run this locally on bpdn-inner-outer-constr-lm-r2-l0-linf, I get

┌ Info:  using LM to solve with
└   h = NormL0{Float64}(0.04974454739546478)
[ Info:  outer    inner     f(x)     h(x)     √ξ1      √ξ        ρ       σ     ‖x‖     ‖s‖   ‖Jₖ‖² reg
[ Info:      1        2  2.1e+00  0.0e+00 4.9e-01 3.2e-01  2.8e+00 1.0e+00 0.0e+00 6.0e-01 2.0e+00 ↘
[ Info:      2        7  1.6e+00  2.0e-01 7.5e-01 7.0e-01  1.6e+00 3.3e-01 6.0e-01 1.3e+00 1.3e+00 ↘
[ Info:      3        8  6.0e-01  4.0e-01 5.5e-01 6.0e-01  1.3e+00 1.1e-01 1.8e+00 1.4e+00 1.1e+00 ↘
[ Info:      4        9  4.4e-02  5.0e-01 1.6e-01 1.8e-01  1.1e+00 3.7e-02 2.8e+00 3.9e-01 1.0e+00 ↘
[ Info:      5       16  9.1e-03  5.0e-01 1.7e-02 2.0e-02  1.0e+00 1.2e-02 3.1e+00 4.6e-02 1.0e+00 ↘
[ Info:      6       15  8.7e-03  5.0e-01 5.9e-04 7.1e-04  1.0e+00 4.1e-03 3.1e+00 1.7e-03 1.0e+00 ↘
[ Info:      7        5  8.7e-03  5.0e-01 8.1e-06 9.7e-06  1.0e+00 1.4e-03 3.1e+00 2.1e-05 1.0e+00 ↘
[ Info:      8        1  8.7e-03  5.0e-01 1.1e-06 1.1e-06          4.6e-04 3.1e+00 1.1e-06 1.0e+00
[ Info: LM: terminating with √ξ1 = 1.069826582248765e-6

So I should have the same results as on #138, I don't really see where these plots are generated ? The stats.solver_specific[:SubsolverCounter] contains the correct values as well...

@dpo
Copy link
Member

dpo commented Jun 23, 2024

They’re generated by the scripts in the examples folder.

@MaxenceGollier
Copy link
Contributor Author

The issue should be solved with f0a273c now.

* `Fobj_hist`: an array with the history of values of the smooth objective
* `Hobj_hist`: an array with the history of values of the nonsmooth objective
* `Complex_hist`: an array with the history of number of inner iterations.
ψ(s; xₖ) is either h(xₖ + s) or an approximation of h(xₖ + s), ‖⋅‖ is a user-defined norm and σₖ > 0 is the regularization parameter.
Copy link
Contributor Author

@MaxenceGollier MaxenceGollier Jul 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is $||\cdot||$ really a user-defined norm ? Isn't it always the $\ell_2$ norm ?

Suggested change
ψ(s; xₖ) is either h(xₖ + s) or an approximation of h(xₖ + s), ‖‖ is a user-defined norm and σₖ > 0 is the regularization parameter.
ψ(s; xₖ) is either h(xₖ + s) or an approximation of h(xₖ + s), ‖‖ is a user-defined norm and σₖ > 0 is the regularization parameter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oui en effet, c'est la norme 2.

@dpo dpo merged commit e4c69af into JuliaSmoothOptimizers:master Aug 15, 2024
4 checks passed
@MaxenceGollier MaxenceGollier deleted the RegularizedModels branch August 20, 2024 09:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants