Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TR: fix definition of ν #124

Merged
merged 7 commits into from
Feb 20, 2024
Merged

TR: fix definition of ν #124

merged 7 commits into from
Feb 20, 2024

Conversation

dpo
Copy link
Member

@dpo dpo commented Oct 7, 2023

The definition in the paper is

$$ \nu_k = 1 / (L(x_k) + \alpha^{-1} \Delta_k^{-1}), $$

and $L(x_k) = \Vert B_k \Vert$.
There is no $\theta$ involved.

@codecov
Copy link

codecov bot commented Oct 7, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (20fb633) 61.40% compared to head (3e191d1) 63.55%.
Report is 3 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #124      +/-   ##
==========================================
+ Coverage   61.40%   63.55%   +2.14%     
==========================================
  Files          11       11              
  Lines        1293     1295       +2     
==========================================
+ Hits          794      823      +29     
+ Misses        499      472      -27     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions
Copy link
Contributor

github-actions bot commented Oct 7, 2023

Here are the
demos-results

@geoffroyleconte
Copy link
Member

geoffroyleconte commented Oct 9, 2023

Isn't it

$$ \nu_k \le 1 / (L(x_k) + \alpha^{-1} \Delta_k^{-1}) $$

in the paper?
I've tried to launch the benchmark tables and I experienced instabilities with unconstrained FH on every random seed I've tried.

@dpo
Copy link
Member Author

dpo commented Oct 9, 2023

In principle, there is no reason to take a smaller step size than necessary.

What kind of instabilities?

@geoffroyleconte
Copy link
Member

Issues with DifferentialEquations.jl

@rjbaraldi
Copy link
Collaborator

This is probably because FH doesn't actually satisfy our problem assumptions. I recall now that my experience is also that DifferentialEquations.jl fails occasionally.

Copy link
Contributor

github-actions bot commented Nov 1, 2023

Here are the
demos-results

@dpo
Copy link
Member Author

dpo commented Nov 3, 2023

@geoffroyleconte @rjbaraldi What do you think of these numerical results?

src/TR_alg.jl Outdated Show resolved Hide resolved
@dpo dpo requested a review from geoffroyleconte November 6, 2023 13:19
Co-authored-by: geoffroyleconte <[email protected]>
Copy link
Contributor

github-actions bot commented Nov 6, 2023

Here are the
demos-results

@dpo
Copy link
Member Author

dpo commented Nov 8, 2023

@geoffroyleconte What do you think of these demo results?

@geoffroyleconte
Copy link
Member

I am wondering whether we should use
$$\nu_k = \frac{\alpha \Delta_k}{1 + \Vert B_k \Vert (1 + \alpha \Delta_k)}$$
or
$$\nu_k = \frac{1}{\alpha^{-1} \Delta_k^{-1} +\Vert B_k \Vert (1 + \alpha^{-1} \Delta_k^{-1})}$$
?
I observe different results with these 2 expressions. I would choose the 2nd because we set $\alpha$ to 1 / eps(), but I'm not 100% sure.
I both cases, this changes the benchmarks of the TRDH paper (I've created a branch on my fork with the current state of this repo to keep the results). Maybe we should also experiment more with the value of $\alpha$ in another PR (the old implementation used the parameter $\theta$ set to $10^{-3}$).

@dpo
Copy link
Member Author

dpo commented Nov 8, 2023

I observe different results with these 2 expressions. I would choose the 2nd because we set $\alpha$
to 1 / eps(), but I'm not 100% sure.

I agree that if $\alpha$ is that small, the second expression would be better.

The old and new formulae coincide if $\alpha = \theta^{-1} \Delta^{-1}$. If we assume that $\Delta$ remains $\Theta(1)$, that means $\alpha \approx \theta^{-1}$, i.e., $\alpha \approx 10^3$ in this case.

@geoffroyleconte
Copy link
Member

I observe different results with these 2 expressions. I would choose the 2nd because we set α
to 1 / eps(), but I'm not 100% sure.

I agree that if α is that small, the second expression would be better.

You mean $\alpha$ large?

The old and new formulae coincide if α=θ−1Δ−1. If we assume that Δ remains Θ(1), that means α≈θ−1, i.e., α≈103 in this case.

Yes, the old expression is

$$\nu_k = \frac{1}{\alpha^{-1} \Delta_k^{-1} + \Vert B_k \Vert(1 + \theta)}$$

But the first term of the denominator will change as well if we decrease $\alpha$ (maybe it is not really probablematic).

Should I commit the changes I suggested? And may be tune $\alpha$ differently in the benchmarks in another PR?

@dpo
Copy link
Member Author

dpo commented Nov 12, 2023

Yes, let's merge this if you think the results look reasonable. I would really like to introduce proper benchmarks in this repo so we can have a clear view of the performance without skimming through the demos.

src/TR_alg.jl Outdated Show resolved Hide resolved
src/TR_alg.jl Outdated Show resolved Hide resolved
src/TR_alg.jl Outdated Show resolved Hide resolved
src/TR_alg.jl Outdated Show resolved Hide resolved
dpo and others added 4 commits November 13, 2023 17:19
Co-authored-by: geoffroyleconte <[email protected]>
Co-authored-by: geoffroyleconte <[email protected]>
Co-authored-by: geoffroyleconte <[email protected]>
Co-authored-by: geoffroyleconte <[email protected]>
Copy link
Contributor

Here are the
demos-results

@geoffroyleconte
Copy link
Member

I have instabilities with FH, we would need to change the value of $\alpha$ and / or increase $\epsilon$ (but I can do it in another PR if you want).

@dpo dpo merged commit 7e53ad2 into master Feb 20, 2024
@dpo dpo deleted the tr-nu branch February 20, 2024 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants