Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Protection against malicious recovery spells is not sufficient in case of multiple recovery spells #14

Closed
howlbot-integration bot opened this issue Oct 27, 2024 · 8 comments
Labels
bug Something isn't working downgraded by judge Judge downgraded the risk level of this issue grade-b nullified Issue is high quality, but not accepted primary issue Highest quality submission among a set of duplicates QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax 🤖_38_group AI based duplicate group recommendation sponsor disputed Sponsor cannot duplicate the issue, or otherwise disagrees this is an issue sufficient quality report This report is of sufficient quality

Comments

@howlbot-integration
Copy link

Lines of code

https://github.com/code-423n4/2024-10-kleidi/blob/ab89bcb443249e1524496b694ddb19e298dca799/src/RecoverySpellFactory.sol#L44-L78
https://github.com/code-423n4/2024-10-kleidi/blob/ab89bcb443249e1524496b694ddb19e298dca799/src/RecoverySpell.sol#L165-L317
https://github.com/code-423n4/2024-10-kleidi/blob/ab89bcb443249e1524496b694ddb19e298dca799/src/Timelock.sol#L687-L700

Vulnerability details

Vulnerability Details

Impact

To protect against malicious recovery spells, the documentation states that it is important to have a longer time delay on the recovery spells than the timelock. This way the safe can remove the recovery spell with a proposal before it becomes executable.

However, if the protocol had to be protected by the pause guardian because the safe keys were compromised, a malicious recovery spell will be executable before it can be removed.

The issue is not possible when there is only one recovery spell, but the protocol clearly aims to set up multiple recovery spells and by restricting oneself to one recovery spell there can be times when cold signer key loss can not be recovered.
So the issue is clearly present in live deployments and can not be explained away by the system configuration.

Proof of Concept

Consider the following scenario:

timelockDelay = 5 days
pauseDuration = 15 days
RecoverySpellA delay = 10 days
malicious RecoverySpellM delay = 10 days
  1. The cold signers are compromised.

  2. RecoverySpellA gets created to rotate the signing keys.

  3. The pause guardian calls pause() to cancel the malicious proposals the compromised cold signers scheduled.

  4. One day later RecoverySpellM gets created as well.

  5. The contract is paused for 15 days which means that, among other things, it is not possible to schedule proposals.

  6. 10 days later RecoverySpellA gets executed to rotate the signing keys.

  7. One more day later, the malicious recovery spell will be executed and the safe will be lost, as there is no protection by the safe keys.

Recommended Mitigation Steps

While the contest documentation claims that the wallet can be protected against malicious recovery spells, the finding shows that this protection is limited.

It is not possible to make minor changes in such a way that the intended protection is achieved.

Instead, the protocol design as a whole needs to be reconsidered or at least the safety guarantees that the protocol aims for must be lowered.

Assessed type

Context

@howlbot-integration howlbot-integration bot added 2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value 🤖_38_group AI based duplicate group recommendation bug Something isn't working primary issue Highest quality submission among a set of duplicates sufficient quality report This report is of sufficient quality labels Oct 27, 2024
howlbot-integration bot added a commit that referenced this issue Oct 27, 2024
@ElliotFriedman
Copy link

Already covered in the guardian issues section, we expect users to only have a single recovery key https://github.com/code-423n4/2024-10-kleidi/blob/main/docs/EDGECASES.md#guardian

@ElliotFriedman ElliotFriedman added the sponsor disputed Sponsor cannot duplicate the issue, or otherwise disagrees this is an issue label Oct 29, 2024
@GalloDaSballo
Copy link

I believe I also had flagged the issue around non-deterministic behaviour when more than one spell is armed
As you could either go from A -> B -> C or from A -> C -> B

This seems to be a known issue

@c4-judge
Copy link

GalloDaSballo marked the issue as nullified

@c4-judge c4-judge added nullified Issue is high quality, but not accepted downgraded by judge Judge downgraded the risk level of this issue QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax and removed 2 (Med Risk) Assets not at direct risk, but function/availability of the protocol could be impacted or leak value labels Oct 30, 2024
@c4-judge
Copy link

GalloDaSballo changed the severity to QA (Quality Assurance)

@GalloDaSballo
Copy link

Having a hard time finding my comment about different order of operations

Downgrading to QA due to that

@HollaDieWaldfee100
Copy link

HollaDieWaldfee100 commented Nov 5, 2024

The sponsor has commented that the scenario described in this finding is already covered in the guardian issues section.
But the section that the sponsor has linked does not contain this sentence or a similar statement:

we expect users to only have a single recovery key

We have provided evidence in different submissions how the safety mechanisms of the protocol can break. See this finding but also:

Saying for every malicious scenario, there is a safe configuration vs saying there is a safe configuration that protects against all malicious scenarios are two different statements.

I agree with the first statement that for every malicious scenario, there is a safe configuration, but there is no overall safe configuration, and this is the point of our issues.

We tried to separate the different concerns which we believe point to distinct weaknesses.

With these high level thoughts out of the way, let's go back to what the sponsor said:

we expect users to only have a single recovery key

Again, this is not stated in the documentation and in fact there is reference to multiple recovery spells all over the documentation.
Quoting from just the page that the sponsor linked:

Recovery spells cannot be deployed unless a safe is deployed first

This is why it is important to have a longer time delay on the recovery spells than on the timelock.

It is recommended to have the timelock delay be shorter than the recovery spell delay so that the timelock can cancel malicious recovery spells.

Saying one timelock can cancel malicious recovery spells (-> plural!) means the system is not restricted to one recovery spell.

The scenario described in the report is then straightforward, respects all the rules with which the system should be set up, and shows that there is insufficient protection against a malicious recovery spell.
Also note that we just need one malicious recovery spell since the one malicious one can wait for a non-malicious recovery to start.

You introduced another argument why this issue should be considered known:

I believe I also had flagged the issue around non-deterministic behaviour when more than one spell is armed
As you could either go from A -> B -> C or from A -> C -> B

Non-deterministic behavior is different from the system getting compromised. Non-deterministic behavior just means we can't know which signers the Safe will end up with after recovery but not that the system can just easily be compromised. I don't think this statement alone is sufficient to discount the issue and point of view introduced in this finding.

Overall I believe this issue is valid because

  1. it is not a known issue
  2. assets can be compromised (i will agree that we have conditions that need to be satisfied so it is an M rather than H)

I will also agree it is hard to find a solution for this problem (see recommendation section above) but this doesn't reduce the validity of the problem. Maybe it just means the system must be advertised as less safe than expected.

@GalloDaSballo
Copy link

As discussed above and discussed in the EDGECASES:

## Malicious Recovery Spell Scenario
If the time delay on a recovery spell is shorter than the timelock, the recovery spell can kick the safe owners without the current safe owners being able to veto this change. This is why it is important to have a longer time delay on the recovery spells than on the timelock.

## Malicious Safe Signer Takeover Scenario
Conversely, if the Safe keys are compromised, and the recovery spell time delay is longer than the timelock, then the attacker can rotate the keys on the Safe through a timelocked transaction and remove the recovery spell as a module. However, if the guardian is set, and the pause duration is longer than the recovery spell period, the guardian can pause the timelock, cancelling all malicious proposals, which stops them from being executed, and then the recovery spell can execute and rotate signing keys.

The scenario listed would require both signers and malicious spell being compromised

If a spell and the signers are compromised, the system has been taken over and there is no recovery scenario covered by the system

Adding more spells doesn't allow to shrink the sequence to these 2 core pieces
Meaning that this case is covered by the README

It's worth noting that in this scenario, if a bening and malicious recovery spell where present, the non deterministic behaviour of spells may actually play in favour of the benign side, since as long as they execute the spell last they will regain ownership of the Safe

However all of these discussions fall into cases that can be reduced to known

For this reason I believe it's most appropriate to maintain the severity as QA

@HollaDieWaldfee100
Copy link

The scenario listed would require both signers and malicious spell being compromised

Okay, if this makes the finding fall under known edge cases I agree this is invalid. Thanks for checking.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working downgraded by judge Judge downgraded the risk level of this issue grade-b nullified Issue is high quality, but not accepted primary issue Highest quality submission among a set of duplicates QA (Quality Assurance) Assets are not at risk. State handling, function incorrect as to spec, issues with clarity, syntax 🤖_38_group AI based duplicate group recommendation sponsor disputed Sponsor cannot duplicate the issue, or otherwise disagrees this is an issue sufficient quality report This report is of sufficient quality
Projects
None yet
Development

No branches or pull requests

5 participants