Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud spring cleaning 2024 04 30 #97

Merged
merged 21 commits into from
May 2, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ Here are some of the most-visited sections:
- [How to get started as a Code4rena warden](roles/wardens#joining-an-audit)
- [Submission policy](roles/wardens/submission-policy.md) and [reporting guidelines](roles/wardens/submission-guidelines.md)
- [Becoming Certified (KYC’d): benefits and process](roles/certified-contributors)
- [+Backstage warden role: overview, criteria and process](roles/certified-contributors/backstage-wardens.md)
- [Lookout role: overview, criteria and process](roles/certified-contributors/lookouts.md)
- [SR role: overview, criteria and process](roles/certified-contributors/sr-backstage-wardens.md)
- [Validator role: overview, criteria and process](roles/certified-contributors/validators.md)
- [Scout role: overview and selection process](roles/certified-contributors/scouts.md)
- Awarding [model](awarding/incentive-model-and-awards) and [process](awarding/incentive-model-and-awards/awarding-process.md)
- [Judging criteria](awarding/judging-criteria) and [severity categorization](awarding/judging-criteria/severity-categorization.md)
Expand Down
104 changes: 64 additions & 40 deletions awarding/incentive-model-and-awards/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Incentive model and awards

To incentivize **wardens**, C4 uses a unique scoring system with two primary goals: reward contestants for finding unique bugs and also to make the audit resistant to Sybil attack. A secondary goal of the scoring system is to encourage contestants to form teams and collaborate.
To incentivize **wardens**, C4 uses a unique scoring system with two primary goals: reward participants for finding unique bugs and also to make the audit resistant to Sybil attack. A secondary goal of the scoring system is to encourage participants to form teams and collaborate.

**Judges** are incentivized to review findings and decide their severity, validity, and quality by receiving a share of the prize pool themselves.

Expand All @@ -11,7 +11,7 @@ To incentivize **wardens**, C4 uses a unique scoring system with two primary goa

## High and Medium Risk bugs

Contestants are given shares for bugs discovered based on severity, and those shares give the owner a pro rata piece of the pie:
Wardens are given shares for bugs discovered based on severity, and those shares give the owner a pro rata piece of the pie:

`Med Risk Slice: 3 * (0.9 ^ (split - 1)) / split`\
`High Risk Slice: 10 * (0.9 ^ (split - 1)) / split`
Expand Down Expand Up @@ -40,6 +40,14 @@ The resulting awards are:
| 'Warden B' | 'H-02' | '3' | 8.91 | 3 | 2.70 | 1000 |
| 'Warden C' | 'H-02' | '3' | 8.91 | 3 | 2.70 | 1000 |

### Bonuses for top competitors
For audits starting on or after April 30, 2024, there are two bonuses for top-performing wardens:

1. **Hunter bonus:** 10% of the HM pool will be awarded to the warden or team who identities the greatest number of unique HMs.
2. **Gatherer bonus:** 10% of the HM pool will be awarded to the warden or team who identifies the greatest number of valid HMs.

Both bonuses weigh Highs more heavily than Mediums, similarly to Code4rena's standard awarding mechanism.

### Duplicates getting partial credit

All issues which identify the same functional vulnerability will be considered duplicates regardless of effective rationalization of severity or exploit path.
Expand Down Expand Up @@ -103,66 +111,46 @@ We can see here that the logic behind the `partial-` labels only impacts the awa

Only the award amounts for "partial" findings have been reduced, in line with expectations. The aim of this adjustment is to recalibrate the rewards allocated for these specific findings. Meanwhile, the awards for full-credit findings remain unchanged.

## Bot races

The first hour of each Code4rena audit is devoted to a bot race, to incentivize high quality automated findings as the first wave of the audit.

- The winning bot report is selected and shared with all wardens within 24 hours of the audit start time.
- The full set of issues identified by the best automated tools are considered out of scope for the audit and ineligible for awards.

Doing this eliminates the enormous overlapping effort of all wardens needing to document common low-hanging issues And because the best bot report is shared with auditors at the start of the audit, these findings serve as a thorough starting place for understanding the codebase and where weaknesses may exist.

**Ultimately, the bot race ensures human auditors are focused on things humans can do.**

By designating a portion of the pool in this direction, Code4rena creates a separate lane for the significant investment of effort that many auditors already make in automated tooling -- and rather than awarding 100 people for identifying the same issue, we award the best automated tools.

## Analyses

Each warden is encouraged to submit an Analysis alongside their findings for each audit, to share high-level advice and insights from their review of the code.

Where individual findings are the "trees" in an audit, the Analysis is a "forest"-level view.
### Validator-improved submissions

Advanced-level Analyses compete for a portion of each audit's award pool, and are graded and awarded similarly to QA and Gas Optimization reports.
[Validators](https://docs.code4rena.com/roles/certified-contributors/validators.md) may enhance submissions (add PoC, increase quality of report, etc.) in exchange for a % of the finding’s payout.

For Validator-improved submissions: if the judge believes the validator added a measurable enhancement, they get a split of the value of the issue:
- 25% cut → small enhancement
- 50% cut → med enhancement
- 75% cut → large enhancement

## QA and Gas Optimization Reports

In order to incentivize wardens to focus efforts on high and medium severity findings while also ensuring quality coverage, the pool’s allocation is capped for low severity, non-critical, and gas optimization findings.
In order to incentivize wardens to focus efforts on high and medium severity findings while also ensuring quality coverage, the pool’s allocation is capped for low severity, governance, and gas optimization findings.

Low and non-critical findings are submitted as a **single** QA report. Similarly, gas optimizations are submitted as a single gas report. For more on reports, see [Judging criteria](/awarding/judging-criteria/README.md).
Low and governance findings are submitted as a **single** QA report. Similarly, gas optimizations are submitted as a single gas report. For more on reports, see [Judging criteria](/awarding/judging-criteria/README.md).

QA and gas optimization reports are awarded on a curve based on the judge’s score.

- QA reports compete for a share of 2.5% of the prize pool (e.g. $1,250 for a $50,000 audit);
- The gas optimization pool varies from audit to audit, but is typically 2.5% of the total prize pool (e.g. $1,250 for a $50,000 audit);
- QA and Gas optimization reports are scored by judges using A/B/C grades (with C = unsatisfactory), and awarded on a curve.
- QA reports compete for a share of 4% of the prize pool (e.g. $2,000 for a $50,000 audit);
- The gas optimization pool varies from audit to audit;
- QA and Gas optimization reports are awarded on a curve.

There is a very high burden of quality and value provided for QA and gas optimization reports. Only submissions that demonstrate full effort worthy of consideration for inclusion in the report will be eligible for rewards.

It is highly recommended to clearly spell out the impact of proposed gas optimizations.

Historically, Code4rena valued non-critical findings at 0; the intent of the QA report is not to increase the value of non-criticals, but rather to allow them to be consolidated in reports alongside low severity issues.

**Note:** Audits pre-dating February 3, 2022 awarded low risk and gas optimization shares as: `Low Risk Shares: 1 * (0.9 ^ (findingCount - 1)) / findingCount`

In the unlikely event that zero high- or medium-risk vulnerabilities are found, the HM award pool will be divided based on the QA Report curve.

## Grades for Analyses, QA and Gas reports
### Ranks for QA and Gas reports

Analyses, QA reports and Gas reports are graded A, B, or C.
_These guidelines apply to all audits starting on or after April 30, 2024._

C scores are unsatisfactory and ineligible for awards.
After post-judging QA is complete, the Judge and Validators vote to select the top 3 QA reports and Gas reports. (In the case of a tie vote, there may be a 4th place report.)

All A-grade reports receive a score of 2; All B-grade reports get a 1. Awarding for QA and Gas reports is on a curve that's described [here](https://docs.code4rena.com/awarding/incentive-model-and-awards/curve-logic).
The 1st, 2nd, and 3rd place winners are awarded using a curve model that will be documented here ASAP.

### Bonus for best / selected for report
Judges choose the best report in each category (Analysis, QA report, and Gas report), each of which earns the same 30% share bonus described under "High and Medium Risk bugs."
Satisfactory reports not among the winning reports will not be awarded -- but will count towards wardens' accuracy scores.

**Note:** if the `selected for report` submission has a B-grade label, it will still be treated as A-grade and given proportionally more than B-grade, plus the 30% bonus for being `selected for report`.
In the unlikely event that zero high- or medium-risk vulnerabilities are found, the HM award pool will be divided based on the QA Report curve, unless otherwise stated in the audit repo.

## Satisfactory / unsatisfactory submissions

Any submissions deemed unsatisfactory are ineligible for awards.
Any submissions deemed unsatisfactory are ineligible for awards, and count against wardens' accuracy scores.

The bar for satisfactory submissions is that they are roughly at a level that could be found in a draft report by a professional auditor: specifically on the merits of technical substance, with writing quality considered only where it interferes with comprehension of the technical message.

Expand All @@ -176,3 +164,39 @@ It is possible for a submission to be *technically* valid and still unsatisfacto
- approach is disrespectful of sponsors’ and judges’ time in some way

Any submissions that appear to be direct copies of other reports in the current audit will be collectively deemed unsatisfactory.

## Other submission types

As of April 30, 2024, the following submission types are paused:

### Bot reports

The first hour of each Code4rena audit is devoted to a bot race, to incentivize high quality automated findings as the first wave of the audit.

- The winning bot report is selected and shared with all wardens within 24 hours of the audit start time.
- The full set of issues identified by the best automated tools are considered out of scope for the audit and ineligible for awards.

Doing this eliminates the enormous overlapping effort of all wardens needing to document common low-hanging issues And because the best bot report is shared with auditors at the start of the audit, these findings serve as a thorough starting place for understanding the codebase and where weaknesses may exist.

**Ultimately, the bot race ensures human auditors are focused on things humans can do.**

By designating a portion of the pool in this direction, Code4rena creates a separate lane for the significant investment of effort that many auditors already make in automated tooling -- and rather than awarding 100 people for identifying the same issue, we award the best automated tools.

### Analyses

Analyses share high-level advice and insights from wardens' review of the code.

Where individual findings are the "trees" in an audit, the Analysis is a "forest"-level view.

Analyses compete for a portion of each audit's award pool, and are graded and awarded similarly to QA and Gas Optimization reports.

### Understanding historical grading for QA, Gas, and Analysis reports

For audits that started before April 30, 2024:

- Analyses, QA reports and Gas reports in this time period were graded A, B, or C.
- C scores are unsatisfactory and ineligible for awards.
- All A-grade reports receive a score of 2; All B-grade reports get a 1. Awarding for QA and Gas reports is on a curve that's described [here](https://docs.code4rena.com/awarding/incentive-model-and-awards/curve-logic).
- Judges chose the best report in each category (Analysis, QA report, and Gas report), each of which earns the same 30% share bonus described under "High and Medium Risk bugs."

**Note:** if the `selected for report` submission has a B-grade label, it will still be treated as A-grade and given proportionally more than B-grade, plus the 30% bonus for being `selected for report`.
6 changes: 3 additions & 3 deletions awarding/incentive-model-and-awards/awarding-process.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ description: >-

# Awarding process

At the conclusion of an audit, sponsors review wardens’ findings and express their opinions with regard to severity of issues. Judges evaluate input from both and make the ultimate decision in terms of severity and validity of issues. (See [How to judge an audit](../../roles/judges/how-to-judge-a-contest.md) for more detail.)
At the conclusion of an audit, sponsors review wardens’ findings and express their opinions with regard to severity of issues. Judges evaluate input from both and make the ultimate decision in terms of severity and validity of issues. (See [How to judge an audit](https://docs.code4rena.com/roles/judges/how-to-judge-a-contest.md) for more detail.)

In making their determination, judges add labels to Github issues, while the original submission data (including the warden's proposed severity rating) is preserved via a JSON data file.

Expand All @@ -16,7 +16,7 @@ Judging data is used to generate the awards using Code4rena's award calculation
- Risk level
- Validity
- Number of duplicates
- Grade (A, B, C; Satisfactory/Unsatisfactory)
- Rank (1st, 2nd, 3rd; Satisfactory/Unsatisfactory)
- In some cases, "partial duplicate" status

It should be possible to reverse engineer awards using a combination of two CSV files:
Expand All @@ -40,7 +40,7 @@ If you still don’t see the award in your wallet, please [open a help desk tick

We are occasionally asked how wardens should declare Code4rena earnings for tax (or other financial/legal) purposes. Due to the nascent nature of DAOs, we are unable to provide reliable information in this area. You must assess and determine your own best course of action.

Audit contest rewards are distributed by the DAO, which does not have a legal personality.
Audit rewards are distributed by the DAO, which does not have a legal personality.

The DAO has designated Code4rena Foundation as its agent via [a governance action](https://github.com/code-423n4/org/discussions/13) [approved by DAO members](https://polygonscan.com/tx/0x8fbe178e34a7ae03a5e0d1f49f23e38f3a1c0d1186a67920d33196a89f79da98) for purposes of entering into contractual agreements. However, wardens are not in any contractual agreement with the Foundation [unless they are certified](https://code4rena.com/certified-contributor-summary/).

Expand Down
18 changes: 7 additions & 11 deletions awarding/incentive-model-and-awards/qa-gas-report-faq.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,28 @@
# FAQ about QA and Gas Reports

This FAQ pertains to the award mechanism update that takes effect February 3, 2022, which changes the submission guidelines for low-risk, non-critical, and gas optimization reports. For more details, see [Judging Criteria](https://docs.code4rena.com/roles/wardens/judging-criteria).
This FAQ pertains to the award mechanism update that takes effect April 30, 2024, which changes the submission guidelines for low-risk, non-critical, and gas optimization reports. For more details, see [Judging Criteria](https://docs.code4rena.com/roles/wardens/judging-criteria).

### What happens to the award pool if no Med/High vulns are found?

Unless otherwise stipulated in the audit repo, the full pool would then be divided based on the QA Report curve.

### Will non-critical findings hold some weight? Just want to know if it's worth spending a considerable amount of time writing this part of the report.
### Can I still include non-critical findings in my QA report?

The full QA report will be graded on a curve against the other reports. We'll be experimenting together as a community with this, but we think we'll learn a lot and it will be interesting to see the best practices emerge.
Non-critical findings are discouraged for QA reports.

We are intentionally not providing an "example," as we are eager to see what approaches folks take and to be able to learn from a variety of approaches.

### What if a low-impact QA report turns out to be a high-impact report? How does that work with the 10% prize pool? Would the report be upgraded?
### What if a low-impact QA report turns out to be a high-impact report? Would the report be upgraded?

It's conceivable it could be upgraded, though it's important to consider that part of auditing is demonstrating proper theory of how an issue could be exploited. If a warden notices something is "off" but is unable to articulate why it could lead to loss of funds, for example, the job is only half-done; without understanding the implications, a developer could very well overlook or deprioritize the issue.

The tl;dr for determining severity is relatively clear with regard to separating by impact.
The tl;dr for [determining severity](https://docs.code4rena.com/awarding/judging-criteria/severity-categorization.md) is relatively clear with regard to separating by impact.

### What happens when an issue submitted by the warden as part of their QA report (an L or N) *DOES* get bumped up to Med/High by the judge after review?
### What happens when an issue submitted by the warden as part of their QA report (an L or C) *DOES* get bumped up to Med/High by the judge after review?

If it seemed appropriate to do so based on a judge's assessment of the issue, they could certainly choose to do this.

The judge could create a new separate Github issue in the findings repo that contains the relevant portions of the warden's QA report, and add that to the respective H or M level bucket.

However, QA items may be marked as a duplicate of another finding *without* being granted an upgrade, since making the case for *how* an issue can be exploited, and providing a thorough description and proof of concept, is part of what merits a finding properly earning medium or high severity.

### Conversely, in the reverse situation where an issue submitted by wardens as H/M level, is subsequently downgraded to QA level by the judge during their review, would the penalty just be excluding the overrated warden submission from consideration in regards to the QA rewards?

We'll need to see how it works in reality, but our current assumption is that (a) low severity findings attempted to get pushed into med/high would essentially get zero (just logically so since they wouldn't be high or med), and then (b) their QA report would be lower quality as a result, and so they wouldn't score as highly as they could have. Judges could also decide to mark off points in someone's QA report if they saw behavior that seemed like it might be trying to game for higher rewards by inflating severity, so it could have a negative consequence as well.
In theory, findings downgraded to QA are grouped with the warden's QA report (if one exists) and they are grouped together. In practice, however, we have found that downgraded issues do not have a significant impact on wardens' overall QA score. Judges can also decide to mark off points in someone's QA report if they see behavior that seems like it might be trying to game for higher rewards by inflating severity, so it can have a negative consequence as well.

Loading