Skip to content

Commit

Permalink
add model based
Browse files Browse the repository at this point in the history
  • Loading branch information
Lightbridge-KS committed Sep 28, 2024
1 parent 823d761 commit 3ddeeac
Show file tree
Hide file tree
Showing 3 changed files with 106 additions and 0 deletions.
15 changes: 15 additions & 0 deletions _freeze/contents/ml-model-based/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
{
"hash": "b80d99b897693d5926839e3b73476605",
"result": {
"engine": "knitr",
"markdown": "# Model Performace Difference\n\n## One-Sample Proportion Test for Machine Learning Research\n\n### Scenario:\nSuppose you are evaluating the performance of a new machine learning classifier that predicts whether patients have a particular disease. Previous research using similar classifiers has shown an AUC (Area Under the Curve) of 0.90. You want to see if your new classifier performs significantly better or worse than this standard. To do so, you will use a one-sample proportion test to compare your classifier's AUC against the hypothesized value of 0.90.\n\n### Setting Up the One-Sample Proportion Test\nIn this context:\n\n- **Null Hypothesis ($H_0$)**: The AUC of the new classifier is equal to 0.90. \n$$\nH_0: p = 0.90\n$$\n\n- **Alternative Hypothesis ($H_a$)**: The AUC of the new classifier is not equal to 0.90 (it could be higher or lower). \n$$\nH_a: p \\neq 0.90\n$$\n\n- **Significance Level ($\\alpha$)**: 0.05 (5% chance of Type I error — rejecting$H_0$when it's true).\n \n- **Power ($1 - \\beta$)**: 0.80 (80% chance of correctly rejecting$H_0$when it’s false).\n\n- **Effect Size ($\\Delta$)**: You want to detect a difference of at least 0.05, meaning that you want to see if the new AUC is 0.95 or greater, or 0.85 or lower.\n\n- **Observed Proportion ($p$)**: This is the proportion you will calculate based on the model’s performance on a test set (e.g., by using a ROC curve to compute the AUC).\n\n\n### Conducting the One-Sample Proportion Test\nAfter determining the sample size, you can conduct the experiment and calculate the observed AUC of your classifier on the test set.\n\nFor example:\n\n- **Test Set Size**: 150 cases\n- **Observed AUC**: 0.92\n\nYou can then use the one-sample proportion test to check if this observed AUC of 0.92 is significantly different from 0.90.\n\n### Performing the Test in R (`pwr`)\n\nHere's how you could perform this test in R using the `pwr.p.test()` function:\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlibrary(pwr)\n```\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\n# Define parameters\np0 <- 0.90 # Null hypothesis proportion\npa <- 0.95 # Alternative hypothesis proportion\neffect_size <- abs(pa - p0) # Effect size (absolute difference)\npower <- 0.80 # Desired power\nalpha <- 0.05 # Significance level\n\n# Calculate the sample size using pwr.p.test()\nsample_size_result <- pwr.p.test(h = ES.h(p0, pa), \n sig.level = alpha, \n power = power, \n alternative = \"two.sided\")\n\n# Print the result\nprint(sample_size_result)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n\n proportion power calculation for binomial distribution (arcsine transformation) \n\n h = 0.1924743\n n = 211.8659\n sig.level = 0.05\n power = 0.8\n alternative = two.sided\n```\n\n\n:::\n:::\n\n\n\n\n\n**Explanation of the parameters** in `pwr.p.test()`:\n\n- `h = ES.h(p0, pa)`: The effect size for proportion tests calculated using Cohen's h formula.\n- `sig.level`: The significance level (alpha).\n- `power`: The desired power of the test.\n- `alternative`: Specifies whether the test is \"two.sided\", \"greater\", or \"less\".\n- `n`: the calculated sample size required **per group** to detect the specified effect size with the given power and significance level.\n\n\n\n\n### Use Case Summary\n\n| Parameter | Value |\n|------------------------------|----------------------------------|\n| Hypothesized AUC ($p_0$) | 0.90 |\n| Observed AUC | 0.92 |\n| Significance Level ($\\alpha$) | 0.05 |\n| Power ($1 - \\beta$) | 0.80 |\n| Test Set Size | 150 |\n| Null Hypothesis ($H_0$) | AUC = 0.90 |\n| Alternative Hypothesis ($H_a$) | AUC ≠ 0.90 |\n\nIf the one-sample proportion test shows a statistically significant result, you can confidently say that the AUC of the new classifier is different from 0.90 and assess whether the new model performs better or worse than the previous standard.",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": {},
"postProcess": true
}
}
1 change: 1 addition & 0 deletions _quarto.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ book:
- part: "ML Studies"
chapters:
- "contents/ml.qmd"
- "contents/ml-model-based.qmd"
- references.qmd

bibliography: assets/ref.bib
Expand Down
90 changes: 90 additions & 0 deletions contents/ml-model-based.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,90 @@
# Model Performace Difference

## One-Sample Proportion Test for Machine Learning Research

### Scenario:
Suppose you are evaluating the performance of a new machine learning classifier that predicts whether patients have a particular disease. Previous research using similar classifiers has shown an AUC (Area Under the Curve) of 0.90. You want to see if your new classifier performs significantly better or worse than this standard. To do so, you will use a one-sample proportion test to compare your classifier's AUC against the hypothesized value of 0.90.

### Setting Up the One-Sample Proportion Test
In this context:

- **Null Hypothesis ($H_0$)**: The AUC of the new classifier is equal to 0.90.
$$
H_0: p = 0.90
$$

- **Alternative Hypothesis ($H_a$)**: The AUC of the new classifier is not equal to 0.90 (it could be higher or lower).
$$
H_a: p \neq 0.90
$$

- **Significance Level ($\alpha$)**: 0.05 (5% chance of Type I error — rejecting$H_0$when it's true).

- **Power ($1 - \beta$)**: 0.80 (80% chance of correctly rejecting$H_0$when it’s false).

- **Effect Size ($\Delta$)**: You want to detect a difference of at least 0.05, meaning that you want to see if the new AUC is 0.95 or greater, or 0.85 or lower.

- **Observed Proportion ($p$)**: This is the proportion you will calculate based on the model’s performance on a test set (e.g., by using a ROC curve to compute the AUC).


### Conducting the One-Sample Proportion Test
After determining the sample size, you can conduct the experiment and calculate the observed AUC of your classifier on the test set.

For example:

- **Test Set Size**: 150 cases
- **Observed AUC**: 0.92

You can then use the one-sample proportion test to check if this observed AUC of 0.92 is significantly different from 0.90.

### Performing the Test in R (`pwr`)

Here's how you could perform this test in R using the `pwr.p.test()` function:

```{r}
library(pwr)
```

```{r}
# Define parameters
p0 <- 0.90 # Null hypothesis proportion
pa <- 0.95 # Alternative hypothesis proportion
effect_size <- abs(pa - p0) # Effect size (absolute difference)
power <- 0.80 # Desired power
alpha <- 0.05 # Significance level
# Calculate the sample size using pwr.p.test()
sample_size_result <- pwr.p.test(h = ES.h(p0, pa),
sig.level = alpha,
power = power,
alternative = "two.sided")
# Print the result
print(sample_size_result)
```


**Explanation of the parameters** in `pwr.p.test()`:

- `h = ES.h(p0, pa)`: The effect size for proportion tests calculated using Cohen's h formula.
- `sig.level`: The significance level (alpha).
- `power`: The desired power of the test.
- `alternative`: Specifies whether the test is "two.sided", "greater", or "less".
- `n`: the calculated sample size required **per group** to detect the specified effect size with the given power and significance level.




### Use Case Summary

| Parameter | Value |
|------------------------------|----------------------------------|
| Hypothesized AUC ($p_0$) | 0.90 |
| Observed AUC | 0.92 |
| Significance Level ($\alpha$) | 0.05 |
| Power ($1 - \beta$) | 0.80 |
| Test Set Size | 150 |
| Null Hypothesis ($H_0$) | AUC = 0.90 |
| Alternative Hypothesis ($H_a$) | AUC ≠ 0.90 |

If the one-sample proportion test shows a statistically significant result, you can confidently say that the AUC of the new classifier is different from 0.90 and assess whether the new model performs better or worse than the previous standard.

0 comments on commit 3ddeeac

Please sign in to comment.