Skip to content

konung-yaropolk/AutoStatLib

Repository files navigation

AutoStatLib - python library for automated statistical analysis

pypi_version GitHub Release PyPI - License Python PyPI - Downloads

To install run the command:

pip install autostatlib

Example use case:

See the /demo directory on Git repo or
use the following example:

import numpy as np
import AutoStatLib

# generate random data:
groups = 2
n = 30

# normal data
data_norm = [list(np.random.normal(.5*i + 4, abs(1-.2*i), n))
        for i in range(groups)]

# non-normal data
data_uniform = [list(np.random.uniform(i+3, i+1, n)) for i in range(groups)]


# set the parameters:
paired = False     # is groups dependent or not
tails = 2          # two-tailed or one-tailed result
popmean = 0        # population mean - only for single-sample tests needed

# initiate the analysis
analysis = AutoStatLib.StatisticalAnalysis(
    data_norm, paired=paired, tails=tails, popmean=popmean)

now you can preform automated statistical test selection:

analysis.RunAuto()

or you can choose specific tests:

# 2 groups independent:
analysis.RunTtest()
analysis.RunMannWhitney()

# 2 groups paired"
analysis.RunTtestPaired()
analysis.RunWilcoxon()

# 3 and more independed groups comparison:
analysis.RunOnewayAnova()
analysis.RunKruskalWallis()

# 3 and more depended groups comparison:
analysis.RunOnewayAnovaRM()
analysis.RunFriedman()

# single group tests"
analysis.RunTtestSingleSample()
analysis.RunWilcoxonSingleSample()

Test summary will be printed to the console. You can also get it as a python string via GetSummary() method.


Test results are accessible as a dictionary via GetResult() method:

results = analysis.GetResult()

The results dictionary keys with representing value types:

{
    'p-value':                     String
    'Significance(p<0.05)':        Boolean
    'Stars_Printed':               String
    'Test_Name':                   String
    'Groups_Compared':             Integer
    'Population_Mean':             Float   (taken from the input)
    'Data_Normaly_Distributed':    Boolean
    'Parametric_Test_Applied':     Boolean
    'Paired_Test_Applied':         Boolean
    'Tails':                       Integer (taken from the input)
    'p-value_exact':               Float
    'Stars':                       Integer
    'Warnings':                    String
    'Groups_N':                    List of integers
    'Groups_Median':               List of floats
    'Groups_Mean':                 List of floats
    'Groups_SD':                   List of floats
    'Groups_SE':                   List of floats
    'Samples':                     List of input values by groups
                                           (taken from the input)
}

If errors occured, GetResult() returns an empty dictionary


Pre-Alpha dev status.

TODO:

-- Kruskal-Wallis test - add Dunn's multiple comparisons
-- Anova: add 2-way anova and 3-way anova
-- onevay Anova: add repeated measures (for normal dependent values) with and without Gaisser-Greenhouse correction
-- onevay Anova: add Brown-Forsithe and Welch (for normal independent values with unequal SDs between groups)
-- paired T-test: add ratio-paired t-test (ratios of paired values are consistent)
-- add Welch test (for norm data unequal variances) -- add Kolmogorov-smirnov test (unpaired nonparametric 2 sample, compare cumulative distributions)
-- add independent t-test with Welch correction (do not assume equal SDs in groups)
-- add correlation test, correlation diagram
-- add linear regression, regression diagram
-- add QQ plot -- n-sample tests: add onetail option

✅ done -- detailed normality test results

checked tests:
1-sample:
--Wilcoxon 2,1 tails - ok
--t-tests 2,1 tails -ok

2-sample:
--Wilcoxon 2,1 tails - ok
--Mann-whitney 2,1 tails - ok
--t-tests 2,1 tails -ok

n-sample:
--Kruskal-Wallis 2 tail - ok
--Friedman 2 tail - ok
--one-way ANOWA 2 tail - ok