added information theory toolbox

dainis-boumber · Sep 16, 2018 · d657bd9 · d657bd9
1 parent ffdde5b
commit d657bd9
Show file tree

Hide file tree

Showing 58 changed files with 13,287 additions and 0 deletions.
diff --git a/ite-in-python/CHANGELOG.txt b/ite-in-python/CHANGELOG.txt
@@ -0,0 +1,6 @@
+v1.1 (Feb 20, 2018):
+-Analytical values & demo for MMD: added ('demo_d_mmd.py', 'analytical_value_d_mmd.py').
+-New examples for expected kernel: added ('demo_k_expected.py', 'analytical_value_k_expected.py'").
+-Documentation: Table 6: first 2 rows: 'Cost name': 'BKProbProd_KnnK' <--changed--> 'BKExpected' [Thanks to Matthew D. Harris for noticing the typo.]
+
+v1.0 (Nov 18, 2016): initial release.
diff --git a/ite-in-python/LICENSE.txt b/ite-in-python/LICENSE.txt
diff --git a/ite-in-python/README.md b/ite-in-python/README.md
@@ -0,0 +1,39 @@
+#Information Theoretical Estimators (ITE) in Python
+
+It
+
+* is the redesigned, Python implementation of the [Matlab/Octave ITE](https://bitbucket.org/szzoli/ite/) toolbox.
+* can estimate numerous entropy, mutual information, divergence, association measures, cross quantities, and kernels on distributions.
+* can be used to solve information theoretical optimization problems in a high-level way.
+* comes with several demos.
+* is free and open source: GNU GPLv3(>=).
+
+Estimated quantities:
+
+* `entropy (H)`: Shannon entropy, Rényi entropy, Tsallis entropy (Havrda and Charvát entropy), Sharma-Mittal entropy, Phi-entropy (f-entropy).
+* `mutual information (I)`: Shannon mutual information (total correlation, multi-information), Rényi mutual information, Tsallis mutual information, chi-square mutual information (squared-loss mutual information, mean square contingency), L2 mutual information, copula-based kernel dependency, kernel canonical correlation analysis, kernel generalized variance, multivariate version of Hoeffding's Phi, Hilbert-Schmidt independence criterion, distance covariance, distance correlation, Lancaster three-variable interaction.
+* `divergence (D)`: Kullback-Leibler divergence (relative entropy, I directed divergence), Rényi divergence, Tsallis divergence, Sharma-Mittal divergence, Pearson chi-square divergence (chi-square distance), Hellinger distance, L2 divergence, f-divergence (Csiszár-Morimoto divergence, Ali-Silvey distance), maximum mean discrepancy (kernel distance, current distance), energy distance (N-distance; specifically the Cramer-Von Mises distance), Bhattacharyya distance, non-symmetric Bregman distance (Bregman divergence), symmetric Bregman distance,  J-distance (symmetrised Kullback-Leibler divergence, J divergence), K divergence, L divergence, Jensen-Shannon divergence, Jensen-Rényi divergence, Jensen-Tsallis divergence.
+* `association measures (A)`: multivariate extensions of Spearman's rho (Spearman's rank correlation coefficient, grade correlation coefficient), multivariate conditional version of Spearman's rho, lower and upper tail dependence via conditional Spearman's rho.
+* `cross quantities (C)`: cross-entropy,
+* `kernels on distributions (K)`: expected kernel (summation kernel, mean map kernel, set kernel, multi-instance kernel, ensemble kernel; specific convolution kernel), probability product kernel, Bhattacharyya kernel (Bhattacharyya coefficient, Hellinger affinity), Jensen-Shannon kernel, Jensen-Tsallis kernel, exponentiated Jensen-Shannon kernel, exponentiated Jensen-Rényi kernels, exponentiated Jensen-Tsallis kernels.
+* `conditional entropy (condH)`: conditional Shannon entropy.
+* `conditional mutual information (condI)`: conditional Shannon mutual information.
+
+* * *
+
+**Citing**: If you use the ITE toolbox in your research, please cite it \[[.bib](http://www.cmap.polytechnique.fr/~zoltan.szabo/ITE.bib)\].
+
+**Download** the latest release:
+
+- code: [zip](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_code.tar.bz2),
+- documentation: [pdf](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_documentation.pdf).
+
+**Note**: the evolution of the code is briefly summarized in CHANGELOG.txt.
+
+* * *
+
+**ITE mailing list**: You can [sign up](https://groups.google.com/d/forum/itetoolbox) here.
+
+**Follow ITE**: on [Bitbucket](https://bitbucket.org/szzoli/ite-in-python/follow), on [Twitter](https://twitter.com/ITEtoolbox).
+
+**ITE applications**: [Wiki](https://bitbucket.org/szzoli/ite/wiki). Feel free to add yours.
diff --git a/ite-in-python/demos/analytical_values/demo_c_cross_entropy.py b/ite-in-python/demos/analytical_values/demo_c_cross_entropy.py
@@ -0,0 +1,74 @@
+#!/usr/bin/env python3
+
+""" Demo for cross-entropy estimators.
+
+Analytical vs estimated value is illustrated for normal random variables.
+
+"""
+
+from numpy.random import rand, multivariate_normal
+from scipy import arange, zeros, dot, ones
+import matplotlib.pyplot as plt
+
+from ite.cost.x_factory import co_factory
+from ite.cost.x_analytical_values import analytical_value_c_cross_entropy
+
+
+def main():
+    # parameters:
+    dim = 1  # dimension of the distribution
+    num_of_samples_v = arange(100, 12*1000+1, 500)
+    cost_name = 'BCCE_KnnK'  # dim >= 1
+
+    # initialization:
+    distr = 'normal'  # fixed
+    num_of_samples_max = num_of_samples_v[-1]
+    length = len(num_of_samples_v)
+    co = co_factory(cost_name, mult=True)  # cost object
+    c_hat_v = zeros(length)  # vector of estimated cross-entropies
+
+    # distr, dim -> samples (y1,y2), distribution parameters (par1,par2), 
+    # analytical value (c):
+    if distr == 'normal':
+        # mean (m1,m2):
+        m2 = rand(dim)
+        m1 = m2
+
+        # (random) linear transformation applied to the data (l1,l2) -> 
+        # covariance matrix (c1,c2):
+        l2 = rand(dim, dim)
+        l1 = rand(1) * l2
+        # Note: (m2,l2) => (m1,l1) choice guarantees y1<<y2 (in practise,
+        # too).
+
+        c1 = dot(l1, l1.T)
+        c2 = dot(l2, l2.T)
+
+        # generate samples (y1~N(m1,c1), y2~N(m2,c2)):
+        y1 = multivariate_normal(m1, c1, num_of_samples_max)
+        y2 = multivariate_normal(m2, c2, num_of_samples_max)
+
+        par1 = {"mean": m1, "cov": c1}
+        par2 = {"mean": m2, "cov": c2}
+    else:
+        raise Exception('Distribution=?')        
+
+    c = analytical_value_c_cross_entropy(distr, distr, par1, par2)
+
+    # estimation:
+    for (tk, num_of_samples) in enumerate(num_of_samples_v):
+        c_hat_v[tk] = co.estimation(y1[0:num_of_samples],
+                                    y2[0:num_of_samples])  # broadcast
+        print("tk={0}/{1}".format(tk+1, length))
+
+    # plot:    
+    plt.plot(num_of_samples_v, c_hat_v, num_of_samples_v, ones(length)*c)
+    plt.xlabel('Number of samples')
+    plt.ylabel('Cross-entropy')
+    plt.legend(('estimation', 'analytical value'), loc='best')
+    plt.title("Estimator: " + cost_name)
+    plt.show()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/ite-in-python/demos/analytical_values/demo_d_bregman.py b/ite-in-python/demos/analytical_values/demo_d_bregman.py
@@ -0,0 +1,63 @@
+#!/usr/bin/env python3
+
+""" Demo for Bregman divergence estimators.
+
+Analytical vs estimated value is illustrated for uniform random variables.
+
+"""
+
+from numpy.random import rand
+from numpy import arange, zeros, ones
+import matplotlib.pyplot as plt
+
+from ite.cost.x_factory import co_factory
+from ite.cost.x_analytical_values import analytical_value_d_bregman
+
+
+def main():
+    # parameters:
+    alpha = 0.7  # parameter of Bregman divergence, \ne 1
+    dim = 1  # dimension of the distribution
+    num_of_samples_v = arange(1000, 10*1000+1, 1000)
+    cost_name = 'BDBregman_KnnK'  # dim >= 1
+
+    # initialization:
+    distr = 'uniform'  # fixed    
+    num_of_samples_max = num_of_samples_v[-1]
+    length = len(num_of_samples_v)
+    co = co_factory(cost_name, mult=True, alpha=alpha)  # cost object
+    d_hat_v = zeros(length)  # vector of estimated divergence values
+
+    # distr, dim -> samples (y1<<y2), distribution parameters (par1,par2), 
+    # analytical value (d):
+    if distr == 'uniform':
+        b = 3 * rand(dim)
+        a = b * rand(dim)  
+        y1 = rand(num_of_samples_max, dim) * a  # U[0,a]
+        y2 = rand(num_of_samples_max, dim) * b
+        # Note: y2 ~ U[0,b], a<=b (coordinate-wise) => y1<<y2
+
+        par1 = {"a": a}
+        par2 = {"a": b}        
+    else:
+        raise Exception('Distribution=?')        
+
+    d = analytical_value_d_bregman(distr, distr, alpha, par1, par2)
+
+    # estimation:
+    for (tk, num_of_samples) in enumerate(num_of_samples_v):
+        d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
+                                    y2[0:num_of_samples])  # broadcast
+        print("tk={0}/{1}".format(tk+1, length))
+
+    # plot:    
+    plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
+    plt.xlabel('Number of samples')
+    plt.ylabel('Bregman divergence')
+    plt.legend(('estimation', 'analytical value'), loc='best')
+    plt.title("Estimator: " + cost_name)
+    plt.show()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/ite-in-python/demos/analytical_values/demo_d_chi_square.py b/ite-in-python/demos/analytical_values/demo_d_chi_square.py
@@ -0,0 +1,75 @@
+#!/usr/bin/env python3
+
+""" Demo for chi^2 divergence estimators.
+
+Analytical vs estimated value is illustrated for uniform and spherical
+normal random variables.
+
+"""
+
+from numpy.random import rand, multivariate_normal
+from numpy import arange, zeros, ones, eye
+import matplotlib.pyplot as plt
+
+from ite.cost.x_factory import co_factory
+from ite.cost.x_analytical_values import analytical_value_d_chi_square
+
+
+def main():
+    # parameters:
+    distr = 'normalI'  # 'uniform', 'normalI' (isotropic normal, Id cov.) 
+    dim = 1  # dimension of the distribution
+    num_of_samples_v = arange(1000, 50*1000+1, 1000)
+    cost_name = 'BDChi2_KnnK'  # dim >= 1
+
+    # initialization:
+    num_of_samples_max = num_of_samples_v[-1]
+    length = len(num_of_samples_v)
+    co = co_factory(cost_name, mult=True)  # cost object
+    d_hat_v = zeros(length)  # vector of estimated divergence values
+
+    # distr, dim -> samples (y1<<y2), distribution parameters (par1,par2), 
+    # analytical value (d):
+    if distr == 'uniform':
+        b = 3 * rand(dim)
+        a = b * rand(dim)
+
+        y1 = rand(num_of_samples_max, dim) * a  # U[0,a]
+        y2 = rand(num_of_samples_max, dim) * b
+        # U[0,b], a<=b (coordinate-wise) => y1<<y2
+
+        par1 = {"a": a}
+        par2 = {"a": b}        
+    elif distr == 'normalI':
+        # mean (m1,m2):
+        m1 = 2 * rand(dim)
+        m2 = 2 * rand(dim)
+
+        # generate samples (y1~N(m1,I), y2~N(m2,I)):
+        y1 = multivariate_normal(m1, eye(dim), num_of_samples_max)
+        y2 = multivariate_normal(m2, eye(dim), num_of_samples_max)
+
+        par1 = {"mean": m1}
+        par2 = {"mean": m2}
+    else:
+        raise Exception('Distribution=?')        
+
+    d = analytical_value_d_chi_square(distr, distr, par1, par2)
+
+    # estimation:
+    for (tk, num_of_samples) in enumerate(num_of_samples_v):
+        d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
+                                    y2[0:num_of_samples])  # broadcast
+        print("tk={0}/{1}".format(tk+1, length))
+
+    # plot:    
+    plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
+    plt.xlabel('Number of samples')
+    plt.ylabel('Chi square divergence')
+    plt.legend(('estimation', 'analytical value'), loc='best')
+    plt.title("Estimator: " + cost_name)
+    plt.show()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/ite-in-python/demos/analytical_values/demo_d_hellinger.py b/ite-in-python/demos/analytical_values/demo_d_hellinger.py
@@ -0,0 +1,68 @@
+#!/usr/bin/env python3
+
+""" Demo for Hellinger distance estimators.
+
+Analytical vs estimated value is illustrated for normal random variables.
+
+"""
+
+from numpy.random import rand, multivariate_normal
+from numpy import arange, zeros, dot, ones
+import matplotlib.pyplot as plt
+
+from ite.cost.x_factory import co_factory
+from ite.cost.x_analytical_values import analytical_value_d_hellinger
+
+
+def main():
+    # parameters:
+    dim = 2  # dimension of the distribution
+    num_of_samples_v = arange(1000, 50 * 1000 + 1, 2000)
+    cost_name = 'BDHellinger_KnnK'  # dim >= 1
+
+    # initialization:
+    distr = 'normal'  # fixed
+    num_of_samples_max = num_of_samples_v[-1]
+    length = len(num_of_samples_v)
+    co = co_factory(cost_name, mult=True)  # cost object
+    d_hat_v = zeros(length)  # vector of estimated divergence values
+
+    # distr, dim -> samples (y1,y2), distribution parameters (par1,par2),
+    # analytical value (d):
+    if distr == 'normal':
+        # mean (m1,m2):
+        m1, m2 = rand(dim), rand(dim)
+
+        # (random) linear transformation applied to the data (l1,l2) ->
+        # covariance matrix (c1,c2):
+        l1, l2 = rand(dim, dim), rand(dim, dim)
+        c1, c2 = dot(l1, l1.T), dot(l2, l2.T)
+
+        # generate samples (y1~N(m1,c1), y2~N(m2,c2)):
+        y1 = multivariate_normal(m1, c1, num_of_samples_max)
+        y2 = multivariate_normal(m2, c2, num_of_samples_max)
+
+        par1, par2 = {"mean": m1, "cov": c1}, {"mean": m2, "cov": c2}
+    else:
+        raise Exception('Distribution=?')
+
+    d = analytical_value_d_hellinger(distr, distr, par1, par2)
+
+    # estimation:
+    for (tk, num_of_samples) in enumerate(num_of_samples_v):
+        # with broadcasting:
+        d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
+                                    y2[0:num_of_samples])
+        print("tk={0}/{1}".format(tk + 1, length))
+
+    # plot:
+    plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length) * d)
+    plt.xlabel('Number of samples')
+    plt.ylabel('Hellinger distance')
+    plt.legend(('estimation', 'analytical value'), loc='best')
+    plt.title("Estimator: " + cost_name)
+    plt.show()
+
+
+if __name__ == "__main__":
+    main()
diff --git a/ite-in-python/demos/analytical_values/demo_d_jensen_renyi.py b/ite-in-python/demos/analytical_values/demo_d_jensen_renyi.py
@@ -0,0 +1,65 @@
+#!/usr/bin/env python3
+
+""" Demo Jensen-Renyi divergence estimators.
+
+Analytical vs estimated value is illustrated for spherical normal random
+variables.
+
+"""
+
+from numpy.random import rand, multivariate_normal, randn
+from numpy import arange, zeros, ones, array, eye
+import matplotlib.pyplot as plt
+
+from ite.cost.x_factory import co_factory
+from ite.cost.x_analytical_values import analytical_value_d_jensen_renyi
+
+
+def main():
+    # parameters:
+    dim = 2  # dimension of the distribution
+    w = array([1/3, 2/3])  # weight in the Jensen-Renyi divergence
+    num_of_samples_v = arange(100, 12*1000+1, 500)
+    cost_name = 'MDJR_HR'  # dim >= 1
+
+    # initialization:
+    alpha = 2  # parameter of the Jensen-Renyi divergence, \ne 1; fixed    
+    distr = 'normal'  # fixed    
+    num_of_samples_max = num_of_samples_v[-1]
+    length = len(num_of_samples_v)
+    co = co_factory(cost_name, mult=True, alpha=alpha, w=w)  # cost object
+    d_hat_v = zeros(length)  # vector of estimated divergence values
+
+    # distr, dim -> samples (y1,y2), distribution parameters (par1,par2), 
+    # analytical value (d):
+    if distr == 'normal':
+        # generate samples (y1,y2); y1~N(m1,s1^2xI), y2~N(m2,s2^2xI):
+        m1, s1 = randn(dim), rand(1)
+        m2, s2 = randn(dim), rand(1)
+        y1 = multivariate_normal(m1, s1**2 * eye(dim), num_of_samples_max)
+        y2 = multivariate_normal(m2, s2**2 * eye(dim), num_of_samples_max)
+
+        par1 = {"mean": m1, "std": s1}
+        par2 = {"mean": m2, "std": s2}
+    else:
+        raise Exception('Distribution=?')        
+
+    d = analytical_value_d_jensen_renyi(distr, distr, w, par1, par2)
+
+    # estimation:
+    for (tk, num_of_samples) in enumerate(num_of_samples_v):
+        d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
+                                    y2[0:num_of_samples])  # broadcast
+        print("tk={0}/{1}".format(tk+1, length))
+
+    # plot:    
+    plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
+    plt.xlabel('Number of samples')
+    plt.ylabel('Jensen-Renyi divergence')
+    plt.legend(('estimation', 'analytical value'), loc='best')
+    plt.title("Estimator: " + cost_name)
+    plt.show()
+
+
+if __name__ == "__main__":
+    main()