Skip to content

Commit

Permalink
added information theory toolbox
Browse files Browse the repository at this point in the history
  • Loading branch information
dainis-boumber committed Sep 16, 2018
1 parent ffdde5b commit d657bd9
Show file tree
Hide file tree
Showing 58 changed files with 13,287 additions and 0 deletions.
6 changes: 6 additions & 0 deletions ite-in-python/CHANGELOG.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
v1.1 (Feb 20, 2018):
-Analytical values & demo for MMD: added ('demo_d_mmd.py', 'analytical_value_d_mmd.py').
-New examples for expected kernel: added ('demo_k_expected.py', 'analytical_value_k_expected.py'").
-Documentation: Table 6: first 2 rows: 'Cost name': 'BKProbProd_KnnK' <--changed--> 'BKExpected' [Thanks to Matthew D. Harris for noticing the typo.]

v1.0 (Nov 18, 2016): initial release.
674 changes: 674 additions & 0 deletions ite-in-python/LICENSE.txt

Large diffs are not rendered by default.

39 changes: 39 additions & 0 deletions ite-in-python/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
#Information Theoretical Estimators (ITE) in Python

It

* is the redesigned, Python implementation of the [Matlab/Octave ITE](https://bitbucket.org/szzoli/ite/) toolbox.
* can estimate numerous entropy, mutual information, divergence, association measures, cross quantities, and kernels on distributions.
* can be used to solve information theoretical optimization problems in a high-level way.
* comes with several demos.
* is free and open source: GNU GPLv3(>=).

Estimated quantities:

* `entropy (H)`: Shannon entropy, Rényi entropy, Tsallis entropy (Havrda and Charvát entropy), Sharma-Mittal entropy, Phi-entropy (f-entropy).
* `mutual information (I)`: Shannon mutual information (total correlation, multi-information), Rényi mutual information, Tsallis mutual information, chi-square mutual information (squared-loss mutual information, mean square contingency), L2 mutual information, copula-based kernel dependency, kernel canonical correlation analysis, kernel generalized variance, multivariate version of Hoeffding's Phi, Hilbert-Schmidt independence criterion, distance covariance, distance correlation, Lancaster three-variable interaction.
* `divergence (D)`: Kullback-Leibler divergence (relative entropy, I directed divergence), Rényi divergence, Tsallis divergence, Sharma-Mittal divergence, Pearson chi-square divergence (chi-square distance), Hellinger distance, L2 divergence, f-divergence (Csiszár-Morimoto divergence, Ali-Silvey distance), maximum mean discrepancy (kernel distance, current distance), energy distance (N-distance; specifically the Cramer-Von Mises distance), Bhattacharyya distance, non-symmetric Bregman distance (Bregman divergence), symmetric Bregman distance, J-distance (symmetrised Kullback-Leibler divergence, J divergence), K divergence, L divergence, Jensen-Shannon divergence, Jensen-Rényi divergence, Jensen-Tsallis divergence.
* `association measures (A)`: multivariate extensions of Spearman's rho (Spearman's rank correlation coefficient, grade correlation coefficient), multivariate conditional version of Spearman's rho, lower and upper tail dependence via conditional Spearman's rho.
* `cross quantities (C)`: cross-entropy,
* `kernels on distributions (K)`: expected kernel (summation kernel, mean map kernel, set kernel, multi-instance kernel, ensemble kernel; specific convolution kernel), probability product kernel, Bhattacharyya kernel (Bhattacharyya coefficient, Hellinger affinity), Jensen-Shannon kernel, Jensen-Tsallis kernel, exponentiated Jensen-Shannon kernel, exponentiated Jensen-Rényi kernels, exponentiated Jensen-Tsallis kernels.
* `conditional entropy (condH)`: conditional Shannon entropy.
* `conditional mutual information (condI)`: conditional Shannon mutual information.

* * *

**Citing**: If you use the ITE toolbox in your research, please cite it \[[.bib](http://www.cmap.polytechnique.fr/~zoltan.szabo/ITE.bib)\].

**Download** the latest release:

- code: [zip](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_code.tar.bz2),
- documentation: [pdf](https://bitbucket.org/szzoli/ite-in-python/downloads/ITE-1.1_documentation.pdf).

**Note**: the evolution of the code is briefly summarized in CHANGELOG.txt.

* * *

**ITE mailing list**: You can [sign up](https://groups.google.com/d/forum/itetoolbox) here.

**Follow ITE**: on [Bitbucket](https://bitbucket.org/szzoli/ite-in-python/follow), on [Twitter](https://twitter.com/ITEtoolbox).

**ITE applications**: [Wiki](https://bitbucket.org/szzoli/ite/wiki). Feel free to add yours.
74 changes: 74 additions & 0 deletions ite-in-python/demos/analytical_values/demo_c_cross_entropy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
#!/usr/bin/env python3

""" Demo for cross-entropy estimators.
Analytical vs estimated value is illustrated for normal random variables.
"""

from numpy.random import rand, multivariate_normal
from scipy import arange, zeros, dot, ones
import matplotlib.pyplot as plt

from ite.cost.x_factory import co_factory
from ite.cost.x_analytical_values import analytical_value_c_cross_entropy


def main():
# parameters:
dim = 1 # dimension of the distribution
num_of_samples_v = arange(100, 12*1000+1, 500)
cost_name = 'BCCE_KnnK' # dim >= 1

# initialization:
distr = 'normal' # fixed
num_of_samples_max = num_of_samples_v[-1]
length = len(num_of_samples_v)
co = co_factory(cost_name, mult=True) # cost object
c_hat_v = zeros(length) # vector of estimated cross-entropies

# distr, dim -> samples (y1,y2), distribution parameters (par1,par2),
# analytical value (c):
if distr == 'normal':
# mean (m1,m2):
m2 = rand(dim)
m1 = m2

# (random) linear transformation applied to the data (l1,l2) ->
# covariance matrix (c1,c2):
l2 = rand(dim, dim)
l1 = rand(1) * l2
# Note: (m2,l2) => (m1,l1) choice guarantees y1<<y2 (in practise,
# too).

c1 = dot(l1, l1.T)
c2 = dot(l2, l2.T)

# generate samples (y1~N(m1,c1), y2~N(m2,c2)):
y1 = multivariate_normal(m1, c1, num_of_samples_max)
y2 = multivariate_normal(m2, c2, num_of_samples_max)

par1 = {"mean": m1, "cov": c1}
par2 = {"mean": m2, "cov": c2}
else:
raise Exception('Distribution=?')

c = analytical_value_c_cross_entropy(distr, distr, par1, par2)

# estimation:
for (tk, num_of_samples) in enumerate(num_of_samples_v):
c_hat_v[tk] = co.estimation(y1[0:num_of_samples],
y2[0:num_of_samples]) # broadcast
print("tk={0}/{1}".format(tk+1, length))

# plot:
plt.plot(num_of_samples_v, c_hat_v, num_of_samples_v, ones(length)*c)
plt.xlabel('Number of samples')
plt.ylabel('Cross-entropy')
plt.legend(('estimation', 'analytical value'), loc='best')
plt.title("Estimator: " + cost_name)
plt.show()


if __name__ == "__main__":
main()
63 changes: 63 additions & 0 deletions ite-in-python/demos/analytical_values/demo_d_bregman.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,63 @@
#!/usr/bin/env python3

""" Demo for Bregman divergence estimators.
Analytical vs estimated value is illustrated for uniform random variables.
"""

from numpy.random import rand
from numpy import arange, zeros, ones
import matplotlib.pyplot as plt

from ite.cost.x_factory import co_factory
from ite.cost.x_analytical_values import analytical_value_d_bregman


def main():
# parameters:
alpha = 0.7 # parameter of Bregman divergence, \ne 1
dim = 1 # dimension of the distribution
num_of_samples_v = arange(1000, 10*1000+1, 1000)
cost_name = 'BDBregman_KnnK' # dim >= 1

# initialization:
distr = 'uniform' # fixed
num_of_samples_max = num_of_samples_v[-1]
length = len(num_of_samples_v)
co = co_factory(cost_name, mult=True, alpha=alpha) # cost object
d_hat_v = zeros(length) # vector of estimated divergence values

# distr, dim -> samples (y1<<y2), distribution parameters (par1,par2),
# analytical value (d):
if distr == 'uniform':
b = 3 * rand(dim)
a = b * rand(dim)
y1 = rand(num_of_samples_max, dim) * a # U[0,a]
y2 = rand(num_of_samples_max, dim) * b
# Note: y2 ~ U[0,b], a<=b (coordinate-wise) => y1<<y2

par1 = {"a": a}
par2 = {"a": b}
else:
raise Exception('Distribution=?')

d = analytical_value_d_bregman(distr, distr, alpha, par1, par2)

# estimation:
for (tk, num_of_samples) in enumerate(num_of_samples_v):
d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
y2[0:num_of_samples]) # broadcast
print("tk={0}/{1}".format(tk+1, length))

# plot:
plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
plt.xlabel('Number of samples')
plt.ylabel('Bregman divergence')
plt.legend(('estimation', 'analytical value'), loc='best')
plt.title("Estimator: " + cost_name)
plt.show()


if __name__ == "__main__":
main()
75 changes: 75 additions & 0 deletions ite-in-python/demos/analytical_values/demo_d_chi_square.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
#!/usr/bin/env python3

""" Demo for chi^2 divergence estimators.
Analytical vs estimated value is illustrated for uniform and spherical
normal random variables.
"""

from numpy.random import rand, multivariate_normal
from numpy import arange, zeros, ones, eye
import matplotlib.pyplot as plt

from ite.cost.x_factory import co_factory
from ite.cost.x_analytical_values import analytical_value_d_chi_square


def main():
# parameters:
distr = 'normalI' # 'uniform', 'normalI' (isotropic normal, Id cov.)
dim = 1 # dimension of the distribution
num_of_samples_v = arange(1000, 50*1000+1, 1000)
cost_name = 'BDChi2_KnnK' # dim >= 1

# initialization:
num_of_samples_max = num_of_samples_v[-1]
length = len(num_of_samples_v)
co = co_factory(cost_name, mult=True) # cost object
d_hat_v = zeros(length) # vector of estimated divergence values

# distr, dim -> samples (y1<<y2), distribution parameters (par1,par2),
# analytical value (d):
if distr == 'uniform':
b = 3 * rand(dim)
a = b * rand(dim)

y1 = rand(num_of_samples_max, dim) * a # U[0,a]
y2 = rand(num_of_samples_max, dim) * b
# U[0,b], a<=b (coordinate-wise) => y1<<y2

par1 = {"a": a}
par2 = {"a": b}
elif distr == 'normalI':
# mean (m1,m2):
m1 = 2 * rand(dim)
m2 = 2 * rand(dim)

# generate samples (y1~N(m1,I), y2~N(m2,I)):
y1 = multivariate_normal(m1, eye(dim), num_of_samples_max)
y2 = multivariate_normal(m2, eye(dim), num_of_samples_max)

par1 = {"mean": m1}
par2 = {"mean": m2}
else:
raise Exception('Distribution=?')

d = analytical_value_d_chi_square(distr, distr, par1, par2)

# estimation:
for (tk, num_of_samples) in enumerate(num_of_samples_v):
d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
y2[0:num_of_samples]) # broadcast
print("tk={0}/{1}".format(tk+1, length))

# plot:
plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
plt.xlabel('Number of samples')
plt.ylabel('Chi square divergence')
plt.legend(('estimation', 'analytical value'), loc='best')
plt.title("Estimator: " + cost_name)
plt.show()


if __name__ == "__main__":
main()
68 changes: 68 additions & 0 deletions ite-in-python/demos/analytical_values/demo_d_hellinger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
#!/usr/bin/env python3

""" Demo for Hellinger distance estimators.
Analytical vs estimated value is illustrated for normal random variables.
"""

from numpy.random import rand, multivariate_normal
from numpy import arange, zeros, dot, ones
import matplotlib.pyplot as plt

from ite.cost.x_factory import co_factory
from ite.cost.x_analytical_values import analytical_value_d_hellinger


def main():
# parameters:
dim = 2 # dimension of the distribution
num_of_samples_v = arange(1000, 50 * 1000 + 1, 2000)
cost_name = 'BDHellinger_KnnK' # dim >= 1

# initialization:
distr = 'normal' # fixed
num_of_samples_max = num_of_samples_v[-1]
length = len(num_of_samples_v)
co = co_factory(cost_name, mult=True) # cost object
d_hat_v = zeros(length) # vector of estimated divergence values

# distr, dim -> samples (y1,y2), distribution parameters (par1,par2),
# analytical value (d):
if distr == 'normal':
# mean (m1,m2):
m1, m2 = rand(dim), rand(dim)

# (random) linear transformation applied to the data (l1,l2) ->
# covariance matrix (c1,c2):
l1, l2 = rand(dim, dim), rand(dim, dim)
c1, c2 = dot(l1, l1.T), dot(l2, l2.T)

# generate samples (y1~N(m1,c1), y2~N(m2,c2)):
y1 = multivariate_normal(m1, c1, num_of_samples_max)
y2 = multivariate_normal(m2, c2, num_of_samples_max)

par1, par2 = {"mean": m1, "cov": c1}, {"mean": m2, "cov": c2}
else:
raise Exception('Distribution=?')

d = analytical_value_d_hellinger(distr, distr, par1, par2)

# estimation:
for (tk, num_of_samples) in enumerate(num_of_samples_v):
# with broadcasting:
d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
y2[0:num_of_samples])
print("tk={0}/{1}".format(tk + 1, length))

# plot:
plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length) * d)
plt.xlabel('Number of samples')
plt.ylabel('Hellinger distance')
plt.legend(('estimation', 'analytical value'), loc='best')
plt.title("Estimator: " + cost_name)
plt.show()


if __name__ == "__main__":
main()
65 changes: 65 additions & 0 deletions ite-in-python/demos/analytical_values/demo_d_jensen_renyi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
#!/usr/bin/env python3

""" Demo Jensen-Renyi divergence estimators.
Analytical vs estimated value is illustrated for spherical normal random
variables.
"""

from numpy.random import rand, multivariate_normal, randn
from numpy import arange, zeros, ones, array, eye
import matplotlib.pyplot as plt

from ite.cost.x_factory import co_factory
from ite.cost.x_analytical_values import analytical_value_d_jensen_renyi


def main():
# parameters:
dim = 2 # dimension of the distribution
w = array([1/3, 2/3]) # weight in the Jensen-Renyi divergence
num_of_samples_v = arange(100, 12*1000+1, 500)
cost_name = 'MDJR_HR' # dim >= 1

# initialization:
alpha = 2 # parameter of the Jensen-Renyi divergence, \ne 1; fixed
distr = 'normal' # fixed
num_of_samples_max = num_of_samples_v[-1]
length = len(num_of_samples_v)
co = co_factory(cost_name, mult=True, alpha=alpha, w=w) # cost object
d_hat_v = zeros(length) # vector of estimated divergence values

# distr, dim -> samples (y1,y2), distribution parameters (par1,par2),
# analytical value (d):
if distr == 'normal':
# generate samples (y1,y2); y1~N(m1,s1^2xI), y2~N(m2,s2^2xI):
m1, s1 = randn(dim), rand(1)
m2, s2 = randn(dim), rand(1)
y1 = multivariate_normal(m1, s1**2 * eye(dim), num_of_samples_max)
y2 = multivariate_normal(m2, s2**2 * eye(dim), num_of_samples_max)

par1 = {"mean": m1, "std": s1}
par2 = {"mean": m2, "std": s2}
else:
raise Exception('Distribution=?')

d = analytical_value_d_jensen_renyi(distr, distr, w, par1, par2)

# estimation:
for (tk, num_of_samples) in enumerate(num_of_samples_v):
d_hat_v[tk] = co.estimation(y1[0:num_of_samples],
y2[0:num_of_samples]) # broadcast
print("tk={0}/{1}".format(tk+1, length))

# plot:
plt.plot(num_of_samples_v, d_hat_v, num_of_samples_v, ones(length)*d)
plt.xlabel('Number of samples')
plt.ylabel('Jensen-Renyi divergence')
plt.legend(('estimation', 'analytical value'), loc='best')
plt.title("Estimator: " + cost_name)
plt.show()


if __name__ == "__main__":
main()
Loading

0 comments on commit d657bd9

Please sign in to comment.