NOTE: THIS IS STILL HIGHLY EXPERIMENTAL AND MAY BE PRONE TO ERROR, USE WITH CAUTION!
A Matlab implementation of Hellinger Distance Decision Trees and Forests for binary decision problems with imbalanced data and numeric attributes based on the paper "Hellinger distance decision trees are robust and skew-insensitive" by Cieslak et al.
This library is based on the paper:
Cieslak, David A., et al. "Hellinger distance decision trees are robust and skew-insensitive." Data Mining and Knowledge Discovery 24.1 (2012): 136-158.
@article{cieslak2012hellinger, title={Hellinger distance decision trees are robust and skew-insensitive}, author={Cieslak, David A and Hoens, T Ryan and Chawla, Nitesh V and Kegelmeyer, W Philip}, journal={Data Mining and Knowledge Discovery}, volume={24}, number={1}, pages={136--158}, year={2012}, publisher={Springer} }
The author of this software has no affiliation with the researchers mentioned above, and the software is not an exact replication of the methods mentioned in the paper above. The authors of this software make no guarantee about the correctness or functionality of this code. To cite this software, you can use the following citation:
Daniels, Zachary A. "Hellinger Decision Trees and Forests for Matlab."https://github.com/ZDanielsResearch/HellingerTreesMatlab.
@misc{daniels2015hellinger, author = {Daniels, Zachary A}, title = {Hellinger Decision Trees and Forests for Matlab}, howpublished = {\url{https://github.com/ZDanielsResearch/HellingerTreesMatlab}} }
The code was developed on Matlab R2014b for Mac OS X 10.10 (Yosemite).
The examples use data packaged with the Statistics and Machine Learning Toolbox and the Neural Network Toolbox for Matlab.
The examples are very basic and are only intended to demonstrate how to use the library (and provide basic tests to check that the code is operating as it should). THEY ARE NOT PROPER EXPERIMENTS AND SHOULD NOT BE TREATED AS SUCH!
If you find any issues or have questions, the author can be contacted at: zad7[at]cs[dot]rutgers[dot]edu