CMPS140, Winter 2012, Section 01: Lecture 18

Building nearest neighborhood tables is trivial unless
you want to apply further organization (k-d tries) to speed retrieval.

Building Optimal Decision Trees is NP-Hard thus there is no way to
easily build the smallest tree.

So what we do is use a greedy algorithm which attempts to minimize
entropy by choosing the question which is expected to be most informative at at each step. Assuming there are P positive examples and N
negative examples: Entropy =  summation(Pi*log2(1/Pi) = E(P,N)
         P* P/(P+N) *log2((P+N)/P) + N*N/(P+N) *log2((P+N)/N)
One we choose that attribute we recursively call this algoriithm on the remaining subtrees.