Mitchell, Ch. 3
- Give a non-empty set that has entropy 0.
- Give a non-empty set that has entropy 1.
- Compute: Entropy[17+,5-] (complete expression only, no need for calculator).
- Give the formula for the information gain, Gain(S,A), resulting from decomposing set, S, on a given attribute, A.
- Given the entropy of set S, calculate the Gain(S,A), resulting from decomposing, S on a given attribute, A.
- For the data on text page 59 use variables, function log2 and the fact that Sunny has entropy, .970, to compute a precise exspression that evaluates to Gain(Sunny,Wind).
Wind when Sunny:- Weak: 2N, 1 Y,proportion= 3/5
entropyWeak = -(2/3)log2(2/3) -(1/3)log2(1/3) = .9183
- Strong: 1N 1Y ,proportion= 2/5
entropyStrong = -(1/2)log2(1/2) -(1/2)log2(1/2) = 1
- Gain(Sunny,Wind) =
=.97 - (3/5)(entropyWeak) - (2/5) entropyStrong = 97- (3/5)(.9183) - (2/5)(1) = .97 - .55098 - .4 = .01902
- Weak: 2N, 1 Y,proportion= 3/5
- Describe the inductive bias of ID3.
good trees have low depth and nodes near the root corresponding to high entropy variables
- Calculate the entropy for a given variable that has more than two values (like V = {1,2,3,2,1,1}).
- Describe approaches to avoiding overtraining of decision trees (p68).
- What is pruning of a decision tree node?
- How does a trainer determine that it is desirable to prune a node?
- Give an example of a decision-tree rule.
- Give rules for the four leaves in the decision tree on p53.
- Describe the components of a decision-tree rule, defining the terms If portion , antecedent, then portion, post condition.
- List the steps of Rule Post-Pruning (p71).
- Explain why rule post-pruning is more powerful than node post pruning (p72).
(Can remove paths through nodes instead of entire nodes, can remove entire node without need to restructure a tree, easier to read.)