Decision trees are normally built in a greedy manner. One problem with this approach is that the best feature to split on at any point might not be apparent without considering its interactions with each of the other remaining features. In this work you will implement a one step lookahead feature in the ID3 algorithm that is included in the Weka code.
You will need to create a new decision tree method based on the given ID3 code. You should extract the java code for ID3 from the weka-src.jar file (which should have been included when you downloaded the code). You can implement your lookahead version of ID3 either by creating a new decision tree file by copying the ID3 code and changing the name, or you can extend the existing ID3 code.
Your lookahead should work by simply changing how information gain is used to select the best feature:
BestFeature(AvailableFeatures,RemainingExamples) BestGain1 = 0 BestGainFeature = ? NumExamples = | RemainingExamples | For each feature F1 from AvailableFeatures ThisInfoGain = Entropy(RemainingExamples) For each feature value V1 of F1 ExamplesForThisValue = RemainingExamples with value V1 of F1 If (ExamplesForThisValue need not or cannot be further subdivided) ThisInfoGain = ThisInfoGain - | ExamplesForThisValue | / NumExamples * Entropy(ExamplesForThisValue) Else BestGain2 = 0 For each feature F2 from (AvailableFeatures-F1) SubInfoGain = 0 For each feature value V2 of F2 SubExamplesForThisValue = ExamplesForThisValue with value V2 of F2 SubInfoGain = SubInfoGain + | SubExamplesForThisValue | / NumExamples * Entropy(SubExamplesForThisValue) EndFor If (SubInfoGain > BestGain2) BestGain2 = SubInfoGain EndIf EndFor ThisInfoGain = ThisInfoGain - BestGain2 EndIf EndFor If (ThisInfoGain > BestGain1) BestGain1 = ThisInfoGain BestGainFeature = F1 EndIf EndIf Return BestGainFeature End
Once you get your code working you should test it on the datasets you used in program 1. Compare these results to the trees that would be produced with ID3. Note that if you included features with unknown values you should replace the unknowns with some feature value.
Also, design a dataset that is guaranteed to have different results for ID3 and your ID3WithLookAhead with method. Hint: you may want to look at problems that are not linearly separable for inspiration.