GABIL is a method for creating decision trees using a genetic algorithm approach. In this assignment you will implement a variation of the GABIL algorithm adapted to using continuous features.
GABIL provides a mechanism for generating 0 or more rules. To provide a classification from such a set of rules you should treat the resulting rules as a decision list. What this means is that when determining the predictions by a set of rules you should start with the first rule and see if it applies, if it does you should use its prediction. If not, move on to the second rule and see if it applies then you should use its prediction. And so on until the last rule is tried. If no rule applies you should assume a default rule as the actual last rule and make that rule predict the majority class.
To generate random rules you should start by deciding how many rules. You will have parameters for the minimum and maximum number of starting rules. For example, these values could be 3 and 7. So first you would pick randomly between 3, 4, 5, 6 or 7 rules. Then you would generate each rule. A rule should have a fixed number of bits for each features as discussed in class (and below for split points). So a rule is simply the bits needed for each feature with bits for the output class added at the end. Each hypothesis will then start with some multiple of that number of bits plus two bits for the AA and DC bit.
To deal with continuous features you will start by discretizing each continuous feature. To do this you will consider all of the split points for that feature and choose the one with the largest information gain. You will then repeat this process until no split point results in positive information gain, or some fixed number of split points has been identified (which will be a parameter to your system).
You should handle unknown values by adding an unknown bit for every feature and then use this bit to represent cases where the feature value is not known.
Your genetic algorithm should have parameters to set for the size of the population, the percentage of the population to be replaced at each step, the number of population iterations that are performed, the minimum number and maximum number of rules in the initial random rule sets and the maximum number of split points for each feature.
You should hand in a documented copy of your code (including your class files).
You should perform experiments to see how well your algorithm performs. You should create a set of test values for each of the parameters and then perform five 10-fold cross validation experiments with each setting. Then present a table for each dataset for all of the parameter combinations you try showing the average error and the standard deviation of the error. You should test on coolcars, your dataset, the promoters-936 dataset and the iris dataset.
You must also submit your code electronically. To do this create a tar file of all of your code and then submit it to the class webdrop by going to https://webdrop.d.umn.edu/ and picking the webdrop for 8751 after logging in.