Some sample exam 1 questions: 1. Briefly define the following terms: (6 points each) Concept Learning Continuous-Valued Attribute Discrete-Valued Attribute Inductive Learning The Inductive Learning Hypothesis Version Space Inductive Bias Noise Decision Tree Entropy Information Gain Gain Ratio (in decision trees) Overfitting Gradient Descent Artificial Neural Network Linear Threshold Unit Sigmoid Unit Eager learning Lazy Learning Curse of dimensionality kd Tree Single Point Crossover Two Point Crossover Uniform Crossover Point Mutation The Baldwin Effect Inverted Deduction 2. Outline the four key questions that must be answered when designing a machine learning algorithm. Give an example of an answer for each question. (20 points) 3. Define the following algorithms: Find-S (20 points) List-Then-Eliminate (Version Space) (20 points) Candidate Elimination (Version Space) (25 points) ID3 (25 points) Perceptron Training Algorithm (assuming linear artificial neurons) (25 points) Backpropagation (assuming sigmoidal artificial neurons) (25 points) 4. For each of the algorithms above, show how it works on a specific problem (examples of these may be found in the book or in the notes). 5. Why is inductive bias important for a machine learning algorithm? Give some examples of ML algorithms and their corresponding inductive biases. (20 points) 6. How would you represent the following concepts in a decision tree: (15 points) A OR B A AND NOT B (A AND B) OR (C OR NOT D) 7. What problem does reduced-error pruning address? How do we decide when to prunce a decision tree? (20 points) 8. How do you translate a decision tree into a corresponding set of rules? (20 points) 9. What mechanism was suggested in class for dealing with continuous-valued attributes in a decision tree? (20 points) 10. What mechanism was suggested in class for dealing with missing attribute values in a decision tree? (20 points) 11. What types of concepts can be learned with a perceptron using linear units? Give an example of a concept that could not be learned by this type of artificial neural network. (15 points) 12. A multi-layer perceptron with sigmoid units can learn (using an algorithm like backpropagation) concepts that cannot be learned by artificial neural networks that lack hidden units or sigmoid activation functions. Give an example of a concept that could be learned by such a network and what the weights of a learned representation of this concept might be. (20 points) 13. An artificial neural network uses gradient descent to search for a local minimum in weight space. How is a local minimum different from the global minimum? Why doesn't gradient descent find the global minimum? (20 points) 14. A concept is represented in C4.5 format with the following files. The .names file is: Class1,Class2,Class3. | Classes FeatureA: continuous FeatureB: BValue1, BValue2, BValue3, BValue4 FeatureC: continuous FeatureD: Yes, No The data file is as follows: 2.5,BValue2,100.0,No,Class2 1.1,BValue4,300.0,Yes,Class1 2.3,BValue3,150.0,No,Class3 1.4,BValue1,350.0,No,Class2 What input and output representation would you use to learn this problem using an artificial neural network? Give the input and output vectors for each of the data points shown above. What are the advantages and disadvantages of your representation? (20 points) 15. How does a k-Nearest Neighbor learner make predictions about new data points? How does a distance-weighted k-Nearest Neighbor learner differ from a standard k-Nearest Neighbor learner? What is locally weighted regression? (25 points) 16. How does a Radial Basis Function network work? How does a kernel function work? (20 points) 17. How are concepts represented in a genetic algorithm? Give an example of of concept represented in a GA. (20 points) 18. What operators are used in a genetic algorithm to produce new concepts? Give an example of a mechanism that can be used to judge a GA concept. (15 points) 19. Give pseudo-code for a general genetic algorithm. Make sure to outline the way concepts are represented, the operators used to create new concepts, how concepts are chosen to reproduce, and how concepts are evaluated. (25 points) 20. Give two different mechanisms for selecting which members of a GA population should reproduce. What are the advantages and disadvantages of your mechanisms? (20 points) 21. How does genetic programming work? How is a genetic program defined? What genetic operators can be applied to a genetic program? (25 points) 22. How does the sequential covering algorithm work to generate a set of rules for a concept? (20 points) 23. How does FOIL work to generate first-order logic rules for a concept? (20 points) 24. What does it mean to view induction as inverted deduction? Give a deduction rule and explain how that rule can be inverted to induce new rules. (20 points)