HW1

TM textbook, Pg. 48 - 2.4

Note: For part d) indicate a set of examples that will learn the corresponding concept. Also draw a picture containing the concept and the examples.

Consider the instance space consisting of integer points in the x , y plane and the set of hypotheses H consisting of rectangles. More precisely, hypotheses are of the form a <= x <=b , c <=y <=d , where a , b, c, and d can be any integers.

(a) Consider the version space with respect to the set of positive (+) and negative (-) training examples shown below. What is the S boundary of the version space in this case? Write out the hypotheses and draw them in on the diagram.

(b) What is the G boundary of this version space? Write out the hypotheses and draw them in.

(c) Suppose the learner may now suggest a new x , y instance and ask the trainer for its classification. Suggest a query guaranteed to reduce the size of the version space, regardless of how the trainer classifies it. Suggest one that will not.

(d) Now assume you are a teacher, attempting to teach a particular target concept (e.g., 3 <=x <=5 , 2 <=y <=9). What is the smallest number of training examples you can provide so that the CANDIDATE-ELIMINATION algorithm will perfectly learn the target concept?

For part (d) and for the other parts as well, it might be easier to use the grid from below to try out soultions.

Hint on how to start part (d): the concept to be learn is a rectangle, meaning that S and G are the rectangle (of course, we have zero version space in this situation of perfectly learning the concept which in our case is the rectangle). Start by placing ~8 positive data inside the rectangle, including on its edges; and place ~8 negative examples outside the rectangle. Now, think how to use the minimum numebr of positives and negatives so that the concept rectangle is learned (again, this means S and G are the rectangle).