Artificial intelligence questions and answers | Exercises Artificial Intelligence

CMPSCI 683 Artificial Intelligence

Questions & Answers

1. General Learning

Consider the following modification to the restaurant example described in class, which includes missing and

partially specified attributes:

⇒The outcomes for X1 and X7 are reversed.

⇒X3 has the missing attribute value for "Pat".

⇒X5 has the missing attribute value for "Hun".

⇒X10 has the attribute for “TYPE” which could be either ITALIAN or FRENCH.

Define an algorithm for dealing with missing attributes and partially specified attributes,which includes the

modified calculation for information gain use to make splitting decisions.

Generate a decision tree for this example using your new algorithm.

Answer

There are a lot of ways of answering this question. One algorithm is as follows:

For a training instance with multi-valued attributes, I will duplicate that instance by the number of values of that

attribute. But each duplicated instance will be weighted down by the number of times I have seen each value in

other training examples.

For example, in the restaurant example, X10 will now become X10’ and X10’’. X10’ will have a value of French,

with a weight of 2/3 (note this is 2/3 because there are only 3 examples with either French or Italian of which 2

are French). X10’’ will have a weight of 1/3 when learning in my decision tree.

For a missing attribute, I will treat it like a multi-valued attribute, using all possible values of the missing

attribute.

For example, X3 will become X3’, X3’’ and X3’’’. X3’ will have the value None with a weight of 2/11. X3’’ will

have the value Some for Pat, with a weight of 3/11. X3’’’ will have the value Full for Pat, with a weight of 6/11.

Note, that these weights are independent of each other. So, if X10 also had the value of Pat missing, I would have

to generate 6 new training instances. X10’ would be French for Type and None for Pat, with a weight of 2/3 *

2/11.

We can now use this modification of the algorithm given in class to compute our decision tree by calculating the

information gain. I will show the formulas for some of the important ones here

FRI = 5/12 * I(2/5,3/5) + 7/12 * I(4/7,3/7)

HUN = (7 + 7/11)/12 * I[4/(7+7/11),(3 + 7/11)/(7+7/11)] + (4 + 4/11)/12 * I[2/(4+4/11),(2+4/11)/(4+4/11)]

TYPE = 4/12 * I(2/4,2/4) + 4/12 * I(3/4,1/4) + (2+2/3)/12 * I(1,0) + (1+1/3)/12 * I[1/(1+1/3),1/3/(1+1/3)]

And so on. The final decision tree will look something like this

Partial preview of the text

Download Artificial intelligence questions and answers and more Exercises Artificial Intelligence in PDF only on Docsity!

CMPSCI 683 Artificial Intelligence

Questions & Answers

General Learning Consider the following modification to the restaurant example described in class, which includes missing and partially specified attributes: ⇒ The outcomes for X1 and X7 are reversed. ⇒ X3 has the missing attribute value for "Pat". ⇒ X5 has the missing attribute value for "Hun". ⇒ X10 has the attribute for “TYPE” which could be either ITALIAN or FRENCH. Define an algorithm for dealing with missing attributes and partially specified attributes,which includes the modified calculation for information gain use to make splitting decisions. Generate a decision tree for this example using your new algorithm. Answer There are a lot of ways of answering this question. One algorithm is as follows: For a training instance with multivalued attributes, I will duplicate that instance by the number of values of that attribute. But each duplicated instance will be weighted down by the number of times I have seen each value in other training examples. For example, in the restaurant example, X10 will now become X10’ and X10’’. X10’ will have a value of French, with a weight of 2/3 (note this is 2/3 because there are only 3 examples with either French or Italian of which 2 are French). X10’’ will have a weight of 1/3 when learning in my decision tree. For a missing attribute, I will treat it like a multivalued attribute, using all possible values of the missing attribute. For example, X3 will become X3’, X3’’ and X3’’’. X3’ will have the value None with a weight of 2/11. X3’’ will have the value Some for Pat, with a weight of 3/11. X3’’’ will have the value Full for Pat, with a weight of 6/11. Note, that these weights are independent of each other. So, if X10 also had the value of Pat missing, I would have to generate 6 new training instances. X10’ would be French for Type and None for Pat, with a weight of 2/3 * 2/11. We can now use this modification of the algorithm given in class to compute our decision tree by calculating the information gain. I will show the formulas for some of the important ones here FRI = 5/12 * I(2/5,3/5) + 7/12 * I(4/7,3/7) HUN = (7 + 7/11)/12 * I[4/(7+7/11),(3 + 7/11)/(7+7/11)] + (4 + 4/11)/12 * I[2/(4+4/11),(2+4/11)/(4+4/11)] TYPE = 4/12 * I(2/4,2/4) + 4/12 * I(3/4,1/4) + (2+2/3)/12 * I(1,0) + (1+1/3)/12 * I[1/(1+1/3),1/3/(1+1/3)] And so on. The final decision tree will look something like this

Neural Networks Apply the backpropagation algorithm to the following network. Your training example is: E1: I1 = 1, I2 = 3, o=0; All weights are set to value 1 initially; learning rate is set to 0.2. Show the updated weight value after the training of the above two examples. (In the figure below, “in^2” means the square of the sum of the inputs.) Answer g = in^ g’ = 2 * in To answer this question, I will normalize my inputs. I1 and I2 will become 0.25 and 0.75 respectively. A1 and a2 are both equal to 1 then. I will normalize them again. A1 and A2 will become 0.5 and 0.5 respectively. O = 1. In my training example, o was 0. I will compute the new weights using the back propagation method Δ o = (Training Output – Calculated output) * g’(input) = (0 – 1) * 2 * 1 = 2‐

Utility Theory In class we discussed an example of decision trees that involved buying a car with tests that could be performed to assess the necessity of repair. What would be the value of information for performing each of the tests in the context of the existence of the other test? What would be the value of information associated with the two tests? Answer First, we need to calculate some probabilities using Bayes rule as we can’t use the given conditional probabilities directly. Given: P(c 1 = good) = 0. 7 P(c 2 = good) = 0. 8 P(T 1 = pass|c 1 = good) = 0. 8 P(T 1 = pass|c 1 = bad) = 0. 35 P(T 2 = pass|c 2 = good) = 0. 75 P(T 2 = pass|c 2 = bad) = 0. 3 Transformed using Bayes rule: P(c 1 = good|T 1 = fail) = 0. 418 P(c 1 = bad|T 1 = fail) = 0. 582 P(c 1 = good|T 1 = pass) = 0. 842 P(c 1 = bad|T 1 = pass) = 0. 158 P(c 2 = good|T 2 = fail) = 0. 588 P(c 2 = bad|T 2 = fail) = 0. 412 P(c 2 = good|T 2 = pass) = 0. 909 P(c 2 = bad|T 2 = pass) = 0. 091 P(T 1 = fail) = 0. 335 P(T 1 = pass) = 0. 665 P(T 2 = fail) = 0. 340

P(T 2 = pass) = 0. 660 To determine the value of the tests I deducted the price whichever test we perform first from the profits, since we would have paid that already. The case where we have already done test 1 is shown in Fig. 1. We can see that the expected profit is the same no matter if we do test 2 or not, so the value of information of test 2 given that we have already done test 1 is 0. As test 2 is not free, we would never do test 2 after test 1. The case where we have already done test 2 is shown in Fig. 2. The expected profit with test 2 only is 270, if we add test 1 it is 312.6., so the value of information of test 1 is 42.6. Unfortunately, test 1 in not free either but comes at a price of 50, so we would never do test 1 after test 2 either. Without any of the two tests, the expected profit is max {500 * 0. 7 − 200 * 0 .3 = 290, 250 * 0 .8 + 100 * 0 .2 = 220 } = 290. With both tests (assuming they are both free) it’s 332.6, so the value of information of both tests is 42.6.

performing T1 and T2).

Build a decision tree for this problem. Only draw the branch that corresponds to T1 being performed first. Note: You do not have to repeatedly draw similar branches. Just draw a representative branch and explain what the other branches might be that you did not end up drawing. Answer
Consider an example branch from my decision tree. Determine the utility of each node in this branch. This is one of the branches for performing the test T1 and T2 simultaneously. Answer The utility of the + branch of chance node C = 500 – 125 = 375 The utility of the – branch of chance node C = 500 – 700 – 125 = 325 The utility of the chance node C = P(C1|~T1) * 375 + P(~C1|~T1) * (325) Note here that C1 is independent of C2 or T2. P(C1|~T1) = P(~T1|C1) * P(C1) / P(~T1) P(~T1) = P(~T1|C1) * P(C1) + P(~T1|~C1) * P(~C1) = 0.2 * 0.7 + 0.65 * 0. = 0. P(C1|~T1) = 0.2 * 0.7 / 0.335 = 0. The utility of the chance node is 32. The utility of the + branch of chance node C = 250 – 125 = 125 The utility of the – branch of chance node C = 250 – 150 – 125 = 25

drop the ball. Some probabilities that will be used in the calculations below: P(:B) = 0:1, The probability that Orville's battery is low given that the observer saw Orville drop the ball, is 0.263.

Artificial intelligence questions and answers, Exercises of Artificial Intelligence

Related documents

Partial preview of the text

Download Artificial intelligence questions and answers and more Exercises Artificial Intelligence in PDF only on Docsity!

CMPSCI 683 Artificial Intelligence

Questions & Answers