STK4051 - Spring 2020

Q and A for Exam

Q1: Do all subtasks have equal weight?

A1: Yes.
Additional comment for STK4051: You need 40% to pass, if you solve 40% of the problems you have to have these 100% correct (difficult to achieve). If you solve 80% of the problems you need to have an average of 50% correct to pass the exam [this is easier, most of you should manage]. If you solve 100% of the problems and have these mostly correct, you have done excellent work. [Many of you should be able to do this as well :-)]

Q2: Where are the data files?

A2: There is a link in inspira, but you can also find them here.

Q3: In problem 3b, expression (7). It is not clearly defined which mode of \(C _i\) that correspond to the given answer. Can you make this explicit?

A3: The distribution in expression (6) has two modes one mode centered around \( \mu \), the other centered around \( -\mu \) . The class membership \(C _i\) enumerates these modes. In expression (7) it is the probability for the class that correspond to the mode centered around \( \mu \) that is given.

Q4: In problem 4a I'm a bit confused by the question. Should we rather derive the conditional distributions of \(P(C_i|y_i,\nu,\mu)\) , \(p(\nu|{\bf C,y},\mu)\) and \(p(\mu |{\bf C,y},\nu)\), where \({\bf C}=[C_1,C_2,...,C_n]\) and \({\bf y}=[y_1,y_2,...,y_n]\).

A4: Yes. These distributions ( \(P(C_i|y_i,\nu,\mu)\) , \(p(\nu|{\bf C,y},\mu)\) and \(p(\mu |{\bf C,y},\nu)\) ) are what is needed for the Gibbs sampler. In the problem text of the exam it seems like you should only use one data point, whereas you should include all the data in the posterior.

Q5: About problem 1a. In this problem I'm not sure what you mean with how many samples do you need?

A5: What is important in this exercise is that you use Monte Carlo integration, and show that you can control the Monte Carlo variability by selecting the number of samples used in the Monte Carlo integration.

Q6: In 1a, Expression (4), what is the dimension of x on the left hand side of the equation?

A6: In this expression \(x \) has two dimensions, i.e. \(x = (x_1, x_2)\) . It would have been clearer if the x on the left hand side was bold, i.e. \(q({\bf x})=q_1(x_1)\cdot q_2(x_2) \).

Q7: In problem 4b you write: select the prior distribution as you see fit, but give an argument for your choice. Should we use these priors also in problem 4a?

A7: In problem 4a you can use generic expression for your prior distributions, i.e. \(p(\mu)\) and \(p(\nu)\).

Q8: About problem 2b) can we simply modify the script to fit to our task, or should we rewrite the code from scratch?

A8:You can keep code as it is if this does the job you need it to do. You do however have full responsibility for the code you present in the end. [ If there are any errors in the code and you just copy it you are the one that make the error.] Also you need to explain which purpose the different code parts have in the context of the genetic algorithm.

Q9: There are some choices in the "baseball_genetic.r' code that I find strange. Do I need to follow the same set up.

A9: The code "baseball_genetic.r", contains many options, some are commented out, the one which is active need not be the best there is. In your answer write explicitly which choices you have made, and have a code that matches these choices. When you apply the code to the data in the text, your choices may turn out to be suboptimal, It is more important that you are able to see the weakness (if there is one) and discuss this than that you should find an even better solution. [you don't have time to test every possible improvement]

Q10: Do you want me to include all of the code or just the parts relevant for the text.

A10: In general I find it useful if you make a selection of the code which you present in the main text, and add the full code in an appendix. I do not always read the full code, but in cases where you have not succeeded in getting the correct answers, I generally check your code to see if the the error is due to a programming slip or at a more fundamental level.

Published June 9, 2020 9:11 AM - Last modified June 13, 2020 1:30 PM