您當前位置：首頁 >> Algorithm 算法作業Algorithm 算法作業

日期：2019-06-13 10:12

Spring 2019

Take-Home Final Exam

Due Date: Wednesday June 12, 2019 at 12:00p (NOON) - upload on Learn

Instructions

Take-Home Final is to be completed individually. If you have questions, please let me know.

I will only answer questions to clarify wording or possible misunderstandings (i.e. I will not

take a look at your code in efforts to debug; therefore, do not send code in your emails).

Please submit your code, with comments followed by a # sign, and or as part of an R markdown file, and submit a separate Word or PDF document with code, plots, answers,

(appropriately labeled; i.e. Part 1, Question a). NOTE: Do NOT submit results of help

files, or str output. I should be able to run your code and not have to manually comment

your output. Also, properly label each Part of the Exam as well as each Question.

1 Problem 1 (10 points)

a) Simulate 500 datasets with n = 100 paired observations (xi, yi), such that

yi = 1.5 + .6xi + i (1)where xi

is normally distributed with mean=3 and SD=1, and  is normally distrubited

with mean=0 and SD=.8. Note that this is a simple linear regression model with β0 = 1.5,

β1 = .6, and population correlation ρ = .6 . Store your simulation data in a two matrices

called simx and simy (one line per sample).

b) The goal is to assess the actual coverage probabilities (probability of containing the true

value of .6) of 95% confidence intervals for ρ based on the sample Pearson correlation

coefficient r using Fisher’s Z-transform method. The sample correlation coefficient r

does not have a normal sampling distribution, but a transformation z

0 = .5[log(1 + r) log(1r)], where log is the natural logarithm, has an approximately normal distribution

with standard error = 1/√

3. Using the standard error and the appropriate critical

value (quantile) from the standard normal distribution, a 95% C.I. for z

can be created.

To attain a C.I. for ρ, apply the inverse transformation(2)

to the upper and lower limits of he C.I. Write a function f.confidence (using the code

from part (b) above), that takes the collection of samples and the desired confidence level

as inputs. The output should be the percentage of cases in which ρ = .6 lies within

the confidence interval. What would you expect in theory this value to be? Apply the

1

same method to create 99% and 90% confidence intervals and report the actual coverage

probabilities.

c) Next, using your first simulated sample, create a simple bootstrap confidence interval with

B = 10000 resamples. For each resample, compute the untransformed sample correlation

coefficients. Create a histogram for the empirical sampling distribution of r based on your

bootstrap estimates. Compute the upper and lower limits of a 95% confidence interval

for ρ using your bootstrapped values of of r for each resample.

Problem 2 (10 points)

For each simulation below you must:

begin with an initial seed;

comment on every line of the algorithm to describe each action;

not make use of the predefined random number generators in R for each distribution

described below (unless where noted).

a) The Pareto(a, b) distribution has cdf

Develop an algorithm to simulate a random sample of size 1000 from the Pareto(4,2)

distribution. Write out the density of X, and create a histogram that displays this

density. NOTE: The only random number generator allowed for use in this problem is

runif.

b) A discrete random variable X has probability mass function:

x 0 1 2 3 4

p(x) 0.1 0.1 0.3 0.2 0.3

Develop an algorithm to generate a random sample of size 5000 from the distribution of X.

NOTE: The only random number generator allowed for use in this problem is runif. Do

the relative sample frequencies agree closely with the theoretical probability distribution?

c) The Rayleigh density is as follows: Develop an algorithm to generate random samples of size

2000 from a Rayleigh (σ) distribution. Within your algorithm, consider various values

for σ using a for-loop. Display the relationship between each σ value considered and the

random samples drawn. NOTE: The only random number generator allowed for use in

this problem is runif.

2

d) Generate a random sample of size 1000 from the Beta(3,2) distribution by acceptancerejection

method. Create a histogram that displays the sample with the Beta(3,2) density

superimposed. NOTE: The only random number generator allowed for use in this problem

is runif.

2 Problem 3 (10 points )

For example 9.2 from Suess and Trumbo (presented in lecture 8), modify the prior distribution

for the mean height difference such that the variance of the prior distribution for μ is

assumed to be 25.

a) Note that the parameters of the prior distribution for θ were selected such that the

prior probability that the standard deviation of height differences is between 0.3mm and

20mm is approximately 95%. Choose new parameters for the prior on θ such that the

prior probability the SD is between .3mm and 50mm is approximately 95%.

b) Assuming the sample mean and SD stay the same, evaluate the mean of the posterior

distribution for sample sizes 10, 20, 30,. . . 90, 100. Plot the value of the posterior mean

vs the sample size. Plot the width of the 95% posterior interval for μ vs the sample size.

c) Explain the results from part b) based on the relationships between the likelihood, prior

for μ, and posterior distribution for μ.

3