MAST30025: Linear Statistical ModelsAssignment 2, 2019Due: 5pm Friday, May 3 (week 8) This assignment is worth 7% of your total mark. You may use R for this assignment, including the lm function unless specified. If you do,include your R commands and output. Your assignment must be submitted to Turnitin on the LMS as a single PDF documentonly. You may choose to either typeset your assignment or handwrite and scan it toproduce an electronic version. Turnitin will not accept late submissions. Turnitin gives you an option to preview your work prior to submission. Please check thispreview carefully to ensure you are submitting the correct document. After a successfulsubmission to Turnitin, you will see a submission ID. This confirmation will also be sentto your University email address. If you do not see a submission ID, you should assumethat your assignment has not been submitted successfully. Either try to submit again orcontact the tutor co-ordinator (Rheanna Mainzer) immediately to arrange an alternatemeans of submission. Issues with Turnitin are not a valid excuse for submitting a lateassignment or an incorrect version of an assignment. (1 mark) Your assignment must clearly show your name and student ID number, yourtutor’s name and the time and day of your tutorial class. Your assignment must besubmitted in the correct format and the correct orientation. Your answers must beclearly numbered and in the same order as the assignment questions.1. Prove Theorem 4.8: show that the maximum likelihood estimator of the error variance. An experiment is conducted to estimate the annual demand for cars, based on their cost, thecurrent unemployment rate, and the current interest rate. A survey is conducted and the followingmeasurements obtained:Cars sold (×103) Cost ($k) Unemployment rate (%) Interest rate (%)5.5 7.2 8.7 5.55.9 10.0 9.4 4.46.5 9.0 10.0 4.05.9 5.5 9.0 7.08.0 9.0 12.0 5.09.0 9.8 11.0 6.210.0 14.5 12.0 5.810.8 8.0 13.7 3.9For this question, you may NOT use the lm function in R.(a) Fit a linear model to the data and estimate the parameters and variance.(b) Which two of the parameters have the highest (in magnitude) covariance in their estimators?(c) Find a 99% confidence interval for the average number of $8, 000 cars sold in a year which hasunemployment rate 9% and interest rate 5%.1(d) A prediction interval for the number of cars sold in such a year is calculated to be (4012, 7087).Find the confidence level used.(e) Test 代写MAST30025作业、代做Linear Statistical Models作业、R语言作业代写、R编程设计作业调for model relevance using a corrected sum of squares.3. Consider two full rank linear models y = X1γ1 + ε1 and y = Xβ + ε2, where all predictors inthe first model (γ1) are also contained in the second model (β). Show that the SSRes for the firstmodel is at least the SSRes for the second model.4. In this question, we study a dataset of 50 US states. This dataset contains the variables: Population: population estimate as of July 1, 1975 Income: per capita income (1974) Illiteracy: illiteracy (1970, percent of population) Life.Exp: life expectancy in years (1969–71) Murder: murder and non-negligent manslaughter rate per 100,000 population (1976) HS.Grad: percentage of high-school graduates (1970) Frost: mean number of days with minimum temperature below freezing (1931–1960) in capitalor large city Area: land area in square milesThe dataset is distributed with R. Open it with the following commands:> data(state)> statedata We wish to use a linear model to model the murder rate in terms of the other variables.(a) Plot the data and comment. Should we consider any variable transformations?(b) Perform model selection using forward selection, using all variable transformations which maybe relevant.(c) Starting from the full model, perform model selection using stepwise selection with the AIC.(d) Write down your final fitted model (including any variable transformations used).(e) Produce diagnostic plots for your final model and comment.5. For ridge regression, we choose parameter estimators b which minimisewhere λ is a constant penalty parameter.(a) Show that these estimators are given byb = (XT X + λI)1XT y.(b) Calculate the ridge regression estimates for the data from Q2 with penalty parameter λ = 0.5.In order to avoid penalising some parameters unfairly, we must first scale every predictorvariable so that it is standardised (mean 0, variance 1), and centre the response variable(mean 0), in which case an intercept parameter is not used. (Hint: This can be done with thescale function).2(c) One way to calculate the optimal value for the penalty parameter is to minimise the AIC.Since the number of parameters p does not change, we use a slightly modified version:AIC = n ln SSResn+ 2 df,where df is the “effective degrees of freedom” defined bydf = tr(H) = tr(X(XT X + λI)1XT).For the data from Q2, construct a plot of λ against AIC. Thereby find the optimal value forλ.3转自:http://www.7daixie.com/2019050822989213.html
网友评论