REGRESSION MODELLING(STAT2008/STAT4038/STAT6014/STAT6038)Assignment 1 for Semester 1, 2019INSTRUCTIONS: This assignment is worth 15% of your overall marks for this course. Please submit your assignment on Wattle. When uploading to Wattle you must submit the following,combined into a single document:1. Your assignment/report in a pdf document.2. An ‘.R’ le containing the R code you have used for the assignment. Failure to upload theR code will result in a penalty. Assignments should be typed. Your assignment may include some carefully edited computer output(e.g. graphs, tables) showing the results of your data analysis and a discussion of these results,as well as some carefully selected code. Please be selective about what you present and only includeas many pages and as much computer output as necessary to justify your solution. It is importantto be be concise in your discussion of the results. Clearly label each part of your report with thepart of the question that it refers to. Unless otherwise advised, use a signicance level of 5%. Marks may be deducted if these instructions are not strictly adhered to, and marks will certainly bededucted if the total report is of an unreasonable length, i.e. more than 10 pages including graphsand tables. You may include an appendix that is in addition to the above page limits; however theappendix will not be assessed. It will only be used if there is some question about what you haveactually done. You may ask me (Abhinav Mehta) questions about this assignment up to 24 hours before thesubmission time. This will allow me enough time to respond to your questions. Late submissions will attract a penalty of 5% of your mark for each day of delay. No assignmentswill be accepted 10 days beyond the due date. Extensions will usually be granted on medical or compassionate grounds on production of appropriateevidence, but must have my permission by no later than 24hours before the submissiondate. If you are granted an extension and submit your assignment after the extended deadline thenthe late submission penalty will still apply.Assignment 1 - Sem 1, 2019 Page 1 of 3Question 1 [50 Marks]Data on eruptions of Old Faithful Geyser, in October 1980 was collected and stored in a .csv le‘oldfaithful’. Variables are the duration in seconds of the current eruption, and the interval time inminutes to the next eruption. Data was not collected between approximately midnight and 6 AM.It is suspected that Duration is associated with the Interval(a) [5 marks] Conduct an exploratory data analysis to assess whether the two variables are associated.Is there a statistically signicant correlation between the variables?Use the cor.test() function to conduct a suitable hypothesis test. Clearly specify the hypothesesyou are testing and present and interpret the results.(b) [20 marks] Fit a simple linear regression (SLR) model with Interval as the response variableand Duration as the predictor. Construct a plot of the residuals against the tted values, anormal Q-Q plot of the residuals, a bar plot of the leverages for each observation and a bar plotof Cook’s distances for each observation. Use these plots (and other means) to comment onthe model assumptions and on any unusual data points.(c) [10 marks] Produce the ANOVA (Analysis of Variance) table for the SLR model and interpretthe results of the F-test. What is the coe�cient of determination for this model and how shouldyou interpret this summary measure?(d) [10 marks] What are the estimated coe�cients of the SLR model in part (b) and the standarderrors associated with these coe�cients? Interpret the values of these estimated coe�cients andperform t-tests to test whether or not these coe�cients di�er signicantly from zero. What doyou conclude as a result of these t-tests?(e) [5 marks] If there is a eruption which lasted for 120 seconds then what will be the interval oftime before the next eruption, as predicted by your model? Construct an appropriate intervalestimate for the length of this interval.Assignment 1 - Sem 1, 2019 Page 2 of 3Question 2 [50 Marks]On March 1, 1984, the Wall Street Journal published a survey of television advertisements conductedby Video Board Test, Inc., a New York ad-testing company that interviewed 4000 adults. Theserespondents were regular product users who were asked to cite a commercial they had seen for thatproduct category in the past week. In this case, the response is the number of millions of retainedimpressions per week (return). The predictor, (spend), is the amount of money (in $ millions) spentby the rm on advertising. The data is available on wattle in .csv le called advertising.(a) [10 marks] Is there a linear association between the two variables? You may want to experimentwith some transformations, like the natural log (log()) and the square root transformation(sqrt()) to one or both of your variables to assess the linear association. Make a choice at thisstage, for your transformed variables and provide justication for this choice.(b) [15 marks] With your chosen transformations, t a simple linear regression (SLR) model. Constructa plot of the residuals against the tted values, a normal Q-Q plot of the residuals, a barplot of the leverages for each observation and a bar plot of Cook’s distances for each observation.Use these plots (and other means) to comment on the model assumptions and on anyunusual data points.(c) [10 marks] Produce the ANOVA (Analysis of Variance) table for the SLR model and interpretthe results of the F-test. What is the coe�cient of determination for this model and how shouldyou interpret this summary measure?(d) [15 marks] Based on the model t in part (b), write the mathematical expression for the regressionmodel in the original untransformed variables. Interpret the e�ect of coe cients on theresponse variable. In particular, for every $1 million increase in spending how much increase isexpected in the retained impressions, based on your chosen model t?Assignment 1 - Sem 1, 2019 Page 3 of 3本团队核心人员组成主要包括BAT一线工程师,精通德英语!我们主要业务范围是代做编程大作业、课程设计等等。我们的方向领域:window编程 数值算法 AI人工智能 金融统计 计量分析 大数据 网络编程 WEB编程 通讯编程 游戏编程多媒体linux 外挂编程 程序API图像处理 嵌入式/单片机 数据库编程 控制台 进程与线程 网络安全 汇编语言 硬件编程 软件设计 工程标准规等。其中代写编程、代写程序、代写留学生程序作业语言或工具包括但不限于以下范围:C/C++/C#代写Java代写IT代写Python代写辅导编程作业Matlab代写Haskell代写Processing代写Linux环境搭建Rust代写Data Structure Assginment 数据结构代写MIPS代写Machine Learning 作业 代写Oracle/SQL/PostgreSQL/Pig 数据库代写/代做/辅导Web开发、网站开发、网站作业ASP.NET网站开发Finance Insurace Statistics统计、回归、迭代Prolog代写Computer Computational method代做因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:99515681@qq.com 微信:codehelp
网友评论