Week1-4:
MCQ:
• Types of data
• Predictive versus prescriptive(causal)analysis
cross section data sets一个国家相同时间的数据
time series data一个国家不同时间的数据
panel data(or longitudinal)许多国家不同时间的数据
pooled cross section-time series data set不同国家不同时间的数据
区别:
cross section data sets are not ordered whereas the order of observations in time series data sets convey important information.
i.i.d. -> independent identically distributed
Time series observations are NOT always i.i.d.
In predictive modeling the variables that are used as predictors need not cause the variable that they try to predict.用来预测y的变量x不必一定和y有因果关系。
correlation is not causation。If X and Y are correlated, we can infer that information about X is useful for predicting Y
怎么求OLS里的slope coefficient?
Simple linear regression in matrix form
Geometric interpretation of least squares
有y的样本se^2, regression的se^2.怎么算R^2? SST= SSE+ SSR R^2 = 1-(SSR/SST)
The OLS estimator and its properties
unbiased
BLUE
Classical linear model(CLM)
Effects of rescaling
x1变大c倍-> beta1除以c倍
y的单位扩大c倍, 所有系数扩大c倍
omitted variable bias省去重要变量的影响:
If we run the regression implied by Figure 1 rather than that implied by Figure 2 we are omitting variable edu which is important in predicting wages. By omitting it we run the risk that the coefficient on experience and/or tenure become biased. The amount of bias depends on the correlation between edu and the other two variables and the size of the coefficient attached to edu. This seems to be the problem when observing the negative sign on the coefficient of experience in the regression implied by Figure 1.
简答题:
R^2 interpretation:
The R^2 is very low,so there is a lot of unexplained variation. Note that in a regression with only one explanatory variable,R^2 is the square of sample correlation coefficient between the dependent variable and the explanatory variable.
XXX % of the variation in y is explained by Xs.
Intercept interpretation:
Intercept in eq01 does not have any meaningful interpretation because there is no individual with IQ of zero.
The intercept,however,is now meaningful. It shows the predicted wage for a person with IQ of 100,i.e. average IQ.
Every extra year of education increases the predicted wage by $42.06,keeping IQ constant
if we subtract a constant from one of the explanatory variables,only the OLS estimator of the intercept will change.
unbiased assumption是怎么被用到的?
Gauss-Markov Theorem tells us that OLS has the smallest variance among all linear unbiased estimators.基于假设,OLS算出的estimator有最小的variance
scatter plot的缺点:
Scatter plots cannot tell us anything about the correlation of y and x1 after the influence of x2 has been taken out.
网友评论