美文网首页
R成精系列-R语言与门限自回归模型(TAR)

R成精系列-R语言与门限自回归模型(TAR)

作者: 果果哥哥BBQ | 来源:发表于2018-10-14 22:26 被阅读647次

门限自回归模型(threshold autoregressive model),又称阈模型,简称TAR模型,它是一种非线性模型,门限自回归模型由汤家豪(Tong)1978年提出,用来解决一类非线性题。
--百度百科

作为非线性回归模型,门限回归模型在经济计量分析中的应用越来越广泛。在实际操作层面,门限模型的关键是确定门限变量和受门限变量影响的解释变量组。
在R中有许多包可以进行门限分析:

  • pdR
  • TSA
  • tsDyn
  • TAR
  • BAYSTAR
  • Hansen提供的函数

pdR

pdR包中进行面板门限回归的主函数为

ptm(dep, ind1, ind2, d, bootn, trimn, qn, conf_lev, t, n)

主要参数包括:

  • dep:被解释变量,1列单变量,只能以;
  • d:门限变量,1列单变量;
  • ind1:受门限变量影响的解释变量,多列变量组;
  • ind2不受门限变量影响的解释变量,多列变量组;
  • bootn:Boostrap重复次数,门限数量通过bootn和trimn参数控制,如bootn=10、bootn=c(10,10)以及bootn=c(10,10,10)分别表示单门限、双门限和三门限模型;
  • trimn:数据剪切比例;
  • qn:待检验分位数个数;
  • conf_lev:置信水平,比如0.95;
  • t:时间跨度;
  • n:横截面的个体数。

注意:
depdind1ind2只能以矩阵形式存储,不能以向量形式存储,否者程序会报错。
ptm()函数的运行结果最终保存为list形式,list中元素的个数等于门限数量。

TSA

TSA包中可以通过tar函数进行门限回归:

tar(y, p1, p2, d, is.constant1 = TRUE, is.constant2 = TRUE, transform = "no",
 center = FALSE, standard = FALSE, estimate.thd = TRUE, threshold, 
method = c("MAIC", "CLS")[1], a = 0.05, b = 0.95, order.select = TRUE, print = FALSE)

主要参数包括:

y
time series
p1
AR order of the lower regime
p2
AR order of the upper regime
d
delay parameter
is.constant1
if True, intercept included in the lower regime, otherwise the intercept is fixed at zero
is.constant2
similar to is.constant1 but for the upper regime
transform
available transformations: "no" (i.e. use raw data), "log", "log10" and "sqrt"
center
if set to be True, data are centered before analysis
standard
if set to be True, data are standardized before analysis
estimate.thd
if True, threshold parameter is estimated, otherwise it is fixed at the value supplied by threshold
threshold
known threshold value, only needed to be supplied if estimate.thd is set to be False.
method
"MAIC": estimate the TAR model by minimizing the AIC; "CLS": estimate the TAR model by the method of Conditional Least Squares.
a
lower percent; the threshold is searched over the interval defined by the a100 percentile to the b100 percentile of the time-series variable
b :upper percent
order.select:If method is "MAIC", setting order.select to True will enable the function to further select the AR order in each regime by minimizing AIC
print: if True, the estimated model will be printed

Hansen提供的函数

Hansen门限模型工具变量的R程序
#####################################################
##  IVTAR.R
##  for questions, contact
##  Bruce E. Hansen
##  Department of Economics
##  Social Science Building
##  University of Wisconsin
##  Madison, WI 53706-1393
##  behansen@wisc.edu
##  http://www.ssc.wisc.edu/~bhansen/
##  This is a R procedure.  It computes estimates and confidence
##  intervals for threshold models with endogenous regressors.
##  It computes the estimator described in
##  "Instrumental Variable Estimation of a Threshold Model"
##  written by Mehmet Caner and Bruce E. Hansen
##  Be sure to have the Gauss graphics library active.
##  If you run this program, it is currently set up to generate some simulated data  and estimate the model on this data.  
##  For your own application, you can substitute your own data.
#########################################################################
# Functions #
##************************************************************
##  REGRESS
##  Computes a linear regression. Uses generalized inverse if X'X is singular
##  Format
##  beta <- regress(y,x)
##  Inputs
##  y         nxm   dependent variable(s)
##  x         nxk   independent variables (should include constant)
##  Output
##  beta    kxm Regression slope estimates
##************************************************************

regress <- function(y,x){
if (qr(x)$rank==ncol(x)) beta <- qr.solve(x,y)
if (qr(x)$rank<ncol(x)) beta <- (qr(t(x)%*%x)$qr)%*%(t(x)%*%y)
beta
}

##************************************************************
##  JOINT_THRESH
##  Estimates the threshold in a multivariate threshold model
##  Format
##  output <- joint_thresh(y,x,q)
##  yhat <- output$yhat
##  qhat <- output$qhat 
##  Inputs
##  y         nxm   dependent variable(s)
##  x         nxk   independent variables (should include constant)
##  q         nx1   threshold variable
##  Outputs
##  yhat    nxm Predicted values for y
##  qhat    1x1 Threshold Estimate  
##************************************************************

joint_thresh <- function(y,x,q){
n <- nrow(y)
k <- ncol(x)
e <- y-x%*%regress(y,x)
s0 <- det(t(e)%*%e)    
n1 <- round(.05*n)+k
n2 <- round(.95*n)-k
qs <- sort(q)
qs <- qs[n1:n2]
qs <- as.matrix(unique(qs))
qn <- nrow(qs)
sn <- matrix(0,qn,1)
for (r in 1:qn){
  d <- (q<=qs[r])
  xx <- x*(d%*%matrix(1,1,k))
  xx <- xx-x%*%regress(xx,x)
  ex <- e-xx%*%regress(e,xx)
  sn[r] <- det(t(ex)%*%ex)   
}
r <- which.min(sn)
smin <- sn[r]
qhat <- qs[r]
d <- (q<=qhat)
x1 <- x*(d%*%matrix(1,1,k))
x2 <- x*((1-d)%*%matrix(1,1,k))
beta1 <- regress(y,x1)
beta2 <- regress(y,x2)
yhat <- x1%*%beta1+x2%*%beta2
list(yhat=yhat,qhat=qhat)
}

##************************************************************
##  JOINT_THRESH_CI
##  Estimates the threshold in a multivariate threshold model 
##  and computes asymptotic confidence intervals
##  Format
##  output <- joint_thresh(y,x,q,_conf,_graph)
##  yhat <- output$yhat
##  Inputs
##  y         nxm   dependent variable(s)
##  x         nxk   independent variables (should include constant)
##  q         nx1   threshold variable
##  conf    1x1 level for confidence interval, e.g. conf=.9
##  graph   1x1 Set graph=1 to view graph of likelihood ratio
##                Set graph=0 to not display graph
##  Outputs
##  yhat    nxm Predicted values for y
##  qhat    1x1 Threshold Estimate 
##  qcf_0   1x2 Confidence interval for threshold, no heteroskedasticity correction
##  qcf_h1  1x2 Confidence interval for threshold, heteroskedasticity correction 
##                using quadratic variance estimate 
##                (only if m=1)
##  qcf_h2  1x2 Confidence interval for threshold, heteroskedasticity correction 
##                using variance estimated by Epanechnikov kernel with automatic bandwidth
##                (only if m=1) 
##************************************************************

joint_thresh_ci <- function(y,x,q,conf,graph){
n <- nrow(y)
k <- ncol(x)
e <- y-x%*%regress(y,x)
s0 <- det(t(e)%*%e)    
n1 <- round(.05*n)+k
n2 <- round(.95*n)-k
qs <- sort(q)
qs <- qs[n1:n2]
qs <- as.matrix(unique(qs))
qn <- nrow(qs)
sn <- matrix(0,qn,1)
for (r in 1:qn){
  d <- (q<=qs[r])
  xx <- x*(d%*%matrix(1,1,k))
  xx <- xx-x%*%regress(xx,x)
  ex <- e-xx%*%regress(e,xx)
  sn[r] <- det(t(ex)%*%ex)   
}
r <- which.min(sn)
smin <- sn[r]
qhat <- qs[r]
d <- (q<=qhat)
x1 <- x*(d%*%matrix(1,1,k))
x2 <- x*((1-d)%*%matrix(1,1,k))
beta1 <- regress(y,x1)
beta2 <- regress(y,x2)
yhat <- x1%*%beta1+x2%*%beta2
e <- y-yhat
lr <- n*(sn/smin-1)
sig2 <- smin/n

if (ncol(y)> 1){
  eta1 <- 1 
  eta2 <- 1
}else{
  r1 <- (x%*%(beta1-beta2))^2
  r2 <- r1*(e^2)
  qx <- cbind(q^0,q^1,q^2)
  qh <- cbind(qhat^0,qhat^1,qhat^2)  
  m1 <- qr.solve(qx,r1)  
  m2 <- qr.solve(qx,r2)  
  g1 <- qh%*%m1
  g2 <- qh%*%m2
  eta1 <- as.vector((g2/g1)/sig2)
  sigq <- sqrt(mean((q-mean(q))^2))
  hband <- 2.344*sigq/(n^(.2))
  u <- (qhat-q)/hband
  u2 <- u^2
  f <- mean((1-u2)*(u2<=1))*(.75/hband)
  df <- -mean(-u*(u2<=1))*(1.5/(hband^2))
  eps <- r1 - qx%*%m1
  sige <- (t(eps)%*%eps)/(n-3)
  hband <- as.vector(sige/(4*f*((m1[3]+(m1[2]+2*m1[3]*qhat)*df/f)^2)))
  u2 <- ((qhat-q)/hband)^2
  kh <- ((1-u2)*.75/hband)*(u2<=1)
  g1 <- mean(kh*r1)
  g2 <- mean(kh*r2)
  eta2 <- as.vector((g2/g1)/sig2)
}
c1 <- -2*log(1-sqrt(conf))
lr0 <- (lr >= c1)
lr1 <- (lr >= (c1*eta1))
lr2 <- (lr >= (c1*eta2))
if (max(lr0)==1){
  qcf_0 <- cbind(qs[which.min(lr0)],qs[qn+1-which.min(rev(lr0))])
}else{
  qcf_0 <- cbind(qs[1],qs[qn])
}
if (max(lr1)==1){
  qcf_h1 <- cbind(qs[which.min(lr1)],qs[qn+1-which.min(rev(lr1))])
}else{
  qcf_h1 <- cbind(qs[1],qs[qn])
}
if (max(lr2)==1){
  qcf_h2 <- cbind(qs[which.min(lr2)],qs[qn+1-which.min(rev(lr2))])
}else{
  qcf_h2 <- cbind(qs[1],qs[qn])
}

if (graph==1){
   x11()
   mtit <- "Confidence Interval Construction for Threshold"
   ytit <- "Likelihood Ratio Sequence in gamma" 
   xtit <- "Threshold Variable"
   clr <- matrix(1,qn,1)*c1
   if (ncol(y) == 1) clr <- cbind(clr,(clr*eta1),(clr*eta2))
   plot(qs,lr,lty=1,col=1,type="l",ann=0)
   lines(qs,clr[,1],lty=2,col=2)
   if (ncol(y) == 1){
   lines(qs,clr[,2],lty=3,col=3)
   lines(qs,clr[,3],lty=4,col=4)
   }
   title(main=mtit,ylab=ytit,xlab=xtit) 
   tit1 <- "LRn(gamma)"
   tit2 <- "90% Critical"
   tit3 <- "Hetero Corrected - 1"
   tit4 <- "Hetero Corrected - 2" 
   if (ncol(y) != 1) legend("topleft",c(tit1,tit2),lty=c(1,2),col=c(1,2))
   if (ncol(y) == 1) legend("topleft",c(tit1,tit2,tit3,tit4),lty=c(1,2,3,4),col=c(1,2,3,4))
}
list(yhat=yhat,qhat=qhat,qcf_0=qcf_0,qcf_h1=qcf_h1,qcf_h2=qcf_h2)
}

##************************************************************
##  GMM_LINEAR
##  Computes the GMM estimator of a linear model
##  Format
##  output <- gmm_linear(y,z,x)
##  beta <- output$beta
##  Inputs
##  y         nx1   dependent variable
##  z         nxk   rhs variables
##  x         nxl   instruments variables (should include constant and exogenous parts of z), l>=k
##  Outputs
##  beta    kx1 Regression slope estimates
##  se  kx1 standard errors
##  jstat   1x1 J Statistic
##************************************************************

gmm_linear <- function(y,z,x){
 pihat <- regress(z,x)
 xz <- t(x)%*%z 
 xy <- t(x)%*%y
 beta <- qr.solve((t(pihat)%*%xz),(t(pihat)%*%xy))
 e <- y-z%*%beta
 xe <- x*(e%*%matrix(1,1,ncol(x)))
 g <- solve(t(xe)%*%xe)
 v <- solve(t(xz)%*%g%*%xz)
 beta <- v%*%(t(xz)%*%g%*%xy)
 se <- as.matrix(sqrt(diag(v)))
 m <- t(x)%*%(y-z%*%beta)
 jstat <- t(m)%*%g%*%m
list(beta=beta,se=se,jstat=jstat)
}

##************************************************************
##  GMM_THRESH
##  Computes threshold model with endogenous variables
##  Format
##  qhat <- gmm_thresh(y,z1,z2,x,q,conf,conf1,conf2,reduced)
##  Inputs
##  y         nx1   dependent variable
##  z1  nxk1    endogenous rhs variables
##  z2  nxk2    exogenous rhs variables
##  x         nxm   instrumental variables not included in z2, m>=k1
##  q         nx1   threshold variable
##  conf    1x1 level for confidence interval for threshold, e.g. conf=.9
##  conf1   1x1 confidence level for threshold interval used as input to slope intervals, e.g. conf2=.8 
##  conf2   1x1 interval dummy - to determine method for threshold interval used as input to slope intervals
##                set to 0 for uncorrected (homoskedastic) interval
##                set to 1 for heteroskedastic correction by quadratic
##                set to 2 for heteroskedastic correction by nonparametric kernel
##  reduced 1x1 reduced form dummy
##                set to 0 if reduced form is linear (no threshold)
##                set to 1 if reduced form is a linear threshold model
##  Output
##  qhat    1x1 threshold estimate
##  Most of the output is written to the screen.
##  
************************************************************

gmm_thresh <- function(y,z1,z2,x,q,conf,conf1,conf2,reduced){
xx <- cbind(x,z2) 
if (reduced==0) z1hat <- t(xx)%*%regress(z1,xx)
if (reduced==1){ 
    out <-  joint_thresh(z1,xx,q)
    z1hat <- out$yhat
    rhohat <- out$qhat
}
zhat <- cbind(z1hat,z2)

out <- joint_thresh_ci(y,zhat,q,conf,1)
yhat <- out$yhat
qhat <- out$qhat
qcf0 <- out$qcf_0
qcf1 <- out$qcf_h1 
qcf2 <- out$qcf_h1 

z <- cbind(z1,z2)
da <- (q<=qhat)
db <- 1-da
ya <- y[da%*%matrix(1,1,ncol(y))>0]
ya <- matrix(ya,length(ya)/ncol(y),ncol(y))
xa <- xx[da%*%matrix(1,1,ncol(xx))>0]
xa <- matrix(xa,length(xa)/ncol(xx),ncol(xx))
za <- z[da%*%matrix(1,1,ncol(z))>0]
za <- matrix(za,length(za)/ncol(z),ncol(z))
yb <- y[db%*%matrix(1,1,ncol(y))>0]
yb <- matrix(yb,length(yb)/ncol(y),ncol(y))
xb <- xx[db%*%matrix(1,1,ncol(xx))>0]
xb <- matrix(xb,length(xb)/ncol(xx),ncol(xx))
zb <- z[db%*%matrix(1,1,ncol(z))>0]
zb <- matrix(zb,length(zb)/ncol(z),ncol(z))
out1 <- gmm_linear(ya,za,xa)
out2 <- gmm_linear(yb,zb,xb)
beta1 <- out1$beta
se1 <- out1$se 
jstat1 <- out1$jstat
beta2 <- out2$beta
se2 <- out2$se 
jstat2 <- out2$jstat

beta1l <- beta1-se1*1.96
beta1u <- beta1+se1*1.96
beta2l <- beta2-se2*1.96
beta2u <- beta2+se2*1.96

out <- joint_thresh_ci(y,zhat,q,conf1,0)
yhat <- out$yhat
qhat <- out$qhat
qcf0i <- out$qcf_0
qcf1i <- out$qcf_h1 
qcf2i <- out$qcf_h1 

if (conf2==0) qcf <- qcf0i 
if (conf2==1) qcf <- qcf1i 
if (conf2==2) qcf <- qcf2i 
qq <- unique(q)
qq <- as.matrix(sort(qq))
qq <- qq[qq<=qcf[2]]
qq <- as.matrix(qq[qq>=qcf[1]])

for (i in 1:nrow(qq)){
  qi <- qq[i]
  dai <- (q<=qi)
  dbi <- 1-dai
  ya <- y[dai%*%matrix(1,1,ncol(y))>0]
  ya <- matrix(ya,length(ya)/ncol(y),ncol(y))
  xa <- xx[dai%*%matrix(1,1,ncol(xx))>0] 
  xa <- matrix(xa,length(xa)/ncol(xx),ncol(xx))
  za <- z[dai%*%matrix(1,1,ncol(z))>0]
  za <- matrix(za,length(za)/ncol(z),ncol(z))
  yb <- y[dbi%*%matrix(1,1,ncol(y))>0]
  yb <- matrix(yb,length(yb)/ncol(y),ncol(y))
  xb <- xx[dbi%*%matrix(1,1,ncol(xx))>0]
  xb <- matrix(xb,length(xb)/ncol(xx),ncol(xx))
  zb <- z[dbi%*%matrix(1,1,ncol(z))>0]
  zb <- matrix(zb,length(zb)/ncol(z),ncol(z))
  out1 <- gmm_linear(ya,za,xa)
  out2 <- gmm_linear(yb,zb,xb)
  beta1i <- out1$beta
  se1i <- out1$se 
  jstat1i <- out1$jstat 
  beta2i <- out2$beta
  se2i <- out2$se 
  jstat2i <- out2$jstat
  beta1l <- apply(t(cbind((beta1i-se1i*1.96),beta1l)),2,min)
  beta1u <- apply(t(cbind((beta1i+se1i*1.96),beta1u)),2,max)
  beta2l <- apply(t(cbind((beta2i-se2i*1.96),beta2l)),2,min)
  beta2u <- apply(t(cbind((beta2i+se2i*1.96),beta2u)),2,max)
}

cat ("\n")
cat ("\n")
cat ("Threshold Estimate                        ", qhat, "\n")
cat ("Confidence Interval - Uncorrected         ", qcf0, "\n")
cat ("Confidence Interval - Het Corrected Quad  ", qcf1, "\n")
cat ("Confidence Interval - Het Corrected NP    ", qcf2, "\n")
cat ("\n")
cat ("\n")
cat ("Regime 1 : Threshold variable less than   ", qhat, "\n")
cat ("Number of observations                    ", sum(da),"\n")
cat ("\n")
cat ("Estimates     S.E.          Lower         Upper", "\n")
for (i in 1:nrow(beta1)) cat (beta1[i],"  ",se1[i],"  ",beta1l[i],"  ",beta1u[i],"\n")
cat ("\n")
cat ("Regime 2 : Threshold variable greater than", qhat, "\n")
cat ("Number of observations                    ", sum(db),"\n")
cat ("\n")
cat ("Estimates     S.E.          Lower         Upper", "\n")
for (i in 1:nrow(beta2)) cat (beta2[i],"  ",se2[i],"  ",beta2l[i],"  ",beta2u[i],"\n")
cat ("\n")
qhat
}

#************************************************************#
# Example using Simulated Data #
n <- 100
kx <- 4 
sig <- matrix(c(1,0.5,0.5,1),2,2)  
e <- matrix(rnorm(n*2),n,2)%*%chol(sig)
q <- matrix(rnorm(n),n,1)  
x <- matrix(rnorm(n*kx),n,kx)  
z2 <- cbind(matrix(1,n,1),q)  
z1 <- x%*%matrix(1,kx,1)*3+e[,2]
zz <- (z1+z2%*%matrix(1,2,1))*.1
y <- zz*(q<0)-zz*(q>=0)+e[,1]
qhat <- gmm_thresh(y,z1,z2,x,q,.9,.8,1,1)

相关文章

网友评论

      本文标题:R成精系列-R语言与门限自回归模型(TAR)

      本文链接:https://www.haomeiwen.com/subject/qsgtzftx.html