--- title: 'R codes for the paper *About predictions in spatial autoregressive models: Optimal and almost optimal strategies*' author: '' output: html_document: default pdf_document: default --- This document was generated directly from RStudio using Markdown tool. It presents the R codes used to obtain the simulation results included in the paper *``About predictions in spatial autoregressive models: Optimal and almost optimal strategies''*. The source of this page (with all R codes used) is available [here](http://www.thibault.laurent.free.fr/code/GLT/GLT.Rmd). To cite this work, please use : Goulard M., Laurent T., and Thomas-Agnan C. (2017). About predictions in spatial autoregressive models: Optimal and almost optimal strategies. *Spatial Economic Analysis*, **12**. ### Before starting ### Install the required packages: ```{r,eval=FALSE} install.packages(c("RColorBrewer", "rgdal", "spdep")) ``` Load the packages : ```{r, results="hide", message=FALSE} require("rgdal") require("spdep") require("RColorBrewer") ``` Set working directory: ```{r} path.dir <- "/media/laurent/DATAPART1/site web/code/GLT" ``` ```{r,echo=FALSE} setwd(path.dir) ``` ## 1. Preparation of the data ## Import the [Midi-Pyrénées communes boundaries](http://www.thibault.laurent.free.fr/code/GLT/contours.zip) into R : ```{r,echo=FALSE,results="hide"} URL <- "http://www.thibault.laurent.free.fr/code/GLT/" fil <- "contours.zip" if (!file.exists(fil)) download.file(paste(URL, fil, sep = ""), fil) unzip(fil) ``` ```{r,echo=TRUE,results="hide"} mapMAP <- readOGR(dsn = "contours", layer = "ADTCAN_region") ``` We convert the type of the identification units into numeric values: ```{r} mapMAP@data$CODE <- as.numeric(as.character(mapMAP@data$CODE)) ``` The number of observations equals to $n$: ```{r} n <- nrow(mapMAP) ``` We consider two spatial weight matrices: * $W_1$ based on the 10-nearest neighbours and row-normalized. $W_1$ is relatively sparse ($96.5\%$ of null values). ```{r,echo=TRUE} coords <- coordinates(mapMAP) W1.listw <- nb2listw(knn2nb(knearneigh(coords, 10)), style = "W") W1.mat <- listw2mat(W1.listw) ``` * $W_2$ based on the weights proportionnal to the inverse distance with a distance threshold equal to 125km and row-normalized. $W_2$ is relatively dense ($40\%$ of null values) ```{r,echo=TRUE} W.nb <- dnearneigh(coords, 0, 125000, longlat = FALSE) dlist <- nbdists(W.nb, coords) dlist.c <- lapply(dlist, function(x) 1/x) W2.listw <- nb2listw(W.nb, glist = dlist.c, style = "W") W2.mat <- listw2mat(W2.listw) ``` Vizualisation of the two matrices (not included in the article) : ```{r,echo=F, fig=T, fig.width=8, fig.height=5} pal <- c("white", brewer.pal(5, "Reds")) oopar <- par(mfrow = c(1, 2), oma = c(0, 0, 0, 0), mar = c(1, 1, 3, 1) + 0.1) z1 <- t(W1.mat) z2 <- t(W2.mat) brks <- c(0, 0, 0.001, 0.002, 0.003, 0.005, 1) nbr3 <- length(brks)-3 image(1:n, 1:n, z1[,ncol(z1):1], breaks = brks, col = pal, main = "Spatial weight matrix W1", axes = FALSE) box() nbr3 <- length(brks)-3 image(1:n, 1:n, z2[,ncol(z2):1], breaks = brks, col = pal, main = "Spatial weight matrix W2", axes = FALSE) box() par(oopar) ``` ```{r, echo=F, eval = F, fig=F, fig.width=15, fig.height=8} # we choose some units to illustrate the problem of prediction ind.example <- c(40, 46, 70, 44, 32, 67, 54, 58, 52, 54, 58, 60, 64) op = par(mar = c(0, 0, 0, 0), mgp = c(2, .4, 0), oma = c(0, 0, 0, 0), mfrow = c(1, 3)) plot(mapMAP[ind.example,]) points(coords[ind.example, ], pch = 16, col = "green", cex = 1.1) points(coords[ind.example, ], pch = 3, col = "royalblue") legend("topleft", legend = c("x,y known", "y to predict"), pch = c(3, 16), col = c("royalblue", "green"), cex = 1.6, bty = "n") plot(mapMAP[ind.example, ]) plot(mapMAP[c(58, 52, 54), ], col = "grey", add = TRUE) plot(mapMAP[c(67,58,52,54), ], density = 10, add = TRUE) points(coords[c(40, 46, 44, 32, 70, 60, 64),], pch = 3, col = "royalblue") points(coords[67,1], coords[67,2], pch=16, col="green", cex=1.1) points(coords[c(58,52,54,67),], pch=1, col="red") legend("topleft", legend=c("x,y known", "y unknown, x known", "y to predict"), pch=c(3,1,16), col=c("royalblue", "red", "green"), cex=1.6, bty="n") legend("bottomleft", legend=c("not used for fitting the model", "not used for prediction"), fill = c("white","grey"), bty="n", cex=1.6) legend("bottomleft", legend=c("", ""),density=c(10,0), bty="n", cex=1.6) # multiple out-of-sample prediction plot(mapMAP[ind.example, ]) plot(mapMAP[c(67,58,52,54), ], density=10, add=TRUE) points(coords[c(40,46,44,32,70,60,64),], pch=3, col="royalblue") points(coords[c(58,52,54,67),1], coords[c(58,52,54,67),2], pch=16, col="green", cex=1.1) points(coords[c(58,52,54,67),1], coords[c(58,52,54,67),2], pch=1, col="red", cex=1.1) legend("topleft", legend=c("x,y known", "y unknown, x known", "y to predict"), pch=c(3,1,16), col=c("royalblue", "red", "green"), cex=1.6, bty="n") legend("bottomleft", legend="not used for fitting the model", fill="black", density=10, cex=1.6, bty="n") par(op) ``` ## 2. Simulation framework for in-sample predictions## ### a. Simulation of a SAR process### We simulate three explanatory variables as follows. $X_1$ follows a gaussian distribution $\mathcal{N}(15,3)$, $X_2$ follows (up to a constant) a binomial distribution $\mathcal{B}(100,0.45)/100$ and $X_3$ follows a log-uniform distribution $log(\mathcal{U}_{[0,283]})$. ```{r} x1 <- rnorm(n, 15, 3) x2 <- rbinom(n, 100, 0.45)/100 x3 <- log(round(runif(n, 1, n),0)) x <- cbind(rep(1, n), x1, x2, x3) ``` We use the following spatial autoregressive LAG regression model to generate the dependent variable $$ \mathbf{Y}=(\mathbf{I}-\rho \mathbf{W})^{-1}(\beta_0\mathbf{1_n} + \mathbf{X_1} \beta_1+ \mathbf{X_2} \beta_2+ \mathbf{X_3} \beta_3 + \boldsymbol{\epsilon}) ~~\text{where $\boldsymbol{\epsilon} \sim \mathcal{N}(0,\sigma^2 \mathbf{I_n}),$} $$ where $\mathbf{1_n}= (1,\cdots,1)' \in \mathbb{R}^n$, with fixed $\beta=(\beta_0,~\beta_1,~\beta_2, ~\beta_3)'=(5,~0.25,~6,~1)'$, ```{r} beta0 <- c(5, 1/4, 6, 1) ``` #### The different choices of parameters ### We consider the following parameter values in the simulation process : * $\rho=-0.4,~-0.2,~0.05,~0.2,~0.35,~0.5,~0.65,~0.8,~0.9$, ```{r} rho <- c(-0.4, -0.2, 0.05, 0.2, 0.35, 0.5, 0.65, 0.8, 0.9) ``` * $\sigma=1,~3$, ```{r} sigma <- c(1, 3) ``` * $W=W_1$ or $W=W_2$. For example, for $\rho=0.35$, $\sigma=1$ and $W_1$, the simulation of $Y$ is as follows: ```{r} n <- nrow(x) A <- diag(1,n) - 0.35*W1.mat Y0 <- x%*%beta0 + rnorm(n, 0, 1) y <- solve(A, Y0) ``` We have 9 possible values for $\rho$, 2 possible values for $\sigma$ and 2 possible spatial weight matrices $W_1$ or $W_2$, hence we obtain 36 different possible combinations. We save these parameters in the file **param-in.RData** included in a user-choosen directory and which will be needed later for executing the simulation program: ```{r} list.insample.param <- list(rho = rho, W = list(W1.listw, W2.listw), sigma = sigma) if(!dir.exists(paste(path.dir, "parameters", sep = ""))) dir.create(paste(path.dir, "parameters", sep = "")) save(list.insample.param, beta0, x, file=paste(path.dir, "parameters/param-in.RData", sep = "")) ``` For each of the combinations, we draw 1000 samples, estimate a SAR model with the *lagsarlm* function (to obtain $\hat\rho$, $\hat\beta$ and $\hat\sigma$) and compute the following predictors : * $\mathbf{\check{Y}^{TS}} = \mathbf{X} \boldsymbol{\hat\beta} + \hat\rho \mathbf{W}\mathbf{Y}$ * $\mathbf{\check{Y}^{TC}} = (\mathbf{I}-\hat \rho \mathbf{W})^{-1} \mathbf{X} \boldsymbol{\hat\beta}$ * $\mathbf{\check{Y}^{BP}} = (\mathbf{I}-\rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat\beta} -\text{Diag}(\mathbf{\hat{Q}})^{-1}\mathbf{\tilde{\hat{Q}}}(\mathbf{Y}-(\mathbf{I}- \rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat \beta})$ ### b. Computing the predictors with **spdep** ### All the prediction formulas cited in our paper are implemented in the releases later than 0.5-92 of the **spdep** R package (Bivand and Piras, 2015), following the GSoC 2015 project by M. Gubri. ```{r} don <- data.frame(y, x1, x2, x3) sar.res <- spdep::lagsarlm(y~., data=don, W1.listw, method="eigen") y_TC = predict.sarlm(sar.res, listw = W1.listw, pred.type = "TC", all.data = T, zero.policy = F) y_TS = predict.sarlm(sar.res, listw = W1.listw, pred.type = "TS", all.data = T, zero.policy = F) y_BP = predict.sarlm(sar.res, listw = W1.listw, pred.type = "BP", all.data=T, zero.policy=F) ``` ### c. Results ### The in-sample prediction simulation codes are given [here](http://www.thibault.laurent.free.fr/code/GLT/function/in-sample.R) and can be executed using parallel processing. We summarize the results and obtain the following table: ```{r, echo=F} mse=function(x) {return(round(c(mean(x[,3]), mean(x[,1]), mean(x[,2])), 4)) } mse.sd=function(x) {return(round(c(sd(x[,3]), sd(x[,1]), sd(x[,2])), 4)) } mse.rate=function(x) {return(round(c(mean(x[,3])/mean(x[,1]),mean(x[,3])/mean(x[,2])),4)) } ``` * comparaison of the MSE depending on $W_1$ or $W_2$ with $\sigma=1$ (Table 2 in the article): ```{r, echo=F} # Results from In-sample.R file are loaded load(url(paste(URL, "result/res-in.RData", sep = ""))) l.mse <- lapply(result.in, mse) l.mse.sd <- lapply(result.in, mse.sd) l.mse.rate <- lapply(result.in, mse.rate) # table of results : table.sigma1 <- matrix(0, length(rho)*2, 6) for(k in 1:length(rho)) {table.sigma1[2*(k-1)+1, 1:3] <- l.mse[[k]] table.sigma1[2*(k-1)+1, 4:6] <- l.mse[[k+9]] table.sigma1[2*(k-1)+2, 1:3] <- paste("(", l.mse.sd[[k]], ")", sep = "") table.sigma1[2*(k-1)+2, 4:6] <- paste("(", l.mse.sd[[k+9]], ")", sep = "") } table.sigma1 <- cbind("", table.sigma1) table.sigma1[seq(1,length(rho)*2,2),1] <- rho colnames(table.sigma1) <- c("rho", "BP (W1)", "TS (W1)", "TC (W1)", "BP (W2)", "TS (W2)", "TC (W2)") ``` ```{r table1, results='asis', message=FALSE, echo=FALSE} library("knitr") #library("xtable") kable(table.sigma1) ``` * comparaison of the ratio of MSE (BP/TS and BP/TC) depending on $W_1$ or $W_2$ with $\sigma=1$ (Table 3 in the article): ```{r, echo=F} # table of results : table.sigma1.r <- matrix(0,length(rho), 4) for(k in 1:length(rho)) {table.sigma1.r[k, 1:2] <- l.mse.rate[[k]] table.sigma1.r[k, 3:4] <- l.mse.rate[[k+9]] } table.sigma1.r <- cbind("", table.sigma1.r) table.sigma1.r[,1] <- rho colnames(table.sigma1.r) <- c("rho", "BP_TS (W1)", "BP_TC (W1)", "BP_TS (W2)", "BP_TC (W2)") ``` ```{r table1.r, results='asis', message=FALSE, echo=FALSE} kable(table.sigma1.r) ``` * comparaison of the MSE depending on $W_1$ or $W_2$ with $\sigma=3$ (results not included in the article): ```{r, echo=F} # table of results : table.sigma2 <- matrix(0,length(rho)*2, 6) for(k in 1:length(rho)) {table.sigma2[2*(k-1)+1, 1:3] <- l.mse[[k+18]] table.sigma2[2*(k-1)+1, 4:6] <- l.mse[[k+18+9]] table.sigma2[2*(k-1)+2, 1:3] <- paste("(", l.mse.sd[[k+18]], ")",sep="") table.sigma2[2*(k-1)+2, 4:6] <- paste("(", l.mse.sd[[k+18+9]], ")",sep="") } table.sigma2 <- cbind("", table.sigma2) table.sigma2[seq(1,length(rho)*2,2),1] <- rho colnames(table.sigma2) <- c("rho", "BP (W1)", "TS (W1)", "TC (W1)", "BP (W2)", "TS (W2)", "TC (W2)") ``` ```{r table2, results='asis', message=FALSE, echo=FALSE} kable(table.sigma2) ``` * comparaison of the MSE ratio BP/TS and BP/TC depending on $W_1$ or $W_2$ with $\sigma=3$ (results not included in the article): ```{r, echo=F} # table of results : table.sigma2.r <- matrix(0,length(rho), 4) for(k in 1:length(rho)) {table.sigma2.r[k, 1:2] <- l.mse.rate[[k+18]] table.sigma2.r[k, 3:4] <- l.mse.rate[[k+18+9]] } table.sigma2.r <- cbind("", table.sigma2.r) table.sigma2.r[,1] <- rho colnames(table.sigma2.r) <- c("rho", "BP_TS (W1)", "BP_TC (W1)", "BP_TS (W2)", "BP_TC (W2)") ``` ```{r table2.r, results='asis', message=FALSE, echo=FALSE} kable(table.sigma2.r) ``` ## 3. Simulation framework for out-of-sample predictions## ### a. Simulation process### We choose at random a given number of sites ($27$ or $54$) which will be declared out-of-sample (in $O$). We predict the $\mathbf{Y}$ variable on the out-of-sample locations based on the sample $S$ constituted by the remaining sites. We consider several situations depending upon the number of out-of-sample units and upon the aggregation level of the out-of-sample units. The corresponding configurations are: #### 27 out-of-sample units (R codes) ```{r} set.seed(18) config1.27 <- sample(1:283, 27) config2.27 <- c(72, 180, 269, 155, 275, 143, 259) for(p in 1:4) {set.seed(p) config2.27 <- c(config2.27, sample(W1.listw$neighbours[[config2.27[p]]],5)) } config3.27 <- c(132, 59, 225) for(p in 1:3) {set.seed(p+1) config3.27 <- c(config3.27, sample(W1.listw$neighbours[[config3.27[p]]],8)) } ``` #### 54 out-of-sample units (R codes) ```{r} set.seed(15) config1.54 <- sample(1:283,54) config2.54 <- c(19, 76, 137, 257, 214, 52, 39, 163, 168) for(p in 1:9) {set.seed(p+1) config2.54 <- c(config2.54, sample(W1.listw$neighbours[[config2.54[p]]],5)) } config3.54 <- c(6, 145, 278, 229, 26, 158) for(p in 1:6) {set.seed(p) config3.54 <- c(config3.54, sample(W1.listw$neighbours[[config3.54[p]]],8)) } ``` #### Graphical representation of the 27 and 54 out-of-sample units (Figure 3 in the article) ```{r, fig.width=10, fig.height=5, echo=F} #postscript("Z:\\Thibault Pro\\research\\statistique spatiale\\pr?diction Michel Christine\\article spatial economic analysis\\codes\\Figures\\config.eps", width = 12.0, height = 8.0, horizontal = FALSE, onefile = FALSE, paper = "special", family = "ComputerModern", encoding = "TeXtext.enc") op <- par(mfrow=c(2,3), oma=c(0,0,0,0), mar=c(0,0,0,0)) plot(mapMAP) plot(mapMAP[config1.27,],col='red',add=TRUE) plot(mapMAP) plot(mapMAP[config2.27,],col='red',add=TRUE) plot(mapMAP) plot(mapMAP[config3.27,],col='red',add=TRUE) plot(mapMAP) plot(mapMAP[config1.54,],col='red',add=TRUE) plot(mapMAP) plot(mapMAP[config2.54,],col='red',add=TRUE) plot(mapMAP) plot(mapMAP[config3.54,],col='red',add=TRUE) par(op) #dev.off() ``` ### b. Out-of-sample predictors formulae ### * $\mathbf{\hat Y_O^{BP}} = \mathbf{\hat Y_O^{TC}} - \mathbf{\hat{Q}_{OO}}^{-1}\mathbf{\hat{Q}_{OS}} \times (\mathbf{Y_S}-\mathbf{\hat Y_S^{TC}})$ * $\mathbf{\hat Y_O^{TC}} = [(\mathbf{I} - \hat \rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat\beta}]_O$ * $\hat Y_o^{TS^{1}} = \mathbf{X_o}\boldsymbol{\hat\beta} + \hat\rho \mathbf{W_{oS}}\mathbf{Y_S}$ * $\mathbf{\hat Y_O^{BP_{W}}}= \mathbf{\hat Y_{O}^{TC}}+\boldsymbol{\hat{\Sigma}_{OS}}\mathbf{W_{OS}'}(\mathbf{W_{OS}}\boldsymbol{\hat{\Sigma}_{SS}} \mathbf{W_{OS}'})^{-1}(\mathbf{W_{OS}}\mathbf{Y_S} - \mathbf{W_{OS}}\mathbf{\hat Y_{S}^{TC}})$ * $\mathbf{\hat Y_O^{BP_N}}= \mathbf{\hat Y_O^{TC}}-\mathbf{\hat{Q}_{OO}}^{-1}\mathbf{\hat{Q}_{OJ}} \mathbf{(Y_J - \hat Y^{TC}_J )_J}$ for $J$ set of indices of neighbors of $O$ * $\hat Y_o^{TC^1}$, row $o$ of $$ \{ \mathbf{I_{n_S+1}} - \hat \rho \begin{pmatrix} \mathbf{W_{SS}}& \mathbf{W_{So}} \\ \mathbf{W_{oS}}& \mathbf{W_{oo}} \end{pmatrix}\}^{-1} \begin{pmatrix} \mathbf{X_S} \\ \mathbf{X_o} \end{pmatrix} \boldsymbol{\hat \beta}$$ * $\hat Y_o^{BP^1} = \hat Y_o^{TC^1} - \hat{Q}_{oo}^{-1}\mathbf{\hat{Q}_{oS}}(\mathbf{Y_S} - \mathbf{\hat Y_S^{TC^1}})$ * $\hat Y_o^{BP_W^1} = \hat Y_{o}^{TC^1}+\mathbf{\hat{\Sigma}_{oS}}\mathbf{W_{oS}'}(\mathbf{W_{oS}}\boldsymbol{\hat{\Sigma}_{SS}} \mathbf{W_{oS}'})^{-1}(\mathbf{W_{oS}}\mathbf{Y_S} - \mathbf{W_{oS}}\mathbf{\hat Y_{S}^{TC^1}})$ * $BP_N^1$ & $\hat Y_o^{BP_N^1} = \hat Y_o^{TC}-\hat{Q}_{oo}^{-1}\mathbf{\hat{Q}_{oJ}} (\mathbf{Y_J} - \mathbf{\hat Y^{TC^1}_J} )_J$ for $J$ set of indices of neighbors of $o$ ### c. Computing the predictions with **spdep** ### We define the indices of the out-of-sample units and build the $S$ and $O$ sample : ```{r} outsamp <- config1.54 S.sample <- don[-outsamp, ] O.sample <- don[outsamp, ] ``` We estimate a SAR model on the $S$ sample. For that, we first have to define the $W_S$ matrix: ```{r} WS.listw <- subset.listw(W1.listw, !(1:n %in% outsamp)) sar.res <- lagsarlm(y~., data = S.sample, WS.listw, method = "eigen") ``` We predict the out-of-sample $O$ values of $Y$ by using *predict.sarlm()* function. Option **pred.type=** gives the predictors type : ```{r, eval=F} y_BP = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BP", power = TRUE)) y_BP1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BP1", power = TRUE)) y_TC = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "TC", power = TRUE)) y_TS1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "TS1", power = TRUE)) y_BPW = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BPW", power = TRUE)) y_BPN = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BPN", power = TRUE)) y_BPW1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BPW1", power = TRUE)) y_BPN1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, newdata = O.sample, zero.policy = T, pred.type = "BPN1", power = TRUE)) ``` ### d. The different choices of parameters in the simulations ### In the simulations, we fix $\rho=0.35$. The different parameters we will test are: * 2 possible values for $\sigma$ ($\sigma=1,~3$). * 2 possible spatial weight matrices $W_1$ and $W_2$. * 6 different configurations for out-of-sample units. * 2 different methods for estimating the parameters: one using ML, one using EM. Thus, we obtain 48 different possible combinations whose parameters are saved in the **param-out.RData** file which will be used to run the simulations: ```{r} list.param <- list(W = list(W1.listw, W2.listw), sigma = sigma, config = list(config1.27, config2.27, config3.27,config1.54, config2.54, config3.54), EM = c(FALSE, TRUE)) save(list.param, beta0, x, file = paste(path.dir, "parameters/param-out.RData", sep = "")) ``` ### e. EM or not EM: that is the question? ### Among the different configurations presented previously, we will first focus on the estimation of the parameters to check if we can notice a strong difference when estimating by EM (see R codes uses [here](http://www.thibault.laurent.free.fr/code/GLT/function/code-EM.r)). We do not consider all the cases, but the following ones : * 54 missing units: randomly distributed or strongly clustered * $\sigma=1,~3$ * $W=W_1,~W_2$ #### $\sigma=1$, $W_1$ (Table 5 in the article) ```{r, echo=F} load(url(paste(URL, "result/res-EM.RData", sep=""))) table.res.EM1 <- matrix(0, 12, 5) table.res.EM1[seq(1,12,2),2] <- round(c(apply(result.EM[[1]][[1]],2,mean), mean(result.EM[[1]][[3]]), mean(result.EM[[1]][[5]])),3) table.res.EM1[seq(2,12,2),2] <- paste("(",round(c(apply(result.EM[[1]][[1]],2,sd), sd(result.EM[[1]][[3]]), sd(result.EM[[1]][[5]])),3),")",sep="") table.res.EM1[seq(1,12,2),3] <- round(c(apply(result.EM[[1]][[2]],2,mean),mean(result.EM[[1]][[4]]), mean(result.EM[[1]][[6]])),3) table.res.EM1[seq(2,12,2),3] <- paste("(",round(c(apply(result.EM[[1]][[2]],2,sd), sd(result.EM[[1]][[4]]), sd(result.EM[[1]][[6]])),3),")",sep="") table.res.EM1[seq(1,12,2),4] <- round(c(apply(result.EM[[5]][[1]],2,mean), mean(result.EM[[5]][[3]]), mean(result.EM[[5]][[5]])),3) table.res.EM1[seq(2,12,2),4] <- paste("(",round(c(apply(result.EM[[5]][[1]],2,sd), sd(result.EM[[5]][[3]]), sd(result.EM[[5]][[5]])),3),")",sep="") table.res.EM1[seq(1,12,2),5] <- round(c(apply(result.EM[[5]][[2]],2,mean),mean(result.EM[[5]][[4]]), mean(result.EM[[5]][[6]])),3) table.res.EM1[seq(2,12,2),5] <- paste("(",round(c(apply(result.EM[[5]][[2]],2,sd), sd(result.EM[[5]][[4]]), sd(result.EM[[5]][[6]])),3),")",sep="") table.res.EM1[,1] <- c("beta0", "", "beta1", "", "beta2", "", "beta3", "", "rho", "", "sigma2", "") colnames(table.res.EM1) <- c("","ML (random)","EM (random)", "ML (clustered)","EM (clustered)") ``` ```{r table3.EM, results='asis', message=FALSE, echo=FALSE} kable(table.res.EM1) ``` #### $\sigma=1$, $W_2$ (Table 5 in the article) ```{r, echo=F} table.res.EM2 <- matrix(0, 12, 5) table.res.EM2[seq(1,12,2),2] <- round(c(apply(result.EM[[2]][[1]],2,mean), mean(result.EM[[2]][[3]]), mean(result.EM[[2]][[5]])),3) table.res.EM2[seq(2,12,2),2] <- paste("(",round(c(apply(result.EM[[2]][[1]],2,sd), sd(result.EM[[2]][[3]]), sd(result.EM[[2]][[5]])),3),")",sep="") table.res.EM2[seq(1,12,2),3] <- round(c(apply(result.EM[[2]][[2]],2,mean),mean(result.EM[[2]][[4]]), mean(result.EM[[2]][[6]])),3) table.res.EM2[seq(2,12,2),3] <- paste("(",round(c(apply(result.EM[[2]][[2]],2,sd), sd(result.EM[[2]][[4]]), sd(result.EM[[2]][[6]])),3),")",sep="") table.res.EM2[seq(1,12,2),4] <- round(c(apply(result.EM[[6]][[1]],2,mean), mean(result.EM[[6]][[3]]), mean(result.EM[[6]][[5]])),3) table.res.EM2[seq(2,12,2),4] <- paste("(",round(c(apply(result.EM[[6]][[1]],2,sd), sd(result.EM[[6]][[3]]), sd(result.EM[[6]][[5]])),3),")",sep="") table.res.EM2[seq(1,12,2),5] <- round(c(apply(result.EM[[6]][[2]],2,mean),mean(result.EM[[6]][[4]]), mean(result.EM[[6]][[6]])),3) table.res.EM2[seq(2,12,2),5] <- paste("(",round(c(apply(result.EM[[6]][[2]],2,sd), sd(result.EM[[6]][[4]]), sd(result.EM[[6]][[6]])),3),")",sep="") table.res.EM2[,1] <- c("beta0", "", "beta1", "", "beta2", "", "beta3", "", "rho", "", "sigma2", "") colnames(table.res.EM2) <- c("","ML (random)","EM (random)", "ML (clustered)","EM (clustered)") ``` ```{r table3.EM2, results='asis', message=FALSE, echo=FALSE} kable(table.res.EM2) ``` #### $\sigma=3$, $W_1$ (Not included in the article) ```{r, echo=F} table.res.EM3 <- matrix(0, 12, 5) table.res.EM3[seq(1,12,2),2] <- round(c(apply(result.EM[[3]][[1]],2,mean), mean(result.EM[[3]][[3]]), mean(result.EM[[3]][[5]])),3) table.res.EM3[seq(2,12,2),2] <- paste("(",round(c(apply(result.EM[[3]][[1]],2,sd), sd(result.EM[[3]][[3]]), sd(result.EM[[3]][[5]])),3),")",sep="") table.res.EM3[seq(1,12,2),3] <- round(c(apply(result.EM[[3]][[2]],2,mean),mean(result.EM[[3]][[4]]), mean(result.EM[[3]][[6]])),3) table.res.EM3[seq(2,12,2),3] <- paste("(",round(c(apply(result.EM[[3]][[2]],2,sd), sd(result.EM[[3]][[4]]), sd(result.EM[[3]][[6]])),3),")",sep="") table.res.EM3[seq(1,12,2),4] <- round(c(apply(result.EM[[7]][[1]],2,mean), mean(result.EM[[7]][[3]]), mean(result.EM[[7]][[5]])),3) table.res.EM3[seq(2,12,2),4] <- paste("(",round(c(apply(result.EM[[7]][[1]],2,sd), sd(result.EM[[7]][[3]]), sd(result.EM[[7]][[5]])),3),")",sep="") table.res.EM3[seq(1,12,2),5] <- round(c(apply(result.EM[[7]][[2]],2,mean),mean(result.EM[[7]][[4]]), mean(result.EM[[7]][[6]])),3) table.res.EM3[seq(2,12,2),5] <- paste("(",round(c(apply(result.EM[[7]][[2]],2,sd), sd(result.EM[[7]][[4]]), sd(result.EM[[7]][[6]])),3),")",sep="") table.res.EM3[,1] <- c("beta0", "", "beta1", "", "beta2", "", "beta3", "", "rho", "", "sigma2", "") colnames(table.res.EM3) <- c("","ML (random)","EM (random)", "ML (clustered)","EM (clustered)") ``` ```{r table3.EM3, results='asis', message=FALSE, echo=FALSE} kable(table.res.EM3) ``` #### $\sigma=3$, $W_2$ (Not included in the article) ```{r, echo=F} table.res.EM4 <- matrix(0, 12, 5) table.res.EM4[seq(1,12,2),2] <- round(c(apply(result.EM[[4]][[1]],2,mean), mean(result.EM[[4]][[3]]), mean(result.EM[[4]][[5]])),3) table.res.EM4[seq(2,12,2),2] <- paste("(",round(c(apply(result.EM[[4]][[1]],2,sd), sd(result.EM[[4]][[3]]), sd(result.EM[[4]][[5]])),3),")",sep="") table.res.EM4[seq(1,12,2),3] <- round(c(apply(result.EM[[4]][[2]],2,mean),mean(result.EM[[4]][[4]]), mean(result.EM[[4]][[6]])),3) table.res.EM4[seq(2,12,2),3] <- paste("(",round(c(apply(result.EM[[4]][[2]],2,sd), sd(result.EM[[4]][[4]]), sd(result.EM[[4]][[6]])),3),")",sep="") table.res.EM4[seq(1,12,2),4] <- round(c(apply(result.EM[[8]][[1]],2,mean), mean(result.EM[[8]][[3]]), mean(result.EM[[8]][[5]])),3) table.res.EM4[seq(2,12,2),4] <- paste("(",round(c(apply(result.EM[[8]][[1]],2,sd), sd(result.EM[[8]][[3]]), sd(result.EM[[8]][[5]])),3),")",sep="") table.res.EM4[seq(1,12,2),5] <- round(c(apply(result.EM[[8]][[2]],2,mean),mean(result.EM[[8]][[4]]), mean(result.EM[[8]][[6]])),3) table.res.EM4[seq(2,12,2),5] <- paste("(",round(c(apply(result.EM[[8]][[2]],2,sd), sd(result.EM[[8]][[4]]), sd(result.EM[[8]][[6]])),3),")",sep="") table.res.EM4[,1] <- c("beta0", "", "beta1", "", "beta2", "", "beta3", "", "rho", "", "sigma2", "") colnames(table.res.EM4) <- c("","ML (random)","EM (random)", "ML (clustered)","EM (clustered)") ``` ```{r table3.EM4, results='asis', message=FALSE, echo=FALSE} kable(table.res.EM4) ``` **Comments:** * the parameters estimates do not vary a lot between ML and EM algorithm. * estimations of $\beta_1$, $\beta_2$ and $\beta_3$ are good whatever the choice for $W_1$ and $W_2$, $\sigma=1,~3$. * estimations of $\sigma$ is good. * when $W_2$ is dense, estimation of $\rho$ is not good, especially when $\sigma=3$ ### f. Simulation algorithms and results when $\rho = 0.35$ ### For each of the 48 combinations, we draw 1000 samples, estimate a SAR model with the **lagsarlm** (with or without EM algorithm) and we compute the previous predictors. The out-of-sample prediction simulation codes are given [here](http://www.thibault.laurent.free.fr/code/GLT/function/out-of-sample.R) and can be executed using parallel processing. We summarize the results and obtain the following table: #### The 6 configurations with $\sigma=1$, $W_1$ and $EM=FALSE$ (Table 6 in the article) #### ```{r,echo=F} load(url(paste(URL, "result/res-out.RData", sep = ""))) nb.poss <- prod(unlist(lapply(list.param, length))) # the array with the different possibilities arr.poss <- array(1:nb.poss, unlist(lapply(list.param, length))) ``` ```{r,echo=F} table.res.out.W1 <- matrix(0, 26, 7) colnames(table.res.out.W1) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.out.W1[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.out.W1[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.out[[arr.poss[1,1,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.out.W1[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.out[[arr.poss[1,1,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.out.W1[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.out.W1[1, k+1])/as.numeric(table.res.out.W1[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.r, results='asis', message=FALSE, echo=FALSE} kable(table.res.out.W1) ``` **Comments:** * standard deviations are comparable across predictors. * $BPw$ and $BPn$ seem nearly equivalent to $BP$ * $BP1$ is slightly less efficient than $BP$ and it seems less good when the predicted sites are clustered. It is always better than $TC$ and $TS$. * $TC$ is less efficient than $BP$, it seems better when the predicted sites are clustered. * $TS$ is less efficient than $BP$ but better than $TC$. It becomes less efficient when the predicted sites are clustered. * $TC1$ is close to $TC$, but a little bit less efficient when predicted sites are clustered. * $BPw1$ and $BPn1$ are slightly better than $TS$. #### The 6 configurations with $\sigma=1$, $W_2$ and $EM=FALSE$ (Table 7 in the article) #### It seems that when $W$ is denser, all the predictors are close to $BP$. ```{r,echo=F} table.res.out.W2 <- matrix(0, 26, 7) colnames(table.res.out.W2) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.out.W2[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.out.W2[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.out[[arr.poss[2,1,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.out.W2[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.out[[arr.poss[2,1,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.out.W2[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.out.W2[1, k+1])/as.numeric(table.res.out.W2[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.rc, results='asis', message=FALSE, echo=FALSE} kable(table.res.out.W2) ``` #### The 6 configurations with $\sigma=1$, $W_1$ and $EM=TRUE$ (Not included in the article) #### It seems that the fact that $\beta$, $\rho$ and $\sigma$ are estimated with the EM algorithm do not affect the prediction. Indeed, the predictors are very close to that of the first situation. ```{r,echo=F} table.res.out.W1.EM <- matrix(0, 26, 7) colnames(table.res.out.W1.EM) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.out.W1.EM[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.out.W1.EM[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.out[[arr.poss[1,1,,2][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.out.W1.EM[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.out[[arr.poss[1,1,,2][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.out.W1.EM[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.out.W1.EM[1, k+1])/as.numeric(table.res.out.W1.EM[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.rb, results='asis', message=FALSE, echo=FALSE} kable(table.res.out.W1.EM) ``` #### The 6 configurations with $\sigma=3$, $W_1$ and $EM=FALSE$ (Not included in the article) #### When $\sigma=3$, it does not change the rank either the ratio between $BP$ and other predictors. ```{r,echo=F} table.res.out.W1.sigma <- matrix(0, 26, 7) colnames(table.res.out.W1.sigma) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.out.W1.sigma[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.out.W1.sigma[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.out[[arr.poss[1,2,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.out.W1.sigma[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.out[[arr.poss[1,2,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.out.W1.sigma[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.out.W1.sigma[1, k+1])/as.numeric(table.res.out.W1.sigma[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.rd, results='asis', message=FALSE, echo=FALSE} kable(table.res.out.W1.sigma) ``` #### Conclusion of this section #### * when $W$ is sparse, $BP_N$, $BP_W$, $BP_N^1$, $BP_W^1$ are very close to $BP$ and much better than $TC$, $TS$, $TC^1$, $TS^1$. * Neither the value of $\sigma$ nor the fact that coefficients are estimated with EM algorithm change the rank of the predictors * When $W$ is dense, all the predictors seem to be very similar. ### g. Simulation algorithms and results when $\rho = -0.2$ ### We only present the results for the 6 configurations with $\sigma=1$, $W_1$ and $EM=FALSE$ (the table is not included in the article). ```{r,echo=F} load(url(paste(URL, "result/res-out-neg.RData", sep = ""))) nb.poss <- prod(unlist(lapply(list.param, length))) # the array with the different possibilities arr.poss <- array(1:nb.poss, unlist(lapply(list.param, length))) ``` ```{r,echo=F} table.res.out.neg.W1 <- matrix(0, 26, 7) colnames(table.res.out.neg.W1) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.out.neg.W1[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.out.neg.W1[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.out.neg[[arr.poss[1,1,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.out.neg.W1[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.out.neg[[arr.poss[1,1,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.out.neg.W1[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.out.neg.W1[1, k+1])/as.numeric(table.res.out.neg.W1[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.neg.r, results='asis', message=FALSE, echo=FALSE} kable(table.res.out.neg.W1) ``` ### h. Mispecification of the weight matrix ### In this section, we used the same simulation processes except that we choose one of the spatial weight matrix $W_1$ or $W_2$ for simulating $Y$ and we use the other spatial weight matrix for estimating and predicting. #### The 6 configurations when $W_2$ is used for simulating and $W_1$ is used for estimating and predicting ($\sigma=1$ and $EM=FALSE$) (Not included in the article) #### ```{r,echo=F} load(url(paste(URL, "result/res-out-inv.RData", sep=""))) nb.poss <- prod(unlist(lapply(list.param, length))) # the array with the different possibilities arr.poss <- array(1:nb.poss, unlist(lapply(list.param, length))) ``` ```{r,echo=F} table.res.inv.1 <- matrix(0, 26, 7) colnames(table.res.inv.1 ) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.inv.1[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.inv.1[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.inv[[arr.poss[1,1,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.inv.1[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.inv[[arr.poss[1,1,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.inv.1[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.inv.1[1, k+1])/as.numeric(table.res.inv.1[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.r.inv, results='asis', message=FALSE, echo=FALSE} kable(table.res.inv.1) ``` **Comments:** * $BP$, $BPw$, $TS$ and $TC$ are nearly similar and fit perfectly well with $Y$. * $BP1$, $BPw1$, $BPn1$ are nearly similar * When the data have been simulated with $W$ dense, one could use $W$ sparse or dense for predicting and all the predictors will yield a good fit. #### The 6 configurations when $W_1$ is used for simulating, $W_2$ is used for estimating and predicting ($\sigma=1$ and $EM=FALSE$) (Not included in the article) #### ```{r,echo=F} table.res.inv.2 <- matrix(0, 26, 7) colnames(table.res.inv.2) <- c("predictors", "config 1 (27)", "config 2 (27)", "config 3 (27)", "config 1 (54)", "config 2 (54)","config 3 (54)") table.res.inv.2[,1] <- c("BP","","BPn","","BP/BPn","BPw","","BP/BPw","BP1","","BP/BP1","BPn1","","BP/BPn1","BPw1","","BP/BPw1","TS1","","BP/TS1","TC","","BP/TC","","TC1","BP/TC1") for(k in 1:6) { table.res.inv.2[c(1,3,6,9,12,15,18,21,24), k+1] <- round(result.inv[[arr.poss[2,1,,1][k]]]$MSE[c(1,8,6,2,9,7,5,3,4)],3) #the MSE table.res.inv.2[c(2,4,7,10,13,16,19,22,25), k+1] <- paste("(", round(result.inv[[arr.poss[2,1,,1][k]]]$sd[c(1,8,6,2,9,7,5,3,4)],3),")",sep="") # the s.d. table.res.inv.2[c(5,8,11,14,17,20,23,26), k+1] <- round(as.numeric(table.res.inv.2[1, k+1])/as.numeric(table.res.inv.2[c(3,6,9,12,15,18,21,24), k+1])*100,1) # the s.d. } ``` ```{r table3.rc.inv, results='asis', message=FALSE, echo=FALSE} kable(table.res.inv.2) ``` **Comments:** When we estimate and predict with $W_2$ (whereas the ``true'' spatial weight matrix is $W_1$), we can observe that the MSE are larger than the case we estimate and predict with $W_1$ (see table 1 in section **Simulation algorithms and results**) If we look at the ranking, we can observe that there are more differences between the predictors: * $BP \simeq BPn$ < $TS$ < $TC$ * $BP1 \simeq BPn1$ < $BPw1$ ### i. Influence of the number of missing values in the neighborhood of $O$ (Table 8 in the article) ### We define the structure of the missing values depending on the number of neighbours. ```{r} config.0.missing <- 220 set.seed(12345) samp.neighbourhood <- sample(W1.listw$neighbours[[config.0.missing]]) config.1.missing <- c(config.0.missing, samp.neighbourhood[1]) config.2.missing <- c(config.1.missing, samp.neighbourhood[2]) config.3.missing <- c(config.2.missing, samp.neighbourhood[3]) config.4.missing <- c(config.3.missing, samp.neighbourhood[4]) config.5.missing <- c(config.4.missing, samp.neighbourhood[5]) config.6.missing <- c(config.5.missing, samp.neighbourhood[6]) config.7.missing <- c(config.6.missing, samp.neighbourhood[7]) config.8.missing <- c(config.7.missing, samp.neighbourhood[8]) config.9.missing <- c(config.8.missing, samp.neighbourhood[9]) ``` We plot in yellow the unit we want to predict and in red the neighborhood which will be supposed missing. It means that when using $W_1$ (the sparsest), and when considering 9 neighbours as missing, we will use only one neighbour to predict the yellow unit. ```{r} plot(mapMAP) plot(mapMAP[config.9.missing,], col = "red", add = T) plot(mapMAP[220,], col = "yellow", add = T) ``` We look at the prediction depending on the 2 spatial weight matrices $W_1$ and $W_2$ ```{r} list.param <- list(W=list(W1.listw, W2.listw), sigma=1, config = list(config.0.missing, config.1.missing, config.2.missing, config.3.missing, config.4.missing, config.5.missing, config.6.missing, config.7.missing, config.8.missing, config.9.missing), EM = FALSE) save(list.param, beta0, x, file=paste(path.dir, "parameters/param-missing.RData", sep="")) ``` ```{r,echo=F} load(url(paste(URL, "result/res-missing.RData", sep = ""))) table.res.miss <- matrix(0, 20, 12) table.res.miss[,2] <- rep(0:9, 2) table.res.miss[,1] <- rep(c("$W_1$","$W_2$"), each=10) for(k in 1:10) { table.res.miss[k,3] <- result.miss[[2*(k-1)+1]][[1]][1] table.res.miss[10+k, 3] <- result.miss[[2*k]][[1]][1] table.res.miss[k,c(5:12)] <- round(result.miss[[2*(k-1)+1]][[1]][1]/result.miss[[2*(k-1)+1]][[1]][c(8, 6, 2, 9, 7, 5, 3, 4)],3) table.res.miss[10+k,c(5:12)] <- round(result.miss[[2*k]][[1]][1]/result.miss[[2*k]][[1]][c(8, 6, 2, 9, 7, 5, 3, 4)],3) table.res.miss[k, 4] <- paste("(", result.miss[[2*(k-1)+1]][[2]][1], ")", sep="") table.res.miss[10+k, 4] <- paste("(", result.miss[[2*k]][[2]][1], ")", sep="") } table.res.miss[,5:12] <- paste(as.numeric(table.res.miss[,5:12])*100) colnames(table.res.miss) <- c("W", "missing","BP", "sd", "BP/BPn","BP/BPw","BP/BP1","BP/BPn1","BP/BPw1","BP/TS1","BP/TC","BP/TC1") ``` ```{r table4.missing, results='asis', message=FALSE, echo=FALSE} kable(table.res.miss) ```