This document was generated directly from RStudio using Markdown tool. It presents the R codes used to obtain the simulation results included in the paper ``About predictions in spatial autoregressive models: Optimal and almost optimal strategies’’. The source of this page (with all R codes used) is available here. To cite this work, please use :

Goulard M., Laurent T., and Thomas-Agnan C. (2017). About predictions in spatial autoregressive models: Optimal and almost optimal strategies. Spatial Economic Analysis, 12.

Before starting

Install the required packages:

install.packages(c("RColorBrewer", "rgdal", "spdep")) 

Load the packages :

require("rgdal")
require("spdep")
require("RColorBrewer")  

Set working directory:

path.dir <- "/media/laurent/DATAPART1/site web/code/GLT"

1. Preparation of the data

Import the Midi-Pyrénées communes boundaries into R :

mapMAP <- readOGR(dsn = "contours", layer = "ADTCAN_region")

We convert the type of the identification units into numeric values:

mapMAP@data$CODE <- as.numeric(as.character(mapMAP@data$CODE))

The number of observations equals to \(n\):

n <- nrow(mapMAP)

We consider two spatial weight matrices:

coords <- coordinates(mapMAP)
W1.listw <- nb2listw(knn2nb(knearneigh(coords, 10)), 
 style = "W")
W1.mat <- listw2mat(W1.listw)
W.nb <-  dnearneigh(coords, 0, 125000, longlat = FALSE)
dlist <- nbdists(W.nb, coords)
dlist.c <- lapply(dlist, function(x) 1/x)
W2.listw <- nb2listw(W.nb, glist = dlist.c, style = "W")
W2.mat <- listw2mat(W2.listw)

Vizualisation of the two matrices (not included in the article) :

2. Simulation framework for in-sample predictions

a. Simulation of a SAR process

We simulate three explanatory variables as follows. \(X_1\) follows a gaussian distribution \(\mathcal{N}(15,3)\), \(X_2\) follows (up to a constant) a binomial distribution \(\mathcal{B}(100,0.45)/100\) and \(X_3\) follows a log-uniform distribution \(log(\mathcal{U}_{[0,283]})\).

x1 <- rnorm(n, 15, 3)
x2 <- rbinom(n, 100, 0.45)/100
x3 <- log(round(runif(n, 1, n),0))
x <- cbind(rep(1, n), x1, x2, x3)

We use the following spatial autoregressive LAG regression model to generate the dependent variable

\[ \mathbf{Y}=(\mathbf{I}-\rho \mathbf{W})^{-1}(\beta_0\mathbf{1_n} + \mathbf{X_1} \beta_1+ \mathbf{X_2} \beta_2+ \mathbf{X_3} \beta_3 + \boldsymbol{\epsilon}) ~~\text{where $\boldsymbol{\epsilon} \sim \mathcal{N}(0,\sigma^2 \mathbf{I_n}),$} \]

where \(\mathbf{1_n}= (1,\cdots,1)' \in \mathbb{R}^n\), with fixed \(\beta=(\beta_0,~\beta_1,~\beta_2, ~\beta_3)'=(5,~0.25,~6,~1)'\),

beta0 <- c(5, 1/4, 6, 1)

The different choices of parameters

We consider the following parameter values in the simulation process :

  • \(\rho=-0.4,~-0.2,~0.05,~0.2,~0.35,~0.5,~0.65,~0.8,~0.9\),
rho <- c(-0.4, -0.2, 0.05, 0.2, 0.35, 0.5, 0.65, 0.8, 0.9)
  • \(\sigma=1,~3\),
sigma <- c(1, 3)
  • \(W=W_1\) or \(W=W_2\).

For example, for \(\rho=0.35\), \(\sigma=1\) and \(W_1\), the simulation of \(Y\) is as follows:

n <- nrow(x)
A <- diag(1,n) - 0.35*W1.mat
Y0 <- x%*%beta0 + rnorm(n, 0, 1)
y <- solve(A, Y0)

We have 9 possible values for \(\rho\), 2 possible values for \(\sigma\) and 2 possible spatial weight matrices \(W_1\) or \(W_2\), hence we obtain 36 different possible combinations. We save these parameters in the file param-in.RData included in a user-choosen directory and which will be needed later for executing the simulation program:

list.insample.param <- list(rho = rho, 
 W = list(W1.listw, W2.listw), sigma = sigma)
if(!dir.exists(paste(path.dir, "parameters", sep = ""))) 
  dir.create(paste(path.dir, "parameters", sep = ""))
save(list.insample.param, beta0, x, 
  file=paste(path.dir, "parameters/param-in.RData", sep = ""))

For each of the combinations, we draw 1000 samples, estimate a SAR model with the lagsarlm function (to obtain \(\hat\rho\), \(\hat\beta\) and \(\hat\sigma\)) and compute the following predictors :

  • \(\mathbf{\check{Y}^{TS}} = \mathbf{X} \boldsymbol{\hat\beta} + \hat\rho \mathbf{W}\mathbf{Y}\)

  • \(\mathbf{\check{Y}^{TC}} = (\mathbf{I}-\hat \rho \mathbf{W})^{-1} \mathbf{X} \boldsymbol{\hat\beta}\)

  • \(\mathbf{\check{Y}^{BP}} = (\mathbf{I}-\rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat\beta} -\text{Diag}(\mathbf{\hat{Q}})^{-1}\mathbf{\tilde{\hat{Q}}}(\mathbf{Y}-(\mathbf{I}- \rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat \beta})\)

b. Computing the predictors with spdep

All the prediction formulas cited in our paper are implemented in the releases later than 0.5-92 of the spdep R package (Bivand and Piras, 2015), following the GSoC 2015 project by M. Gubri.

don <- data.frame(y, x1, x2, x3) 
sar.res <- spdep::lagsarlm(y~., data=don, W1.listw, 
 method="eigen") 
y_TC = predict.sarlm(sar.res, listw = W1.listw, 
 pred.type = "TC", all.data = T, zero.policy = F) 
y_TS =  predict.sarlm(sar.res, listw = W1.listw, 
 pred.type = "TS", all.data = T, zero.policy = F) 
y_BP = predict.sarlm(sar.res, listw = W1.listw, 
 pred.type = "BP", all.data=T, zero.policy=F) 

c. Results

The in-sample prediction simulation codes are given here and can be executed using parallel processing. We summarize the results and obtain the following table:

  • comparaison of the MSE depending on \(W_1\) or \(W_2\) with \(\sigma=1\) (Table 2 in the article):
rho BP (W1) TS (W1) TC (W1) BP (W2) TS (W2) TC (W2)
-0.4 0.9556 0.9761 1.0159 0.965 0.9758 0.9872
(0.0876) (0.0869) (0.0919) (0.0815) (0.08) (0.081)
-0.2 0.9711 0.9779 0.9917 0.9703 0.9783 0.9864
(0.0839) (0.0839) (0.0861) (0.0819) (0.081) (0.0817)
0.05 0.979 0.9801 0.9832 0.9738 0.9777 0.9824
(0.0806) (0.0804) (0.0804) (0.0838) (0.0832) (0.0834)
0.2 0.9795 0.9828 0.995 0.9754 0.9781 0.9818
(0.0842) (0.0844) (0.0868) (0.0833) (0.083) (0.0833)
0.35 0.975 0.9852 1.0257 0.9804 0.9823 0.9858
(0.082) (0.0824) (0.09) (0.0849) (0.0848) (0.0851)
0.5 0.9576 0.9801 1.0901 0.9818 0.9843 0.991
(0.0829) (0.0827) (0.1038) (0.083) (0.0828) (0.0839)
0.65 0.9474 0.9839 1.2386 0.9808 0.984 0.9965
(0.0845) (0.0849) (0.1459) (0.085) (0.0854) (0.0889)
0.8 0.9233 0.9805 1.6808 0.98 0.9849 1.0085
(0.0854) (0.0864) (0.3106) (0.0852) (0.0855) (0.0946)
0.9 0.9173 0.9897 2.8831 0.9801 0.9882 1.0363
(0.0821) (0.0834) (0.9097) (0.0828) (0.0835) (0.1133)
  • comparaison of the ratio of MSE (BP/TS and BP/TC) depending on \(W_1\) or \(W_2\) with \(\sigma=1\) (Table 3 in the article):
rho BP_TS (W1) BP_TC (W1) BP_TS (W2) BP_TC (W2)
-0.4 0.9789 0.9406 0.9889 0.9775
-0.2 0.9931 0.9792 0.9919 0.9837
0.05 0.9989 0.9958 0.996 0.9912
0.2 0.9967 0.9845 0.9972 0.9934
0.35 0.9896 0.9505 0.9981 0.9945
0.5 0.977 0.8784 0.9975 0.9908
0.65 0.963 0.7649 0.9968 0.9842
0.8 0.9417 0.5493 0.995 0.9717
0.9 0.9268 0.3182 0.9918 0.9457
  • comparaison of the MSE depending on \(W_1\) or \(W_2\) with \(\sigma=3\) (results not included in the article):
rho BP (W1) TS (W1) TC (W1) BP (W2) TS (W2) TC (W2)
-0.4 8.6065 8.8094 9.2024 8.6877 8.8092 8.9363
(0.7922) (0.7785) (0.8198) (0.7737) (0.7516) (0.76)
-0.2 8.763 8.8365 8.9911 8.7067 8.7907 8.882
(0.751) (0.7433) (0.7557) (0.7425) (0.7204) (0.7197)
0.05 8.7988 8.817 8.8643 8.7324 8.7852 8.8455
(0.7367) (0.735) (0.737) (0.7705) (0.7587) (0.7598)
0.2 8.8408 8.8834 9.0227 8.786 8.8219 8.8689
(0.7623) (0.7638) (0.787) (0.7799) (0.7732) (0.7726)
0.35 8.7581 8.858 9.242 8.7828 8.8112 8.8629
(0.7838) (0.7837) (0.8477) (0.7512) (0.7486) (0.7522)
0.5 8.6435 8.8419 9.7985 8.8552 8.8871 8.9589
(0.7511) (0.7499) (0.9134) (0.7649) (0.7628) (0.7753)
0.65 8.5424 8.8904 11.2388 8.8226 8.8616 8.9881
(0.7327) (0.7402) (1.308) (0.7667) (0.7693) (0.808)
0.8 8.3666 8.9007 15.4022 8.7824 8.8404 9.0922
(0.7707) (0.7947) (2.8512) (0.7438) (0.7471) (0.8504)
0.9 8.2405 8.919 26.7887 8.8314 8.9092 9.3311
(0.7613) (0.781) (8.8693) (0.7749) (0.7775) (0.9564)
  • comparaison of the MSE ratio BP/TS and BP/TC depending on \(W_1\) or \(W_2\) with \(\sigma=3\) (results not included in the article):
rho BP_TS (W1) BP_TC (W1) BP_TS (W2) BP_TC (W2)
-0.4 0.977 0.9353 0.9862 0.9722
-0.2 0.9917 0.9746 0.9904 0.9803
0.05 0.9979 0.9926 0.994 0.9872
0.2 0.9952 0.9798 0.9959 0.9906
0.35 0.9887 0.9476 0.9968 0.991
0.5 0.9776 0.8821 0.9964 0.9884
0.65 0.9609 0.7601 0.9956 0.9816
0.8 0.94 0.5432 0.9934 0.9659
0.9 0.9239 0.3076 0.9913 0.9464

3. Simulation framework for out-of-sample predictions

a. Simulation process

We choose at random a given number of sites (\(27\) or \(54\)) which will be declared out-of-sample (in \(O\)). We predict the \(\mathbf{Y}\) variable on the out-of-sample locations based on the sample \(S\) constituted by the remaining sites. We consider several situations depending upon the number of out-of-sample units and upon the aggregation level of the out-of-sample units. The corresponding configurations are:

27 out-of-sample units

(R codes)

set.seed(18) 
config1.27 <- sample(1:283, 27)

config2.27 <- c(72, 180, 269, 155, 275, 143, 259) 
for(p in 1:4)
 {set.seed(p)
  config2.27 <- c(config2.27, 
   sample(W1.listw$neighbours[[config2.27[p]]],5))
 }

config3.27 <- c(132, 59, 225)   
for(p in 1:3)
 {set.seed(p+1)
  config3.27 <- c(config3.27, 
   sample(W1.listw$neighbours[[config3.27[p]]],8))
 }

54 out-of-sample units

(R codes)

set.seed(15)
config1.54 <- sample(1:283,54)  

config2.54 <- c(19, 76, 137, 257, 214, 52, 39, 163, 168) 
for(p in 1:9)
 {set.seed(p+1)
  config2.54 <- c(config2.54, 
   sample(W1.listw$neighbours[[config2.54[p]]],5))
 }

config3.54 <- c(6, 145, 278, 229, 26, 158) 
for(p in 1:6)
 {set.seed(p)
  config3.54 <- c(config3.54, 
   sample(W1.listw$neighbours[[config3.54[p]]],8))
}

Graphical representation of the 27 and 54 out-of-sample units

(Figure 3 in the article)

b. Out-of-sample predictors formulae

  • \(\mathbf{\hat Y_O^{BP}} = \mathbf{\hat Y_O^{TC}} - \mathbf{\hat{Q}_{OO}}^{-1}\mathbf{\hat{Q}_{OS}} \times (\mathbf{Y_S}-\mathbf{\hat Y_S^{TC}})\)

  • \(\mathbf{\hat Y_O^{TC}} = [(\mathbf{I} - \hat \rho \mathbf{W})^{-1}\mathbf{X} \boldsymbol{\hat\beta}]_O\)

  • \(\hat Y_o^{TS^{1}} = \mathbf{X_o}\boldsymbol{\hat\beta} + \hat\rho \mathbf{W_{oS}}\mathbf{Y_S}\)

  • \(\mathbf{\hat Y_O^{BP_{W}}}= \mathbf{\hat Y_{O}^{TC}}+\boldsymbol{\hat{\Sigma}_{OS}}\mathbf{W_{OS}'}(\mathbf{W_{OS}}\boldsymbol{\hat{\Sigma}_{SS}} \mathbf{W_{OS}'})^{-1}(\mathbf{W_{OS}}\mathbf{Y_S} - \mathbf{W_{OS}}\mathbf{\hat Y_{S}^{TC}})\)

  • \(\mathbf{\hat Y_O^{BP_N}}= \mathbf{\hat Y_O^{TC}}-\mathbf{\hat{Q}_{OO}}^{-1}\mathbf{\hat{Q}_{OJ}} \mathbf{(Y_J - \hat Y^{TC}_J )_J}\) for \(J\) set of indices of neighbors of \(O\)

  • \(\hat Y_o^{TC^1}\), row \(o\) of
    \[ \{ \mathbf{I_{n_S+1}} - \hat \rho \begin{pmatrix} \mathbf{W_{SS}}& \mathbf{W_{So}} \\ \mathbf{W_{oS}}& \mathbf{W_{oo}} \end{pmatrix}\}^{-1} \begin{pmatrix} \mathbf{X_S} \\ \mathbf{X_o} \end{pmatrix} \boldsymbol{\hat \beta}\]

  • \(\hat Y_o^{BP^1} = \hat Y_o^{TC^1} - \hat{Q}_{oo}^{-1}\mathbf{\hat{Q}_{oS}}(\mathbf{Y_S} - \mathbf{\hat Y_S^{TC^1}})\)

  • \(\hat Y_o^{BP_W^1} = \hat Y_{o}^{TC^1}+\mathbf{\hat{\Sigma}_{oS}}\mathbf{W_{oS}'}(\mathbf{W_{oS}}\boldsymbol{\hat{\Sigma}_{SS}} \mathbf{W_{oS}'})^{-1}(\mathbf{W_{oS}}\mathbf{Y_S} - \mathbf{W_{oS}}\mathbf{\hat Y_{S}^{TC^1}})\)

  • \(BP_N^1\) & \(\hat Y_o^{BP_N^1} = \hat Y_o^{TC}-\hat{Q}_{oo}^{-1}\mathbf{\hat{Q}_{oJ}} (\mathbf{Y_J} - \mathbf{\hat Y^{TC^1}_J} )_J\) for \(J\) set of indices of neighbors of \(o\)

c. Computing the predictions with spdep

We define the indices of the out-of-sample units and build the \(S\) and \(O\) sample :

outsamp <- config1.54 
S.sample <- don[-outsamp, ]
O.sample <- don[outsamp, ]

We estimate a SAR model on the \(S\) sample. For that, we first have to define the \(W_S\) matrix:

WS.listw <- subset.listw(W1.listw, !(1:n %in% outsamp))
sar.res <- lagsarlm(y~., data = S.sample, WS.listw, 
 method = "eigen")

We predict the out-of-sample \(O\) values of \(Y\) by using predict.sarlm() function. Option pred.type= gives the predictors type :

y_BP = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BP", power = TRUE))
y_BP1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BP1", power = TRUE))
y_TC = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T,
 pred.type = "TC", power = TRUE))
y_TS1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T,
 pred.type = "TS1", power = TRUE))
y_BPW = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BPW", power = TRUE))
y_BPN = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BPN", power = TRUE))
y_BPW1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BPW1", power = TRUE))
y_BPN1 = as.numeric(predict.sarlm(sar.res, listw = W1.listw, 
 newdata = O.sample, zero.policy = T, 
 pred.type = "BPN1", power = TRUE))

d. The different choices of parameters in the simulations

In the simulations, we fix \(\rho=0.35\). The different parameters we will test are:

  • 2 possible values for \(\sigma\) (\(\sigma=1,~3\)).

  • 2 possible spatial weight matrices \(W_1\) and \(W_2\).

  • 6 different configurations for out-of-sample units.

  • 2 different methods for estimating the parameters: one using ML, one using EM.

Thus, we obtain 48 different possible combinations whose parameters are saved in the param-out.RData file which will be used to run the simulations:

list.param <- list(W = list(W1.listw, W2.listw), 
  sigma = sigma, config = list(config1.27, config2.27, 
  config3.27,config1.54, config2.54, config3.54),
  EM = c(FALSE, TRUE))
save(list.param, beta0, x, 
     file = paste(path.dir, "parameters/param-out.RData", 
     sep = ""))

e. EM or not EM: that is the question?

Among the different configurations presented previously, we will first focus on the estimation of the parameters to check if we can notice a strong difference when estimating by EM (see R codes uses here). We do not consider all the cases, but the following ones :

  • 54 missing units: randomly distributed or strongly clustered

  • \(\sigma=1,~3\)

  • \(W=W_1,~W_2\)

\(\sigma=1\), \(W_1\)

(Table 5 in the article)

ML (random) EM (random) ML (clustered) EM (clustered)
beta0 6.663 5.732 6.508 5.532
(2.167) (2.226) (2.319) (2.419)
beta1 0.25 0.251 0.25 0.251
(0.024) (0.023) (0.024) (0.024)
beta2 5.874 5.958 5.972 5.961
(1.481) (1.479) (1.474) (1.472)
beta3 1.004 1 1.008 0.999
(0.062) (0.062) (0.07) (0.069)
rho 0.284 0.32 0.287 0.328
(0.084) (0.085) (0.089) (0.094)
sigma2 0.991 1.202 0.991 1.104
(0.097) (3.238) (0.095) (0.33)

\(\sigma=1\), \(W_2\)

(Table 5 in the article)

ML (random) EM (random) ML (clustered) EM (clustered)
beta0 11.185 10.191 11.113 10.054
(8.253) (6.752) (7.679) (6.886)
beta1 0.251 0.251 0.25 0.251
(0.023) (0.023) (0.024) (0.024)
beta2 5.99 5.986 5.973 5.975
(1.476) (1.475) (1.479) (1.481)
beta3 1 0.999 0.999 0.999
(0.062) (0.062) (0.07) (0.069)
rho 0.097 0.138 0.1 0.144
(0.335) (0.273) (0.309) (0.278)
sigma2 0.982 1.791 0.98 1.575
(0.096) (6.449) (0.093) (3.923)

\(\sigma=3\), \(W_1\)

(Not included in the article)

ML (random) EM (random) ML (clustered) EM (clustered)
beta0 6.998 6.002 6.901 5.821
(3.69) (3.91) (3.783) (4.014)
beta1 0.253 0.253 0.252 0.253
(0.071) (0.07) (0.073) (0.073)
beta2 5.773 5.866 5.885 5.883
(4.435) (4.442) (4.422) (4.411)
beta3 1.003 0.998 1.005 0.998
(0.186) (0.186) (0.209) (0.207)
rho 0.271 0.31 0.272 0.317
(0.113) (0.125) (0.115) (0.129)
sigma2 8.898 9.951 8.88 9.244
(0.874) (8.96) (0.852) (0.998)

\(\sigma=3\), \(W_2\)

(Not included in the article)

ML (random) EM (random) ML (clustered) EM (clustered)
beta0 14.736 13.318 14.187 13.346
(11.906) (9.947) (11.155) (10.41)
beta1 0.253 0.253 0.252 0.252
(0.07) (0.07) (0.073) (0.073)
beta2 5.905 5.905 5.894 5.894
(4.412) (4.422) (4.431) (4.439)
beta3 0.998 0.997 0.996 0.996
(0.187) (0.187) (0.208) (0.208)
rho -0.047 0.011 -0.024 0.01
(0.472) (0.393) (0.439) (0.41)
sigma2 8.819 10.909 8.807 9.499
(0.871) (17.525) (0.836) (8.254)

Comments:

  • the parameters estimates do not vary a lot between ML and EM algorithm.

  • estimations of \(\beta_1\), \(\beta_2\) and \(\beta_3\) are good whatever the choice for \(W_1\) and \(W_2\), \(\sigma=1,~3\).

  • estimations of \(\sigma\) is good.

  • when \(W_2\) is dense, estimation of \(\rho\) is not good, especially when \(\sigma=3\)

f. Simulation algorithms and results when \(\rho = 0.35\)

For each of the 48 combinations, we draw 1000 samples, estimate a SAR model with the lagsarlm (with or without EM algorithm) and we compute the previous predictors. The out-of-sample prediction simulation codes are given here and can be executed using parallel processing. We summarize the results and obtain the following table:

The 6 configurations with \(\sigma=1\), \(W_1\) and \(EM=FALSE\) (Table 6 in the article)

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.009 1.033 1.034 1.016 1.03 1.036
(0.276) (0.291) (0.277) (0.194) (0.196) (0.202)
BPn 1.009 1.034 1.034 1.017 1.031 1.036
(0.276) (0.292) (0.277) (0.194) (0.196) (0.203)
BP/BPn 100 99.9 100 99.9 99.9 100
BPw 1.011 1.036 1.035 1.017 1.031 1.037
(0.277) (0.292) (0.277) (0.194) (0.196) (0.203)
BP/BPw 99.8 99.7 99.9 99.9 99.9 99.9
BP1 1.011 1.042 1.061 1.019 1.04 1.053
(0.276) (0.293) (0.288) (0.194) (0.196) (0.209)
BP/BP1 99.8 99.1 97.5 99.7 99 98.4
BPn1 1.012 1.046 1.064 1.021 1.041 1.056
(0.277) (0.294) (0.289) (0.194) (0.197) (0.21)
BP/BPn1 99.7 98.8 97.2 99.5 98.9 98.1
BPw1 1.014 1.046 1.065 1.022 1.042 1.056
(0.277) (0.294) (0.289) (0.194) (0.197) (0.209)
BP/BPw1 99.5 98.8 97.1 99.4 98.8 98.1
TS1 1.023 1.048 1.067 1.029 1.046 1.055
(0.278) (0.292) (0.289) (0.196) (0.198) (0.209)
BP/TS1 98.6 98.6 96.9 98.7 98.5 98.2
TC 1.06 1.076 1.067 1.06 1.07 1.07
(0.286) (0.301) (0.284) (0.203) (0.204) (0.212)
BP/TC 95.2 96 96.9 95.8 96.3 96.8
1.064 1.085 1.104 1.067 1.084 1.088
TC1 (0.286) (0.3) (0.3) (0.204) (0.206) (0.217)
BP/TC1 94.8 95.2 93.7 95.2 95 95.2

Comments:

  • standard deviations are comparable across predictors.

  • \(BPw\) and \(BPn\) seem nearly equivalent to \(BP\)

  • \(BP1\) is slightly less efficient than \(BP\) and it seems less good when the predicted sites are clustered. It is always better than \(TC\) and \(TS\).

  • \(TC\) is less efficient than \(BP\), it seems better when the predicted sites are clustered.

  • \(TS\) is less efficient than \(BP\) but better than \(TC\). It becomes less efficient when the predicted sites are clustered.

  • \(TC1\) is close to \(TC\), but a little bit less efficient when predicted sites are clustered.

  • \(BPw1\) and \(BPn1\) are slightly better than \(TS\).

The 6 configurations with \(\sigma=1\), \(W_2\) and \(EM=FALSE\) (Table 7 in the article)

It seems that when \(W\) is denser, all the predictors are close to \(BP\).

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.018 1.033 1.02 1.02 1.026 1.025
(0.275) (0.285) (0.269) (0.195) (0.195) (0.199)
BPn 1.018 1.033 1.02 1.02 1.026 1.025
(0.275) (0.285) (0.269) (0.195) (0.195) (0.199)
BP/BPn 100 100 100 100 100 100
BPw 1.018 1.033 1.02 1.02 1.026 1.025
(0.275) (0.286) (0.269) (0.195) (0.195) (0.199)
BP/BPw 100 100 100 100 100 100
BP1 1.019 1.033 1.02 1.022 1.027 1.025
(0.275) (0.286) (0.269) (0.195) (0.194) (0.199)
BP/BP1 99.9 100 100 99.8 99.9 100
BPn1 1.019 1.033 1.02 1.022 1.027 1.025
(0.275) (0.286) (0.269) (0.195) (0.194) (0.199)
BP/BPn1 99.9 100 100 99.8 99.9 100
BPw1 1.019 1.033 1.02 1.022 1.027 1.025
(0.275) (0.286) (0.269) (0.195) (0.194) (0.199)
BP/BPw1 99.9 100 100 99.8 99.9 100
TS1 1.018 1.031 1.019 1.018 1.024 1.024
(0.275) (0.285) (0.268) (0.195) (0.194) (0.198)
BP/TS1 100 100.2 100.1 100.2 100.2 100.1
TC 1.019 1.033 1.02 1.019 1.026 1.024
(0.275) (0.285) (0.268) (0.195) (0.194) (0.198)
BP/TC 99.9 100 100 100.1 100 100.1
1.019 1.032 1.02 1.019 1.025 1.024
TC1 (0.275) (0.284) (0.267) (0.195) (0.194) (0.198)
BP/TC1 99.9 100.1 100 100.1 100.1 100.1

The 6 configurations with \(\sigma=1\), \(W_1\) and \(EM=TRUE\) (Not included in the article)

It seems that the fact that \(\beta\), \(\rho\) and \(\sigma\) are estimated with the EM algorithm do not affect the prediction. Indeed, the predictors are very close to that of the first situation.

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.008 1.032 1.033 1.015 1.029 1.035
(0.276) (0.291) (0.277) (0.194) (0.195) (0.201)
BPn 1.009 1.033 1.033 1.015 1.029 1.035
(0.276) (0.291) (0.277) (0.194) (0.195) (0.202)
BP/BPn 99.9 99.9 100 100 100 100
BPw 1.01 1.034 1.034 1.016 1.03 1.036
(0.277) (0.292) (0.277) (0.194) (0.196) (0.202)
BP/BPw 99.8 99.8 99.9 99.9 99.9 99.9
BP1 1.011 1.043 1.066 1.019 1.042 1.06
(0.277) (0.293) (0.29) (0.194) (0.197) (0.21)
BP/BP1 99.7 98.9 96.9 99.6 98.8 97.6
BPn1 1.012 1.047 1.068 1.02 1.044 1.063
(0.277) (0.295) (0.291) (0.194) (0.197) (0.212)
BP/BPn1 99.6 98.6 96.7 99.5 98.6 97.4
BPw1 1.014 1.048 1.07 1.021 1.044 1.063
(0.278) (0.294) (0.291) (0.194) (0.198) (0.211)
BP/BPw1 99.4 98.5 96.5 99.4 98.6 97.4
TS1 1.021 1.047 1.072 1.025 1.045 1.057
(0.278) (0.291) (0.291) (0.195) (0.197) (0.209)
BP/TS1 98.7 98.6 96.4 99 98.5 97.9
TC 1.06 1.073 1.065 1.058 1.067 1.069
(0.286) (0.3) (0.284) (0.203) (0.203) (0.211)
BP/TC 95.1 96.2 97 95.9 96.4 96.8
1.064 1.086 1.113 1.066 1.086 1.091
TC1 (0.287) (0.3) (0.305) (0.204) (0.206) (0.218)
BP/TC1 94.7 95 92.8 95.2 94.8 94.9

The 6 configurations with \(\sigma=3\), \(W_1\) and \(EM=FALSE\) (Not included in the article)

When \(\sigma=3\), it does not change the rank either the ratio between \(BP\) and other predictors.

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 9.09 9.301 9.286 9.148 9.27 9.319
(2.482) (2.612) (2.486) (1.748) (1.761) (1.823)
BPn 9.092 9.308 9.286 9.152 9.273 9.323
(2.483) (2.614) (2.486) (1.748) (1.762) (1.825)
BP/BPn 100 99.9 100 100 100 100
BPw 9.104 9.322 9.293 9.157 9.279 9.331
(2.491) (2.617) (2.483) (1.747) (1.764) (1.827)
BP/BPw 99.8 99.8 99.9 99.9 99.9 99.9
BP1 9.093 9.324 9.352 9.152 9.29 9.382
(2.485) (2.619) (2.502) (1.743) (1.759) (1.852)
BP/BP1 100 99.8 99.3 100 99.8 99.3
BPn1 9.103 9.359 9.362 9.166 9.302 9.407
(2.489) (2.632) (2.504) (1.742) (1.761) (1.86)
BP/BPn1 99.9 99.4 99.2 99.8 99.7 99.1
BPw1 9.115 9.363 9.365 9.17 9.308 9.412
(2.493) (2.627) (2.5) (1.744) (1.767) (1.854)
BP/BPw1 99.7 99.3 99.2 99.8 99.6 99
TS1 9.187 9.363 9.355 9.22 9.322 9.382
(2.496) (2.609) (2.498) (1.758) (1.766) (1.842)
BP/TS1 98.9 99.3 99.3 99.2 99.4 99.3
TC 9.541 9.668 9.58 9.527 9.605 9.618
(2.573) (2.69) (2.55) (1.824) (1.829) (1.899)
BP/TC 95.3 96.2 96.9 96 96.5 96.9
9.547 9.668 9.607 9.533 9.616 9.635
TC1 (2.573) (2.677) (2.57) (1.826) (1.827) (1.905)
BP/TC1 95.2 96.2 96.7 96 96.4 96.7

Conclusion of this section

  • when \(W\) is sparse, \(BP_N\), \(BP_W\), \(BP_N^1\), \(BP_W^1\) are very close to \(BP\) and much better than \(TC\), \(TS\), \(TC^1\), \(TS^1\).

  • Neither the value of \(\sigma\) nor the fact that coefficients are estimated with EM algorithm change the rank of the predictors

  • When \(W\) is dense, all the predictors seem to be very similar.

g. Simulation algorithms and results when \(\rho = -0.2\)

We only present the results for the 6 configurations with \(\sigma=1\), \(W_1\) and \(EM=FALSE\) (the table is not included in the article).

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.026 1.034 1.023 1.024 1.026 1.03
(0.281) (0.287) (0.271) (0.197) (0.195) (0.202)
BPn 1.026 1.035 1.023 1.024 1.026 1.03
(0.281) (0.287) (0.271) (0.197) (0.195) (0.202)
BP/BPn 100 99.9 100 100 100 100
BPw 1.027 1.035 1.023 1.025 1.026 1.03
(0.281) (0.287) (0.272) (0.198) (0.195) (0.201)
BP/BPw 99.9 99.9 100 99.9 100 100
BP1 1.028 1.042 1.039 1.029 1.034 1.047
(0.281) (0.287) (0.281) (0.199) (0.198) (0.206)
BP/BP1 99.8 99.2 98.5 99.5 99.2 98.4
BPn1 1.028 1.041 1.039 1.029 1.034 1.047
(0.281) (0.287) (0.281) (0.199) (0.198) (0.206)
BP/BPn1 99.8 99.3 98.5 99.5 99.2 98.4
BPw1 1.029 1.04 1.04 1.03 1.034 1.047
(0.28) (0.287) (0.281) (0.2) (0.198) (0.206)
BP/BPw1 99.7 99.4 98.4 99.4 99.2 98.4
TS1 1.028 1.037 1.035 1.028 1.031 1.042
(0.281) (0.287) (0.278) (0.199) (0.196) (0.204)
BP/TS1 99.8 99.7 98.8 99.6 99.5 98.8
TC 1.033 1.041 1.027 1.031 1.032 1.036
(0.283) (0.289) (0.272) (0.199) (0.196) (0.2)
BP/TC 99.3 99.3 99.6 99.3 99.4 99.4
1.035 1.044 1.036 1.034 1.036 1.045
TC1 (0.283) (0.29) (0.276) (0.2) (0.198) (0.203)
BP/TC1 99.1 99 98.7 99 99 98.6

h. Mispecification of the weight matrix

In this section, we used the same simulation processes except that we choose one of the spatial weight matrix \(W_1\) or \(W_2\) for simulating \(Y\) and we use the other spatial weight matrix for estimating and predicting.

The 6 configurations when \(W_2\) is used for simulating and \(W_1\) is used for estimating and predicting (\(\sigma=1\) and \(EM=FALSE\)) (Not included in the article)

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.022 1.04 1.02 1.027 1.023 1.024
(0.279) (0.284) (0.271) (0.196) (0.196) (0.197)
BPn 1.022 1.039 1.019 1.027 1.023 1.024
(0.279) (0.284) (0.271) (0.196) (0.195) (0.197)
BP/BPn 100 100.1 100.1 100 100 100
BPw 1.022 1.04 1.019 1.027 1.023 1.024
(0.279) (0.284) (0.271) (0.196) (0.196) (0.197)
BP/BPw 100 100 100.1 100 100 100
BP1 1.023 1.043 1.025 1.029 1.028 1.03
(0.28) (0.285) (0.273) (0.196) (0.196) (0.199)
BP/BP1 99.9 99.7 99.5 99.8 99.5 99.4
BPn1 1.023 1.043 1.025 1.029 1.028 1.03
(0.28) (0.285) (0.272) (0.196) (0.196) (0.199)
BP/BPn1 99.9 99.7 99.5 99.8 99.5 99.4
BPw1 1.023 1.043 1.025 1.029 1.028 1.03
(0.279) (0.285) (0.272) (0.196) (0.196) (0.199)
BP/BPw1 99.9 99.7 99.5 99.8 99.5 99.4
TS1 1.021 1.04 1.023 1.026 1.024 1.027
(0.278) (0.284) (0.271) (0.196) (0.195) (0.198)
BP/TS1 100.1 100 99.7 100.1 99.9 99.7
TC 1.02 1.038 1.019 1.025 1.021 1.024
(0.277) (0.284) (0.27) (0.196) (0.195) (0.197)
BP/TC 100.2 100.2 100.1 100.2 100.2 100
1.021 1.039 1.022 1.026 1.023 1.026
TC1 (0.277) (0.285) (0.271) (0.196) (0.195) (0.198)
BP/TC1 100.1 100.1 99.8 100.1 100 99.8

Comments:

  • \(BP\), \(BPw\), \(TS\) and \(TC\) are nearly similar and fit perfectly well with \(Y\).

  • \(BP1\), \(BPw1\), \(BPn1\) are nearly similar

  • When the data have been simulated with \(W\) dense, one could use \(W\) sparse or dense for predicting and all the predictors will yield a good fit.

The 6 configurations when \(W_1\) is used for simulating, \(W_2\) is used for estimating and predicting (\(\sigma=1\) and \(EM=FALSE\)) (Not included in the article)

predictors config 1 (27) config 2 (27) config 3 (27) config 1 (54) config 2 (54) config 3 (54)
BP 1.045 1.071 1.063 1.052 1.055 1.064
(0.285) (0.299) (0.287) (0.204) (0.204) (0.209)
BPn 1.045 1.071 1.063 1.052 1.055 1.064
(0.285) (0.299) (0.287) (0.204) (0.204) (0.209)
BP/BPn 100 100 100 100 100 100
BPw 1.045 1.072 1.064 1.052 1.055 1.064
(0.285) (0.3) (0.287) (0.204) (0.204) (0.209)
BP/BPw 100 99.9 99.9 100 100 100
BP1 1.044 1.074 1.064 1.051 1.054 1.064
(0.285) (0.301) (0.288) (0.203) (0.204) (0.209)
BP/BP1 100.1 99.7 99.9 100.1 100.1 100
BPn1 1.043 1.074 1.063 1.05 1.054 1.064
(0.285) (0.3) (0.287) (0.203) (0.204) (0.209)
BP/BPn1 100.2 99.7 100 100.2 100.1 100
BPw1 1.048 1.079 1.066 1.054 1.057 1.067
(0.286) (0.302) (0.287) (0.204) (0.204) (0.209)
BP/BPw1 99.7 99.3 99.7 99.8 99.8 99.7
TS1 1.053 1.084 1.069 1.059 1.062 1.071
(0.286) (0.302) (0.288) (0.205) (0.205) (0.21)
BP/TS1 99.2 98.8 99.4 99.3 99.3 99.3
TC 1.081 1.094 1.079 1.078 1.076 1.081
(0.292) (0.302) (0.29) (0.21) (0.21) (0.212)
BP/TC 96.7 97.9 98.5 97.6 98 98.4
1.075 1.106 1.082 1.078 1.08 1.085
TC1 (0.291) (0.308) (0.291) (0.21) (0.21) (0.214)
BP/TC1 97.2 96.8 98.2 97.6 97.7 98.1

Comments:

When we estimate and predict with \(W_2\) (whereas the ``true’’ spatial weight matrix is \(W_1\)), we can observe that the MSE are larger than the case we estimate and predict with \(W_1\) (see table 1 in section Simulation algorithms and results)

If we look at the ranking, we can observe that there are more differences between the predictors:

  • \(BP \simeq BPn\) < \(TS\) < \(TC\)

  • \(BP1 \simeq BPn1\) < \(BPw1\)

i. Influence of the number of missing values in the neighborhood of \(O\) (Table 8 in the article)

We define the structure of the missing values depending on the number of neighbours.

  config.0.missing <- 220
  set.seed(12345)
  samp.neighbourhood <- sample(W1.listw$neighbours[[config.0.missing]])
  config.1.missing <- c(config.0.missing, samp.neighbourhood[1])
  config.2.missing <- c(config.1.missing, samp.neighbourhood[2])
  config.3.missing <- c(config.2.missing, samp.neighbourhood[3])
  config.4.missing <- c(config.3.missing, samp.neighbourhood[4])
  config.5.missing <- c(config.4.missing, samp.neighbourhood[5])
  config.6.missing <- c(config.5.missing, samp.neighbourhood[6])
  config.7.missing <- c(config.6.missing, samp.neighbourhood[7])
  config.8.missing <- c(config.7.missing, samp.neighbourhood[8])
  config.9.missing <- c(config.8.missing, samp.neighbourhood[9])

We plot in yellow the unit we want to predict and in red the neighborhood which will be supposed missing. It means that when using \(W_1\) (the sparsest), and when considering 9 neighbours as missing, we will use only one neighbour to predict the yellow unit.

plot(mapMAP)
plot(mapMAP[config.9.missing,], col = "red", add = T)
plot(mapMAP[220,], col = "yellow", add = T)

We look at the prediction depending on the 2 spatial weight matrices \(W_1\) and \(W_2\)

list.param <- list(W=list(W1.listw, W2.listw), sigma=1, 
  config = list(config.0.missing, config.1.missing, 
   config.2.missing, config.3.missing, config.4.missing, 
   config.5.missing, config.6.missing, config.7.missing,
   config.8.missing, config.9.missing),
  EM = FALSE)
save(list.param, beta0, x, 
     file=paste(path.dir, "parameters/param-missing.RData", 
      sep=""))
W missing BP sd BP/BPn BP/BPw BP/BP1 BP/BPn1 BP/BPw1 BP/TS1 BP/TC BP/TC1
\(W_1\) 0 0.992 (1.443) 100 99.7 100 100 99.7 98.7 94.4 94.4
\(W_1\) 1 0.991 (1.445) 100 99.4 99.8 99.7 99.3 98.3 94.4 94
\(W_1\) 2 0.991 (1.445) 100 99.7 99.3 99.3 99.1 98 94.4 93.4
\(W_1\) 3 1.01 (1.47) 99.9 99.7 97.1 97.1 97.1 96.5 95.8 92.2
\(W_1\) 4 1.013 (1.487) 99.9 99.9 93.8 93.8 93.6 93.2 96 89.3
\(W_1\) 5 1.023 (1.495) 99.9 99.9 92.9 92.8 92.8 93.3 97.1 90.6
\(W_1\) 6 1.029 (1.5) 99.9 99.8 92.4 92.2 92.2 93.5 97.6 92.5
\(W_1\) 7 1.026 (1.495) 100 100 91 91 90.9 92.2 97.3 91.4
\(W_1\) 8 1.037 (1.502) 100.1 100 82.6 82.4 82.4 84.2 98.3 86.3
\(W_1\) 9 1.037 (1.503) 100 100 82.4 82.4 82.4 84.7 98.3 90
\(W_2\) 0 0.999 (1.45) 100 100 100 100 100 100 99.7 99.7
\(W_2\) 1 0.999 (1.449) 100 100 100 100 100 100 99.8 99.7
\(W_2\) 2 0.999 (1.449) 100 100 100 100 100 100 99.8 99.7
\(W_2\) 3 1.002 (1.453) 100 100 99.9 99.9 99.9 100 99.9 99.8
\(W_2\) 4 1.002 (1.457) 100 100 99.9 99.9 99.9 99.9 99.8 99.7
\(W_2\) 5 1.002 (1.458) 100 100 99.9 99.9 100 100 99.9 99.8
\(W_2\) 6 1.002 (1.457) 100 100 100 100 100 100 99.9 99.9
\(W_2\) 7 1.001 (1.455) 100 100.1 100 100 100.1 100 99.7 99.7
\(W_2\) 8 1.002 (1.455) 100 100 100 100 100.1 100 99.8 99.8
\(W_2\) 9 1.001 (1.454) 100 100 100 100 100 100 99.7 99.7