This document was generated directly from RStudio using Markdown tool. It presents the R and Matlab codes used to obtain the figures included in the paper ``A random locational M-estimation problem based on the \(L_2\)-Wasserstein distance’’ written by A. Daouia and I. Van Keilegom (2015). The .pdf of this page is available here.

Before starting

Install the necessary packages:

install.packages(c("rgdal", "gpclib", "maptools", "rgeos", 
                   "OpenStreetMap", "GISTools")) 

Load the packages :

require("rgdal")
require("maptools")
require("rgeos")
require("OpenStreetMap") 
require("GISTools")

1. Representation of the map of Sahel region - North Africa

Import the world country boundaries into R :

world <- readOGR(dsn="world", layer="Pays_WGS84")

Perhaps you will need to run the following code the first time :

require("gpclib")
gpclibPermit()

We consider some border cities located in the Algerian and Moroccan territories and characterized by the following Longitute/Latitude coordinates (given in decimal degrees):

provinces<-SpatialPoints(rbind(c(-2.938114,35.292437), c(-6.855839,33.985752), 
 c(-5.353647,36.142444), c(-5.339725,35.885957), c(-1.914259,34.685276),c(-2.211851,35.085182), 
 c(-1.232229,32.085182), c(-2.103646,34.951092), c(-1.735685,34.852832),  c(-5.834646,35.757976), 
 c(-2.163441,34.311307), c(-2.302420,34.972376),  c(-1.329303,34.915277), c(-1.967375,32.534366), 
 c(-1.998600,33.048943), c(-0.583262,32.750140), c(-2.032403,34.000785), c(-1.329063,34.624857), 
 c(-0.331307,33.249668), c(-0.290108,33.549825), c(-1.259651,34.223006)), 
 proj4string=CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))

The names of these cities are:

nom.prov <- c("Melilla", "Rabat", "Gibraltar", "Ceuta", "Oujda", "Saidia","Figuig",
 "Ahfir","Maghnia","Tangier", "Jerada", "Berkane", "Tlemcen", "Bouarfa", "Tendrara", 
 "Ain Sefra", "Ain-Beni-Mathar", "Sebdou", "Naama", "Mecheria", "El Aricha")

We only keep the countries that we want to represent in the Sahel region:

carte<-world[world@data$NOM%in%c("Algeria","Morocco","Western Sahara","Mauritania",
 "Mali","Niger","Tunisia","Libya", "Chad","Italy","Spain","France","Portugal","Senegal",
 "Gambia, The","Guinea","Guinea-Bissau","Burkina Faso","Nigeria","Malta", "Sudan", "Somalia",
 "Uganda","Cameroon", "Egypt","Greece", "Ivory Coast","Ghana", "Turkey","Albania", "Montenegro", 
 "Macedonia", "Bulgaria","Croatia","Bosnia and Herzegovina", "Andorra", "Benin"), ]

we precise the names of the selected countries:

nom<-carte@data$NOM
row.names(carte)<-as.character(nom)
nom[nom=="Western Sahara"]="Morocco"
carte<-unionSpatialPolygons(carte, nom)  

We change the Coordinate Reference System of the countries and keep the one given by OpenStreetMap:

proj.OSM<-"+proj=merc +a=6378137 +b=6378137 +lat_ts=0.0 +lon_0=0.0 +x_0=0.0 +y_0=0 
+k=1.0 +units=m +nadgrids=@null +no_defs"
carte<-spTransform(carte, CRS(proj.OSM)) 
provinces.OSM<-spTransform(provinces, CRS(proj.OSM))

We import the relief map from OpenStreetMap:

map = openmap(c(lat= 35.60583,   lon= -3.881389),
               c(lat= 31.96583,   lon= 0.4601385),
        minNumTiles=9,type="nps")

We are now in position to represent the map of interest:

op<-par(oma=c(0,0,0,0), omi=c(0,0,0,0), mar=c(0,0,0,0), mai=c(0,0,0,0))
plot(carte, xlim=c(-1900000,2200000),ylim=c(3200000,3600000))
masker = poly.outer(matrix(par()$usr,2,2,byrow=TRUE), carte,extend=10000000);
add.masking(masker,"lightblue")
plot(carte, col="lightyellow",lty=1,add=TRUE)
text(coordinates(carte)[,1], coordinates(carte)[,2], row.names(carte),  font=3, cex=0.9)
plot(map, add=TRUE)
xmin=map[[1]][[1]]$bbox$p1[1];xmax=map[[1]][[1]]$bbox$p2[1]
ymin=map[[1]][[1]]$bbox$p2[2];ymax=map[[1]][[1]]$bbox$p1[2]
polygon(c(xmin,xmax,xmax,xmin,xmin),c(ymin,ymin,ymax,ymax,ymin), lty=2)
plot(carte, add=TRUE, lty=2)
plot(provinces.OSM<-spTransform(provinces, CRS(proj.OSM))
[c(1,2,3,4,7,6)], col="red", pch=15, add=TRUE, cex=0.7)                           
text(coordinates(provinces.OSM<-spTransform(provinces, CRS(proj.OSM))
[c(1,2,3,4,7,6),]), nom.prov[c(1,2,3,4,7,6)], cex=0.7, pos=c(3, 2, 3, 1, 4, 4))

par(op)

2. Representation of the facilities (monitoring stations) and the clients (smugglers, traffickers and illegal immigrants)

a. Preparation of the borderline

First, we prepare the borderland between Morocco and Algeria:

maghreb<-world[world@data$NOM%in%c("Algeria","Morocco","Western Sahara"), ]
maghreb<-unionSpatialPolygons(maghreb, c("Algerie", "Maroc","Maroc"))  

We provide the locations of 6 existing monitoring stations:

villes<-SpatialPoints(rbind(c(-1.022099,32.506542), 
 c(-1.483524,33.051516), c(-1.604374,33.611439), c(-1.659306,34.104095), 
 c( -1.771780,34.742707), c(-1.703251,34.494356), c(-2.095231,34.957809)), 
 proj4string=CRS("+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"))
row.names(villes)<-c("Figuig3", "Figuig2", "Figuig1", "Jerada2", "Oujda", 
                     "Jerada", "Ahfir")

We are going to compute some distances between the points which delimit the frontier. For this reason, we must convert the Spatial data into a CRS given in meters:

proj.maroc<-"+proj=lcc +lat_1=33.3 +lat_0=33.3 +lon_0=-5.4 +k_0=0.999625769 
 +x_0=500000 +y_0=300000 +a=6378249.2 +b=6356515 +towgs84=31,146,47,0,0,0,0 
 +units=m +no_defs"
maghreb<-spTransform(maghreb,CRS(proj.maroc))
villes<-spTransform(villes,CRS(proj.maroc))

We define the borderline between Algeria and Morocco:

algerie<-maghreb[1,]
maroc<-maghreb[2,]
frontiere<-gIntersection(maroc, algerie) 

This frontier is represented by the following number of segments:

length(frontiere@lines[[1]]@Lines)
## [1] 182

Now, we have to integrate the monitoring center sites into the borderline. For this, we first identify the closest border segments to the stations:

ind.villes<-apply(sapply(frontiere@lines[[1]]@Lines, 
    function(x) gDistance(SpatialLines(list(Lines(x, ID="1"))), villes, byid=TRUE), 
    simplify = "matrix"), 1, which.min)

Then, we have to create a new Spatial object which defines the new borderline including the monitoring stations:

test<-vector("list", length=length(frontiere@lines[[1]]@Lines)+length(ind.villes))
ind.old=1
ind.new=1
ind.villes.bis=NULL
while(ind.old<=(length(frontiere@lines[[1]]@Lines)))
{if(!ind.old%in%ind.villes)
  {test[[ind.new]]<-frontiere@lines[[1]]@Lines[[ind.old]]
   ind.new = ind.new+1
   }else
   {
   test[[ind.new]]<-Line(cbind(c(frontiere@lines[[1]]@Lines[[ind.old]]@coords[1,1],
     coordinates(villes)[which(ind.villes==ind.old),1]),
     c(frontiere@lines[[1]]@Lines[[ind.old]]@coords[1,2], 
     coordinates(villes)[which(ind.villes==ind.old),2])))
   
   ind.villes.bis=c(ind.villes.bis,ind.new)  
   
   test[[ind.new+1]]<-Line(cbind(c(coordinates(villes)[which(ind.villes==ind.old),1],
    frontiere@lines[[1]]@Lines[[ind.old]]@coords[2,1]), 
     c(coordinates(villes)[which(ind.villes==ind.old),2], 
       frontiere@lines[[1]]@Lines[[ind.old]]@coords[2,2])))
   ind.new=ind.new+2
   }
   ind.old = ind.old+1
}  

Finally, we define the resulting frontier:

frontiere<-SpatialLines(list(Lines(test, ID="1")))
proj4string(frontiere) <- CRS(proj.maroc) 

b. Computing the cumulative distance of the data points lying on the frontier

We shall need compute the cumulative distance of the data points \(X_i\):

coord.front<-t(sapply(frontiere@lines[[1]]@Lines,function(x) x@coords))
dist.front<-sapply(frontiere@lines[[1]]@Lines,function(x) dist(x@coords))
don<-data.frame(coord.front, dist.front)
names(don)<-c("xA","xB","yA","yB","dist")

We sort the data points \(X_i\) on the borderline in order to facilitate the computation the cumulative distances:

don.int<-don[1:3,]
don<-rbind(don[4:189,],don.int)
row.names(don)<-1:189

for(k in 1:length(villes))
 {ind.k <- which(don$xB%in%coordinates(villes)[k,1])
  row.names(don)[ind.k]<-row.names(villes)[k]
 }

Now, we want to restrict ourselves to the part of the borderline which stretches some 450 km starting from the Mediterranean sea:

don<-don[cumsum(don$dist)<=450000,]

We compute now the cumulative distances of the \(X_i\)’s starting from the coastal city of Saidia until the city of Figuig.

don$dist.cum.A<-cumsum(c(0,don$dist[-length(don$dist)]))/sum(don$dist)
don$dist.cum.B<-cumsum(don$dist)/sum(don$dist)

c. Preparation of the data

Historical data before the placement of the new station

When converting the original borderline into a straight line with lower and upper endpoints \(a\) and \(b\), representing respectively the coastal city of Saidia and the eastern city of Figuig, the resulting client’s locations \(X_i\) and facility’s locations \(y_j\), along with their corresponding capacities \(z_j\) and allocation zones, are available in the file here that can be imported as follows:

vec.exist <- as.numeric(read.table("data/table_exist_loc.txt", dec=".",sep=","))

The endpoints of the transformed straight line:

a <- vec.exist[1]
b <- vec.exist[2]

The locations \(y_j\) and capacities \(z_j\) of the monitoring stations:

postes <- vec.exist[3:8]
taille.postes <- vec.exist[9:14] 

The location of incidents affected by guardhouse:

incident.p1 <- vec.exist[15:214]; (n.p1 <- length(incident.p1))
## [1] 200
incident.p2 <- vec.exist[215:414]; (n.p2 <- length(incident.p2))
## [1] 200
incident.p3 <- vec.exist[415:484]; (n.p3 <- length(incident.p3))
## [1] 70
incident.p4 <- vec.exist[485:514]; (n.p4 <-  length(incident.p4))
## [1] 30
incident.p5 <- vec.exist[515:549]; (n.p5 <-  length(incident.p5))
## [1] 35
incident.p6 <- vec.exist[550:614]; (n.p6 <- length(incident.p6))
## [1] 65
incidents <- c(incident.p1, incident.p2, incident.p3, incident.p4, 
               incident.p5, incident.p6)

The new situation after positioning the 7th station

The estimated optimal location and capacity of the new monitoring station were obtained by using Matlab codes. The resulting optimal allocation zones after positioning the seventh monitoring station were determined as well. The file containing all these data is available here and can be imported as follows:

vec.new <- as.numeric(read.table("data/table_new_loc.txt", dec=".",sep=","))

The new allocation zones:

incident.p1.new <- vec.new[15:148] ; n.p1.new <- length(incident.p1.new)
incident.p2.new <- vec.new[149:292]; n.p2.new <- length(incident.p2.new)
incident.p3.new <- vec.new[293:364]; n.p3.new <- length(incident.p3.new)
incident.p4.new <- vec.new[365:393]; n.p4.new <-  length(incident.p4.new)
incident.p5.new <- vec.new[394:422]; n.p5.new <-  length(incident.p5.new)
incident.p6.new <- vec.new[423:494]; n.p6.new <- length(incident.p6.new)

The location and capacity of the new monitoring station:

optim.poste <- vec.new[495]  
postes.new <- c(postes, optim.poste) 
taille.postes.new <- c(vec.new[9:14], vec.new[496]) 

The clients allocated to the new station:

incident.optim.poste <- vec.new[497:616]; n.optim.new <- length(incident.optim.poste) 
incidents.new <- c(incident.p1.new, incident.p2.new, incident.p3.new, 
 incident.p4.new, incident.p5.new, incident.p6.new, incident.optim.poste)

The \(95\%\) confidence interval of the true optimal location

LB <- vec.new[617] 
UB <- vec.new[618]

d. Representation of the data

Graphical parameters

col.postes <- c("mediumorchid1", "aquamarine2","chocolate2","indianred1",
                "darkgoldenrod2","deeppink1", "red")
cex.postes = as.vector(7 * sqrt(taille.postes)/max(sqrt(taille.postes)))                  
cex.postes.new = as.vector(7 * sqrt(taille.postes.new)/max(sqrt(taille.postes.new)))

1st graphic (historical data before the placement of the new station)

#pdf(file="droite_avant.pdf", width=12, height=4)
plot(incidents,rep(1,length(incidents)), xlim=c(a,b), xlab="", ylab="",axes=FALSE, pch=3, cex=2,
col=col.postes[c(rep(1,n.p1),rep(2,n.p2),rep(3,n.p3),rep(4,n.p4),rep(5,n.p5),rep(6,n.p6))])
points(postes,rep(1,length(postes)), pch=15)
points(postes,rep(1,length(postes)), col=col.postes, cex=cex.postes, lwd=2)
arrows(a,1,b,1)
text(postes, rep(1.2,6), as.expression(c(bquote(y[1]), bquote(y[2]), bquote(y[3]), 
 bquote(y[4]), bquote(y[5]), bquote(y[6]))),cex=1.6)

#dev.off()

2nd graphic (the new situation after positioning the 7th station)

#pdf(file="droite_apres.pdf", width=12, height=4)
plot(incidents.new, rep(1,length(incidents.new)), xlim=c(a,b), xlab="", ylab="",
     axes=FALSE, pch=3, cex=2, col=col.postes[c(rep(1,n.p1.new),rep(2,n.p2.new),
     rep(3,n.p3.new),rep(4,n.p4.new),rep(5,n.p5.new),rep(6,n.p6.new), rep(7,n.optim.new))])
points(postes.new, rep(1,length(postes.new)), pch=15)
points(postes.new,rep(1,length(postes.new)), col=col.postes, cex=cex.postes.new, lwd=2)
arrows(a,1,b,1)
text(optim.poste , 1.23, as.expression(bquote(hat(y)[list(7,n)])), col="red",cex=1.6)
text(postes, rep(1.2,6), as.expression(c(bquote(y[1]), bquote(y[2]), bquote(y[3]), 
  bquote(y[4]), bquote(y[5]), bquote(y[6]))),cex=1.6)
text(optim.poste,  1.28, "*", col="red")
arrows(LB,0.87,UB,0.87, lwd=2, angle=90, col="red", code=3, length=0.04)
text((LB+UB)/2, 0.8, "CI", col="red", cex=1.6)

#dev.off()

3. Representation of the monitoring stations and associated clients on a map

The idea is to convert the segment [ab] into the original borderline on the map by considering the fact that \(a\) corresponds to the coastal city of Saidia and \(b\) to the city of Figuig. First, we define the locations of the monitoring centers and their allocations zones in a vector and then we sort this vector:

x <- c(a, b, incidents,postes)
x.new <- c(a, b, incidents.new, postes.new, LB, UB)
x.ord<-sort(x)
x.ord.new<-sort(x.new)

We compute the distances and cumulative distances between consecutive endpoints of the segments \([ab]\):

x.ord <- diff(x.ord)
x.ord.new <- diff(x.ord.new)
x.cum<-cumsum(x.ord)/sum(x.ord)
x.cum.new<-cumsum(x.ord.new)/sum(x.ord.new)
x.final.cum<-x.cum[-c(length(x.cum))]
x.final.cum.new<-x.cum.new[-c(length(x.cum.new))]

We report the cumulative distances computed on the straight line on the borderline, then we determine the cartesian coordinates of the data points:

coord.pts <- data.frame(prop=x.final.cum, indice=apply(t(x.final.cum),2,
    function(x) which(x>don$dist.cum.A & x<=don$dist.cum.B)))
coord.pts$coef <- (coord.pts$prop-don$dist.cum.A[coord.pts$indice])/
  (don$dist.cum.B[coord.pts$indice]-don$dist.cum.A[coord.pts$indice])
coord.pts$x <- don$xA[coord.pts$indice] + coord.pts$coef*
   (don$xB[coord.pts$indice]-don$xA[coord.pts$indice])
coord.pts$y <- don$yA[coord.pts$indice] + coord.pts$coef*
   (don$yB[coord.pts$indice]-don$yA[coord.pts$indice])
coord.pts.new <- data.frame(prop=x.final.cum.new, indice = 
  apply(t(x.final.cum.new),2,function(x) which(x>don$dist.cum.A & x<=don$dist.cum.B)))
coord.pts.new$coef<-(coord.pts.new$prop-don$dist.cum.A[coord.pts.new$indice])/
  (don$dist.cum.B[coord.pts.new$indice]-don$dist.cum.A[coord.pts.new$indice])
coord.pts.new$x <- don$xA[coord.pts.new$indice] + coord.pts.new$coef*
  (don$xB[coord.pts.new$indice]-don$xA[coord.pts.new$indice])
coord.pts.new$y <- don$yA[coord.pts.new$indice] + coord.pts.new$coef*
  (don$yB[coord.pts.new$indice]-don$yA[coord.pts.new$indice])

We identify the monitoring stations and their associated clients:

order.data<-order(c(incidents,postes))
ind.optimaux<-which(order.data>length(incidents))
ind.incidents<-which(!order.data>length(incidents))

order.data.new<-order(c(incidents.new,LB,UB,postes.new))
ind.optimaux.new<-which(order.data.new>length(incidents.new)+2)
ind.incidents.new<-which(!order.data.new>length(incidents.new))
ind.IC.new<-which(order.data.new%in%c(length(incidents.new)+1,length(incidents.new)+2))

We create the following Spatial objects:

incidents.SP <- SpatialPoints(cbind(coord.pts$x[ind.incidents],
  coord.pts$y[ind.incidents]), proj4string=CRS(proj.maroc))
incidents.SP.new <- SpatialPoints(cbind(coord.pts.new$x[ind.incidents.new],
  coord.pts.new$y[ind.incidents.new]), proj4string=CRS(proj.maroc))
postes.SP <- SpatialPoints(cbind(coord.pts$x[ind.optimaux],coord.pts$y[ind.optimaux]),
  proj4string=CRS(proj.maroc))
postes.SP.new <- SpatialPoints(cbind(coord.pts.new$x[ind.optimaux.new],
 coord.pts.new$y[ind.optimaux.new]), proj4string=CRS(proj.maroc))
IC.SP.new <- SpatialPoints(cbind(coord.pts.new$x[ind.IC.new],coord.pts.new$y[ind.IC.new]),
proj4string=CRS(proj.maroc))

We convert the coordinates in the appropriate CRS :

incidents.SP <- spTransform(incidents.SP, CRS(proj.OSM))
incidents.SP.new <- spTransform(incidents.SP.new, CRS(proj.OSM))
postes.SP <- spTransform(postes.SP, CRS(proj.OSM))
postes.SP.new  <- spTransform(postes.SP.new , CRS(proj.OSM))
IC.SP.new  <- spTransform(IC.SP.new , CRS(proj.OSM))
frontiere.OSM <- spTransform(frontiere, CRS(proj.OSM))
provinces.OSM<-spTransform(provinces, CRS(proj.OSM))

1st graphic (historical data before the placement of the new station)

#pdf(file="carte_avant.pdf")
plot(map)
plot(incidents.SP,  add=TRUE, pch=3, cex=1.6, 
  col=col.postes[c(rep(1,n.p1),rep(2,n.p2),rep(3,n.p3),
                   rep(4,n.p4),rep(5,n.p5),rep(6,n.p6))])
plot(postes.SP,add=TRUE, pch=22, cex=0.6)  
points(postes.SP, col=col.postes, cex=cex.postes, lwd=3)
plot(frontiere.OSM, add=TRUE)
text(coordinates(postes.SP), as.expression(c(bquote(y[1]), bquote(y[2]), bquote(y[3]),
    bquote(y[4]), bquote(y[5]), bquote(y[6]))), font=3, pos=4, cex=1.2, offset=0.6)
plot(provinces, pch=1, add=TRUE, cex=0.8)                         
text(coordinates(provinces), nom.prov, cex=0.95, 
     pos=c(4, 1, 1, 1, 2, 3, 1, 1, 4, 1, 2, 2, 4, 2, 2, 4, 2, 4, 4, 4, 4), offset=0.4)

#dev.off()

2nd graphic (the new situation after positioning the 7th station)

#pdf(file="carte_apres.pdf")
plot(map)
plot(incidents.SP.new,  add=TRUE, pch=3, cex=1.6, 
     col=col.postes[c(rep(1,n.p1.new), rep(7,n.optim.new), rep(2,n.p2.new),
                      rep(3,n.p3.new),rep(4,n.p4.new),rep(5,n.p5.new),rep(6,n.p6.new))])
plot(postes.SP.new, add=TRUE, pch=22, cex=0.6)  
points(postes.SP.new, col=col.postes[c(1,7,2:6)], cex=cex.postes.new[c(1,7,2:6)], lwd=3)
plot(frontiere.OSM, add=TRUE) 
plot(IC.SP.new, add=TRUE, pch=5, cex=0.6)  
text(coordinates(postes.SP.new[c(1,3:7),]), as.expression(c(bquote(y[1]), bquote(y[2]),
  bquote(y[3]), bquote(y[4]), bquote(y[5]), bquote(y[6]))), 
  font=3, pos=4, cex=1.2, offset=0.6)
text(coordinates(postes.SP.new[2,])[1]-5000, 
     coordinates(postes.SP.new[2,])[2]+16000,  as.expression(bquote(hat(y)[list(7,n)])),
     font=3, pos=4, cex=1.2, col="red", offset=0.9)
text(coordinates(postes.SP.new[2,])[1]-3800,coordinates(postes.SP.new[2,])[2]+18000,
     "*", font=3, pos=4, cex=0.8, col="red", offset=1.2)
plot(provinces.OSM, pch=1, add=TRUE, cex=0.8)                         
text(coordinates(provinces.OSM), nom.prov, cex=0.95, 
     pos=c(4, 1, 1, 1, 2, 3, 1, 1, 4, 1, 2, 2, 4, 2, 2, 4, 2, 4, 4, 4, 4), offset=0.4)

#dev.off()

4. Animation

In this application, we actually consider a dynamic setting where the patrol agents’ intervention sites \(\{x_i\}\) vary over time, but the dataset at hand upon which we base the long term decision of positioning a new border monitoring station is just a picture of the situation over a given period of time \(i=1,\ldots,600\).
This kind of static location modeling is based on the implicit assumption that the time or effort needed for the illegal immigration, drug trafficking and contraband smuggling to react is long enough to harvest the main benefits of the new monitoring site. One way to check this assumption is by considering the evolution of the estimated optimal location and capacity \((\widehat{y}^*_{7,n},\widehat{z}^*_{7,n})\) starting from the subsample of the first \(300\) recorded observations \(\{x_1,\ldots,x_{300}\}\). For each new patrol agents’ intervention \(i=301,\ldots,600\), we add the corresponding observation \(x_i\) to the previous sample and then display the obtained results similarly to the second graphic above. This leads to an animation on the straight line that can be found at https://gremaq.univ-tlse1.fr/perso/laurent/code/DVK.html, and its conversion into the true borderline at https://gremaq.univ-tlse1.fr/perso/laurent/code/DVK2.html.