Computation matrix with R (session 3)

class: center, middle, inverse, title-slide

# Computation matrix with <strong>R</strong> (session 3)
## Master’s degree in International Economics and Business
### <a href="http://www.thibault.laurent.free.fr">Thibault Laurent</a>
### Toulouse School of Economics, CNRS
### Last update: 2020-09-17

---

# Matrix: Create

Matrix is a table in 2D which contains elements of the same type. It can be created with the *matrix()* function which consists in transforming a vector into a matrix. By default the matrix is filled by column (use **byrow** argument otherwise).

```r
a.vec <- c(25, 26, 30, 31, 26, 27, 29, 30, 22, 23)
a.matrix <- matrix(a.vec, nrow = 5, ncol = 2, byrow = T,
              dimnames = list(letters[1:5], c("y.2005", "y.2006")))
```
It has a **dim** attribute of length 2 (number of rows and columns) and an optional **dimnames** attribute, a list of length 2 with the names of rows and columns.

```r
attributes(a.matrix)
```

It can also be done by merging vectors by row (*rbind()*) or by column (*cbind()*):

```r
a.matrix.1 <- rbind(a.character, a.logical)
colnames(a.matrix.1) <- paste0("team_", 1:4)
rownames(a.matrix.1) <- c("country", "winner") 
a.matrix.2 <- cbind(a.integer, a.numeric)
```

---
# Matrix: Access and modify

Elements can be accessed as *var[row, column]*. Here *row* and *column* are vectors of **integer** (the indices), **character** (the names) or logical.

```r
a.matrix[c(2, 4), 2]
a.matrix[c("b", "d"), "y.2006"]
a.matrix[a.matrix[, 1] >= 30, ]
a.matrix.1[ , dimnames(a.matrix.1)[[2]] %in% c("team_1", "team_4")]
```

We can combine assignment operator with the above learned methods for accessing elements of a matrix to modify it.

```r
a.matrix[2, 2] <- round(a.matrix[2, 2], 0)
```

We can also use *cbind()* and *rbind()* for adding columns or rows

```r
a.matrix <- cbind(a.matrix, y.2007 = a.matrix[, 2] + 1)
a.matrix <- rbind(a.matrix, f = c(31, 32, 33))
```

---
# Matrix: Basic operations

We use the same operators **+**, **-**, **\***, **/**, **^**, **%%**, **%/%** than for vectors. There are some rules with respect to the dimension : a matrix of size `\((n,p)\)` can be associated with a scalar and a vector of size lower than `\(np\)`.

```r
1 + 2 * a.matrix - 0.5 * a.matrix ^ 2 
a.matrix + c(1, 2, 3)
```

which is equivalent to:

```r
a.matrix + matrix(c(1, 2, 3), nrow(a.matrix), 
                    ncol(a.matrix))
```

A matrix can be associated with another matrix iif the dimensions coincide:

```r
a.matrix[2:5, 1:2] + a.matrix.2 ^ 2
a.matrix[1:2, ] + a.matrix.2 ^ 2
```

---
# Matrix: function *apply()*

*apply()* is a common way for doing a *for loop* instruction. For example, instead of doing:

```r
my_res <- numeric(ncol(a.matrix))
for (k in 1:nrow(a.matrix)) {
  my_res <- my_res + a.matrix[k, ]
}
my_res
```

We can do:

```r
apply(a.matrix, MARGIN = 2, FUN = sum)
```

The second argument of the function indicates the dimension on which applying the function **FUN**. The last argument can be a existing or personnal function.

```r
apply(a.matrix, 1, function(x) c(min(x), max(x)))
```

---
# Matrix calculation (1)

```r
y <- c(178, 180, 165, 158, 183)
x <- cbind(x0 = 1, x1 = c(0, 0, 1, 1, 0), 
           x2 = c(43, 43, 40, 39, 45))
```

* *t()* give the transpose of a matrix. Note that a transpose of a vector belongs to the **matrix** class of object.

```r
t(x)
t(y)
```

* %*% permits to multiply two matrices if they are conformable

```r
t(x) %*% y
t(x) %*% x
```

* *crossprod(x, y)* is optimized to compute `\(x^Ty\)`

```r
crossprod(x, y)
crossprod(x)
```

---
# Matrix calculation (2)

* Note that a vector in **R** is considered as row-vector and column-vector (**R** is actually doing the verification)

```r
y %*% y
t(y) %*% y
```

* *solve(A, b)* permits to resolve the linear problem `\(Ax=b\)`. For example:

```r
A <- crossprod(x)
b <- crossprod(x, y)
solve(A, b)
```

It permits to find the inverse of a square matrix by resolving `\(Ax=I\)`:

```r
solve(A) %*% b
```

* To inverse a symmetric definite positive matrix, use instead the Cholesky or `\(QR\)` factorization (more stable)

```r
chol2inv(chol(A)) %*% crossprod(x, y)
qr.solve(A, b)
```

---
# Matrix calculation (3)

To deal with sparse matrix, we recommend to use **Matrix** package (or alternatively **spam**).

```r
library("Matrix")
mat <- matrix(rbinom(10000, 1, 0.05), 100, 100)
object.size(mat)
Mat <- as(mat, "Matrix")
object.size(Mat)
```

The same functions (*crossprod()*, *solve()*, etc.) and operators (%*%) can be applied and will be faster.  
---

# 👩‍🎓 Training

### Exercise 3

We consider the two vectors **weight** and **group**

```r
ctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14)
trt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69)
group <- gl(2, 10, 20, labels = c("Ctl", "Trt"))
weight <- c(ctl, trt)
```

* create a matrix **X** of dim `\(20 \times 2\)` which contains in the first column 1 if group="Ctl", 0 otherwise, and contains in the second column 1 if group="Trt" and 0 otherwise.

* What does the following command do?

```r
split(weight, group)
```

* Compute the mean of **weight** in the two groups "Ctl" and "Trt"

---
# 👩‍🎓 Training

* On the following figure, do you think there is a difference of weight between the two groups ?
![](my_course_3_files/figure-html/unnamed-chunk-26-1.png)

* create matrix `\(A=(X'X)\)` and `\(b = (X'y)\)` where `\(y\)` is the **weight**

* solve the equation `\(A\beta=b\)`. We call **hat_beta** the solution of the equation

* compute the adjusted values `\(\hat y=X\hat\beta\)`. We call **hat_y** this vector.

* compute the vector of residuals `\(\hat\epsilon=y -\hat{y}\)`. We call **hat_e** this vector

---
# 👩‍🎓 Training

* Compute the sum of squares of the residuals. We call **sse** this value:

* compute the residual standard error which is equal to `\(\sqrt{SSE/(n-2)}\)`

* compute **SST** which is equal to `\(\sum(y-\bar{y})^2\)`

* compute **SSR** which is equal to `\(\sum(\hat{y}-\bar{y})^2\)`

* verify that `\(SST=SSR+SSE\)`

* compute the `\(R^2\)` which is equal to SSR/SST