class: center, middle, inverse, title-slide # Computation matrix with
R
(session 3) ## Masterβs degree in International Economics and Business ###
Thibault Laurent
### Toulouse School of Economics, CNRS ### Last update: 2020-09-17 --- # Matrix: Create Matrix is a table in 2D which contains elements of the same type. It can be created with the *matrix()* function which consists in transforming a vector into a matrix. By default the matrix is filled by column (use **byrow** argument otherwise). ```r a.vec <- c(25, 26, 30, 31, 26, 27, 29, 30, 22, 23) a.matrix <- matrix(a.vec, nrow = 5, ncol = 2, byrow = T, dimnames = list(letters[1:5], c("y.2005", "y.2006"))) ``` It has a **dim** attribute of length 2 (number of rows and columns) and an optional **dimnames** attribute, a list of length 2 with the names of rows and columns. ```r attributes(a.matrix) ``` It can also be done by merging vectors by row (*rbind()*) or by column (*cbind()*): ```r a.matrix.1 <- rbind(a.character, a.logical) colnames(a.matrix.1) <- paste0("team_", 1:4) rownames(a.matrix.1) <- c("country", "winner") a.matrix.2 <- cbind(a.integer, a.numeric) ``` --- # Matrix: Access and modify Elements can be accessed as *var[row, column]*. Here *row* and *column* are vectors of **integer** (the indices), **character** (the names) or logical. ```r a.matrix[c(2, 4), 2] a.matrix[c("b", "d"), "y.2006"] a.matrix[a.matrix[, 1] >= 30, ] a.matrix.1[ , dimnames(a.matrix.1)[[2]] %in% c("team_1", "team_4")] ``` We can combine assignment operator with the above learned methods for accessing elements of a matrix to modify it. ```r a.matrix[2, 2] <- round(a.matrix[2, 2], 0) ``` We can also use *cbind()* and *rbind()* for adding columns or rows ```r a.matrix <- cbind(a.matrix, y.2007 = a.matrix[, 2] + 1) a.matrix <- rbind(a.matrix, f = c(31, 32, 33)) ``` --- # Matrix: Basic operations We use the same operators **+**, **-**, **\***, **/**, **^**, **%%**, **%/%** than for vectors. There are some rules with respect to the dimension : a matrix of size `\((n,p)\)` can be associated with a scalar and a vector of size lower than `\(np\)`. ```r 1 + 2 * a.matrix - 0.5 * a.matrix ^ 2 a.matrix + c(1, 2, 3) ``` which is equivalent to: ```r a.matrix + matrix(c(1, 2, 3), nrow(a.matrix), ncol(a.matrix)) ``` A matrix can be associated with another matrix iif the dimensions coincide: ```r a.matrix[2:5, 1:2] + a.matrix.2 ^ 2 a.matrix[1:2, ] + a.matrix.2 ^ 2 ``` --- # Matrix: function *apply()* *apply()* is a common way for doing a *for loop* instruction. For example, instead of doing: ```r my_res <- numeric(ncol(a.matrix)) for (k in 1:nrow(a.matrix)) { my_res <- my_res + a.matrix[k, ] } my_res ``` We can do: ```r apply(a.matrix, MARGIN = 2, FUN = sum) ``` The second argument of the function indicates the dimension on which applying the function **FUN**. The last argument can be a existing or personnal function. ```r apply(a.matrix, 1, function(x) c(min(x), max(x))) ``` --- # Matrix calculation (1) ```r y <- c(178, 180, 165, 158, 183) x <- cbind(x0 = 1, x1 = c(0, 0, 1, 1, 0), x2 = c(43, 43, 40, 39, 45)) ``` * *t()* give the transpose of a matrix. Note that a transpose of a vector belongs to the **matrix** class of object. ```r t(x) t(y) ``` * %*% permits to multiply two matrices if they are conformable ```r t(x) %*% y t(x) %*% x ``` * *crossprod(x, y)* is optimized to compute `\(x^Ty\)` ```r crossprod(x, y) crossprod(x) ``` --- # Matrix calculation (2) * Note that a vector in **R** is considered as row-vector and column-vector (**R** is actually doing the verification) ```r y %*% y t(y) %*% y ``` * *solve(A, b)* permits to resolve the linear problem `\(Ax=b\)`. For example: ```r A <- crossprod(x) b <- crossprod(x, y) solve(A, b) ``` It permits to find the inverse of a square matrix by resolving `\(Ax=I\)`: ```r solve(A) %*% b ``` * To inverse a symmetric definite positive matrix, use instead the Cholesky or `\(QR\)` factorization (more stable) ```r chol2inv(chol(A)) %*% crossprod(x, y) qr.solve(A, b) ``` --- # Matrix calculation (3) To deal with sparse matrix, we recommend to use **Matrix** package (or alternatively **spam**). ```r library("Matrix") mat <- matrix(rbinom(10000, 1, 0.05), 100, 100) object.size(mat) Mat <- as(mat, "Matrix") object.size(Mat) ``` The same functions (*crossprod()*, *solve()*, etc.) and operators (%*%) can be applied and will be faster. --- # π©βπ Training ### Exercise 3 We consider the two vectors **weight** and **group** ```r ctl <- c(4.17, 5.58, 5.18, 6.11, 4.50, 4.61, 5.17, 4.53, 5.33, 5.14) trt <- c(4.81, 4.17, 4.41, 3.59, 5.87, 3.83, 6.03, 4.89, 4.32, 4.69) group <- gl(2, 10, 20, labels = c("Ctl", "Trt")) weight <- c(ctl, trt) ``` * create a matrix **X** of dim `\(20 \times 2\)` which contains in the first column 1 if group="Ctl", 0 otherwise, and contains in the second column 1 if group="Trt" and 0 otherwise. * What does the following command do? ```r split(weight, group) ``` * Compute the mean of **weight** in the two groups "Ctl" and "Trt" --- # π©βπ Training * On the following figure, do you think there is a difference of weight between the two groups ? <!-- --> * create matrix `\(A=(X'X)\)` and `\(b = (X'y)\)` where `\(y\)` is the **weight** * solve the equation `\(A\beta=b\)`. We call **hat_beta** the solution of the equation * compute the adjusted values `\(\hat y=X\hat\beta\)`. We call **hat_y** this vector. * compute the vector of residuals `\(\hat\epsilon=y -\hat{y}\)`. We call **hat_e** this vector --- # π©βπ Training * Compute the sum of squares of the residuals. We call **sse** this value: * compute the residual standard error which is equal to `\(\sqrt{SSE/(n-2)}\)` * compute **SST** which is equal to `\(\sum(y-\bar{y})^2\)` * compute **SSR** which is equal to `\(\sum(\hat{y}-\bar{y})^2\)` * verify that `\(SST=SSR+SSE\)` * compute the `\(R^2\)` which is equal to SSR/SST