### Definition

$r(\textbf{x}) = E[Y|\textbf{X}=\textbf{x}] =\beta_1 + \beta_2x_2 + \dotsc +\beta_k x_k$

That is, we have data of the form

$(X_1, Y_1), \dots, (X_n, Y_n)$ where the covariate is a vector of length $$k$$ $X_i = (X_{i1}, \dots, X_{ik })$ So, we have that each observation $$i$$ is defined by $$k$$ values.

#### Alternative definition

$Y_i = \sum_{j=1}^k \beta_j X_{ij} + \epsilon_i \qquad i=1, \dots, n$ where

$E \left( \epsilon_i\right | X_{i1}, \dots, X_{ik}) = 0$

To include and intercept, we set

$X_{i1} = 1 \qquad i=1, \dots, n$

#### Matrix notation

The multiple linear regression model can be expressed as

$Y = X \beta + \epsilon$

where

$Y = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix}$

$X = \begin{bmatrix} 1 & X_{12} & \dots & X_{1k} \\ 1 & X_{22} & \dots & X_{2k} \\ \vdots & \vdots & \vdots & \vdots \\ 1 & X_{n2} &\dots & X_{nk} \\ \end{bmatrix}$

$\beta = \begin{bmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_n \end{bmatrix}$

$\epsilon = \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_n \end{bmatrix}$

$$X$$ is called the design matrix.

## References

Wasserman, L. All of statistics (2004). Springer Science & Business Media.