Definition

We have data of the form

\[(X_1, Y_1), \dots, (X_n, Y_n)\] where the covariate is a vector of length \(k\) \[X_i = (X_{i1}, \dots, X_{ik })\] So, we have that each observation \(i\) is defined by \(k\) values.

The multiple linear regression model is defined as

\[Y_i = \sum_{j=1}^n \beta_j X_{ij} + \epsilon_i \qquad i=1, \dots, n \] where

\[E \left( \epsilon_i\right | X_{i1}, \dots, X_{ik}) = 0 \]

To include and intercept, we set

\[X_{i1} = 1 \qquad i=1, \dots, n\]

Matrix notation

The multiple linear regression model can be expressed as

\[Y = X \beta + \epsilon\]

where

\[Y = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix}\]

\[X = \begin{bmatrix} X_{11} & X_{12} & \dots & X_{1k} \\ X_{21} & X_{22} & \dots & X_{2k} \\ \vdots & \vdots & \vdots & \vdots \\ X_{n1} & X_{n2} &\dots & X_{nk} \\ \end{bmatrix}\]

\[\beta = \begin{bmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_n \end{bmatrix}\]

\[\epsilon = \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_n \end{bmatrix}\]

References

Wasserman, L. All of statistics (2004). Springer Science & Business Media.