\[r(\textbf{x}) = E[Y|\textbf{X}=\textbf{x}] =\beta_1 + \beta_2x_2 + \dotsc +\beta_k x_k\]
That is, we have data of the form
\[(X_1, Y_1), \dots, (X_n, Y_n)\] where the covariate is a vector of length \(k\) \[X_i = (X_{i1}, \dots, X_{ik })\] So, we have that each observation \(i\) is defined by \(k\) values.
\[Y_i = \sum_{j=1}^k \beta_j X_{ij} + \epsilon_i \qquad i=1, \dots, n \] where
\[E \left( \epsilon_i\right | X_{i1}, \dots, X_{ik}) = 0 \]
To include and intercept, we set
\[X_{i1} = 1 \qquad i=1, \dots, n\]
The multiple linear regression model can be expressed as
\[Y = X \beta + \epsilon\]
where
\[Y = \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_n \end{bmatrix}\]
\[X = \begin{bmatrix} 1 & X_{12} & \dots & X_{1k} \\ 1 & X_{22} & \dots & X_{2k} \\ \vdots & \vdots & \vdots & \vdots \\ 1 & X_{n2} &\dots & X_{nk} \\ \end{bmatrix}\]
\[\beta = \begin{bmatrix} \beta_1 \\ \beta_2 \\ \vdots \\ \beta_n \end{bmatrix}\]
\[\epsilon = \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ \vdots \\ \epsilon_n \end{bmatrix}\]
\(X\) is called the design matrix.
Wasserman, L. All of statistics (2004). Springer Science & Business Media.