Expectation

Univariate random variables

Definition

Discrete

\[E[g(X)] \equiv \langle g(X) \rangle \equiv \sum_{i=1}^{\infty} g(x_i) p_i\]

Continous

\[E[g(X)] \equiv \langle g(X) \rangle \equiv \int_{-\infty}^{\infty}g(x) f(x) dx\]

Properties

\(E\) is a linear operator \[E[af(X)+bg(X)+c]=aE[f(X)]+bE[g(X)]+c\]

Mean

\(g(X) = \mu = X\)

Variance

\(g(X) = V(X)=\sigma^2 = (X-E(X))^2\)

Properties

\(V(x)=E[X^2]-\mu^2\)
- Demonstration
  
  \(V(X) = E[(X-\mu)^2] = E[X^2 - 2X\mu + \mu^2]= E[X^2]-2E[X]\mu + E[\mu^2] = \\ = E[X^2] -2\mu^2 + \mu^2 = E[X^2] - \mu^2\)
\(V(aX+b)=a^2 V(X)\)
- Demonstration
  
  \(V(aX+b) = E[ (aX+b - E[aX+b])^2 ] = E[ (aX+b - aE[X]-b)^2 ] = \\ = E[ (aX- aE[X])^2 ] = E[ a^2 (X-E[x])^2] = a^2 E[(X-E[X])^2]=a^2V(X)\)

n-moment

\(g(X) = (X-E(X))^n\)

Probability as an expectaction value

If \(r(x)= I_A(x)\) where \(I_A(x)=1\) when \(x \in A\) and \(I_A(x)=0\) when \(x \notin A\) then

\[E[I_A(X)] = \int_A I_A(x) f_X(x) = P(X \in A)\]

Examples

Discrete

Poisson random variable

\(E[X] =\lambda; \, Var[X] = \lambda\)

Binomial random variable

\(E[X] = n p; \, Var[X] = n p (1 - p)\)

Continous

Normal random variable

\(E[X] =\mu; \, Var[X] = \sigma^{2}\)

Multivariate random variables

Definition

Discrete

…

Continuous

\[E[g(X_1, \dotsc, X_n)] \equiv \langle g(X_1, \dotsc, X_n) \rangle \equiv \int_{-\infty}^{\infty}f(x_1, \dotsc, x_n) g(x_1, \dotsc, x_n) \, dx_1 \dotsc dx_n\]

Properties

\(E[\sum_i a_i X_i] = \sum_i a_i E[X_i]\)
If \(X_1,\dotsc,X_n\) are identically distributed \(E[\sum_{i=1}^n g(X_i)]= n E[g(X_1)]\)

Demonstration
- \(\int_{-\infty}^{\infty} \dotsc \int_{-\infty}^{\infty} \left( \sum_{i=1}^n g(x_i) \right) f(x_1,\dotsc,x_n) \, dx_1 \dotsc dx_n = \\ =\int_{-\infty}^{\infty} g(x_1) \left[ \int_{-\infty}^{\infty} \dotsc \int_{-\infty}^{\infty} f(x_1,\dotsc,x_n) dx_2 \dotsc dx_n \right] \, dx_1 \dotsc \int_{-\infty}^{\infty} g(x_n) \left[ \int_{-\infty}^{\infty} \dotsc \int_{-\infty}^{\infty} f(x_1,\dotsc,x_n) dx_1 \dotsc dx_{n-1} \right] \, dx_n = \\ = \int_{-\infty}^{\infty} g(x_1) f_{X_1}(x_1) \, dx_1 \dotsc \int_{-\infty}^{\infty} g(x_n) f_{X_n}(x_n) \, dx_n = n \int_{-\infty}^{\infty} g(x_1) f_{X_1}(x_1) \, dx_1 = n E[g(X_1)]\)
If \(X_1,\dotsc,X_n\) are independent \(E[\prod_i X_i] = \prod_i E[X_i]\)
If \(X_1,\dotsc,X_n\) are independent \(V(\sum_i a_i X_i)= \sum_i a_i^2 V(X_i)\)

Definition of covariance

\[cov(X,Y)= E[(X-\mu_X)(Y-\mu_Y)]\]

Properties

\(cov(X,Y)=E[XY]-E[X]E[Y]\)
if X and Y are independent, then \(cov(X,Y)=0\)
\(V(X+Y)=V(X)+V(Y)+2cov(X,Y)\)
\(V(X-Y)=V(X)+V(Y)-2cov(X,Y)\)
\(V(\sum_i a_i X_i) = \sum_i a_i^2 V(X_i) + \sum\sum_{i<j}a_ia_j con(X_i,X_j)\)

Definition of correlation

\[\rho(X,Y) = \frac{cov(X,Y)}{\sigma_X \sigma_Y}\]

Properties

\(-1 \leq \rho \leq 1\)
If \(Y = aX + b\) then \(\rho=1\) if \(a>0\) and \(\rho=-1\) if \(a<0\)

Definition of variance-covariance matrix \(\Sigma\)

\[ \left( \begin{array}{cccc} V(X_1) & cov(X_1,X_2) & \dotsc & cov(X_1,X_n) \\ cov(X_2,X_1) & V(X_2) & \dotsc & cov(X_2,X_n)\\ \dotsc & \dotsc & \dotsc & \dotsc \\ cov(X_n,X_1) & cov(X_n,X_2) & \dotsc & V(X_n)\end{array} \right)\]

Properties

If \(a\) is a vector, \(A\) is a matri and \(X\) is a random vector with mean \(\mu\) and variance \(\Sigma\), then

\(E[a^T X] = a^T \mu\)
\(V[a^T X] = a^T \Sigma a\)
\(E[AX]=A\mu\)
\(V(AX)=A \Sigma A^T\)

Conditional expectation

Discrete \(E(X|Y=y) = \sum x f_{X|Y}\)
Continous \(E(X|Y=y) = \int x f_{X|Y} dx\)

Expected values of conditional expectations

Given the random variable \(E(X|Y)\)

\(E[E[Y|X]]=E[Y]\)
\(E[E[r(X,Y)|X]]=E[r(X,Y)]\)

References

Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media.