Random variables

Definition of random variable
Definition of cumulative distribution function F
Properties of F
Definition of probability mass function f
F from f
Definition of continous random variable and probability density function
F from f
Definition of inverse cumulative distribution function or quantile function
Examples of discrete random variables
Examples of continous random variables
Bivariate distributions
Multivariate distributions and iid samples
References

Definition of random variable

A random variable \(X\) is a function that maps events in \(\Omega\) onto real numbers \[ X: \Omega \rightarrow \mathbb{R}\]

For example \(X(correct)=x_c=1\) and \(X(incorrect)=x_{inc}=0\).

Definition of cumulative distribution function F

\[F(x)=P(X \leq x)\]

We call the distribution of a random variable also a population.

Properties of F

\[ P(a < X \leq b) = F(b) - F(a)\]

\[ P(X > x) = 1 - F(x)\]

Definition of probability mass function f

\[f(x)=P(X=x)\]

F from f

\[F(x)=\sum_{x_i \leq x} f(X=x_i)\]

Definition of continous random variable and probability density function

\(X\) is continuous if exists \(f\) such that

\[f(x) \geq 0 \quad \forall x\] \[\int_{-\infty}^{\infty}f(x) dx = 1\] \[P(a \leq x \leq b) = \int_{a}^{b} f(x) dx\]

\(f\) is called the probability density function.

We can associate a pdf to discrete random variables using the delta of Dirac \[f(x)=\sum_{i=1}^{n}\delta (x-x_i) p_i\]

F from f

Given the definitions of \(F\) and \(f\) for continous random variables is follows that

\[F(x) = \int_{-\infty}^{x} f(x) dx\]

and by the fundamental theorem of calculus

\[ F'(x) = f(x)\]

Definition of inverse cumulative distribution function or quantile function

\(F^{-1}(q)\) is the unique real \(x\) such that \(F(x)=q\).

Examples of discrete random variables

The following examples correspond to parameterized families of random variables. Once we fix the parameters, we have a specific random variable of the family.

Bernouilli

\(f_k = p^k(1-p)^{1-k}\)

Binomial

\(f_k = \binom{n}{k}p^k(1-p)^{n-k}\)

Properties

If \(X_1\) is distributed binomially \((X_1 \sim Binomial(n_1,p))\) and \(X_2 \sim Binomial(n_2,p)\), then \(X_1 + X_2 \sim Binomial(n_1 + n_2, p)\)

Poisson

\(f_k = \frac{\lambda^k}{k!}e^{-\lambda}\)

Examples of continous random variables

Normal (\(X \sim N(\mu,\sigma)\))

\(f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}\)

When \(\mu = 0\) and \(\sigma = 1\) we say that \(X\) has a standard normal distribution. The random variable is usually called \(Z\), the probability distribution function \(\varphi\) and the cumulative distribution function \(\Phi\).

Properties

If \(X \sim N(\mu, \sigma^2)\), then \(Z=\frac{X-\mu}{\sigma} \sim N(0,1)\).

If \(Z \sim N(0, 1)\), then \(X=\mu + Z \sigma \sim N(\mu, \sigma^2)\).

If \(X_i \sim N(\mu_i, \sigma_i^2)\) then \(\sum_{i} X_i \sim N (\sum_{i} \mu_i, \sum_{i} \sigma_i^2)\)

\(P(a < X < b) = P(\frac{a - \mu}{\sigma} < Z < \frac{b - \mu}{\sigma}) = \Phi( \frac{b - \mu}{\sigma}) - \Phi(\frac{a - \mu}{\sigma})\)

Example 1

Find \(P(X>2)\) if \(X \sim N(4,5)\).

\(P(X>2) = 1 - F(2) = 1 - \Phi(\frac{2 - 4}{\sqrt{5}})\)

1 - pnorm((2-4)/sqrt(5))

## [1] 0.8144533

Given that pnorm also accepts \(\mu\) and \(\sigma\) we can do it more directly

1 - pnorm(2, 4, sqrt(5))

## [1] 0.8144533

pnorm(2, 4, sqrt(5), lower.tail = FALSE)

## [1] 0.8144533

Example 2

Find \(F^{-1}(0.3)\) if \(X \sim N(2,6)\).

\(Z=\Phi^{-1}(0.3)\)

\(F^{-1}(0.3) = X = \mu + Z \sigma\)

2 + qnorm(.3) * sqrt(6)

## [1] 0.7154863

Given that qnorm also accepts \(\mu\) and \(\sigma\) we can do it more directly

qnorm(.3, 2, sqrt(6))

## [1] 0.7154863

Example 3

The probability from \(-2\sigma\) to \(2\sigma\) is about 0.95. For example for the standard normal distribution

1 - 2 * pnorm(-2)

## [1] 0.9544997

\(t\) distribution (\(X \sim t_\nu\))

\(f(x)=\frac{\Gamma(\frac{\nu + 1}{2})}{\Gamma(\frac{\nu}{2})} \frac{1}{\left(1+\frac{x^2}{2}\right)^{\frac{\nu + 1}{2}}}\)

where \(\nu\) is called the degrees of freedom

Properties

For \(\nu \rightarrow \infty\), \(f(x)\) is normal.
\(\frac{\overline{X}_n - \mu}{\widehat{se}}\) is distributed \(t\) (\(\overline{X}_n\) and \(\widehat{se}\) are random variables defined later)

Cauchy distribution

\(f(x)=\frac{1}{\pi (1 + x^2)}\)

Properties

It is the \(t\) distribution for \(\nu=1\).

\(\chi^2\) distribution (\(X \sim \chi_p^2\))

\(f(x)=\frac{1}{\Gamma(p/2)2^{p/2}}x^{p/2-1}e^{-x/2}\)

Properties

If \(Z_i \sim N(0,1)\) then \(\sum_{i=1}^{p} Z_i^2 \sim \chi_p^2\).

\(F\) distribution (\(X \sim F\))

\(f(x)=\frac{\sqrt{\frac{ \left( \nu_1 x \right)^{\nu_1} \nu_2^{\nu2}}{\left( \nu_1 \, x + \nu_2 \right)^{\nu_1+\nu_2}} }}{x \, B \left( \frac{\nu_1}{2} , \frac{\nu_2}{2} \right)}\)

Properties

If \(X_1\) is distributed \(\chi^2\) with \(\nu_1\) degrees of freedom and \(X_2\) is distributed \(\chi^2\) with \(\nu_2\) degrees of freedom and they are indepedent, then \(F =\frac{X_1 / \nu_1}{X_2 / \nu_2}\) is distributed F, that is, with the \(f(x)\) above.

Bivariate distributions

Definition of joint probability mass function

If X and Y are discrete random variables

\[f(x,y) = P(X = x, Y = y)\]

Definition of joint probability density function

\[f(x,y) \geq 0 \quad \forall x\] \[\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) dx dy = 1\] \[P((X,Y) \in A) = \int \int_{A} f(x,y) dx dy\]

Definition of joint cumulative density function

\[ F(x, y) = P(X \leq x, Y \leq y) \]

Definition of marginal mass functions

If X and Y are discrete random variables

\[ f_X(x) = P(x=X)=\sum_y P(X=x, Y=y)= \sum_y f(x,y)\] \[ f_Y(y) = P(y=Y)=\sum_x P(X=x, Y=y)= \sum_x f(x,y)\]

Definition of marginal probability density functions

If X and Y are continuous random variables

\[ f_X(x) = \int_{-\infty}^{\infty} f(x,y) dy \] \[ f_Y(y) = \int_{-\infty}^{\infty} f(x,y) dx \]

Definition of independency

X and Y are independent random variables if

\[P(X \in A, Y \in B)=P(X \in A)P(Y \in B)\]

Defintion of conditional probability mass functions

\[f_{X|Y}(x|y) = P(X=x|Y=y)=\frac{P(X=x,Y=y)}{P(Y=y)}=\frac{f_{X,Y}(x,y)}{f_Y(y)}\]

Defintion of conditional probability density functions

\[ f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}\]

Multivariate distributions and iid samples

Definition of random vector

\[ X=(X_1,X_2,\dotsc,X_n)\]

Definition of independency

\(X_1,\dotsc,X_n\) are independent if

\[P(X_1 \in A_1, \dotsc, X_n \in A_n)= \prod_{i=1}^{n} P(X_i \in A_i)\]

or equivalently

\[f(x_1,\dotsc,x_n)= \prod_{i=1}^{n} f_{X_i}(x_i)\]

Definition of iid samples

\(X_1,\dotsc,X_n\) are iid (independent and identically distributed) if

\(X_1,\dotsc,X_n\) are independent
\(X_1,\dotsc,X_n\) have the same marginal cumulative distribution function \(F\) (or \(f\))

We say that \(X_1,\dotsc,X_n \sim F\)

and call \(X_1,\dotsc,X_n\) a random sample of size n from \(F\). Much of statistics deals with random samples.

Examples

Multinomial

Multivariate normal

References

Gabbiani, F., & Cox, S. J. (2010). Mathematics for neuroscientists. Academic Press. Chicago

Knoblauch, K., & Maloney, L. T. (2012). Modeling Psychophysical Data in R. New York: Springer.

Wasserman, L. All of statistics (2004). Springer Science & Business Media.