Definition of random variable

A random variable $$X$$ is a function that maps events in $$\Omega$$ onto real numbers $X: \Omega \rightarrow \mathbb{R}$

For example $$X(correct)=x_c=1$$ and $$X(incorrect)=x_{inc}=0$$.

Definition of cumulative distribution function F

$F(x)=P(X \leq x)$

We call the distribution of a random variable also a population.

Properties of F

$P(a < X \leq b) = F(b) - F(a)$

$P(X > x) = 1 - F(x)$

Definition of probability mass function f

$f(x)=P(X=x)$

F from f

$F(x)=\sum_{x_i \leq x} f(X=x_i)$

Definition of continous random variable and probability density function

$$X$$ is continuous if exists $$f$$ such that

$f(x) \geq 0 \quad \forall x$ $\int_{-\infty}^{\infty}f(x) dx = 1$ $P(a \leq x \leq b) = \int_{a}^{b} f(x) dx$

$$f$$ is called the probability density function.

We can associate a pdf to discrete random variables using the delta of Dirac $f(x)=\sum_{i=1}^{n}\delta (x-x_i) p_i$

F from f

Given the definitions of $$F$$ and $$f$$ for continous random variables is follows that

$F(x) = \int_{-\infty}^{x} f(x) dx$

and by the fundamental theorem of calculus

$F'(x) = f(x)$

Definition of inverse cumulative distribution function or quantile function

$$F^{-1}(q)$$ is the unique real $$x$$ such that $$F(x)=q$$.

Examples of discrete random variables

The following examples correspond to parameterized families of random variables. Once we fix the parameters, we have a specific random variable of the family.

Bernouilli

$$f_k = p^k(1-p)^{1-k}$$

Binomial

$$f_k = \binom{n}{k}p^k(1-p)^{n-k}$$

Properties

If $$X_1$$ is distributed binomially $$(X_1 \sim Binomial(n_1,p))$$ and $$X_2 \sim Binomial(n_2,p)$$, then $$X_1 + X_2 \sim Binomial(n_1 + n_2, p)$$

Poisson

$$f_k = \frac{\lambda^k}{k!}e^{-\lambda}$$

Examples of continous random variables

Normal ($$X \sim N(\mu,\sigma)$$)

$$f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\frac{(x-\mu)^2}{2\sigma^2}}$$

When $$\mu = 0$$ and $$\sigma = 1$$ we say that $$X$$ has a standard normal distribution. The random variable is usually called $$Z$$, the probability distribution function $$\varphi$$ and the cumulative distribution function $$\Phi$$.

Properties

If $$X \sim N(\mu, \sigma^2)$$, then $$Z=\frac{X-\mu}{\sigma} \sim N(0,1)$$.

If $$Z \sim N(0, 1)$$, then $$X=\mu + Z \sigma \sim N(\mu, \sigma^2)$$.

If $$X_i \sim N(\mu_i, \sigma_i)$$ then $$\sum_{i} X_i \sim N (\sum_{i} \mu_i, \sum_{i} \sigma_i^2)$$

$$P(a < X < b) = P(\frac{a - \mu}{\sigma} < Z < \frac{b - \mu}{\sigma}) = \Phi( \frac{b - \mu}{\sigma}) - \Phi(\frac{a - \mu}{\sigma})$$

Example 1

Find $$P(X>2)$$ if $$X \sim N(4,5)$$.

$$P(X>2) = 1 - F(2) = 1 - \Phi(\frac{2 - 4}{\sqrt{5}})$$

1 - pnorm((2-4)/sqrt(5))
## [1] 0.8144533

Given that pnorm also accepts $$\mu$$ and $$\sigma$$ we can do it more directly

1 - pnorm(2, 4, sqrt(5))
## [1] 0.8144533

or

pnorm(2, 4, sqrt(5), lower.tail = FALSE)
## [1] 0.8144533

Example 2

Find $$F^{-1}(0.3)$$ if $$X \sim N(2,6)$$.

$$Z=\Phi^{-1}(0.3)$$

$$F^{-1}(0.3) = X = \mu + Z \sigma$$

2 + qnorm(.3) * sqrt(6)
## [1] 0.7154863

Given that qnorm also accepts $$\mu$$ and $$\sigma$$ we can do it more directly

qnorm(.3, 2, sqrt(6)) 
## [1] 0.7154863

Example 3

The probability from $$-2\sigma$$ to $$2\sigma$$ is about 0.95. For example for the standard normal distribution

1 - 2 * pnorm(-2)
## [1] 0.9544997

$$t$$ distribution ($$X \sim t_\nu$$)

$$f(x)=\frac{\Gamma(\frac{\nu + 1}{2})}{\Gamma(\frac{\nu}{2})} \frac{1}{\left(1+\frac{x^2}{2}\right)^{\frac{\nu + 1}{2}}}$$

where $$\nu$$ is called the degrees of freedom

Properties

For $$\nu \rightarrow \infty$$, $$f(x)$$ is normal.

Cauchy distribution

$$f(x)=\frac{1}{\pi (1 + x^2)}$$

Properties

It is the $$t$$ distribution for $$\nu=1$$.

$$\chi^2$$ distribution ($$X \sim \chi_p^2$$)

$$f(x)=\frac{1}{\Gamma(p/2)2^{p/2}}x^{p/2-1}e^{-x/2}$$

Properties

If $$Z_i \sim N(0,1)$$ then $$\sum_{i=1}^{p} Z_i^2 \sim \chi_p^2$$.

$$F$$ distribution ($$X \sim F$$)

$$f(x)=\frac{\sqrt{\frac{ \left( \nu_1 x \right)^{\nu_1} \nu_2^{\nu2}}{\left( \nu_1 \, x + \nu_2 \right)^{\nu_1+\nu_2}} }}{x \, B \left( \frac{\nu_1}{2} , \frac{\nu_2}{2} \right)}$$

Properties

If $$X_1$$ is distributed $$\chi^2$$ with $$\nu_1$$ degrees of freedom and $$X_2$$ is distributed $$\chi^2$$ with $$\nu_2$$ degrees of freedom and they are indepedent, then $$F =\frac{X_1 / \nu_1}{X_2 / \nu_2}$$ is distributed F, that is, with the $$f(x)$$ above.

Bivariate distributions

Definition of joint probability mass function

If X and Y are discrete random variables

$f(x,y) = P(X = x, Y = y)$

Definition of joint probability density function

$f(x,y) \geq 0 \quad \forall x$ $\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) dx dy = 1$ $P((X,Y) \in A) = \int \int_{A} f(x,y) dx dy$

Definition of joint cumulative density function

$F(x, y) = P(X \leq x, Y \leq y)$

Definition of marginal mass functions

If X and Y are discrete random variables

$f_X(x) = P(x=X)=\sum_y P(X=x, Y=y)= \sum_y f(x,y)$ $f_Y(y) = P(y=Y)=\sum_x P(X=x, Y=y)= \sum_x f(x,y)$

Definition of marginal probability density functions

If X and Y are continuous random variables

$f_X(x) = \int_{-\infty}^{\infty} f(x,y) dy$ $f_Y(y) = \int_{-\infty}^{\infty} f(x,y) dx$

Definition of independency

X and Y are independent random variables if

$P(X \in A, Y \in B)=P(X \in A)P(Y \in B)$

Defintion of conditional probability mass functions

$f_{X|Y}(x|y) = P(X=x|Y=y)=\frac{P(X=x,Y=y)}{P(Y=y)}=\frac{f_{X,Y}(x,y)}{f_Y(y)}$

Defintion of conditional probability density functions

$f_{X|Y}(x|y) = \frac{f_{X,Y}(x,y)}{f_Y(y)}$

Multivariate distributions and iid samples

Definition of random vector

$X=(X_1,X_2,\dotsc,X_n)$

Definition of independency

$$X_1,\dotsc,X_n$$ are independent if

$P(X_1 \in A_1, \dotsc, X_n \in A_n)= \prod_{i=1}^{n} P(X_i \in A_i)$

or equivalently

$f(x_1,\dotsc,x_n)= \prod_{i=1}^{n} f_{X_i}(x_i)$

Definition of iid samples

$$X_1,\dotsc,X_n$$ are iid (independent and identically distributed) if

• $$X_1,\dotsc,X_n$$ are independent

• $$X_1,\dotsc,X_n$$ have the same marginal cumulative distribution function $$F$$ (or $$f$$)

We say that $$X_1,\dotsc,X_n \sim F$$

and call $$X_1,\dotsc,X_n$$ a random sample of size n from $$F$$. Much of statistics deals with random samples.

References

Gabbiani, F., & Cox, S. J. (2010). Mathematics for neuroscientists. Academic Press. Chicago

Knoblauch, K., & Maloney, L. T. (2012). Modeling Psychophysical Data in R. New York: Springer.

Wasserman, L. All of statistics (2004). Springer Science & Business Media.