## Definition of hypotheses

A hypothesis is a statement about a population parameter.

$H_0: \theta \in \Theta_0 \text{ and } H_1: \theta \in \Theta_1$

Often $$\Theta_1 = \Theta_0^C$$.

### Types of hypotheses

#### Simple hypothesis

$H_0: \theta = \theta_0$

#### Composite hypothesis

When the hypothesis is not simple.

### Types of hypothesis test

#### Simple

$H_0: \theta = \theta_0 \text{ and } H_1: \theta = \theta_1$

#### Composite

##### One-side

$H_0: \theta \leq \theta_0 \text{ and } H_1: \theta > \theta_0$

or

$H_0: \theta \geq \theta_0 \text{ and } H_1: \theta > \theta_0$

##### Two-sides

$H_0: \theta = \theta_0 \text{ and } H_1: \theta \neq \theta_0$

That is, under $$H_1$$, $$\theta = (\theta_1,\dotsc,\theta_m,\theta_{m+1},\dotsc,\theta_{m+p})$$ and under $$H_0$$, $$\theta = (\theta_1,\dotsc,\theta_m,\theta_{m+1}=\theta_{0,1},\dotsc,\theta_{m+p}=\theta_{0,p})$$.

## Action space

$$\mathcal{A} = \{ a_0, a_1\}$$ where $$a_0$$ represents choosing $$H_0$$ and $$a_1$$ represents choosing $$H_1$$

## Decision rule

In the context of hypothesis testing, a decision rule is called hypothesis test and it is defined by the sample values $$x$$ for which we accept or reject $$H_0$$. The sample values for which we reject $$H_0$$ is called the rejection region $$R$$.

$$d(x) = \left\{ \begin{array}{ll} a_0 \ if \ x \not\in R \\ a_1 \ if \ x \in R \\ \end{array} \right.$$

The rejection region is often specified by establishing a criterium on a test statistic $$W(x)$$ (function of the sample):

$R=\{x: W(x)<c\}$

A common method to find a test statistic is the likelihood ratio. To completely specify $$R$$, we need $$c$$ which can be obtained by setting the probabilities of making mistakes.

### Example

$$d(x) = \left\{ \begin{array}{ll} a_0 \ if \ \overline{x} \leq 4.3 \\ a_1 \ if \ \overline{x} > 4.3 \\ \end{array} \right.$$

## Evaluating tests (probabilities of making mistakes)

Type I error: $$H_0$$ is true $$(\theta \in \Theta_0)$$ and $$a=a_1$$.

Type II error: $$H_1$$ is true $$(\theta \in \Theta_1)$$ and $$a=a_0$$.

There are 4 interesting probabilities:

$$P(a_1 | H_0) = P (Type \,I\, error)$$

$$P(a_1 | H_1)$$

$$P(a_0 | H_0)$$

$$P(a_0 | H_1) = P (Type \, II \, error)$$

### Power function of a test (power function of a rejection region)

$\beta(\theta) = P_\theta (X \in R)$

A good test should have $$\beta(\theta)$$ near 0 when $$\theta \in \Theta_0$$ and near 1 when $$\theta \in \Theta_1$$.

#### Example 1

$$X \sim B(5,p)$$

$$H_0: p \leq 1/2$$

$$H_1: p > 1/2$$

$$R=\{k = 5 \}$$

$$\beta(p) = P_p(k \in R) = P_p(k = 5) = \binom{5}{5} p^5 (1-p)^0 = p^5$$

library(ggplot2)
## Warning in file(con, "r"): cannot open file '/var/db/timezone/zoneinfo/
## +VERSION': No such file or directory
p <- seq(0,1,.01)
beta <- p^5
dat <- data.frame(p,beta)
ggplot(dat,aes(x=p,y=beta))+geom_line()

The probability of Type II error is quite large.

$$R=\{k \geq 3 \}$$

$$\beta(p) = P_p(k \in R) = P_p(k = 3 \ or \ k = 4 \ or \ k =5 ) = \binom{5}{3} p^3 (1-p)^2 + \binom{5}{4} p^4 (1-p)^1 + \binom{5}{5} p^5(1-p)^0$$

p <- seq(0,1,.01)
beta <- 10*p^3*(1-p)^2 + 5*p^4*(1-p)+p^5
dat <- data.frame(p,beta)
ggplot(dat,aes(x=p,y=beta))+geom_line()

The Type II error decreases, but Type I increases.

### Size $$\alpha$$ of a test

$sup_{\theta \in \Theta_0} \beta(\theta) = \alpha$

In example 1, $$\alpha$$s are respectively

p <- .5
p^5
## [1] 0.03125
10*p^3*(1-p)^2 + 5*p^4*(1-p)+p^5
## [1] 0.5

### Level $$\alpha$$ of a test

$sup_{\theta \in \Theta_0} \beta(\theta) \leq \alpha$

### Null hypothesis testing

$$H_0$$ and $$H_1$$ are treated asymmetrically as $$\alpha$$ is fixed (often to .01 or .05) so that the Type I error is less than $$\alpha$$.

### UMP

A test in class $$\mathcal{C}$$ with a power function $$\beta(\theta)$$ is the uniformly most powerful (UMP) test of class $$\mathcal{C}$$ if $$\beta(\theta) \geq \beta'(\theta)$$ for all $$\theta \in \Theta_1$$.

#### UMP level $$\alpha$$

$$\mathcal{C} = \text{ all level } \alpha \text{ tests}$$.

### p-value

A p-value $$p(X)$$ is a test statistics that satisfies $0 \leq p(X) \leq 1$

It is valid if $P_{\theta} \left( p \left( X \right) \leq \alpha\right) \leq \alpha$

### Loss and risk functions

#### 0-1

$$L(\theta,a) = \left\{ \begin{array}{ll} 1 & \mbox{if } \theta \in \Theta_0 \text{ and } a=a_1\\ 1 & \mbox{if } \theta \in \Theta_1 \text{ and } a=a_0 \\ 0 & \text{otherwise} \\ \end{array} \right.$$

$$R(\theta,d) = \left\{ \begin{array}{ll} P_{\theta\ \in \Theta_0}\left(d(X)=a_1\right) =\beta(\theta) & \mbox{if } \theta \in \Theta_0 \\ P_{\theta\ \in \Theta_1}\left(d(X)=a_0\right) =1 -\beta(\theta) & \mbox{if } \theta \in \Theta_1 \\ \end{array} \right.$$

Demonstration

If $$\theta \in \Theta_0$$

$$R(\theta,d(X)) = \int_R L(\theta,d(X)) f(x; \theta) \, dx + \int_{R^C} L(\theta,d(X)) f(x; \theta) \, dx = \int_R L(\theta,a_1) f(x; \theta) \, dx + \int_{R^C} L(\theta,a_0) f(x; \theta) \, dx= \\ = \int_R 1 \, f(x; \theta) \, dx + \int_{R^C} 0 \, f(x; \theta) \, dx = \int_R f(x; \theta) \, dx = P_{\theta\ \in \Theta_0}(d(X)=a_1)$$

If $$\theta \in \Theta_1$$

$$R(\theta,d(X)) = \int_R L(\theta,d(X)) f(x; \theta) \, dx + \int_{R^C} L(\theta,d(X)) f(x; \theta) \, dx = \int_R L(\theta,a_1) f(x; \theta) \, dx + \int_{R^C} L(\theta,a_0) f(x; \theta) \, dx= \\ = \int_R 0 \, f(x; \theta) \, dx + \int_{R^C} 1 \, f(x; \theta) \, dx = \int_{R^C} f(x; \theta) \, dx = P_{\theta\ \in \Theta_1}(d(X)=a_0)$$

##### Example
p <- seq(0,1,.01)
risk <- ifelse(p<.5, p^5, 1 - p^5)
dat <- data.frame(p,risk)
ggplot(dat,aes(x=p,y=risk))+geom_line()

#### Generalised 0-1

$$L(\theta,a) = \left\{ \begin{array}{ll} c_I & \mbox{if } \theta \in \Theta_0 \text{ and } a=a_1\\ c_{II} & \mbox{if } \theta \in \Theta_1 \text{ and } a=a_0 \\ 0 & \text{otherwise} \\ \end{array} \right.$$

$$R(\theta,d) = \left\{ \begin{array}{ll} P_{\theta\ \in \Theta_0}\left(d(X)=a_1\right) =c_I \beta(\theta) & \mbox{if } \theta \in \Theta_0 \\ P_{\theta\ \in \Theta_1}\left(d(X)=a_0\right) =c_{II} \left( 1 -\beta(\theta) \right) & \mbox{if } \theta \in \Theta_1 \\ \end{array} \right.$$

#### More general

$$L(\theta,a) = \left\{ \begin{array}{ll} -V_{00} & \mbox{if } \theta \in \Theta_0 \text{ and } a=a_0\\ V_{01} & \mbox{if } \theta \in \Theta_0 \text{ and } a=a_1\\ -V_{11} & \mbox{if } \theta \in \Theta_1 \text{ and } a=a_1\\ V_{10} & \mbox{if } \theta \in \Theta_1 \text{ and } a=a_0\\ \end{array} \right.$$

$$R(\theta,a) = \left\{ \begin{array}{ll} V_{10} P_{\theta\ \in \Theta_0}\left(d(X)=a_1\right) - V_{00} P_{\theta\ \in \Theta_0}\left(d(X)=a_0\right)& \mbox{if } \theta \in \Theta_0 \\ V_{01} P_{\theta\ \in \Theta_1}\left(d(X)=a_0\right) - V_{11} P_{\theta\ \in \Theta_1}\left(d(X)=a_1\right) & \mbox{if } \theta \in \Theta_1 \\ \end{array} \right.$$

## References

Casella, G., & Berger, R. L. (2002). Statistical inference (Vol. 2). Duxbury Pacific Grove, CA.

Wasserman, L. (2013). All of statistics: A concise course in statistical inference. Springer Science & Business Media.

Young, G. A., & Smith, R. L. (2005). Essentials of statistical inference (Vol. 16). Cambridge University Press.