## Action space

Often $$\Theta$$.

## Decision rule

In the context of point estimation, the decision rule is called a statistical estimator (also termed statistic or estimator) of $$\theta$$, usually designed as $$\widehat{\theta}$$, is any function of the data (sample of the random variable) $$\widehat{\theta}=d(x)$$.

### Nuisance parameter

Parameters that we do not want to estimate.

### Examples

#### Sample mean

$\overline{X}_n=\frac{\sum_{i=1}^{n} X_i}{n}$

#### Sample variance

$S_n^2=\sum_{i=1}^{n}\frac{(X_i-\overline{X}_n)^2}{n-1}$

## Sampling distribution of a statistic $$\widehat{\theta}$$

The distribution of $$\widehat{\theta}$$ is called the sampling distribution.

### Examples

#### Sample mean for a normal distributed random variable

If $$X_1,\dotsc,X_n$$ are normally distributed, then $$\overline{X}_n$$ is distributed $$N(\mu,\frac{\sigma^2}{n})$$

• Demonstration ….

## Bias of $$\widehat{\theta}$$

$bias(\widehat{\theta}) = E[\widehat{\theta}] - \theta$

### Examples

#### Sample mean

$$bias(\overline{X}_n) = E[\overline{X}_n] - \mu = 0$$

• Demonstration

$$E[\overline{X}_n] = \frac{1}{n} \sum_{i} E[X_i] = \frac{1}{n} n E[X_i]= E[X_i]= \mu$$

#### Sample variance

• $$bias(S_n^2) = E[S^2_n] - \sigma^2 = 0$$
• Demonstration …

## Standard error $$se$$ of $$\widehat{\theta}$$

$se(\widehat{\theta})=\sqrt{V(\widehat{\theta})}$

### Examples

#### Sample mean

$$se(\overline{X}_n)=\sqrt{V(\overline{X}_n)} = \frac{\sigma}{\sqrt{n}}$$

• Demonstration

$$V[\overline{X}_n]=V(\frac{1}{n}\sum_i X_i) = \frac{1}{n^2} \sum_i V(X_i) = \frac{1}{n^2} n \sigma^2 = \frac{\sigma^2}{n}$$

## Consistency

$$\widehat{\theta}_n$$ is consistent if it converges in probability to $$\theta$$.

### Examples

#### Sample mean

$$\overline{X}_n$$ is consistent because it converges in probability to $$\mu$$ (law of large numbers).

• Demonstration:

#### Sample variance

$$S_n^2$$ is consistent because it converges in probability to $$\sigma^2$$

• Demonstration:

#### Sample standard deviation

$$S_n$$ is consistent because it converges in probability to $$\sigma$$.

#### Sample uncorrected variance

Uncorrected $$S_n^2$$ is consistent because it converges in probability to $$\sigma^2$$.

## Asymptotic normality

$$\widehat{\theta}_n$$ is asymptotically normal if $$\frac{\widehat{\theta}_n - \theta}{se}$$ converges in distribution to $$N(0,1)$$.

### Example

$$\overline{X}_n$$ is asymptotically normal because $$\frac{\overline{X}_n - \mu}{se}$$ where $$se=\frac{\sigma}{\sqrt{n}}$$ converges in distribution to $$N(0,1)$$ (central limit theorem).

## Loss and risk functions

### Squared error

$$L(\theta,\widehat{\theta})=(\theta - \widehat{\theta})^2$$

The risk function is called MSE.

$$MSE = R(\theta,\widehat{\theta})=E[L(\theta,\widehat{\theta})]= \int (\theta - \widehat{\theta})^2 f(x;\theta) \, dx$$

#### Properties

• $$MSE = bias(\widehat{\theta})^2 + var(\widehat{\theta})$$

Demonstration: …

#### Examples

##### Example 1

$$X \sim N(\mu,1)$$

$$\widehat{\mu}_1 = 4$$

$$MSE_1 = bias(\widehat{\mu}_1)^2 + var(\widehat{\mu}_1)= \left( E[\widehat{\mu}_1] - \mu \right)^2+ var(\widehat{\mu}_1) = \left(4-\mu \right)^2 + 0 = \left(4-\mu \right)^2$$

$$\widehat{\mu}_2 = X$$

$$MSE_2 = bias(\widehat{\mu}_2)^2 + var(\widehat{\mu}_2)= \left( E[\widehat{\mu}_2] - \mu \right)^2+ var(\widehat{\mu}_2) = \left(E[X]-\mu \right)^2 + var(X) = \left(\mu-\mu \right)^2 + 1 = 1$$

If the parameter happens to be close to 4 the risk for the first estimator is better than for the second. Otherwise, the second si better.

### Absolute error

$$L(\theta,\widehat{\theta})=|\theta - \widehat{\theta}|$$

### Kullback-Leibler

$$L(\theta,\widehat{\theta})=\int \log \left( \frac{f(x;\theta)}{f(x;\widehat{\theta})} \right) f(x;\theta) dx$$