Definitions
-
Population: a particular group
-
**sample: a subset of the population.
-
parameters: numerical description of a population characteristic (mean height…).
-
sample statistics: numeric description of particular sample characteristic.
-
*Branches of statistics: 1) *Descriptive statistics 2) Inferential statistics: use descriptive statistics to estimate population parameters - infer from the sample data the population statistics.
-
mean:
1) population mean: $\mu = \frac{\sum_i x_i}{N}$ 2) sample mean: $\bar{x} = \frac{\sum_i x_i}{n}$
- Median: middle value in an ordered list
- If odd samples -> median is middle
- If even samples -> average the 2 middle points
1 2 3 4 5 6 7 8 9 10 11 12 13 |
-
Variance / Standard Deviation:
-
population variance: $\sigma^2 = \frac{\sum_i (x_i - \mu)^2}{N}$
-
sample variance: $\s^2 = \frac{\sum_i (x_i - \mu)^2}{n-1}$
-
Coefficient of variation: CV is used to compare 2 dataset to know which one is more spread
- Standard deviation of grouped data:
We cannot use $s = \sqrt{\frac{\sum_i (x_i - \bar{x})^2}{n-1}}$ because we do not know $x_i$ nor $\bar{x}$. For grouped data, the standard deviation is given by:
where $f_c$ is the count/frequency, $x_c$ is the midpoint of the class, $n=\sum_c f_c$ is the sample size and $\bar{x} = \frac{\sum_c f-c x_x}{\sum_c f_c}$ is the sample mean for a frequency distribution.
Grades | Frequency | Midpoint |
---|---|---|
94-100 | 5 | 97 |
87-93 | 8 | 90 |
80-86 | 12 | 83 |
73-79 | 7 | 76 |
66 - 72 | 4 | 69 |