last authored:
last reviewed:
It is rarely practical or possible to measure every single member of a population, and as such a random sample is drawn from the population of interest.
Normally, the sample mean should equal the population mean. If it does not, it could be due to:
Z tests allow determination of whether the difference is true or not. T tests can be used instead if n is less than 30. Different curves are used because with smaller samples there is more of a chance for error or variance.
There are various levels of data. Samples are always estimates of the whole population, and this introduces error.
Observational studies are not a real experient; one just watches what happens. These can identify associations, but causation needs to be shown through experimentation.
A trial should be big enough to have a high chance of detecting a worthwhile effect if it exists and this be reasonably sure it doesn't exist if it is not found.
A clinically significant difference in outcomes is not the same as a statistically significant difference. For example, a decrease in blood pressure of 10 mmHg could be statistically shown to be due to a given treatment but have limited impact on a patient's risk of cardiovascular disease.
A statistical nomogram can be used to determine the number of subjects required to demonstrate an effect, if it exists (the power of a study).
The power of a study is the ability to demonstrate an association, if it exists, thus representing the capacity to avoid β (type II error), or 1- β.
Power is determined by:
Underpowered studies are very common, usually because of difficulties recruiting patients. This often leads to a type II, or β error, which erroneously concludes that an intervention has no effect.
Small, "underpowered" studies are less likely to find a real difference as significant.
Beta, and accordingly power, should be fixed at the time of study design to determine optimal sample size; a sample size calculator can be of use when doing this.
A useful exercise is to select 'compare proportions for two samples' and to alter the p1 and p2 or the power and observe effects on required sample sizes.
Types of Data
Data is collected from an experimental population and is compared with the control population to test the hypothesis. Statistics allow us to test how unlikely it is that observed data does not come from the normal distribution of the control population.
Dependent variables can be either continous or dichotomous, and the statistic test used depends on the type of data of the independent variable.
statistical tests used:
statistical tests used:
As participants can enter or leave studies at different time points, uneven observation periods are common with survival analysis. Person-time date can be used, but it assumes 1 person for 10 years is equal to 10 people observed for 1 year, which is not likely true.
More specialized tests include
In order to determine differences between samples and populations, null and alternative hypotheses are formed.
Normally, the null hypothesis is meanE - meanC = zero. There is no difference between groups.
A rejected null hypothesis suggests the result is statistically significant, if it is unlikely to have occurred by chance.
A test statistic gives a p value. Set the p value where you'd like. If a p value is 0.05, it means there is a 5% chance that the difference could be false.
Type I error: finding a difference (rejecting the null) when there is none. P value-mediated.
Type II error: failing to find a difference (accepting the null) that exists; Power is the capacity to find a difference, and is therefore 1- β