Inferential Statistics:
Testing Hypotheses and Determining Significance Using Variance
What is inferential statistics?
- Inferential statistics is used to “draw conclusions about the characteristics of a population from the characteristics of a sample”, which are then used to determine how reliable such conclusions are.
- Subsequently to the conclusions, based on the sample size and distribution of the sample, one can then determine the probability of the event happening.
What it measures:
- Central tendency, variability, distribution, and relationships between characteristics within a data sample.
Further uses:
- To make predictions.
- To make generalizations about large groups.
- To relate variables in a data set to one another.
Conducting an Inferential Statistics Test:
1. To begin, determine the type of data in the data set:
2. Formulate the difference between the Null and Alternative Hypothesis:
3. Identify potential Errors and signify which is more preferable to commit.
With all predictions come a place for potential error. Although evidence might be found in favor of the alternate hypothesis (the new theory), this evidence might not actually exist.
Type I Error preferred when:
- Testing for a disease, a doctor and patient might prefer to give/receive a false positive test result given that the treatment would have less people than missing a rare disease altogether.
Type II Error preferred when:
- will add later…
4. Quantify our uncertainty using Significance Tests :
Tests for statistical significance “tell us what the probability is that the relationship we think we have found is due only to random chance.”
Potential Significance tests to use:
- T-tests
- P-value
- Hypothesis Test
5. If a relationship does exist — determine the effect :
Linear regression (simple or multiple) :
Cohen’s D: is an effect size used to indicate the standardized difference between two means. It can be used, for example, to accompany reporting of t-test.
- Measures the Effect
- cohen’s d = M1 — M2 / spooled
- M1 = mean of group 1
- M2 = mean of group 2
- spooled = pooled standard deviations for the two groups. The formula is: √[(s12+ s22) / 2]
To Be Continued…