cssclasses:
- wideTable
- mytable
Statistics FAQ: ANOVA, Bayes, etc
Sample Mean = Unbiased Estimator Of Population Mean;
Not biased, but has inherent error;
The inherent error itself is random (usually gausian).
Variance does not directly represent error.
Variance of Variance somewhat represents error in sample mean.
Sample Variance = Biased Estimator of Population Variance
= Multiply by N/(N-1) to get unbiased estimate
= Now it is not biased but has inherent error due to sample.
Variance Of Variance =
Obtained using multiple sample collections;
Represents error/uncertainty of variance/mean.
Useful when population is very large
F = (Inter-Group Variability) / (Intra Group Variability)
Inter-Group Variability = Variance Of Group Means wrt Overall-Groups-Mean
Intra-Group Variability = Average Variance of all groups together.
(Overall Mean not involved)
df1 = Degrees Of Freedom (Numerator) = Total Number of Groups.
df2 = Degrees Of Freedom (Denominator) = Total Size of all Samples.
Higher F === means ===> Different groups.
0 < F < Infinity
Mostly 0 < F < 1; (Variability across group means is less);
F > 1 when the samples are taken from different areas of population.
(Group means vary more than intra-group)
F probability function is ProbabilityDensity (Y Axis) vs F-Value (X Axis)
for a given df1 and df2;
F pdf looks like Right Skewed (long right tail) normal distribution.
Mostly F is less than 1; It has long tail where F > 1
F > Critical-F-Value ==> Rejects Null Hypothesis (No Effects Assumption)
i.e. Groups are significantly different.
Critical-F-Value := Use Lookup table based on Significance Level e.g. 5%
(and df1 and df2).
The lookup table was constructed assuming the samples are gaussian.
Significance level or alpha := Typically 5%
P-value := Probability of observing the given observation;
If P-value < Significance the null Hypothesis should be rejected.
e.g. For a given F-Value may correspond to a P-Value of 0.03;
So either you calculate Critical-F-Value (value for P-value 0.05),
or find the p-value of your observation to reject/accept null hypothesis.
Chi-Square = Sum( square(expected-observed)/ expected )
Degrees Of Freedom = (rows-1) * (cols-1)
Example: Category1: Male/Female; Category2: Tea/Coffee
Rows=2; Cols=2; Degrees of Freedom = 1x1 = 1
The table contains numeric count of peoples who prefer coffee/Tea.
Assuming independence between variables, you compete:
Overall_expected_Tea_drinkers = Total_tea_preference / Total_people
Overall_expected_Coffee_drinkers = Total_coffee_preference / Total_people
Then find the ratios for all combinations and sum it up.
Chi-square critical value at α = 0.05 (df=1) from tables = 3.84
Your chi-square > 3.84 means there is a relationship between categories.
t = (mean1 - mean2) / sqrt((variance1/n1) +(variance2/n2) )
df = degrees-of-freedom = n - 1
Higher t value, more likely groups are different.
Similar to F-Statistic, there is a lookup table for T-critical-value
given the df and significance level (alpha - e.g. 5%);
T pdf (Probability density function) looks like normal bell curve with thicker
tails and smaller peak);
Note:
All "Bayes-related" concepts—Bayes Theorem, Bayesian Networks, and Bayesian Inference --
are all built on the same foundational idea:
Updating our beliefs (probabilities) based on new evidence. (i.e. Conditional probability)
For more see notes: Notes on Bayes