Level of Measurement |
1 Variable | 2 Variables | Interval or Ratio (scores) |
Descriptive
Central Tendency:
Dispersion: |
Correlational
Pearson's r: Inferential (parametric) t-test: ANOVA: Nominal |
or Ordinal (frequencies) Descriptive
|
Frequency Distribution Histogram
Curve
Inferential (non-parametric)
|
Chi Square (x2):
|
---|
Understanding the levels of measurement is obviously critical to selecting the correct statistical tool for a given design. Remember that the t-test and ANOVA will yield both a one-tailed and two-tailed probability level and we can only use the one-tailed probability if we accurately predicted the outcome of the study.
The t-test also comes in two "flavors." The independent t-test is used when there is no connection between the two groups, as for example when subjects are randomly assigned to experimental or control groups.
In situations where the groups are related in some way such as a pre-test/post-test design where the subjects are actually the same for each group or a study where subjects are paired based on a pre-test of some characteristic like math ability, a special type of test is required called the t-test for related means. The math is altered slightly to allow for the relationship between the subjects. It is reported and interpreted just like the independent t-test.
Chi Square (x2), being non-parametric and dealing with frequencies, actually allows us to compare apples with oranges. It is a very valuable statistical tool, especially when used with a contingency table (also called a crossbreak or cross tabulation). This table can show where the variance lies. Compare the example below where television viewing frequency data have been divided into 3 levels (low, medium and high) with the results of a t-test which made use of the actual viewing frequency values for each subject.
| VIEWING FREQUENCY | LOW MED HIGH |----------------------------- | MALE | 8 3 16 | FEMALE | 9 8 6 |
Note the correct format for reporting x2: The 2 indicates the degrees of freedom, N (50 in this case) equals the total number of subjects, 6.60 is the actual value of x2 and .04 is the probability value, representing the odds of the results being due to pure chance.
MEAN SD N MALE 3.97222 2.92891 27 FEMALE 2.39130 1.41386 23
Or if we made no prediction about the outcome:
Note the correct format for reporting t: The 48 indicates the degrees of freedom, 2.36 is the actual value of t and .02 is the probability value, representing the odds of the results being due to pure chance.
This was a very small sample. With a larger sample the Chi Square probability could easily equal the results from the t-test and give us a better picture of the variance pattern.
Note that the actual values for t and x2 are virtually meaningless to anyone but a statistician. While we need to follow the standard format for reporting each statistic, it is the probability value alone which will be meaningful to most readers.
For years there have been statistics that have attempted to measure the size or magnitude of a relationship to help expand our understanding of our data beyond just significance. These statistics have, however, been difficult to interpret and have not been widely accepted. In 1988 Jacob Cohen published Statistical Power Analysis for the Behavioral Sciences which, along with examining the whole issue of magnitude in depth, proposed a new statistic which has come to be called Cohen's d.
This statistic is now considered THE appropriate measure of effect size (magnitude) to accompany the independent t test. While not every statistician agrees, Cohen provided ranges for d corresponding to small, medium, and large effect sizes. They are:
Computing d can be a bit intimidating, but a statistician named Lee Becker has developed a calculator and kindly placed it on the Web for all to use. You simply place the means and standard deviations for the 2 groups in the appropriate boxes and click on "Compute." The calculator spits out Cohen's d and r which is more difficult to interpret..00 - .20 Small Effect Size .21 - .50 Medium Effect Size > .50 Large Effect Size
The calculation for the independent t test above is shown in the graphic below.
The relationship between gender and viewing frequency could therefore be called "highly significant" and "large" when discussing the findings.
Note that Cohen's d is NOT an accurate measure of magnitude when applied to a t test for related means.
There are a variety of statistics associated with each measure of significance which are coming to be accepted as appropriate corresponding measures of effect size (magnitude). At present Cohen's d has the most widely accepted scale for interpreting the results. It is also popular because the independent t test is so widely used in many types of research.
There are reasonably well accepted values for interpreting η2. They are:
.01 - .05 Small Effect Size .06 - .13 Medium Effect Size > .13 Large Effect Size
In the table below is displayed the descriptive values of data comparing viewing frequency with type of program.
The SPSS printout for the ANOVA (F test) described above is shown in the following table:
So in your results you would report that viewing frequency and program type were not found to be significantly related. In most cases you would stop there, not including any statistics. If there HAD been a significant relationship, you would have then described it as "small."
While harder to interpret than Cohen's d, Cramer's V is considered the best statistic to measure the effect size (magnitude) when dealing with nominal data using Chi Square as your measurement for significance. SPSS will generate Cramer's V as one of the options under crosstabulation, so calculation is not an issue.
A very "crude" interpretation of V that is generally accepted is as follows:
Cramer's V for the Crosstabulation example above equals .36. The relationship would be written as follows in the body of the paper:<.06 Negligible Relationship .06 - .10 Weak Relationship .11 - .15 Moderate Relationship .15 - .25 Strong Relationship > .25 Very Strong Relationship
The relationship between gender and viewing frequency could therefore by called "significant" and "very strongly related."
Note that while the t test yielded a higher significance level (p = .01, "highly significant"), both Cohen's d and Cramer's V indicate a high magnitude for the relationship between gender and viewing frequency.