Chapter 3 t-tests

In general, t-tests determine whether the average of some variable differs significantly across two groups. For example, a t-test may be used to answer questions such as ‘Are women more intelligent than men?’, ‘Do younger individuals tend to be more extraverted compared to older individuals?’, or ‘Do those with college degrees earn more money than those without college degrees?’. For each of these examples we have two different groups (i.e., women vs. men, young vs. old, those with undergraduate degrees vs. those without), and we are comparing them on some variable of interest (i.e., intelligence, extraversion, or income).

3.1 Null and Alternative Hypotheses for t-Tests

Let’s consider the third example in more detail (‘Do those with college degrees earn more money than those without college degrees?’). Let’s begin by forming a null and alternative hypothesis to guide our analysis of the data.

Null Hypothesis: Those with college degrees earn the same amount of money as those without college degrees. This can also be written as \(\mu_{college} = \mu_{no\ college}\).

Alternative hypothesis: Those with college degress do not earn the same amount of money as those without college degrees. This can also be written as \(\mu_{college} \neq \mu_{no\ college}\).

In an ideal world we would directly compare the means across the groups. For example, if we wanted to compare salaries across college graduates versus non college graduates, then we would collect salary information for all college graduaes and non college graduates, compute means salaries for each group, calculate the difference between the group means, and use the difference to decide whether there is a difference between college graduates’ and non college graduates’ salaries.

Of course, direct comparison is rarely possible. Researchers typically do not have enough time or money to collect data from every member of the population (e.g., collecting salary information across college graduates and non college graduates). Therefore, we need an indirect method for comparing mean differences between groups. Typically, a researcher will collect a sample of individuals from the population, and hope to make comparisons about the populations of interest using these samples.

3.2 From the Population to the Sample

Within the null and alternative hypotheses, \(\mu_{college}\) refers to the average amount of money made across all individuals with an undergraduate degree, and \(\mu_{no\ college}\) reflects the average amount of money made across all individuals without an undergraduate degree. Therefore, if we had measurements from all individuals within the population, we would be able to evaluate the null and alternative hypotheses directly (i.e., we would just compare \(\mu_{college}\) to \(\mu_{no\ college}\)).

Instead, we only have estimates of \(\mu_{college}\) and \(\mu_{no\ college}\), and these estimates are calculated directly from our sample data. In the case of a t-test, we use the sample mean to estimate \(\mu\). More specifically (and following the example based on salary and college education status), we calculate \(\bar{X}\) as an estimate of \(\mu\) (i.e., we use \(\bar{X}_{college}\) and \(\bar{X}_{no\ college}\) to estimate \(\mu_{college}\) to \(\mu_{no\ college}\), respectively).

One may believe that a simple difference between the two sample means \(\bar{X}_{college}\) and \(\bar{X}_{no\ college}\) would provide an answer to our original question. However, that difference only refers only to the difference between a subset (i.e., a sample) of the population, and we may not be certain that such a difference would generalize. Let’s emphasize this notion using an experience from a first year graduate student named Julie.

Julie decides to collect a sample from the population (N = 100, with 50 individuals with college degrees and 50 indiviudals without college degrees), and finds that the difference between the respective sample means equals $20. Then, Julie decides that there isn’t a large difference in salary across college graduates and non college graduates.

But Julie’s classmate, Jill, asks Julie if the finding would replicate. More specifically, Jill wonders whether the $20 difference would occur within another sample (indicating that the small difference generalizes across samples, and therefore describes the population), or if the difference would be largely different within a second sample (indicating that the difference of $20 only characterizes the first sample, and not the population).

Julie - who hasn’t studied t-tests yet - decides to collect another sample in order to answer Jill’s question. But, prior to collecting the second sample, Julie imagines Jill asking whether the results would replicate a third or fourth time. Therefore, Julie decides to collect 5 samples to answer Jill’s question more thoroughly.

Now, before we discuss what Jill found, let’s imagine two different scenarios. Within ‘Scenario 1’, there really is no difference between the salaries of college graduates and non college graduates (i.e., the null hypothesis is true), and Julie’s first sample actually does provide good insight into what occurs across the population. Within ‘Scenario 2’, there really is a difference in salaries across the two groups (i.e., the alternative hypothesis is true), and the small difference of $20 observed within Julie’s first sample was just a fluke. We play out Scenarios 1 and 2 within the table below.

Sample Number Scenario 1 \(\bar{X}_{college} - \bar{X}_{no\ college}\) Scenario 2 \(\bar{X}_{college} - \bar{X}_{no\ college}\)
1 20 20
2 -15 19,000
3 10 30,000
4 2 23,500
5 -16 24,000
6 4 28,000
7 -3 27,000
8 -11 28,225
9 10 25,500
10 5 26,000

Within Scenario 1, Julie tells Jill that she is more confident that the results will generalize to the popuation given that all 5 samples produced a relatively trivial mean difference across annual salaries. And, within Scenario 2, Julie is perhaps, more convinced that there is meaningful difference across annual salaries. Needless to say, if Julie hadn’t produced the extra samples, it would be difficult to determine whether the sample mean difference (i.e., \(\bar{X}_{college} - \bar{X}_{no\ college}\)) would generalize across samples. However, the main limitation of Julie’s approach is the need to collect multiple samples from the population.

3.3 Applying the Concepts of the z-distribution

As we continue with the example from Julie, we note that it would be informative to compute the probability that the null hypothesis is true for Scenario 1 (or 2). Stated differently, given that we observed the sample mean differences in Scenario 1 (or 2), what is the probability that \(\mu_{college} = \mu_{no\ college}\)? If the probability is very small, say less than .05, then we may use it as justification for rejecting the the null hypothesis and concluding the alternative.

But how can we go about computing that information?

First, remember that z-distributions enable us to calculate probabilities of specific observations. If we assume that the \(\bar{X}_{college} - \bar{X}_{no\ college}\) follow a normal distribution, then we can convert use the z-distribution to estimate the probability that the null hypothesis is true.

Let us use R to calculate the means and standard deviations for Scenarios 1 and 2.

### Make a data set that contains both scenarios. Normally we would read this data set into R, but
### since it is so small, we will just enter it manually here.

data = data.frame(sample     = c(1,2,3,4,5,6,7,8,9,10),
                  scenario_1 = c(20, -15, 10, 2, -16, 4, -3, -11, 10, 5),
                  scenario_2 = c(20, 19000, 30000, 23500, 24000, 28000, 27000, 28225, 25500, 26000))
### Calculating the mean and standard deviation for Scenario 1
mean(data$scenario_1)
## [1] 0.6
sd(data$scenario_1)
## [1] 11.79642
### Calculating the mean and standard deviation for Scenario 2
mean(data$scenario_2)
## [1] 23124.5
sd(data$scenario_2)
## [1] 8677.724

Therefore, the mean of and standard deviation of Scenario 1 equal 0.6 and 11.80, respectively (with some rounding). And, the mean and standard deviation of Scenario 2 equal 23124.5 and 8677.72.

3.4 What is a t-test?

Often researchers want to compare respones from two different groups. Perhaps the simplest comparison a scientist can make is the difference between two A t-test is a statistical analysis that tests for mean differences between two groups.

3.5 What are the different types of t-tests?

There are two types of t-tests: indepenent samples t-test and dependent samples t-tests. Independent samples t-tests are reserved for

3.6 Independent samples t-tests

3.6.1 An example in R

3.7 Dependent samples t-tests

3.7.1 An example in R