# Ch. 9 Inferences Based on Two Samples

Sections covered: 9.1, 9.2, 9.3, 9.4

## 9.1 $$z$$ Tests and CI’s for a Difference Between Two Population Means

(Cases 1 & 2)

Skip: $$\beta$$ and the Choice of Sample Size (pp. 366-367)

### R

Since you aren’t given the original data, R isn’t very helpful here, but you can write a function that you could reuse, such as:

options(scipen = 999) # get rid of scientific notation

# this function only has to be run once per session, and then you can reuse it.

diffmeans <- function(xbar, ybar, delta0,
sigma1, sigma2,
m, n, type = "twosided") {

Z <- ((xbar - ybar) - delta0)/sqrt(((sigma1^2)/m) + ((sigma2^2)/n))

if (type == "twosided") {
pvalue <- pnorm(-abs(Z))*2
} else if (type == "lowertail") {
pvalue <- pnorm(Z)
} else {
pvalue <- 1 - pnorm(Z)
}

print(c("The p-value is", round(pvalue, 4)))

}

# Example 9.1, p. 365

diffmeans(xbar  = 29.8, ybar  = 34.7,
delta0 = 0, sigma1 = 4, sigma2 = 5,
m = 20, n = 25, type = "twosided")
## [1] "The p-value is" "0.0003"
# Example 9.4, p. 368

# As long as you use the correct order of parameters, you don't have to write out the names:

diffmeans(2258, 2637, -200, 1519, 1138, 663, 413, "lowertail")
## [1] "The p-value is" "0.0139"

## 9.2 The Two-Sample $$t$$ Test and CI

(Case 3)

Use: https://www.statology.org/welchs-t-test-calculator/ (choose “Enter raw data” or “Enter summary data” as appropriate) to calculate the degrees of freedom.

Skip: Pooled $$t$$ Procedures (pp. 377-378)

Skip: Type II Error Probabilities (pp. 378-379)

Given two random samples, use t.test() with different parameters to carry out a two-sample hypothesis test.

For demonstration purposes, x and y here are samples of 10 numbers that drawn from the standard normal distribution.

set.seed(1)
x <- rnorm(10)
set.seed(2)
y <- rnorm(10)
t.test(x, y, var.equal = TRUE, alternative = "less") 
##
##  Two Sample t-test
##
## data:  x and y
## t = -0.19865, df = 18, p-value = 0.4224
## alternative hypothesis: true difference in means is less than 0
## 95 percent confidence interval:
##      -Inf 0.610222
## sample estimates:
## mean of x mean of y
## 0.1322028 0.2111516

Thus, H_0 is rejected at alpha = .05.

In t.test(), “less” indicates that H_A: delta < 0. Also try using alternative = "two.sided" or alternative = "two.sided" for a different alternative hypothesis.

## 9.3 Analysis of Paired Data

Skip: Paired Data and Two-Sample $$t$$ Procedures (pp. 386-387)

Skip: Paired Versus Unpaired Experiments (pp. 387-388)

Two ways of doing paired test: (Here we continue to use the x and y in section 9.2 above) [Method 1] Take x-y and do a one-sample test.

t.test(x-y, alternative = "less")
##
##  One Sample t-test
##
## data:  x - y
## t = -0.17952, df = 9, p-value = 0.4308
## alternative hypothesis: true mean is less than 0
## 95 percent confidence interval:
##       -Inf 0.7272152
## sample estimates:
##   mean of x
## -0.07894886

[Method 2] Give another parameter paired = TRUE. In R, the default parameter for paired in t.test() is FALSE; here we set it to TRUE and leave x and y as two separate inputs.

t.test(x, y, alternative = "less", paired = TRUE)
##
##  Paired t-test
##
## data:  x and y
## t = -0.17952, df = 9, p-value = 0.4308
## alternative hypothesis: true mean difference is less than 0
## 95 percent confidence interval:
##       -Inf 0.7272152
## sample estimates:
## mean difference
##     -0.07894886

It’s clear that they give the same results.

## 9.4 Inferences Concerning a Difference Between Population Proportions

Skip: Type II Error Probabilities and Sample Sizes (pp. 394-395)

### R

A/B Testing question from class:

clicks <- c(25, 20)
people <- c(100, 100)
prop.test(clicks, people, correct = FALSE)
##
##  2-sample test for equality of proportions without continuity correction
##
## data:  clicks out of people
## X-squared = 0.71685, df = 1, p-value = 0.3972
## alternative hypothesis: two.sided
## 95 percent confidence interval:
##  -0.06553817  0.16553817
## sample estimates:
## prop 1 prop 2
##   0.25   0.20