22  统计检验的功效

22.1 三大检验方法

统计检验的一般方法。

22.1.1 Wald 检验

22.1.2 Wilks 检验

也叫似然比检验

22.1.3 Rao 检验

也叫得分检验

22.2 t 检验的功效

检验的功效常用于样本量的计算

power.t.test() 计算单样本或两样本的 t 检验的功效,或者根据功效计算参数,如样本量

代码
library(ggplot2)
n <- 30 # 样本量(只是一个例子)
x <- seq(from = 0, to = 12, by = 0.01)
dat <- data.frame(xx = x / sqrt(n), yy = 2 * (1 - pt(x, n - 1)))
ggplot(data = dat, aes(x = xx, y = yy)) +
  geom_line(linewidth = 1) +
  geom_vline(xintercept = c(0.01, 0.2, 0.5, 0.8, 1.2, 2), linetype = 2) +
  theme_classic(base_size = 13) +
  labs(x = "$d = \\frac{t}{\\sqrt{n}}$", 
       y = "$2(1 - \\mathrm{pt}(x, n - 1))$")

图 22.1: t 检验的功效
power.t.test(
  n = 100, delta = 2.2,
  sd = 1, sig.level = 0.05,
  type = "two.sample",
  alternative = "two.sided"
)
#> 
#>      Two-sample t test power calculation 
#> 
#>               n = 100
#>           delta = 2.2
#>              sd = 1
#>       sig.level = 0.05
#>           power = 1
#>     alternative = two.sided
#> 
#> NOTE: n is number in *each* group
表格 22.1: 函数 power.t.test() 的参数及其含义
参数 含义
n 每个组的样本量
delta 两个组的均值之差
sd 标准差,默认值 1
sig.level 显著性水平,默认是 0.05 (犯第 I 类错误的概率)
power 检验的功效(1 - 犯第 II 类错误的概率)
type t 检验的类型 "two.sample" 两样本、"one.sample" 单样本或 "paired" 配对样本
alternative 单边或双边检验,取值为 "two.sided""one.sided"

参数 ndeltapowersdsig.level 必须有一个值为 NULL,为 NULL 的参数是由其它参数决定的。

# 前面 t 检验的等价功效计算
library(pwr)
pwr.t.test(
  d = 2.2 / 6.4,
  n = 100,
  sig.level = 0.05,
  type = "two.sample",
  alternative = "two.sided"
)
#> 
#>      Two-sample t test power calculation 
#> 
#>               n = 100
#>               d = 0.34375
#>       sig.level = 0.05
#>           power = 0.6768572
#>     alternative = two.sided
#> 
#> NOTE: n is number in *each* group

sleep 数据集为例,计算功效

# 分组计算均值
aggregate(data = sleep, extra ~ group, FUN = mean)
#>   group extra
#> 1     1  0.75
#> 2     2  2.33
# 分组计算标准差
aggregate(data = sleep, extra ~ group, FUN = sd)
#>   group    extra
#> 1     1 1.789010
#> 2     2 2.002249
# 代入计算功效
power.t.test(
  delta = 2.33 - 0.75,            # 两组均值之差
  sd = (2.002249 + 1.789010) / 2, # 标准差
  sig.level = 0.05,         # 显著性水平
  type = "two.sample",      # 两样本
  power = 0.95,             # 功效水平
  alternative = "two.sided" # 双边检验
)
#> 
#>      Two-sample t test power calculation 
#> 
#>               n = 38.39795
#>           delta = 1.58
#>              sd = 1.89563
#>       sig.level = 0.05
#>           power = 0.95
#>     alternative = two.sided
#> 
#> NOTE: n is number in *each* group

经检验,上面取两组的平均方差代替共同方差和下面精确计算的结果差不多。各组至少需要 39 个样本。MKpower 包精确计算 Welch t 检验的功效

library(MKpower)
power.welch.t.test(
  delta = 2.33 - 0.75,
  sd1 = 2.002249,
  sd2 = 1.789010,
  sig.level = 0.05,
  power = 0.95,
  alternative = "two.sided"
)

我国著名统计学家许宝騄先生对此功效计算方法做出过巨大贡献。

22.3 比例检验的功效

# power.prop.test()

power.prop.test() 计算两样本比例检验的功效

功效可以用来计算实验所需要的样本量,检验统计量的功效越大/高,检验方法越好,实验所需要的样本量越少

# p1 >= p2 的检验 单边和双边检验
power.prop.test(
  p1 = .65, p2 = 0.6, sig.level = .05,
  power = 0.90, alternative = "one.sided"
)
#> 
#>      Two-sample comparison of proportions power calculation 
#> 
#>               n = 1603.846
#>              p1 = 0.65
#>              p2 = 0.6
#>       sig.level = 0.05
#>           power = 0.9
#>     alternative = one.sided
#> 
#> NOTE: n is number in *each* group
power.prop.test(
  p1 = .65, p2 = 0.6, sig.level = .05,
  power = 0.90, alternative = "two.sided"
)
#> 
#>      Two-sample comparison of proportions power calculation 
#> 
#>               n = 1968.064
#>              p1 = 0.65
#>              p2 = 0.6
#>       sig.level = 0.05
#>           power = 0.9
#>     alternative = two.sided
#> 
#> NOTE: n is number in *each* group

pwrpwr.2p.test() 函数提供了类似 power.prop.test() 函数的功能

library(pwr)
# 明确 p1 > p2 的检验
# 单边检验拆分更加明细,分为大于和小于
pwr.2p.test(
  h = ES.h(p1 = 0.65, p2 = 0.6),
  sig.level = 0.05, power = 0.9, alternative = "greater"
)
#> 
#>      Difference of proportion power calculation for binomial distribution (arcsine transformation) 
#> 
#>               h = 0.1033347
#>               n = 1604.007
#>       sig.level = 0.05
#>           power = 0.9
#>     alternative = greater
#> 
#> NOTE: same sample sizes

已知两样本的样本量不等,检验 H_0: \(p_1 = p_2\) H_1: \(p_1 \neq p_2\) 的功效

pwr.2p2n.test(
  h = 0.30, n1 = 80, n2 = 245,
  sig.level = 0.05, alternative = "greater"
)
#> 
#>      difference of proportion power calculation for binomial distribution (arcsine transformation) 
#> 
#>               h = 0.3
#>              n1 = 80
#>              n2 = 245
#>       sig.level = 0.05
#>           power = 0.7532924
#>     alternative = greater
#> 
#> NOTE: different sample sizes

h 表示两个样本的差异,计算得到的功效是 0.75

22.4 方差分析的功效

power.anova.test() 计算平衡的单因素方差分析检验的功效

power.anova.test(
  groups = 4,       #  4 个组  
  between.var = 1,  # 组间方差为 1
  within.var = 3,   # 组内方差为 3
  power = 0.95      # 1 - 犯第二类错误的概率
)
#> 
#>      Balanced one-way analysis of variance power calculation 
#> 
#>          groups = 4
#>               n = 18.18245
#>     between.var = 1
#>      within.var = 3
#>       sig.level = 0.05
#>           power = 0.95
#> 
#> NOTE: n is number in each group
library(pwr)
# f 是如何和上面的组间/组内方差等价指定的
pwr.anova.test(
  k = 4,            # 组数
  f = 0.5,          # 效应大小
  sig.level = 0.05, # 显著性水平
  power = 0.95      # 检验的效
)
#> 
#>      Balanced one-way analysis of variance power calculation 
#> 
#>               k = 4
#>               n = 18.18244
#>               f = 0.5
#>       sig.level = 0.05
#>           power = 0.95
#> 
#> NOTE: n is number in each group