\[ Y_1,\ldots,Y_n \sim NID\left(\mu,\sigma^2\right) \qquad \overline{Y}=\sum_{i=1}^n Y_i \qquad S^2=\frac{1}{n-1}\sum_{i=1}^n\left(Y_i-\overline{Y}\right)^2\]
\[ \mbox{Fixed Constants: } a_1,\ldots,a_n \qquad U=\sum_{i=1}^n a_iY_i \quad \Rightarrow \quad E\{U\} = E\left\{\sum_{i=1}^n a_iY_i\right\}=\sum_{i=1}^n a_i E\{Y_i\} = \sum_{i=1}^n a_i \mu\]
\[ V\{U\} = V\left\{\sum_{i=1}^n a_iY_i\right\}=\sum_{i=1}^n a_i^2 V\{Y_i\}=\sum_{i=1}^n a_i^2\sigma^2 \qquad U \sim N\left(\mu_U=\sum_{i=1}^n a_i \mu,\sigma_U^2=\sum_{i=1}^n a_i^2\sigma^2\right)\]
Consider the sample mean \(\overline{Y}=\sum_{i=1}^n a_iY_i\) for \(a_1,\ldots,a_n=1/n\):
\[ E\left\{\overline{Y}\right\}=\sum_{i=1}^n\frac{1}{n}\mu=n\frac{1}{n}\mu=\mu \qquad V\left\{\overline{Y}\right\}=\sum_{i=1}^n\left(\frac{1}{n}\right)^2\sigma^2=n\left(\frac{1}{n}\right)^2\sigma^2 =\frac{\sigma^2}{n} \qquad \overline{Y} \sim N\left(\mu,\frac{\sigma^2}{n}\right)\]
\[ \overline{Y} \sim N\left(\mu,\frac{\sigma^2}{n}\right) \quad \Rightarrow \quad Z=\frac{\overline{Y}-\mu}{\sqrt{\frac{\sigma^2}{n}}} \sim N(0,1)\]
Under this model, the sample variance, scaled by multiplying by \(n-1\) and dividing by \(\sigma^2\), follows the chi-square distribution with \(n-1\) degrees of freedom and is independent of \(\overline{Y}\).
\[Z=\frac{\overline{Y}-\mu}{\sqrt{\frac{\sigma^2}{n}}} \sim N(0,1) \qquad W=\frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2 \qquad \overline{Y} \perp S^2 \qquad Z \perp W \]
If \(Z\sim N(0,1)\) and \(W \sim \chi_{\nu}^2\), then the following quantity is distributed as Student’s-\(t\) with \(\nu\) degrees of freedom.
\[ T=\frac{Z}{\sqrt{W/\nu}} \sim t_{\nu} \quad \Rightarrow \quad \frac{\frac{\overline{Y}-\mu}{\sqrt{\sigma^2/n}}}{\sqrt{\frac{(n-1)S^2/\sigma^2}{(n-1)}}} =\frac{\overline{Y}-\mu}{\sqrt{S^2/n}} \sim t_{n-1}\]
This allows making inference on the population mean \(\mu\) without needing to know the population variance \(\sigma^2\). Finally, we consider ratios of independent sample variances.
\[ W_1 \sim \chi_{\nu_1} \qquad W_2 \sim \chi_{\nu_2} \qquad W_1 \perp W_2 \quad \Rightarrow \quad F=\frac{W_1/\nu_1}{W_2/\nu_2} \sim F_{\nu_1,\nu_2}\]
Expected values and variances for the chi-square, Student’s-\(t\) and \(F\) distributions are given below.
\[ W \sim \chi_{\nu}^2 \quad \Rightarrow \quad E\{W\} = \nu \qquad V\{W\} = 2\nu \]
\[ T \sim t_{\nu} \quad \Rightarrow \quad E\{T\}=0 \qquad V\{T\}=\frac{\nu}{\nu-2} \quad \nu > 2\]
\[ F \sim F_{\nu_1,\nu_2} \quad \Rightarrow \quad E\{F\}= \frac{\nu_2}{\nu_2-2} \quad \nu_2>2 \qquad V\{F\}=2\left(\frac{\nu_2}{\nu_2-2}\right)^2\frac{\nu_1+\nu_2-2}{\nu_1\left(\nu_2-4\right)}\quad \nu_2>4 \]
We now consider WNBA team game point predictions based on the Vegas point spread and over/under lines, the actual points and their differences (actual - predicted).
## Read in data into wnba dataframe
wnba <- read.csv(
"https://www.stat.ufl.edu/~winner/data/wnba_20102024_ATS_OU.csv")
names(wnba)
## [1] "team" "date" "year" "tmGmYr" "opGmYr"
## [6] "home" "teamPts" "oppPts" "OT" "tmSprd"
## [11] "OU" "tmSpDev" "oppTeam" "tmWin" "tmATSWin"
## [16] "tmOver" "tmWinSeas" "tmLosSeas" "tmWinStrk" "tmLosStrk"
## [21] "tmATSWinSeas" "tmATSLosSeas" "tmATSWinStrk" "tmATSLosStrk" "tmOVRSeas"
## [26] "tmUDRSeas" "tmOVRStrk" "tmUDRStrk" "opWin" "opATSWin"
## [31] "opOver" "opWinSeas" "opLosSeas" "opWinStrk" "opLosStrk"
## [36] "opATSWinSeas" "opATSLosSeas" "opATSWinStrk" "opATSLosStrk" "opOVRSeas"
## [41] "opUDRSeas" "opOVRStrk" "opUDRStrk" "day" "month"
## [46] "year.1" "dmy" "code"
wnba$tmSprd <- -wnba$tmSprd ## Reverse team spread so that favorite team has positive spread
wnba$tmPred <- (wnba$OU + wnba$tmSprd)/2 ## Team's Predicted score = (OU + spread)/2
wnba$tmPtDiff <- wnba$teamPts - wnba$tmPred ## Team's Difference = Actual - Predicted
## Print side-by-side histograms of predictet, actual, and difference
par(mfrow=c(3,1))
hist(wnba$tmPred, breaks="Scott", col="green",main="Predicted Points")
hist(wnba$teamPts, breaks="Scott", col="yellow",main="Actual Points")
hist(wnba$tmPtDiff, breaks="Scott", col="lightblue",main="Act-Pred")
par(mfrow=c(1,1))
## As population size is 6118 > 5000, p-values for Shapiro-Wilk test for normality are unreliable
## We will take 100 random samples of 5000 games and compute P-values for test and take the average
shapiro.p <- matrix(rep(0,300), ncol=3)
set.seed(12345)
N <- nrow(wnba)
n <- 5000
for (i1 in 1:100) {
sample.5k <- sample(N,n,replace=FALSE)
shapiro.p[i1,1] <- shapiro.test(wnba$tmPred[sample.5k])$p.value
shapiro.p[i1,2] <- shapiro.test(wnba$teamPts[sample.5k])$p.value
shapiro.p[i1,3] <- shapiro.test(wnba$tmPtDiff[sample.5k])$p.value
}
## Compute population standard deviations and adjust denominator in variance to be N, not N-1
sigma.prd <- sd(wnba$tmPred)*sqrt((N-1)/N)
sigma.act <- sd(wnba$teamPts)*sqrt((N-1)/N)
sigma.dif <- sd(wnba$tmPtDiff)*sqrt((N-1)/N)
## Print out the summaries for each variable
summary.out <- cbind(rbind(nrow(wnba), mean(wnba$tmPred),
sigma.prd, cor(wnba$tmPred,wnba$teamPts),
sigma.prd/sqrt(nrow(wnba)),
mean(shapiro.p[,1])),
rbind(nrow(wnba), mean(wnba$teamPts),
sigma.act, cor(wnba$tmPred,wnba$teamPts),
sigma.act/sqrt(nrow(wnba)),
mean(shapiro.p[,2])),
rbind(nrow(wnba), mean(wnba$tmPtDiff),
sigma.dif,NA,
sigma.dif/sqrt(nrow(wnba)),
mean(shapiro.p[,3]))
)
colnames(summary.out) <- c("Predicted", "Actual", "Diff (A-P)")
rownames(summary.out) <- c("N", "mean", "SD", "Corr", "Std Err",
"Shapiro-Wilk P-value")
round(summary.out,4)
## Predicted Actual Diff (A-P)
## N 6118.0000 6118.0000 6118.0000
## mean 79.7896 79.8710 0.0814
## SD 5.7513 11.4824 10.0805
## Corr 0.4793 0.4793 NA
## Std Err 0.0735 0.1468 0.1289
## Shapiro-Wilk P-value 0.0000 0.0001 0.1259
## Give
hist(wnba$tmPtDiff, breaks="Scott", col="yellow",
xlab="Actual-Predicted Points", freq=FALSE,
main="WNBA Team Point Differential")
lines(seq(min(wnba$tmPtDiff), max(wnba$tmPtDiff), length=1000),
dnorm(seq(min(wnba$tmPtDiff), max(wnba$tmPtDiff), length=1000),
mean(wnba$tmPtDiff), sigma.dif), col="purple", lwd=2.5)
Note that all three distributions are unimodal, but predicted and actual are slightly skewed, so they have small P-values for Shapiro-Wilk test with these large “samples.”
Let \(P\) represent the predicted score, \(A\) represent the actual score, and \(D\) represent the difference. Then \(D=A-P\). This leads to the following mean and variance for \(D\), based on those for \(A\) and \(P\) and their covariance \(\left(\sigma_{AP}\right)\).
\[ D=1(A)+(-1)P \quad \Rightarrow \quad \mu_D=1\mu_A + (-1)\mu_D = 79.8710-79.7896=0.0814\] \[\sigma_D^2 = 1^2\sigma_A^2 + (-1)^2\sigma_P^2 + 2(1)(-1)\sigma_{AD} \quad \mbox{where:} \quad \sigma_{AD}=\rho_{AD}\sigma_A\sigma_D \] \[\sigma_D^2=1^2(11.4824)^2+(-1)^2(5.7513)^2+2(1)(-1)(0.4793)(11.4824)(5.7513)=10.0805^2=101.1615\]
N.samples <- 100000 ## Take 100000 random samples (better resolution for plots)
n <- 25 ## Each sample will have n=25 team games, sampled without replacement
N <- length(wnba$tmPtDiff) ## Population size (needed for taking random samples)
set.seed(32611) ## Set seed for reproducibility
results.mat <- matrix(rep(0,N.samples*3),ncol=3) ## Results Matrix - save ybar, s^2, SW-p-value
mu.Y.dif <- mean(wnba$tmPtDiff) ## Population mean for diffs: mu_D
sigma2.Y.dif <- sigma.dif^2 ## Population variance for diffs: sigma_D^2
### Take 100000 random samples of n=25, save sample mean, variance, Shapiro-Wilk P-value
for (i1 in 1:N.samples) {
samp.y <- sample(N, n, replace=FALSE) ## n=25 integers beween 1 and N
y <- wnba$tmPtDiff[samp.y] ## Point differences for the 25 games
results.mat[i1,1] <- mean(y) ## Sample Mean
results.mat[i1,2] <- var(y) ## Sample Variance
results.mat[i1,3] <- shapiro.test(y)$p.value ## Shapiro-Wilk P-value
}
ybar.dif <- results.mat[,1] ## 100000 ybar values
s2.dif <- results.mat[,2] ## 100000 s^2 values
X2.y.dif <- (n-1)*s2.dif/sigma2.Y.dif ## 100000 (n-1)s^2/sigma^2 (Chi-square) values
SE.ybar.dif <- sqrt(s2.dif/n) ## 100000 SE{ybar}=sqrt{s^2/n} values
## Table of results
meanvar.out <- rbind(
cbind(mu.Y.dif, sigma2.Y.dif, sqrt(sigma2.Y.dif/n),n-1, n-1, 2*(n-1)),
cbind(mean(ybar.dif),mean(s2.dif),sd(ybar.dif),
n-1,mean(X2.y.dif), var(X2.y.dif))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}",
"df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
## Mean(Y) Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical 0.081 101.617 2.016 24 24.000 48.000
## Empirical 0.089 101.590 1.998 24 23.994 48.873
## Useful for setting X-limits on histograms
summary(ybar.dif)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## -7.91000 -1.26000 0.09000 0.08915 1.44000 9.02000
summary(X2.y.dif)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.387 18.976 23.333 23.994 28.299 64.855
The theoretical and empirical distributions of the sample means, sample variances, and scaled variances are described here form the table above. \[ \mbox{Theoretical Mean: }\quad \mu_{\overline{Y}_D}=\mu_D=0.081 \qquad \mbox{Empirical: } \quad \overline{\overline{y}_D}=0.089\]
\[ \mbox{Theoretical Variance: }\quad \sigma_D^2=101.617 \qquad \mbox{Empirical: } \quad \overline{s_d^2}=101.590\]
\[\mbox{Theoretical Std Error of } \overline{Y}_D \quad SE\left\{\overline{Y}_D\right\}= \frac{\sigma_D^2}{n}=\sqrt{\frac{101.617}{25}}=2.016 \qquad \mbox{Empirical: } \sqrt{\frac{1}{99999}\sum_{j=1}^{100000} \left(\overline{y}_{Dj}-\overline{\overline{y}_D}\right)^2}=1.998\]
\[ \mbox{Theoretical Mean of } \quad X_D^2=\frac{(n-1)S_D^2}{\sigma_D^2}=n-1=24 \qquad \mbox{Empirical: } \frac{1}{100000}\sum_{j=1}^{100000} X_{Dj}^2=\overline{X_{D}^2}=23.994\]
\[ \mbox{Theoretical Variance of } \quad X_D^2=\frac{(n-1)S_D^2}{\sigma_D^2}=2(n-1)=48 \qquad \mbox{Empirical: } \frac{1}{999999}\sum_{j=1}^{100000} \left(X_{Dj}^2-\overline{X_{D}^2}\right)^2=48.873\]
####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(-12,12,.01)
hist(ybar.dif, seq(-12,12,0.2), col="yellow", border="green",
freq=F,xlab="Sample Mean Team Point Diff",
main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.dif,sqrt(sigma2.Y.dif/n)), col="purple",
lwd=2.5)
###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.dif, seq(0,70,0.5), col="green", border="yellow",
freq=F,xlab="Scaled Variance Team Point Diff (X2)",
main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
lwd=2.5)
###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.dif <- (ybar.dif-mu.Y.dif)/sqrt(s2.dif/n)
t.grid <- seq(-8,8,0.01)
hist(t.val.dif, seq(-8,8,0.2), col="pink", border="red",
freq=F, xlab="t-value for Difference",
main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)
###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.05288
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)
Histograms are consistent with their theoretical distributions. Further,
the P-values for the Shapiro-Wilk test are below the nominal 0.05 level
for approximately 5% of samples, consistent with the differences being
(approximately) normally distributed.
We consider the .05, .25, .50, .75, and .95 quantiles for the variables. For the probabilities, we consider probabilities: 1) below 2 standard deviations below the mean, 2) within 1 standard deviation of the mean, and above 2 sd’s above the mean.
### Obtaining theoretical and empirical quantiles and output results table
quantiles <- c(.05,.25,.50,.75,.95)
q.ybar.th <- qnorm(quantiles, mu.Y.dif, sqrt(sigma2.Y.dif/n)) ## Theoretical normal quantiles
q.X2.th <- qchisq(quantiles, n-1) ## Theoretical chi-square quantiles
q.t.th <- qt(quantiles, n-1) ## Theoretical Student's t quantiles
q.ybar.em <- quantile(ybar.dif, quantiles) ## Empirical normal quantiles
q.X2.em <- quantile(X2.y.dif, quantiles) ## Empirical chi-square quantiles
q.t.em <- quantile(t.val.dif, quantiles) ## Empirical chi-square quantiles
quant.dif.out <- rbind(q.ybar.th,q.ybar.em, q.X2.th, q.X2.em, q.t.th, q.t.em)
colnames(quant.dif.out) <- c("5%", "25%", "50%", "75%", "95%")
rownames(quant.dif.out) <- c("Mean Theoretical", "Mean Empirical", "Scale Var Theoretical",
"Scale Var Empirical", "t Theoretical", "t Empirical")
round(quant.dif.out, 3)
## 5% 25% 50% 75% 95%
## Mean Theoretical -3.235 -1.278 0.081 1.441 3.398
## Mean Empirical -3.220 -1.260 0.090 1.440 3.370
## Scale Var Theoretical 13.848 19.037 23.337 28.241 36.415
## Scale Var Empirical 13.745 18.976 23.333 28.299 36.521
## t Theoretical -1.711 -0.685 0.000 0.685 1.711
## t Empirical -1.716 -0.678 0.004 0.678 1.683
### Obtaining theoretical and empirical probabilities and output results table
ybar.mu <- mu.Y.dif
ybar.sigma <- sqrt(sigma2.Y.dif/n)
X2.mu <- n-1
X2.sigma <- sqrt(2*(n-1))
t.mu <- 0
t.sigma <- sqrt((n-1)/(n-3))
p.ybar.th <- cbind(pnorm(ybar.mu-2*ybar.sigma,ybar.mu,ybar.sigma),
pnorm(ybar.mu+ybar.sigma,ybar.mu,ybar.sigma) - pnorm(ybar.mu-ybar.sigma,ybar.mu,ybar.sigma),
1-pnorm(ybar.mu+2*ybar.sigma,ybar.mu,ybar.sigma))
p.X2.th <- cbind(pchisq(X2.mu-2*X2.sigma,n-1),
pchisq(X2.mu+X2.sigma,n-1) - pchisq(X2.mu-X2.sigma,n-1),
1-pchisq(X2.mu+2*X2.sigma,n-1))
p.t.th <- cbind(pt(t.mu-2*t.sigma,n-1),
pt(t.mu+t.sigma,n-1) - pt(t.mu-t.sigma,n-1),
1-pt(t.mu+2*t.sigma,n-1))
p.ybar.em <- cbind(mean(ybar.dif <= ybar.mu-2*ybar.sigma),
mean(ybar.dif <= ybar.mu+ybar.sigma & ybar.dif >= ybar.mu-ybar.sigma),
mean(ybar.dif >= ybar.mu+2*ybar.sigma))
p.X2.em <- cbind(mean(X2.y.dif <= X2.mu-2*X2.sigma),
mean(X2.y.dif <= X2.mu+X2.sigma & X2.y.dif >= X2.mu-X2.sigma),
mean(X2.y.dif >= X2.mu+2*X2.sigma))
p.t.em <- cbind(mean(t.val.dif <= t.mu-2*t.sigma),
mean(t.val.dif <= t.mu+t.sigma & t.val.dif >= t.mu-t.sigma),
mean(t.val.dif >= t.mu+2*t.sigma))
prob.dif.out <- rbind(p.ybar.th,p.ybar.em, p.X2.th, p.X2.em, p.t.th, p.t.em)
colnames(prob.dif.out) <- c("<m-2s", "[m-s,m+s]", ">m+2s")
rownames(prob.dif.out) <- c("Mean Theoretical", "Mean Empirical", "Scale Var Theoretical",
"Scale Var Empirical", "t Theoretical", "t Empirical")
round(prob.dif.out, 3)
## <m-2s [m-s,m+s] >m+2s
## Mean Theoretical 0.023 0.683 0.023
## Mean Empirical 0.021 0.687 0.023
## Scale Var Theoretical 0.006 0.690 0.036
## Scale Var Empirical 0.007 0.684 0.037
## t Theoretical 0.024 0.693 0.024
## t Empirical 0.024 0.697 0.023
There is very strong agreement between the theoretical and empirical quantiles for the sample means, scaled variances, and t-values for game point differences.
### Predicted Points (tmPred)
N.samples <- 100000
n <- 25
N <- length(wnba$tmPred)
set.seed(12345)
results.mat <- matrix(rep(0,N.samples*3),ncol=3)
mu.Y.prd <- mean(wnba$tmPred)
sigma2.Y.prd <- sigma.prd^2
for (i1 in 1:N.samples) {
samp.y <- sample(N, n, replace=FALSE)
y <- wnba$tmPred[samp.y]
results.mat[i1,1] <- mean(y)
results.mat[i1,2] <- var(y)
results.mat[i1,3] <- shapiro.test(y)$p.value
}
ybar.prd <- results.mat[,1]
s2.prd <- results.mat[,2]
X2.y.prd <- (n-1)*s2.prd/sigma2.Y.prd
SE.ybar.prd <- sqrt(s2.prd/n)
meanvar.out <- rbind(
cbind(mu.Y.prd, sigma2.Y.prd, sqrt(sigma2.Y.prd/n),n-1, n-1, 2*(n-1)),
cbind(mean(ybar.prd),mean(s2.prd),sd(ybar.prd),
n-1,mean(X2.y.prd), var(X2.y.prd))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}",
"df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
## Mean(Y) Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical 79.79 33.077 1.150 24 24.000 48.000
## Empirical 79.78 33.076 1.142 24 23.999 44.521
summary(ybar.prd)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 75.24 79.01 79.78 79.78 80.55 84.80
summary(X2.y.prd)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 5.045 19.239 23.423 23.999 28.101 66.359
####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(72,88,.01)
hist(ybar.prd, seq(72,88,0.1), col="yellow", border="green",
freq=F,xlab="Sample Mean Team Point Pred",
main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.prd,sqrt(sigma2.Y.prd/n)), col="purple",
lwd=2.5)
###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.prd, seq(0,70,0.5), col="green", border="yellow",
freq=F,xlab="Scaled Variance Team Point Pred (X2)",
main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
lwd=2.5)
###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.prd <- (ybar.prd-mu.Y.prd)/sqrt(s2.prd/n)
t.grid <- seq(-8,8,0.01)
hist(t.val.prd, seq(-8,8,0.2), col="pink", border="red",
freq=F, xlab="t-value",main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)
###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.04697
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)
While the team game predictions were found to not be normal in part1, they still show good properties with respect to the mean, scaled variance, and \(t\)-value. The mean part is not surprising due to central limit theorem and fact the distribution is not “far” from being normal. The only real difference in the theoretical and empirical results is that the variance of the scaled variance for the empirical values (44.5) is smaller than the theoretical value (48.0).
### Actual Points (teamPts)
N.samples <- 100000
n <- 25
N <- length(wnba$teamPts)
set.seed(98765)
results.mat <- matrix(rep(0,N.samples*3),ncol=3)
mu.Y.act <- mean(wnba$teamPts)
sigma2.Y.act <- sigma.act^2
for (i1 in 1:N.samples) {
samp.y <- sample(N, n, replace=FALSE)
y <- wnba$teamPts[samp.y]
results.mat[i1,1] <- mean(y)
results.mat[i1,2] <- var(y)
results.mat[i1,3] <- shapiro.test(y)$p.value
}
ybar.act <- results.mat[,1]
s2.act <- results.mat[,2]
X2.y.act <- (n-1)*s2.act/sigma2.Y.act
SE.ybar.act <- sqrt(s2.act/n)
meanvar.out <- rbind(
cbind(mu.Y.act, sigma2.Y.act, sqrt(sigma2.Y.act/n),n-1, n-1, 2*(n-1)),
cbind(mean(ybar.act),mean(s2.act),sd(ybar.act),
n-1,mean(X2.y.act), var(X2.y.act))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}",
"df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
## Mean(Y) Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical 79.871 131.846 2.296 24 24.000 48.000
## Empirical 79.877 131.950 2.285 24 24.019 46.996
summary(ybar.act)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 69.52 78.32 79.88 79.88 81.40 89.72
summary(X2.y.act)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 4.743 19.124 23.366 24.019 28.223 65.730
####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(67,93,.01)
hist(ybar.act, seq(67,93,0.2),
col="yellow", border="green",
freq=F,xlab="Sample Mean Team Point Act",
main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.act,sqrt(sigma2.Y.act/n)), col="purple",
lwd=2.5)
###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.act, seq(0,70,0.5), col="green", border="yellow",
freq=F,xlab="Scaled Variance Team Point Act (X2)",
main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
lwd=2.5)
###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.act <- (ybar.act-mu.Y.act)/sqrt(s2.act/n)
t.grid <- seq(-8,8,0.01)
hist(t.val.act, seq(-8,8,0.2), col="pink", border="red",
freq=F, xlab="t-value",main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)
###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.05284
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)
While the team game actual scores were found to not be normal in part1, they still show good properties with respect to the mean, scaled variance, and \(t\)-value. The mean part is not surprising due to central limit theorem and fact the distribution is not “far” from being normal. Note that the difference in the variance of the scaled variance for the empirical values (47.0) is not much smaller than the theoretical value (48.0).
Note that to assure that samples are (approximately) independent, the sampled game predictions and actual scores were obtained with different seeds. With \(n=25\) and \(N=6118\), we can certainly feel that the independent samples assumption is reasonable.
F.ratio <- (s2.act/sigma2.Y.act)/(s2.prd/sigma2.Y.prd)
df1 <- df2 <- n-1
F.mu <- df2/(df2-2)
F.sigma2 <- 2*F.mu^2*(df1+df2-2)/(df1*(df2-4))
F.out <- rbind(
cbind(F.mu, F.sigma2, qf(.05,df1,df2),qf(.25,df1,df2),qf(.5,df1,df2),qf(.75,df1,df2),qf(.95,df1,df2)),
cbind(mean(F.ratio),var(F.ratio),quantile(F.ratio,0.05),qf(.25,df1,df2),
quantile(F.ratio,0.5),qf(.755,df1,df2),quantile(F.ratio,0.95)))
colnames(F.out) <- c("Mean", "Var", "5%", "25%", "Median", "75%", "95%")
rownames(F.out) <- c("Theoretical", "Empirical")
round(F.out,3)
## Mean Var 5% 25% Median 75% 95%
## Theoretical 1.091 0.228 0.504 0.757 1.000 1.321 1.984
## Empirical 1.085 0.215 0.511 0.757 0.999 1.330 1.945
summary(F.ratio)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.1276 0.7622 0.9991 1.0853 1.3106 6.8261
F.grid <- seq(0,8,.01)
hist(F.ratio, seq(0,8,0.05), freq=F, col="orange", main="F Ratio")
lines(F.grid,df(F.grid,n-1,n-1),col="blue", lwd=2.5)
Based on the theoretical and empirical distributions, the fact that the predicted and actual points showed small right skews, the \(F\)-distribution is still a good model for the scaled variance ratios.
We showed that \(T=\sqrt{n}\left(\overline{Y}-\mu\right)/S \sim t_{n-1}\) previously. This leads to the following result where \(P\left(T\geq t_a\right)=a\) (that is, that \(a\) is the upper tail probability).
\[ T=\frac{\overline{Y}-\mu}{\sqrt{S^2/n}}\sim t_{n-1} \quad \Rightarrow \quad P\left(t_{1-\alpha/2,n-1} \leq \frac{\overline{Y}-\mu}{\sqrt{S^2/n}} \leq t_{\alpha/2,n-1}\right) = 1-\alpha\]
With basic algebra, the goal is to isolate \(\mu\) in the center of the probability statement.
\[ \quad \Rightarrow \quad P\left(\overline{Y}-t_{\alpha/2,n-1}\sqrt{S^2/n} \leq \mu \leq \overline{Y}-t_{1-\alpha/2,n-1}\sqrt{S^2/n}\right)=1-\alpha\]
That is, in repeated sampling the true value \(\mu\) should fall within this (random) interval for \((1-\alpha)100\%\), of such intervals.
Once we collect our sample, we construct the interval with the observed sample mean \(\overline{y}\) and observed sample variance \(s^2\).
When tesing \(H_0:\mu=\mu_0\), for a specific value \(\mu_0\), we construct a test statistic \(t_{obs}\) with \(\mu_0\) in place of \(\mu\). When the null hypothesis is true, the resulting test statistic follows the (central) Student’s \(t\)-distribution, otherwise it is non-central \(t\).
We reject the null hypothesis if \(|t_{obs}| \geq t_{\alpha/2,n-1}\) for a 2-tailed test. The \(P\)-value is given below.
\[H_0:\mu=\mu_0 \quad H_A:\mu \neq \mu_0 \qquad \mbox{Test Statistic:} \quad t_{obs}=\frac{\overline{Y}-\mu_0}{\sqrt{S^2/n}}\]
\[ \mbox{Rejection Region:} \quad |t_{obs}|\geq t_{\alpha/2,n-1} \qquad P=2P\left(t_{n-1}\geq |t_{obs}|\right)\]
For the team point difference samples generated above, if we set \(\alpha=0.05\), which leads to 95% Confidence Intervals and tests with a 5% Type I error rate, we have \(n=25\) and \(t_{.025,25-1}=2.060\).
Here we construct 95% CI’s based on each sample and test \(H_0:\mu=0\) versus \(H_A:\mu\neq 0\).
## Construct 95% Confidence Interval for population mean difference
mu.LB <- ybar.dif + qt(.025,n-1)*sqrt(s2.dif/n)
mu.UB <- ybar.dif + qt(.975,n-1)*sqrt(s2.dif/n)
## Proportion of CI's Containing Population mean difference
mu.cover <- mean(mu.LB <= mu.Y.dif & mu.UB >= mu.Y.dif)
mu.uncover.lo <- mean(mu.UB <= mu.Y.dif)
mu.uncover.hi <- mean(mu.LB >= mu.Y.dif)
## Testing H0:mu=0 (On average, actual=predicted, Note: mu.Y.dif=0.0814)
t_obs.0 <- (ybar.dif-0) / sqrt(s2.dif/n) ## t-statistic
p.t.0 <- 2*(1-pt(abs(t_obs.0),n-1)) ## P-value
t.prob.rej <- mean(abs(t_obs.0) >= qt(.975,n-1)) ##Pr{Reject H0}
## Output Results
t.0.out <- cbind(mu.Y.dif, sqrt(sigma2.Y.dif), mu.cover,
mu.uncover.lo, mu.uncover.hi, t.prob.rej, mean(p.t.0))
colnames(t.0.out) <- c("mu", "sigma", "95%CI CovRate",
"P(mu>UB)","P(mu<LB)","P(Rej H0)", "mean(Pval)")
rownames(t.0.out) <- c("H0:mu=0")
round(t.0.out,4)
## mu sigma 95%CI CovRate P(mu>UB) P(mu<LB) P(Rej H0) mean(Pval)
## H0:mu=0 0.0814 10.0805 0.9508 0.0253 0.0239 0.0491 0.5021
## Histogram of P-values (Under H0, will be U(0,1))
hist(p.t.0, breaks=seq(0,1,.01), col="aquamarine", freq=F,
xlab="P-value",
main=expression(paste("Histogram of P-values", H[0]:mu,"=0")))
abline(v=0, col="deeppink2", lwd=2.5)
abline(h=1, col="sienna3", lwd=2.5)
The null value \(\mu_0=0\), is very close to \(\mu=0.0814\), especially, considering the standard deviation is \(\sigma=10.0805\). The rejection rate for the \(t\)-test is 0.0491, very close to the nominal \(\alpha=0.05\). For all intents and purposes, \(H_0:\mu=0\) is true. The histogram of \(P\)-values is very well approximated by a Uniform(0,1) distribution, as expected when \(H_0\) is true.
The coverage rate for the 95% Confidence Intervals is 95.08%, and the intervals “miss low” 2.53% and “miss high” 2.39%. These are very similar. Recall that a different seed would have produced different samples, but similar results.
We saw previously that the scaled variance \((n-1)S^2/\sigma^2 \sim \chi_{n-1}^2\). By a similar argument with the \(t\)-value, we obtain the following result.
\[ \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2 \quad \Rightarrow \quad P\left(\chi_{1-\alpha/2,n-1}^2 \leq \frac{(n-1)S^2}{\sigma^2} \leq \chi_{\alpha/2,n-1}^2\right)=1-\alpha\]
This leads to the following probability statement and \((1-\alpha)100\%\) Confidence Interval for \(\sigma^2\).
\[ P\left(\frac{(n-1)S^2}{\chi_{\alpha/2,n-1}^2} \leq \sigma^2 \leq \frac{(n-1)S^2}{\chi_{\alpha/2,n-1}^2}\right)=1-\alpha \] \[\quad \Rightarrow \quad (1-\alpha)100\% \mbox{ CI: } \quad \left[\frac{(n-1)s^2}{\chi_{\alpha/2,n-1}^2}, \frac{(n-1)s^2}{\chi_{1-\alpha/2,n-1}^2}\right]\] Here \(s^2\) is the observed sample variance.
If, for some reason, there was a null value \(\sigma_0^2\) of interest to test, we could conduct the following test.
\[ H_0:\sigma^2=\sigma_0^2 \quad H_A:\sigma^2 \neq \sigma_0^2 \qquad \mbox{Test Statistic: } \quad X_{obs}^2=\frac{(n-1)s^2}{\sigma_0^2}\]
\[ \mbox{Rejection Region: } \quad \left\{X_{obs}^2 \leq \chi_{1-\alpha/2,n-1}^2\right\} \cup \left\{X_{obs}^2 \geq \chi_{\alpha/2,n-1}^2\right\} \] \[ \mbox{P-value: } \quad P=\min\left[ P\left(\chi_{n-1}^2 \leq X_{obs}^2\right), P\left(\chi_{n-1}^2 \geq X_{obs}^2\right) \right]\]
We obtain the 95% Confidence Interval coverage rates based on the team game differences below. As their is no clear null value \(\sigma_0^2\) of interest, we will not pursue testing the variance. Below, we will test for the equality of variances among the actual and predicted scores.
## Construct 95% Confidence Interval for population variance of differences
sigma2.LB <- ((n-1)*s2.dif) / qchisq(.975,n-1)
sigma2.UB <- ((n-1)*s2.dif) / qchisq(.025,n-1)
sigma2.cover <- mean(sigma2.LB <= sigma2.Y.dif & sigma2.UB >= sigma2.Y.dif)
sigma2.uncover.lo <- mean(sigma2.UB <= sigma2.Y.dif)
sigma2.uncover.hi <- mean(sigma2.LB >= sigma2.Y.dif)
sigma2.out <- cbind(sigma2.Y.dif, n-1, qchisq(.025,n-1), qchisq(.975,n-1),
sigma2.cover, sigma2.uncover.lo, sigma2.uncover.hi)
colnames(sigma2.out) <- c("sigma^2", "df","X2(.975)","X2(.025)",
"CovRate","sigma2>UB","sigma2<LB")
rownames(sigma2.out) <- c("95% CI sigma^2")
round(sigma2.out,4)
## sigma^2 df X2(.975) X2(.025) CovRate sigma2>UB sigma2<LB
## 95% CI sigma^2 101.6167 24 12.4012 39.3641 0.9482 0.0263 0.0255
The coverage rate is 94.82%, with undercoverage (2.63%) and overcoverage (2.55%) being very similar.
Here we consider the variance ratio of the actual to predicted scores. The population varaiances are \(\sigma_A^2=(11.4824)^2=131.85\) and \(\sigma_P^2=(5.7513)^2=33.08\), respectively. Thus, the population variance ratio is \(131.85/33.08 \approx 4.0\). Note again that the samples are obtained with different seeds, so can be treated as (approximately) independent. We have the following results from above.
\[ W_A=\frac{(n_A-1)S_A^2}{\sigma_A^2}\sim \chi_{n_A-1}^2 \quad W_P=\frac{(n_P-1)S_P^2}{\sigma_P^2}\sim \chi_{n_P-1}^2 \quad df_i=n_i-1 \]
\[ F=\frac{W_A/(n_A-1)}{W_P/(n_P-1)} \sim F_{n_A-1,n_P-1} \quad \Rightarrow \quad F=\frac{S_A^2/\sigma_A^2}{S_P^2}{\sigma_P^2}= \frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \sim F_{n_A-1,n_P-1} \]
Setting up the probability statement as before, we isolate \(\sigma_A^2/\sigma_P^2\) that leads to a Confidence Interval and Test procedure.
\[ F=\frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \sim F_{n_A-1,n_P-1} \] \[\Rightarrow \quad P\left(F_{1-\alpha/2,n_A-1,n_P-1} \leq \frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \leq F_{\alpha/2,n_A-1,n_P-1}\right)=1-\alpha\]
\[\Rightarrow \quad P\left(\frac{S_A^2/S_P^2}{F_{\alpha/2,n_A-1,n_P-1}} \leq \frac{\sigma_A^2}{\sigma_P^2} \leq \frac{S_A^2/S_P^2}{F_{1-\alpha/2,n_A-1,n_P-1}} \right) = 1-\alpha \]
For a \((1-\alpha)100\%\) Confidence Interval, we plug in the observed \(s_A^2\) and \(s_P^2\) as before. For testing the equality of variances we conduct the test as given below.
\[ (1-\alpha)100\% \mbox{ CI for } \quad \frac{\sigma_A^2}{\sigma_P^2}: \qquad \left[\frac{s_A^2/s_P^2}{F_{\alpha/2,n_A-1,n_P-1}}, \frac{s_A^2/s_P^2}{F_{1-\alpha/2,n_A-1,n_P-1}} \right]\]
Note that when testing \(H_0:\sigma_A^2=\sigma_P^2\), under the null hypothesis, \(S_A^2/S_P^2 \sim F_{n_A-1,n_P-1}\). So the test statistic is simply \(s_A^2/s_P^2\).
\[H_0:\frac{\sigma_A^2}{\sigma_P^2}=1 \quad H_A: \frac{\sigma_A^2}{\sigma_P^2} \neq 1 \qquad \mbox{Test Statistic:} \quad F_{obs}=\frac{s_A^2}{s_P^2} \]
\[\mbox{Rejection Region: } \quad \left\{F_{obs}\leq F_{1-\alpha/2,n_A-1,n_P-1}\right\} \cup \left\{F_{obs}\geq F_{\alpha/2,n_A-1,n_P-1}\right\} \]
\[ P=\min\left[P\left(F_{n_a-1,n_P-1}\leq F_{obs}\right), P\left(F_{n_a-1,n_P-1}\geq F_{obs}\right)\right]\]
Finally, we construct the 95% Confidence Intervals and conduct the tests of equal variances. Given that the true variance ratio is 4, that test should reject the null hypothesis often.
vr.LB <- (s2.act/s2.prd)/qf(.975,n-1,n-1)
vr.UB <- (s2.act/s2.prd)/qf(.025,n-1,n-1)
vr.cover <- mean(vr.LB <= (sigma2.Y.act/sigma2.Y.prd) &
vr.UB >= (sigma2.Y.act/sigma2.Y.prd))
vr.uncover.lo <- mean(vr.UB <= (sigma2.Y.act/sigma2.Y.prd))
vr.uncover.hi <- mean(vr.LB >= (sigma2.Y.act/sigma2.Y.prd))
F.obs <- s2.act/s2.prd
vr.lt.1 <- mean(F.obs <= qf(.025,n-1,n-1))
vr.eq.1 <- mean(F.obs >= qf(.025,n-1,n-1) & (F.obs <= qf(.975,n-1,n-1)))
vr.gt.1 <- mean(F.obs >= qf(.975,n-1,n-1))
vr.out <- cbind(sigma2.Y.act/sigma2.Y.prd, vr.cover, vr.uncover.lo,
vr.uncover.hi,vr.gt.1,vr.eq.1,vr.lt.1)
colnames(vr.out) <- c("Var Ratio", "CovRate","CI<VR","CI>VR",
"VR>1","VR'='1","VR<1")
rownames(vr.out) <- c("Variance Ratio")
round(vr.out,4)
## Var Ratio CovRate CI<VR CI>VR VR>1 VR'='1 VR<1
## Variance Ratio 3.986 0.9551 0.0223 0.0226 0.9167 0.0833 0
The coverage rate for the 95% Confidence Intervals is 95.51% with undercoverage (2.23%) and overcoverage (2.26%) being virtually the same.
For the tests, we concluded \(\sigma_A^2 > \sigma_P^2\) in 91.67% of the samples, failed to reject \(\sigma_A^2 = \sigma_P^2\) in 8.33% of the samples, and never concluded \(\sigma_A^2 < \sigma_P^2\). This is not surprising as the true ratio is 4.