Functions of Normal Random Variables

\[ Y_1,\ldots,Y_n \sim NID\left(\mu,\sigma^2\right) \qquad \overline{Y}=\sum_{i=1}^n Y_i \qquad S^2=\frac{1}{n-1}\sum_{i=1}^n\left(Y_i-\overline{Y}\right)^2\]

\[ \mbox{Fixed Constants: } a_1,\ldots,a_n \qquad U=\sum_{i=1}^n a_iY_i \quad \Rightarrow \quad E\{U\} = E\left\{\sum_{i=1}^n a_iY_i\right\}=\sum_{i=1}^n a_i E\{Y_i\} = \sum_{i=1}^n a_i \mu\]

\[ V\{U\} = V\left\{\sum_{i=1}^n a_iY_i\right\}=\sum_{i=1}^n a_i^2 V\{Y_i\}=\sum_{i=1}^n a_i^2\sigma^2 \qquad U \sim N\left(\mu_U=\sum_{i=1}^n a_i \mu,\sigma_U^2=\sum_{i=1}^n a_i^2\sigma^2\right)\]

Consider the sample mean \(\overline{Y}=\sum_{i=1}^n a_iY_i\) for \(a_1,\ldots,a_n=1/n\):

\[ E\left\{\overline{Y}\right\}=\sum_{i=1}^n\frac{1}{n}\mu=n\frac{1}{n}\mu=\mu \qquad V\left\{\overline{Y}\right\}=\sum_{i=1}^n\left(\frac{1}{n}\right)^2\sigma^2=n\left(\frac{1}{n}\right)^2\sigma^2 =\frac{\sigma^2}{n} \qquad \overline{Y} \sim N\left(\mu,\frac{\sigma^2}{n}\right)\]

\[ \overline{Y} \sim N\left(\mu,\frac{\sigma^2}{n}\right) \quad \Rightarrow \quad Z=\frac{\overline{Y}-\mu}{\sqrt{\frac{\sigma^2}{n}}} \sim N(0,1)\]

Under this model, the sample variance, scaled by multiplying by \(n-1\) and dividing by \(\sigma^2\), follows the chi-square distribution with \(n-1\) degrees of freedom and is independent of \(\overline{Y}\).

\[Z=\frac{\overline{Y}-\mu}{\sqrt{\frac{\sigma^2}{n}}} \sim N(0,1) \qquad W=\frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2 \qquad \overline{Y} \perp S^2 \qquad Z \perp W \]

If \(Z\sim N(0,1)\) and \(W \sim \chi_{\nu}^2\), then the following quantity is distributed as Student’s-\(t\) with \(\nu\) degrees of freedom.

\[ T=\frac{Z}{\sqrt{W/\nu}} \sim t_{\nu} \quad \Rightarrow \quad \frac{\frac{\overline{Y}-\mu}{\sqrt{\sigma^2/n}}}{\sqrt{\frac{(n-1)S^2/\sigma^2}{(n-1)}}} =\frac{\overline{Y}-\mu}{\sqrt{S^2/n}} \sim t_{n-1}\]

This allows making inference on the population mean \(\mu\) without needing to know the population variance \(\sigma^2\). Finally, we consider ratios of independent sample variances.

\[ W_1 \sim \chi_{\nu_1} \qquad W_2 \sim \chi_{\nu_2} \qquad W_1 \perp W_2 \quad \Rightarrow \quad F=\frac{W_1/\nu_1}{W_2/\nu_2} \sim F_{\nu_1,\nu_2}\]

Expected values and variances for the chi-square, Student’s-\(t\) and \(F\) distributions are given below.

\[ W \sim \chi_{\nu}^2 \quad \Rightarrow \quad E\{W\} = \nu \qquad V\{W\} = 2\nu \]

\[ T \sim t_{\nu} \quad \Rightarrow \quad E\{T\}=0 \qquad V\{T\}=\frac{\nu}{\nu-2} \quad \nu > 2\]

\[ F \sim F_{\nu_1,\nu_2} \quad \Rightarrow \quad E\{F\}= \frac{\nu_2}{\nu_2-2} \quad \nu_2>2 \qquad V\{F\}=2\left(\frac{\nu_2}{\nu_2-2}\right)^2\frac{\nu_1+\nu_2-2}{\nu_1\left(\nu_2-4\right)}\quad \nu_2>4 \]

We now consider WNBA team game point predictions based on the Vegas point spread and over/under lines, the actual points and their differences (actual - predicted).

## Read in data into wnba dataframe 
wnba <- read.csv(
   "https://www.stat.ufl.edu/~winner/data/wnba_20102024_ATS_OU.csv")
names(wnba)
##  [1] "team"         "date"         "year"         "tmGmYr"       "opGmYr"      
##  [6] "home"         "teamPts"      "oppPts"       "OT"           "tmSprd"      
## [11] "OU"           "tmSpDev"      "oppTeam"      "tmWin"        "tmATSWin"    
## [16] "tmOver"       "tmWinSeas"    "tmLosSeas"    "tmWinStrk"    "tmLosStrk"   
## [21] "tmATSWinSeas" "tmATSLosSeas" "tmATSWinStrk" "tmATSLosStrk" "tmOVRSeas"   
## [26] "tmUDRSeas"    "tmOVRStrk"    "tmUDRStrk"    "opWin"        "opATSWin"    
## [31] "opOver"       "opWinSeas"    "opLosSeas"    "opWinStrk"    "opLosStrk"   
## [36] "opATSWinSeas" "opATSLosSeas" "opATSWinStrk" "opATSLosStrk" "opOVRSeas"   
## [41] "opUDRSeas"    "opOVRStrk"    "opUDRStrk"    "day"          "month"       
## [46] "year.1"       "dmy"          "code"
wnba$tmSprd <- -wnba$tmSprd                    ## Reverse team spread so that favorite team has positive spread

wnba$tmPred <- (wnba$OU + wnba$tmSprd)/2       ## Team's Predicted score = (OU + spread)/2
wnba$tmPtDiff <- wnba$teamPts - wnba$tmPred    ## Team's Difference = Actual - Predicted

## Print side-by-side histograms of predictet, actual, and difference
par(mfrow=c(3,1))
hist(wnba$tmPred, breaks="Scott", col="green",main="Predicted Points")
hist(wnba$teamPts, breaks="Scott", col="yellow",main="Actual Points")
hist(wnba$tmPtDiff, breaks="Scott", col="lightblue",main="Act-Pred")

par(mfrow=c(1,1))

## As population size is 6118 > 5000, p-values for Shapiro-Wilk test for normality are unreliable 
## We will take 100 random samples of 5000 games and compute P-values for test and take the average
shapiro.p <- matrix(rep(0,300), ncol=3)

set.seed(12345)
N <- nrow(wnba)
n <- 5000
for (i1 in 1:100) {
  sample.5k <- sample(N,n,replace=FALSE)
  shapiro.p[i1,1] <- shapiro.test(wnba$tmPred[sample.5k])$p.value
  shapiro.p[i1,2] <- shapiro.test(wnba$teamPts[sample.5k])$p.value
  shapiro.p[i1,3] <- shapiro.test(wnba$tmPtDiff[sample.5k])$p.value
}

## Compute population standard deviations and adjust denominator in variance to be N, not N-1
sigma.prd <- sd(wnba$tmPred)*sqrt((N-1)/N)
sigma.act <- sd(wnba$teamPts)*sqrt((N-1)/N)
sigma.dif <- sd(wnba$tmPtDiff)*sqrt((N-1)/N)

## Print out the summaries for each variable
summary.out <- cbind(rbind(nrow(wnba), mean(wnba$tmPred), 
                 sigma.prd, cor(wnba$tmPred,wnba$teamPts),
                 sigma.prd/sqrt(nrow(wnba)),
                 mean(shapiro.p[,1])),
                 rbind(nrow(wnba), mean(wnba$teamPts),
                 sigma.act, cor(wnba$tmPred,wnba$teamPts),
                 sigma.act/sqrt(nrow(wnba)),
                 mean(shapiro.p[,2])),
                 rbind(nrow(wnba), mean(wnba$tmPtDiff),
                 sigma.dif,NA,
                 sigma.dif/sqrt(nrow(wnba)),
                 mean(shapiro.p[,3]))
)
colnames(summary.out) <- c("Predicted", "Actual", "Diff (A-P)")
rownames(summary.out) <- c("N", "mean", "SD", "Corr", "Std Err",
                       "Shapiro-Wilk P-value")
round(summary.out,4)
##                      Predicted    Actual Diff (A-P)
## N                    6118.0000 6118.0000  6118.0000
## mean                   79.7896   79.8710     0.0814
## SD                      5.7513   11.4824    10.0805
## Corr                    0.4793    0.4793         NA
## Std Err                 0.0735    0.1468     0.1289
## Shapiro-Wilk P-value    0.0000    0.0001     0.1259
## Give 
hist(wnba$tmPtDiff, breaks="Scott", col="yellow", 
     xlab="Actual-Predicted Points", freq=FALSE, 
     main="WNBA Team Point Differential")
lines(seq(min(wnba$tmPtDiff), max(wnba$tmPtDiff), length=1000),
      dnorm(seq(min(wnba$tmPtDiff), max(wnba$tmPtDiff), length=1000),
            mean(wnba$tmPtDiff), sigma.dif), col="purple", lwd=2.5)

Note that all three distributions are unimodal, but predicted and actual are slightly skewed, so they have small P-values for Shapiro-Wilk test with these large “samples.”

Let \(P\) represent the predicted score, \(A\) represent the actual score, and \(D\) represent the difference. Then \(D=A-P\). This leads to the following mean and variance for \(D\), based on those for \(A\) and \(P\) and their covariance \(\left(\sigma_{AP}\right)\).

\[ D=1(A)+(-1)P \quad \Rightarrow \quad \mu_D=1\mu_A + (-1)\mu_D = 79.8710-79.7896=0.0814\] \[\sigma_D^2 = 1^2\sigma_A^2 + (-1)^2\sigma_P^2 + 2(1)(-1)\sigma_{AD} \quad \mbox{where:} \quad \sigma_{AD}=\rho_{AD}\sigma_A\sigma_D \] \[\sigma_D^2=1^2(11.4824)^2+(-1)^2(5.7513)^2+2(1)(-1)(0.4793)(11.4824)(5.7513)=10.0805^2=101.1615\]

Sampling from the Differences with n=20 and observing the properties of the mean, scaled variance, \(t\)-value and Shapiro-Wilk \(p\)-value

N.samples <- 100000          ## Take 100000 random samples (better resolution for plots)
n <- 25                      ## Each sample will have n=25 team games, sampled without replacement
N <- length(wnba$tmPtDiff)   ## Population size (needed for taking random samples)
set.seed(32611)              ## Set seed for reproducibility
results.mat <- matrix(rep(0,N.samples*3),ncol=3)    ## Results Matrix - save ybar, s^2, SW-p-value
mu.Y.dif <- mean(wnba$tmPtDiff)                     ## Population mean for diffs: mu_D
sigma2.Y.dif <- sigma.dif^2                         ## Population variance for diffs: sigma_D^2

### Take 100000 random samples of n=25, save sample mean, variance, Shapiro-Wilk P-value
for (i1 in 1:N.samples) {
  samp.y <- sample(N, n, replace=FALSE)           ## n=25 integers beween 1 and N
  y <- wnba$tmPtDiff[samp.y]                      ## Point differences for the 25 games
  results.mat[i1,1] <- mean(y)                    ## Sample Mean
  results.mat[i1,2] <- var(y)                     ## Sample Variance
  results.mat[i1,3] <- shapiro.test(y)$p.value    ## Shapiro-Wilk P-value
}

ybar.dif <- results.mat[,1]                       ## 100000 ybar values
s2.dif <- results.mat[,2]                         ## 100000 s^2 values
X2.y.dif <- (n-1)*s2.dif/sigma2.Y.dif             ## 100000 (n-1)s^2/sigma^2 (Chi-square) values
SE.ybar.dif <- sqrt(s2.dif/n)                     ## 100000 SE{ybar}=sqrt{s^2/n} values

## Table of results
meanvar.out <- rbind(
         cbind(mu.Y.dif, sigma2.Y.dif, sqrt(sigma2.Y.dif/n),n-1, n-1, 2*(n-1)),
         cbind(mean(ybar.dif),mean(s2.dif),sd(ybar.dif),
               n-1,mean(X2.y.dif), var(X2.y.dif))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}", 
           "df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
##             Mean(Y)  Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical   0.081 101.617    2.016 24   24.000  48.000
## Empirical     0.089 101.590    1.998 24   23.994  48.873
## Useful for setting X-limits on histograms
summary(ybar.dif)    
##     Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
## -7.91000 -1.26000  0.09000  0.08915  1.44000  9.02000
summary(X2.y.dif)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.387  18.976  23.333  23.994  28.299  64.855

The theoretical and empirical distributions of the sample means, sample variances, and scaled variances are described here form the table above. \[ \mbox{Theoretical Mean: }\quad \mu_{\overline{Y}_D}=\mu_D=0.081 \qquad \mbox{Empirical: } \quad \overline{\overline{y}_D}=0.089\]

\[ \mbox{Theoretical Variance: }\quad \sigma_D^2=101.617 \qquad \mbox{Empirical: } \quad \overline{s_d^2}=101.590\]

\[\mbox{Theoretical Std Error of } \overline{Y}_D \quad SE\left\{\overline{Y}_D\right\}= \frac{\sigma_D^2}{n}=\sqrt{\frac{101.617}{25}}=2.016 \qquad \mbox{Empirical: } \sqrt{\frac{1}{99999}\sum_{j=1}^{100000} \left(\overline{y}_{Dj}-\overline{\overline{y}_D}\right)^2}=1.998\]

\[ \mbox{Theoretical Mean of } \quad X_D^2=\frac{(n-1)S_D^2}{\sigma_D^2}=n-1=24 \qquad \mbox{Empirical: } \frac{1}{100000}\sum_{j=1}^{100000} X_{Dj}^2=\overline{X_{D}^2}=23.994\]

\[ \mbox{Theoretical Variance of } \quad X_D^2=\frac{(n-1)S_D^2}{\sigma_D^2}=2(n-1)=48 \qquad \mbox{Empirical: } \frac{1}{999999}\sum_{j=1}^{100000} \left(X_{Dj}^2-\overline{X_{D}^2}\right)^2=48.873\]

Histograms of Sample means, scaled variances, and \(t\)-values for the differences

####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(-12,12,.01)
hist(ybar.dif, seq(-12,12,0.2), col="yellow", border="green",
  freq=F,xlab="Sample Mean Team Point Diff",
  main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.dif,sqrt(sigma2.Y.dif/n)), col="purple",
    lwd=2.5)

###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.dif, seq(0,70,0.5), col="green", border="yellow",
  freq=F,xlab="Scaled Variance Team Point Diff (X2)",
  main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
    lwd=2.5)

###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.dif <- (ybar.dif-mu.Y.dif)/sqrt(s2.dif/n)
t.grid <- seq(-8,8,0.01)

hist(t.val.dif, seq(-8,8,0.2), col="pink", border="red",
  freq=F, xlab="t-value for Difference",
  main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)

###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.05288
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
    xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)

Histograms are consistent with their theoretical distributions. Further, the P-values for the Shapiro-Wilk test are below the nominal 0.05 level for approximately 5% of samples, consistent with the differences being (approximately) normally distributed.

Comparing theoretical and empirical quantiles and probabilities for the means, scaled variances, and t-values.

We consider the .05, .25, .50, .75, and .95 quantiles for the variables. For the probabilities, we consider probabilities: 1) below 2 standard deviations below the mean, 2) within 1 standard deviation of the mean, and above 2 sd’s above the mean.

### Obtaining theoretical and empirical quantiles and output results table
quantiles <- c(.05,.25,.50,.75,.95)
q.ybar.th <- qnorm(quantiles, mu.Y.dif, sqrt(sigma2.Y.dif/n))    ## Theoretical normal quantiles
q.X2.th <- qchisq(quantiles, n-1)                                ## Theoretical chi-square quantiles
q.t.th <- qt(quantiles, n-1)                                     ## Theoretical Student's t quantiles
q.ybar.em <- quantile(ybar.dif, quantiles)                       ## Empirical normal quantiles
q.X2.em <- quantile(X2.y.dif, quantiles)                         ## Empirical chi-square quantiles
q.t.em <- quantile(t.val.dif, quantiles)                         ## Empirical chi-square quantiles

quant.dif.out <- rbind(q.ybar.th,q.ybar.em, q.X2.th, q.X2.em, q.t.th, q.t.em)
colnames(quant.dif.out) <- c("5%", "25%", "50%", "75%", "95%")
rownames(quant.dif.out) <- c("Mean Theoretical", "Mean Empirical", "Scale Var Theoretical",
                             "Scale Var Empirical", "t Theoretical", "t Empirical")
round(quant.dif.out, 3)
##                           5%    25%    50%    75%    95%
## Mean Theoretical      -3.235 -1.278  0.081  1.441  3.398
## Mean Empirical        -3.220 -1.260  0.090  1.440  3.370
## Scale Var Theoretical 13.848 19.037 23.337 28.241 36.415
## Scale Var Empirical   13.745 18.976 23.333 28.299 36.521
## t Theoretical         -1.711 -0.685  0.000  0.685  1.711
## t Empirical           -1.716 -0.678  0.004  0.678  1.683
### Obtaining theoretical and empirical probabilities and output results table
ybar.mu <- mu.Y.dif
ybar.sigma <- sqrt(sigma2.Y.dif/n)
X2.mu <- n-1
X2.sigma <- sqrt(2*(n-1))
t.mu <- 0
t.sigma <- sqrt((n-1)/(n-3))
p.ybar.th <- cbind(pnorm(ybar.mu-2*ybar.sigma,ybar.mu,ybar.sigma),
                   pnorm(ybar.mu+ybar.sigma,ybar.mu,ybar.sigma) - pnorm(ybar.mu-ybar.sigma,ybar.mu,ybar.sigma),
                   1-pnorm(ybar.mu+2*ybar.sigma,ybar.mu,ybar.sigma))
p.X2.th <- cbind(pchisq(X2.mu-2*X2.sigma,n-1),
                   pchisq(X2.mu+X2.sigma,n-1) - pchisq(X2.mu-X2.sigma,n-1),
                   1-pchisq(X2.mu+2*X2.sigma,n-1))
p.t.th <- cbind(pt(t.mu-2*t.sigma,n-1),
                   pt(t.mu+t.sigma,n-1) - pt(t.mu-t.sigma,n-1),
                   1-pt(t.mu+2*t.sigma,n-1))
p.ybar.em <- cbind(mean(ybar.dif <= ybar.mu-2*ybar.sigma),
                   mean(ybar.dif <= ybar.mu+ybar.sigma & ybar.dif >= ybar.mu-ybar.sigma),
                   mean(ybar.dif >= ybar.mu+2*ybar.sigma))
p.X2.em <- cbind(mean(X2.y.dif <= X2.mu-2*X2.sigma),
                   mean(X2.y.dif <= X2.mu+X2.sigma & X2.y.dif >= X2.mu-X2.sigma),
                   mean(X2.y.dif >= X2.mu+2*X2.sigma))
p.t.em <- cbind(mean(t.val.dif <= t.mu-2*t.sigma),
                   mean(t.val.dif <= t.mu+t.sigma & t.val.dif >= t.mu-t.sigma),
                   mean(t.val.dif >= t.mu+2*t.sigma))

prob.dif.out <- rbind(p.ybar.th,p.ybar.em, p.X2.th, p.X2.em, p.t.th, p.t.em)
colnames(prob.dif.out) <- c("<m-2s", "[m-s,m+s]", ">m+2s")
rownames(prob.dif.out) <- c("Mean Theoretical", "Mean Empirical", "Scale Var Theoretical",
                             "Scale Var Empirical", "t Theoretical", "t Empirical")
round(prob.dif.out, 3)
##                       <m-2s [m-s,m+s] >m+2s
## Mean Theoretical      0.023     0.683 0.023
## Mean Empirical        0.021     0.687 0.023
## Scale Var Theoretical 0.006     0.690 0.036
## Scale Var Empirical   0.007     0.684 0.037
## t Theoretical         0.024     0.693 0.024
## t Empirical           0.024     0.697 0.023

There is very strong agreement between the theoretical and empirical quantiles for the sample means, scaled variances, and t-values for game point differences.

Analysis of Game Point predictions

### Predicted Points (tmPred)
N.samples <- 100000
n <- 25
N <- length(wnba$tmPred)
set.seed(12345)
results.mat <- matrix(rep(0,N.samples*3),ncol=3)
mu.Y.prd <- mean(wnba$tmPred)
sigma2.Y.prd <- sigma.prd^2         

for (i1 in 1:N.samples) {
  samp.y <- sample(N, n, replace=FALSE)
  y <- wnba$tmPred[samp.y]
  results.mat[i1,1] <- mean(y)
  results.mat[i1,2] <- var(y)
  results.mat[i1,3] <- shapiro.test(y)$p.value
}

ybar.prd <- results.mat[,1]
s2.prd <- results.mat[,2]
X2.y.prd <- (n-1)*s2.prd/sigma2.Y.prd
SE.ybar.prd <- sqrt(s2.prd/n)

meanvar.out <- rbind(
         cbind(mu.Y.prd, sigma2.Y.prd, sqrt(sigma2.Y.prd/n),n-1, n-1, 2*(n-1)),
         cbind(mean(ybar.prd),mean(s2.prd),sd(ybar.prd),
               n-1,mean(X2.y.prd), var(X2.y.prd))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}", 
           "df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
##             Mean(Y) Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical   79.79 33.077    1.150 24   24.000  48.000
## Empirical     79.78 33.076    1.142 24   23.999  44.521
summary(ybar.prd)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   75.24   79.01   79.78   79.78   80.55   84.80
summary(X2.y.prd)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   5.045  19.239  23.423  23.999  28.101  66.359
####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(72,88,.01)
hist(ybar.prd, seq(72,88,0.1), col="yellow", border="green",
  freq=F,xlab="Sample Mean Team Point Pred",
  main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.prd,sqrt(sigma2.Y.prd/n)), col="purple",
    lwd=2.5)

###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.prd, seq(0,70,0.5), col="green", border="yellow",
  freq=F,xlab="Scaled Variance Team Point Pred (X2)",
  main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
    lwd=2.5)

###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.prd <- (ybar.prd-mu.Y.prd)/sqrt(s2.prd/n)
t.grid <- seq(-8,8,0.01)

hist(t.val.prd, seq(-8,8,0.2), col="pink", border="red",
  freq=F, xlab="t-value",main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)

###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.04697
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
    xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)

While the team game predictions were found to not be normal in part1, they still show good properties with respect to the mean, scaled variance, and \(t\)-value. The mean part is not surprising due to central limit theorem and fact the distribution is not “far” from being normal. The only real difference in the theoretical and empirical results is that the variance of the scaled variance for the empirical values (44.5) is smaller than the theoretical value (48.0).

Analysis of game point actual scores

### Actual Points (teamPts)

N.samples <- 100000
n <- 25
N <- length(wnba$teamPts)
set.seed(98765)
results.mat <- matrix(rep(0,N.samples*3),ncol=3)
mu.Y.act <- mean(wnba$teamPts)
sigma2.Y.act <- sigma.act^2                 
for (i1 in 1:N.samples) {
  samp.y <- sample(N, n, replace=FALSE)
  y <- wnba$teamPts[samp.y]
  results.mat[i1,1] <- mean(y)
  results.mat[i1,2] <- var(y)
  results.mat[i1,3] <- shapiro.test(y)$p.value
}

ybar.act <- results.mat[,1]
s2.act <- results.mat[,2]
X2.y.act <- (n-1)*s2.act/sigma2.Y.act
SE.ybar.act <- sqrt(s2.act/n)

meanvar.out <- rbind(
         cbind(mu.Y.act, sigma2.Y.act, sqrt(sigma2.Y.act/n),n-1, n-1, 2*(n-1)),
         cbind(mean(ybar.act),mean(s2.act),sd(ybar.act),
               n-1,mean(X2.y.act), var(X2.y.act))
)
colnames(meanvar.out) <- c("Mean(Y)", "Var(Y)","SE{Ybar}", 
           "df", "Mean(X2)","Var(X2)")
rownames (meanvar.out) <- c("Theoretical","Empirical")
round(meanvar.out, 3)
##             Mean(Y)  Var(Y) SE{Ybar} df Mean(X2) Var(X2)
## Theoretical  79.871 131.846    2.296 24   24.000  48.000
## Empirical    79.877 131.950    2.285 24   24.019  46.996
summary(ybar.act)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   69.52   78.32   79.88   79.88   81.40   89.72
summary(X2.y.act)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##   4.743  19.124  23.366  24.019  28.223  65.730
####### Sample Mean ~ N(mu,sigma^2/n)
mean.grid <- seq(67,93,.01)
hist(ybar.act, seq(67,93,0.2), 
  col="yellow", border="green",
  freq=F,xlab="Sample Mean Team Point Act",
  main=expression(paste("Sampling Dist of ", bar(Y))))
lines(mean.grid,dnorm(mean.grid,mu.Y.act,sqrt(sigma2.Y.act/n)), col="purple",
    lwd=2.5)

###### Scaled Variance = (n-1)*S^2/sigma^2 ~ X2(n-1)
X2.grid <- seq(0,70,.01)
hist(X2.y.act, seq(0,70,0.5), col="green", border="yellow",
  freq=F,xlab="Scaled Variance Team Point Act (X2)",
  main=expression(paste("Sampling Dist of ", (n-1)(S/sigma)^2)))
lines(X2.grid,dchisq(X2.grid,n-1), col="purple",
    lwd=2.5)

###### Student's t-value = (Ybar-mu)/sqrt(S^2/n) ~ t(n-1)
t.val.act <- (ybar.act-mu.Y.act)/sqrt(s2.act/n)
t.grid <- seq(-8,8,0.01)

hist(t.val.act, seq(-8,8,0.2), col="pink", border="red",
  freq=F, xlab="t-value",main="Sampling Distribution of t-value")
lines(t.grid,dt(t.grid,n-1),col="blue", lwd=2.5)

###### Shapiro-Wilk Statistic P-value
shapiro.p <- results.mat[,3]
mean(shapiro.p < 0.05)
## [1] 0.05284
hist(shapiro.p, breaks=seq(0,1,0.01),col="turquoise",border="blue",
    xlab="P-value", main="P-values for Shapiro-Wilk Test")
abline(v=0.05, col="red", lwd=2.5)
abline(h=1000, col="purple", lwd=2.5)

While the team game actual scores were found to not be normal in part1, they still show good properties with respect to the mean, scaled variance, and \(t\)-value. The mean part is not surprising due to central limit theorem and fact the distribution is not “far” from being normal. Note that the difference in the variance of the scaled variance for the empirical values (47.0) is not much smaller than the theoretical value (48.0).

Distribution of the Scaled Variance Ratio

Note that to assure that samples are (approximately) independent, the sampled game predictions and actual scores were obtained with different seeds. With \(n=25\) and \(N=6118\), we can certainly feel that the independent samples assumption is reasonable.

F.ratio <- (s2.act/sigma2.Y.act)/(s2.prd/sigma2.Y.prd)
df1 <- df2 <- n-1
F.mu <- df2/(df2-2)
F.sigma2 <- 2*F.mu^2*(df1+df2-2)/(df1*(df2-4))

F.out <- rbind(
  cbind(F.mu, F.sigma2, qf(.05,df1,df2),qf(.25,df1,df2),qf(.5,df1,df2),qf(.75,df1,df2),qf(.95,df1,df2)),
        cbind(mean(F.ratio),var(F.ratio),quantile(F.ratio,0.05),qf(.25,df1,df2),
              quantile(F.ratio,0.5),qf(.755,df1,df2),quantile(F.ratio,0.95)))
colnames(F.out) <- c("Mean", "Var", "5%", "25%", "Median", "75%", "95%")
rownames(F.out) <- c("Theoretical", "Empirical")
round(F.out,3)
##              Mean   Var    5%   25% Median   75%   95%
## Theoretical 1.091 0.228 0.504 0.757  1.000 1.321 1.984
## Empirical   1.085 0.215 0.511 0.757  0.999 1.330 1.945
summary(F.ratio)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##  0.1276  0.7622  0.9991  1.0853  1.3106  6.8261
F.grid <- seq(0,8,.01)
hist(F.ratio, seq(0,8,0.05), freq=F, col="orange", main="F Ratio")
lines(F.grid,df(F.grid,n-1,n-1),col="blue", lwd=2.5)

Based on the theoretical and empirical distributions, the fact that the predicted and actual points showed small right skews, the \(F\)-distribution is still a good model for the scaled variance ratios.

Inference Concerning the Population Mean, Variance, and Variance Ratio

Inference Concerning the Population Mean

We showed that \(T=\sqrt{n}\left(\overline{Y}-\mu\right)/S \sim t_{n-1}\) previously. This leads to the following result where \(P\left(T\geq t_a\right)=a\) (that is, that \(a\) is the upper tail probability).

\[ T=\frac{\overline{Y}-\mu}{\sqrt{S^2/n}}\sim t_{n-1} \quad \Rightarrow \quad P\left(t_{1-\alpha/2,n-1} \leq \frac{\overline{Y}-\mu}{\sqrt{S^2/n}} \leq t_{\alpha/2,n-1}\right) = 1-\alpha\]

With basic algebra, the goal is to isolate \(\mu\) in the center of the probability statement.

\[ \quad \Rightarrow \quad P\left(\overline{Y}-t_{\alpha/2,n-1}\sqrt{S^2/n} \leq \mu \leq \overline{Y}-t_{1-\alpha/2,n-1}\sqrt{S^2/n}\right)=1-\alpha\]

That is, in repeated sampling the true value \(\mu\) should fall within this (random) interval for \((1-\alpha)100\%\), of such intervals.

Once we collect our sample, we construct the interval with the observed sample mean \(\overline{y}\) and observed sample variance \(s^2\).

When tesing \(H_0:\mu=\mu_0\), for a specific value \(\mu_0\), we construct a test statistic \(t_{obs}\) with \(\mu_0\) in place of \(\mu\). When the null hypothesis is true, the resulting test statistic follows the (central) Student’s \(t\)-distribution, otherwise it is non-central \(t\).

We reject the null hypothesis if \(|t_{obs}| \geq t_{\alpha/2,n-1}\) for a 2-tailed test. The \(P\)-value is given below.

\[H_0:\mu=\mu_0 \quad H_A:\mu \neq \mu_0 \qquad \mbox{Test Statistic:} \quad t_{obs}=\frac{\overline{Y}-\mu_0}{\sqrt{S^2/n}}\]

\[ \mbox{Rejection Region:} \quad |t_{obs}|\geq t_{\alpha/2,n-1} \qquad P=2P\left(t_{n-1}\geq |t_{obs}|\right)\]

For the team point difference samples generated above, if we set \(\alpha=0.05\), which leads to 95% Confidence Intervals and tests with a 5% Type I error rate, we have \(n=25\) and \(t_{.025,25-1}=2.060\).

Here we construct 95% CI’s based on each sample and test \(H_0:\mu=0\) versus \(H_A:\mu\neq 0\).

## Construct 95% Confidence Interval for population mean difference
mu.LB <- ybar.dif + qt(.025,n-1)*sqrt(s2.dif/n)
mu.UB <- ybar.dif + qt(.975,n-1)*sqrt(s2.dif/n)

## Proportion of CI's Containing Population mean difference
mu.cover <- mean(mu.LB <= mu.Y.dif & mu.UB >= mu.Y.dif)
mu.uncover.lo <- mean(mu.UB <= mu.Y.dif)
mu.uncover.hi <- mean(mu.LB >= mu.Y.dif)

## Testing H0:mu=0  (On average, actual=predicted, Note: mu.Y.dif=0.0814)
t_obs.0 <- (ybar.dif-0) / sqrt(s2.dif/n)   ## t-statistic
p.t.0 <- 2*(1-pt(abs(t_obs.0),n-1))          ## P-value
t.prob.rej <- mean(abs(t_obs.0) >= qt(.975,n-1))   ##Pr{Reject H0}

## Output Results
t.0.out <- cbind(mu.Y.dif, sqrt(sigma2.Y.dif), mu.cover,
                 mu.uncover.lo, mu.uncover.hi, t.prob.rej, mean(p.t.0))
colnames(t.0.out) <- c("mu", "sigma", "95%CI CovRate",
                       "P(mu>UB)","P(mu<LB)","P(Rej H0)", "mean(Pval)")
rownames(t.0.out) <- c("H0:mu=0")
round(t.0.out,4)
##             mu   sigma 95%CI CovRate P(mu>UB) P(mu<LB) P(Rej H0) mean(Pval)
## H0:mu=0 0.0814 10.0805        0.9508   0.0253   0.0239    0.0491     0.5021
## Histogram of P-values (Under H0, will be U(0,1))

hist(p.t.0, breaks=seq(0,1,.01), col="aquamarine", freq=F,
     xlab="P-value", 
     main=expression(paste("Histogram of P-values", H[0]:mu,"=0")))
abline(v=0, col="deeppink2", lwd=2.5)
abline(h=1, col="sienna3", lwd=2.5)

The null value \(\mu_0=0\), is very close to \(\mu=0.0814\), especially, considering the standard deviation is \(\sigma=10.0805\). The rejection rate for the \(t\)-test is 0.0491, very close to the nominal \(\alpha=0.05\). For all intents and purposes, \(H_0:\mu=0\) is true. The histogram of \(P\)-values is very well approximated by a Uniform(0,1) distribution, as expected when \(H_0\) is true.

The coverage rate for the 95% Confidence Intervals is 95.08%, and the intervals “miss low” 2.53% and “miss high” 2.39%. These are very similar. Recall that a different seed would have produced different samples, but similar results.

Inference Concerning the Population Variance

We saw previously that the scaled variance \((n-1)S^2/\sigma^2 \sim \chi_{n-1}^2\). By a similar argument with the \(t\)-value, we obtain the following result.

\[ \frac{(n-1)S^2}{\sigma^2} \sim \chi_{n-1}^2 \quad \Rightarrow \quad P\left(\chi_{1-\alpha/2,n-1}^2 \leq \frac{(n-1)S^2}{\sigma^2} \leq \chi_{\alpha/2,n-1}^2\right)=1-\alpha\]

This leads to the following probability statement and \((1-\alpha)100\%\) Confidence Interval for \(\sigma^2\).

\[ P\left(\frac{(n-1)S^2}{\chi_{\alpha/2,n-1}^2} \leq \sigma^2 \leq \frac{(n-1)S^2}{\chi_{\alpha/2,n-1}^2}\right)=1-\alpha \] \[\quad \Rightarrow \quad (1-\alpha)100\% \mbox{ CI: } \quad \left[\frac{(n-1)s^2}{\chi_{\alpha/2,n-1}^2}, \frac{(n-1)s^2}{\chi_{1-\alpha/2,n-1}^2}\right]\] Here \(s^2\) is the observed sample variance.

If, for some reason, there was a null value \(\sigma_0^2\) of interest to test, we could conduct the following test.

\[ H_0:\sigma^2=\sigma_0^2 \quad H_A:\sigma^2 \neq \sigma_0^2 \qquad \mbox{Test Statistic: } \quad X_{obs}^2=\frac{(n-1)s^2}{\sigma_0^2}\]

\[ \mbox{Rejection Region: } \quad \left\{X_{obs}^2 \leq \chi_{1-\alpha/2,n-1}^2\right\} \cup \left\{X_{obs}^2 \geq \chi_{\alpha/2,n-1}^2\right\} \] \[ \mbox{P-value: } \quad P=\min\left[ P\left(\chi_{n-1}^2 \leq X_{obs}^2\right), P\left(\chi_{n-1}^2 \geq X_{obs}^2\right) \right]\]

We obtain the 95% Confidence Interval coverage rates based on the team game differences below. As their is no clear null value \(\sigma_0^2\) of interest, we will not pursue testing the variance. Below, we will test for the equality of variances among the actual and predicted scores.

## Construct 95% Confidence Interval for population variance of differences
sigma2.LB <- ((n-1)*s2.dif) / qchisq(.975,n-1)
sigma2.UB <- ((n-1)*s2.dif) / qchisq(.025,n-1)
sigma2.cover <- mean(sigma2.LB <= sigma2.Y.dif & sigma2.UB >= sigma2.Y.dif)
sigma2.uncover.lo <- mean(sigma2.UB <= sigma2.Y.dif)
sigma2.uncover.hi <- mean(sigma2.LB >= sigma2.Y.dif)

sigma2.out <- cbind(sigma2.Y.dif, n-1, qchisq(.025,n-1), qchisq(.975,n-1),
                    sigma2.cover, sigma2.uncover.lo, sigma2.uncover.hi)
colnames(sigma2.out) <- c("sigma^2", "df","X2(.975)","X2(.025)",
                          "CovRate","sigma2>UB","sigma2<LB")
rownames(sigma2.out) <- c("95% CI sigma^2")
round(sigma2.out,4)
##                 sigma^2 df X2(.975) X2(.025) CovRate sigma2>UB sigma2<LB
## 95% CI sigma^2 101.6167 24  12.4012  39.3641  0.9482    0.0263    0.0255

The coverage rate is 94.82%, with undercoverage (2.63%) and overcoverage (2.55%) being very similar.

Inference Concerning the Population Variance Ratio

Here we consider the variance ratio of the actual to predicted scores. The population varaiances are \(\sigma_A^2=(11.4824)^2=131.85\) and \(\sigma_P^2=(5.7513)^2=33.08\), respectively. Thus, the population variance ratio is \(131.85/33.08 \approx 4.0\). Note again that the samples are obtained with different seeds, so can be treated as (approximately) independent. We have the following results from above.

\[ W_A=\frac{(n_A-1)S_A^2}{\sigma_A^2}\sim \chi_{n_A-1}^2 \quad W_P=\frac{(n_P-1)S_P^2}{\sigma_P^2}\sim \chi_{n_P-1}^2 \quad df_i=n_i-1 \]

\[ F=\frac{W_A/(n_A-1)}{W_P/(n_P-1)} \sim F_{n_A-1,n_P-1} \quad \Rightarrow \quad F=\frac{S_A^2/\sigma_A^2}{S_P^2}{\sigma_P^2}= \frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \sim F_{n_A-1,n_P-1} \]

Setting up the probability statement as before, we isolate \(\sigma_A^2/\sigma_P^2\) that leads to a Confidence Interval and Test procedure.

\[ F=\frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \sim F_{n_A-1,n_P-1} \] \[\Rightarrow \quad P\left(F_{1-\alpha/2,n_A-1,n_P-1} \leq \frac{S_A^2/S_P^2}{\sigma_A^2/\sigma_P^2} \leq F_{\alpha/2,n_A-1,n_P-1}\right)=1-\alpha\]

\[\Rightarrow \quad P\left(\frac{S_A^2/S_P^2}{F_{\alpha/2,n_A-1,n_P-1}} \leq \frac{\sigma_A^2}{\sigma_P^2} \leq \frac{S_A^2/S_P^2}{F_{1-\alpha/2,n_A-1,n_P-1}} \right) = 1-\alpha \]

For a \((1-\alpha)100\%\) Confidence Interval, we plug in the observed \(s_A^2\) and \(s_P^2\) as before. For testing the equality of variances we conduct the test as given below.

\[ (1-\alpha)100\% \mbox{ CI for } \quad \frac{\sigma_A^2}{\sigma_P^2}: \qquad \left[\frac{s_A^2/s_P^2}{F_{\alpha/2,n_A-1,n_P-1}}, \frac{s_A^2/s_P^2}{F_{1-\alpha/2,n_A-1,n_P-1}} \right]\]

Note that when testing \(H_0:\sigma_A^2=\sigma_P^2\), under the null hypothesis, \(S_A^2/S_P^2 \sim F_{n_A-1,n_P-1}\). So the test statistic is simply \(s_A^2/s_P^2\).

\[H_0:\frac{\sigma_A^2}{\sigma_P^2}=1 \quad H_A: \frac{\sigma_A^2}{\sigma_P^2} \neq 1 \qquad \mbox{Test Statistic:} \quad F_{obs}=\frac{s_A^2}{s_P^2} \]

\[\mbox{Rejection Region: } \quad \left\{F_{obs}\leq F_{1-\alpha/2,n_A-1,n_P-1}\right\} \cup \left\{F_{obs}\geq F_{\alpha/2,n_A-1,n_P-1}\right\} \]

\[ P=\min\left[P\left(F_{n_a-1,n_P-1}\leq F_{obs}\right), P\left(F_{n_a-1,n_P-1}\geq F_{obs}\right)\right]\]

Finally, we construct the 95% Confidence Intervals and conduct the tests of equal variances. Given that the true variance ratio is 4, that test should reject the null hypothesis often.

vr.LB <- (s2.act/s2.prd)/qf(.975,n-1,n-1)
vr.UB <- (s2.act/s2.prd)/qf(.025,n-1,n-1)
vr.cover <- mean(vr.LB <= (sigma2.Y.act/sigma2.Y.prd) & 
                 vr.UB >= (sigma2.Y.act/sigma2.Y.prd))
vr.uncover.lo <- mean(vr.UB <= (sigma2.Y.act/sigma2.Y.prd))
vr.uncover.hi <- mean(vr.LB >= (sigma2.Y.act/sigma2.Y.prd))

F.obs <- s2.act/s2.prd 
vr.lt.1 <- mean(F.obs <= qf(.025,n-1,n-1))
vr.eq.1 <- mean(F.obs >= qf(.025,n-1,n-1) & (F.obs <= qf(.975,n-1,n-1)))
vr.gt.1 <- mean(F.obs >= qf(.975,n-1,n-1))

vr.out <- cbind(sigma2.Y.act/sigma2.Y.prd, vr.cover, vr.uncover.lo,
                vr.uncover.hi,vr.gt.1,vr.eq.1,vr.lt.1) 
colnames(vr.out) <- c("Var Ratio", "CovRate","CI<VR","CI>VR",
                      "VR>1","VR'='1","VR<1")   
rownames(vr.out) <- c("Variance Ratio")
round(vr.out,4)
##                Var Ratio CovRate  CI<VR  CI>VR   VR>1 VR'='1 VR<1
## Variance Ratio     3.986  0.9551 0.0223 0.0226 0.9167 0.0833    0

The coverage rate for the 95% Confidence Intervals is 95.51% with undercoverage (2.23%) and overcoverage (2.26%) being virtually the same.

For the tests, we concluded \(\sigma_A^2 > \sigma_P^2\) in 91.67% of the samples, failed to reject \(\sigma_A^2 = \sigma_P^2\) in 8.33% of the samples, and never concluded \(\sigma_A^2 < \sigma_P^2\). This is not surprising as the true ratio is 4.