Define the following terms/concepts
a.) Non-parametric statistics - A branch of statistics which does not assume that the sample data comes from a specific probability distribution.
b.) Parametric statistics - A branch of statistics which assumes the sample data comes from a specific probability distribution that depends on a set of fixed parameters.
c.) Describe the advantages and disadvantages of Non-parametric significance tests
Advantages
They may be the only alternative when sample sizes are very small (unless the population distribution is known exactly, but this is almost never the case)
They make few assumptions about the population distribution of the data
They are advantages when the data represent crude measurements such as subjective ratings/rankings (e.g, Likert responses)
They often have simpler computations and interpretations than parametric tests
Disadvantages
d.) Explanatory Variable -In bivariate statistical procedures, the explanatory variable is the independent variable that usually defines the groups that will be compared across the response variable
e.) Response Variable -The dependent variable that measures the outcome of interest from each group
\(\bf (1.)\) For parts (a) - (d) given the confidence interval for the parameter under the null hypothesis, use the point estimate to determine whether the null hypothesis \(H_0\) should be Rejected or Not Rejected and denote significance level \(\alpha\)
a.) \(H_0: \mu_0 = -2\)
\(H_A: \mu \neq \mu_0\)
\(95\%\) CI for \(\mu_0 \in [-3.0, -2.4]\)
\(\bar{x} = -2.7\) \[ \text{Decision} = \color{red}{\text{fail to
reject} \ H_0} \\
\alpha = \color{red}{0.05} \]
b.) \(H_0: p_0 = 0.9\)
\(H_A: p \neq p_0\)
\(99\%\) CI for \(p_0 \in [0.972, 0.968]\)
\(\hat{p} = 0.97\)
\[ \text{Decision} = \color{red}{\text{fail to reject} \ H_0} \\ \alpha = \color{red}{0.01} \]
c.) \(H_0: \mu_0 = 0\)
\(H_A: \mu \neq \mu_0\)
\(90\%\) CI for \(\mu_0 \in [-3.63, 7.49]\)
\(\bar{x} = 1.93\)
\[ \text{Decision} = \color{red}{\text{fail to reject} \ H_0} \\ \alpha = \color{red}{0.1} \]
d.) \(H_0: \mu_0 = 50\)
\(H_A: \mu \neq \mu_0\)
\(85\%\) CI for \(\mu_0 \in [93.6, 139]\)
\(\bar{x} = 116.3\)
\[ \text{Decision} = \color{red}{\text{fail to reject} \ H_0} \\ \alpha = \color{red}{0.15} \]
\(\bf (2.)\) A dermatologist is conducting an experiment to test whether a new topical skin care cream reduces the appearance facial of acne. From a sample of \(20\) patients with acne the researcher gives each patient a score from \(1-5\) where (5) representing the severity of their acne before the topical cream is applied. Patients are then asked to use the topical cream for \(10\) days. Following the \(10\) day treatment, patients are given a new acne score between \(1-5\). The data for this experiment is given in the table below:
Patient Number | Score Before Treatment | Score After Treatment | Difference | Sign |
---|---|---|---|---|
1 | 3 | 2 | \(1\) | \(+\) |
2 | 2 | 1 | \(1\) | \(+\) |
3 | 4 | 1 | \(3\) | \(+\) |
4 | 4 | 1 | \(3\) | \(+\) |
5 | 4 | 1 | \(3\) | \(+\) |
6 | 5 | 2 | \(3\) | \(+\) |
7 | 1 | 1 | \(0\) | |
8 | 5 | 4 | \(1\) | \(+\) |
9 | 5 | 1 | \(4\) | \(+\) |
10 | 2 | 1 | \(1\) | \(+\) |
11 | 2 | 2 | \(0\) | |
12 | 5 | 1 | \(4\) | \(+\) |
13 | 3 | 3 | \(0\) | |
14 | 3 | 1 | \(2\) | \(+\) |
15 | 1 | 1 | \(0\) | |
16 | 2 | 1 | \(1\) | \(+\) |
17 | 4 | 1 | \(3\) | \(+\) |
18 | 5 | 1 | \(4\) | \(+\) |
19 | 2 | 3 | \(-1\) | \(-\) |
20 | 1 | 1 | \(0\) |
The dermatologist is particularly interested in whether the new product can improve the scars’ appearance. They plan to test the null hypothesis \(H_0: p_0 = 0.5\) using a right-tailed test with alternative hypothesis \(H_A: p>0.5\) using a sign test to determine if there is enough evidence to conclude that the topical cream a positive effect on reducing facial acne at the \(\alpha = 0.05\) level.
a.) Fill in the table above by computing the difference between in acne scores before and after applying the skin care cream. Record the sign of each difference and report the total number of positive signs \(s\)
b.) What does a “positive” sign indicate in this experiment? A positive sign indicates that the skin cream reduced the appearance of dry skin
c.) Compute the \(p\)-value for the sign test The number of positive signs is \(s = 14\). The \(p\)value is probability of observing \(14\) or more positive signs in \(15\) observations given by \[p\text{-value} = P(S\geq s | H_0) = \sum_{k = 14}^{15}\frac{20!}{k!(20 - k)!} \cdot 0.5^k \cdot (1-0.5)^{20-k} \] \[ \approx 0.0005 \]
d.) Use the \(p\)-value from part (c) to determine if there is enough evidence to conclude that the topical cream product has a positive effect on reducing the appearance of acne. Interpret your decision in context.
At the \(\alpha = 0.05\) significance level, we reject the null hypothesis and determine that there is sufficient evidence to support that the skin cream reduces dry skin
\(\bf (3.)\) In a boreal forest, researchers conducted an experiment to study how herbivores respond to variations in food plant quality and quantity at the stand level. They fertilized young forest stands and observed herbivore use over the subsequent year. The data collected included the number of animal tracks in in the fertilized and control plots (Ball, Danell, and Sunesson 2000)
Observation | Number of Tracks In Fertilized plots | Number of Tracks In Control Plots | Difference | Sign |
---|---|---|---|---|
1 | 15 | 10 | \(5\) | \(+\) |
2 | 12 | 9 | \(3\) | \(+\) |
3 | 18 | 11 | \(7\) | \(+\) |
4 | 14 | 8 | \(6\) | \(+\) |
5 | 16 | 12 | \(4\) | \(+\) |
6 | 13 | 10 | \(3\) | \(+\) |
7 | 17 | 11 | \(6\) | \(+\) |
8 | 14 | 9 | \(5\) | \(+\) |
9 | 19 | 13 | \(6\) | \(+\) |
10 | 15 | 10 | \(5\) | \(+\) |
11 | 11 | 8 | \(3\) | \(+\) |
12 | 16 | 12 | \(4\) | \(+\) |
The researchers are interested in whether or not fertilizing stands increased herbaceous activity. Conduct a sign test at the \(\alpha = 0.05\) significance level to determine if fertilized plots have significantly more herbivore tracks than control plots
a.) Fill in the table above by computing the difference in animal tracks between the fertilized and control plots. Record the sign of each difference and report the total number of positive signs \(s\)
b.) What does a “positive” sign indicate in this experiment? A positive sign indicates that the fertilized stands have more animal tracks
c.) Compute the \(p\)-value for the sign test The number of positive signs is \(s = 12\). The \(p\)value is probability of observing \(12\) or more positive signs in \(12\) observations given by \[p\text{-value} = P(S\geq s | H_0) = \frac{12!}{12!(12 - 12)!} \cdot 0.5^{12}\cdot(1-0.5)^{12-12} \] \[ 0.5^{12} \] \[ \approx 0 \]
d.) Use the \(p\)-value from part (c) to determine if there is enough evidence to conclude that fertilized stands have more herbivore activity. At the \(\alpha = 0.05\) significance level we reject the null hypothesis and conclude that fertilized stands have significantly more animal activity than non-fertilized stands
\(\bf (4.)\) The United States Supreme Court serves as the judicial branch of the U.S. government, tasked with ensuring that laws conform to the U.S. Constitution. Although traditionally operating as a relatively discreet arbiter within the American government, recent years have seen an unusual spotlight on the Supreme Court in partisan politics. This heightened attention is largely attributed by political analysts to several landmark Supreme Court decisions and ethical controversies that have surfaced. Consequently, it is believed that this increased scrutiny has adversely affected public approval of the Court across the political spectrum. The table below provides a summary of current and historical polling results for assessing SCOTUS job approval. The current approval rating is summarized from \(192\) political polls conducted among the public since late 2022 (Ryan and Bycoffe 2024). The historic approval sentiment is characterized from \(39\) Gallup polls conducted since early 2001.
Period | Parameter | Sample Size | Mean SCOTUS Approval Rating |
---|---|---|---|
Current Approval (2022 - 2024) | \(p_1\) | 192 | 41.16 |
Past Approval (Gallup: 2001 - 2022) | \(p_2\) | 39 | 50.87 |
Conduct a two-sample test at the \(\alpha = 0.05\) significance level to determine whether the current approval rating is significantly lower than the historical average - be sure to address all five steps of the hypothesis test.
\[\hat{p} = \frac{0.4116(192)+0.5087(39)}{192+39} = 0.428\]
\(n\hat{p} = 0.428(231) = 99\) and \(n(1-\hat{p}) = 0.572(231) = 132\) are both greater than the required \(15\) and thus the sample size is sufficient to meet the asymptotic requirement of the test
The null hypothesis is \(H_0: p_1 - p_2 = 0\) and the alternative hypothesis is \(H_A: p_1 - p_2 < 0\)
The pooled standard error is
\[SE_{D_{pooled}} = \sqrt{0.428(1-0.428)\left(\frac{1}{192}+\frac{1}{39}\right)} = 0.0869\]
the test statistic is
\[ Z_{obs} = \frac{(0.4116 - 0.5087)-0}{0.0869} = -1.11\]
\[p\text{-value} = P(Z \leq -1.11| H_0) = 0.132\] Note that your \(p\)-value may be slightly different depending on how you rounded
\(\bf (5.)\) A study published in the Journal of Paediatrics and Child Health was interested in the number of reports of child abuse in Sergipe, Brazil, before and after COVID-19 started. The researchers hypothesized that children in abusive homes might be in more danger because they weren’t going to school and their families were under more stress (Martins-Filho et al. 2020). A summary of the study is reported below:
Period | Parameter | Total Registries | Registries for Childer under 12 Years of Age | Relative Proportion |
---|---|---|---|---|
Before COVID-19 (2019) | \(p_1\) | 70 | 18 | 0.26 |
After COVID-19 (2020) | \(p_2\) | 53 | 16 | 0.30 |
Conduct a two-sample test at the \(\alpha = 0.01\) significance level to determine whether the rate of reported child abuse for children under the age of \(12\) was significantly different after the start of the COVID-19 pandemic - be sure to address all five steps of the hypothesis test
\[\hat{p} = \frac{18+16}{70+53} = 0.276\]
\(n\hat{p} = 0.276(123) = 34\) and \(n(1-\hat{p}) = 0.723(123) = 89\) are both greater than the required \(15\) and thus the sample size is sufficient to meet the asymptotic requirement of the test
The null hypothesis is \(H_0: p_1 - p_2 = 0\) and the alternative hypothesis is \(H_A: p_1 - p_2 \neq 0\)
The pooled standard error is
\[SE_{D_{pooled}} = \sqrt{0.276(1-0.276)\left(\frac{1}{70}+\frac{1}{53}\right)} = 0.0819\]
the test statistic is
\[ Z_{obs} = \frac{(0.26 - 0.30)-0}{0.0819} = -0.488\]
\[ p\text{-value} = P(|Z|\geq|-0.488||H_0)\] \[ = 2\times \left[1 - P(Z < 0.488|H_0)\right] = 0.625\]
Note that your \(p\)-value may be slightly different depending on how you rounded
\(\bf (6.)\) A study published in the journal Physical Culture and Sport. Studies and Research explored what motivates athletes to keep playing sports, comparing team sports like football with individual ones like taekwondo (Moradi, Bahrami, and Dana 2020). The study used stratified random sampling to survey \(265\) athletes from four team disciplines (football, volleyball, basketball, handball) and 2 individual disciplines (kung fu and taekwondo). The study evaluated motivational factors affecting sport participation based on eight fields: achievement/status, teamwork, fitness, energy release, situational factors, skill development, friendship, and fun. Each field was evaluated based on a survey that quantified sentiment in each field using three-level likert scale responses (very important=3, somewhat important=2, and not important=1). The results of the study are summarized in the table below:
Motivational Factor | Sport Type | Sample Mean | Sample Stdev. | Sample Size |
---|---|---|---|---|
Participation (total) | Team | 75.67 | 9.38 | 203 |
Individual | 80.50 | 8.40 | 62 | |
Achievement | Team | 15.03 | 2.71 | 203 |
Individual | 15.72 | 3.03 | 62 | |
Teamwork | Team | 8.11 | 1.21 | 203 |
Individual | 7.68 | 1.23 | 62 | |
Energy Release | Team | 11.88 | 2.05 | 203 |
Individual | 12.85 | 2.07 | 62 | |
Fitness | Team | 7.79 | 1.27 | 203 |
Individual | 8.40 | 0.94 | 62 | |
Situational Factors | Team | 7.48 | 1.44 | 203 |
Individual | 8.33 | 0.98 | 62 | |
Skill Development | Team | 7.94 | 1.28 | 203 |
Individual | 8.42 | 1.07 | 62 | |
Friendship | Team | 9.57 | 1.78 | 203 |
Individual | 10.50 | 1.51 | 62 | |
Fun | Team | 7.74 | 1.20 | 203 |
Individual | 8.18 | 1.04 | 62 |
Conduct a pooled (assume the sample vairances are equal) two independent samples \(t\)-test at the \(\alpha = 0.05\) significance level to determine whether fitness as a motivating factor is significantly different between team and individual sports.
The null hypothesis is \(H_0: \mu_1 - \mu_2 = 0\) where \(\mu_1\) is the mean score for fitness for team sports and \(\mu_2\) is the mean score for fitness for individual sports. The alternative hypothesis is \(H_A: \mu_1 - \mu_2 \neq 0\)
The pooled estimate of the sample variance is
\[s_{pooled} = \sqrt{\frac{(203 - 1)1.27^2 + (62-1)0.94^2}{203+62 - 2}} = 1.201 \]
the estiamted standard error assuming the samples are independent have the same variance is given by
\[ SE(\hat{\mu}_d) = 1.201\cdot\sqrt{\frac{1}{203}+\frac{1}{62}} = 0.174\]
The test statistic is
\[ t_{obs} = \frac{(7.79 - 8.40) - 0}{0.174} = -3.505\]
where \(t_{obs}\) is approximately \(t\)-distributed with \(203+62 - 2 = 263\) degrees of freedom
\[ p\text{-value} = P(|t|\geq |t_{obs}||H_0) \] \[= 2\left[P(t \geq 3.505 | H_0)\right] = 0.00054\]
\(\bf (7.)\) A 1993 study published in the Canadian Journal of Zoology investigated the characteristics of maturation and growth in two lowland populations (Servotte and Thevenon) of European Common Frogs Rana temporaria (Augert and Joly 1993) in Southeast France from 1986 - 1989. A summary of the body lengths for male and female frogs from 1 - 4 years in age in both populations are summarized below.
Age (Years) | Population Location | Male Body Length (mm) | Stdev | Sample size | Female Body Length (mm) | Stdev | Sample Size |
---|---|---|---|---|---|---|---|
1 | Servotte | 32.5 | 5.3 | 45 | 33.4 | 5.4 | 76 |
2 | Servotte | 55.8 | 4.7 | 11 | 59.7 | 6.2 | 15 |
3 | Servotte | 61.2 | 5.3 | 35 | 65.6 | 5.2 | 27 |
4 | Servotte | 64.7 | 7.3 | 12 | 69.6 | 0.5 | 15 |
1 | Thevenon | 32.9 | 3.9 | 49 | 32.8 | 4.3 | 76 |
2 | Thevenon | 51.8 | 3.2 | 26 | 53.3 | 6.0 | 29 |
3 | Thevenon | 59.9 | 5.4 | 27 | 59.9 | 6.4 | 21 |
4 | Thevenon | 61.0 | 4.1 | 12 | 63.0 | 4.3 | 13 |
Conduct a two independent samples \(t\)-test at the \(\alpha = 0.05\) significance level to evaluate whether there is a statistically significant difference in the mean body lengths of four-year-old male frogs between the Servotte and Thévenon populations. Assume the sample variances are NOT equal.
The null hypothesis is \(H_0: \mu_1 - \mu_2 = 0\) where \(\mu_1\) is the mean male body length for frogs in the Servotte popultion and \(\mu_2\) is the mean male body length for frogs in the Thevenon population. The alternative hypothesis is \(H_A: \mu_1 - \mu_2 \neq 0\)
The estimated standard error assuming the samples are independent and do not have the same variance is given by
\[ SE(\hat{\mu}_d) = \sqrt{\frac{7.3^2}{12}+\frac{4.1^2}{12}} = 2.417\]
The test statistic is
\[ t_{obs} = \frac{(64.7 - 61.0)}{2.417} = 1.531\]
where \(t_{obs}\) is approximately \(t\)-distributed with \(11\) degrees of freedom
\[ p\text{-value} = P(|t|\geq |t_{obs}||H_0) \] \[= 2\left[P(t \geq 1.531 | H_0)\right] = 0.077\]