Define the following terms/concepts
a.) Describe the central limit theorem and its significance to the sampling distribution for a sample proportion or mean
b.) Margin of error
c.) Describe the “basic form” of a confidence interval
d.) Confidence level
e.) Critical value
f.) When are \(t\)-distribution and standard normal distribution approximately the same shape?
g.) How are the degrees of freedom of a \(t\)-distribution related to its shape?
h.) Significance level
i.) Rejection region
\(\bf (1)\) Use the appropriate table to find the probability of each \(z\) or \(t\) score.
a.) \(P(Z > 1.3)\)
b.) \(P(Z \leq 0.33)\)
c.) \(P(Z \leq -2.1)\)
d.) \(P(t > 1.97)\) with 5 degrees of freedom
e.) \(P(t \leq 2.9)\) with 2 degrees of freedom
f.) \(P(t \geq -0.98)\) with 8 degrees of freedom
\(\bf (2)\) Fill in the table below with the corresponding \(\alpha\) level and critical value for each confidence interval for a population proportion.
Confidence Level | \(\alpha\) | critical value: \(z_{1-\alpha/2}\) |
---|---|---|
\(90\%\) | ||
\(88\%\) | ||
\(97\%\) | ||
\(96\%\) | ||
\(78\%\) | ||
\(99\%\) |
\(\bf (3)\) Fill in the table below with the corresponding \(\alpha\) level and critical value for each confidence interval for a population mean.
Confidence Level | Sample size | \(\alpha\) | critical value: \(t_{n-1, 1-\alpha/2}\) |
---|---|---|---|
\(92\%\) | \(n = 5\) | ||
\(86\%\) | \(n = 8\) | ||
\(98\%\) | \(n = 11\) | ||
\(99.9\%\) | \(n = 14\) | ||
\(82\%\) | \(n = 40\) | ||
\(95\%\) | \(n = 13\) |
\(\bf (4)\) A survey conducted by the state of Georgia surveyed colleges students as part of a larger effort to characterized student behavior (Agresti and Franklin 2007). One the questions in the survey asked students to report their political party affiliation. The table below summarizes the proportion of students who reported being Republican, Democrat, or Independent.
Political Party Affiliation | Count | \(\hat{p}\) |
---|---|---|
Democrat | 8 | 0.14 |
Republican | 36 | 0.61 |
Independent | 15 | 0.25 |
Using the table above, construct a \(90\%\) confidence interval for the proportion of Independent college student voters in the state of Georgia.
\(\bf (5)\) As part of an effort to improve health services, a health club in the United Kingdom asked club members to complete an online survey (de Vries 2023). One of the survey questions asked customers to rate their satisfaction level on a likert scale. The results of this survey question are summarized below
Response | Count | \(\hat{p}\) |
---|---|---|
Completely dissatisfied | 0 | 0.00 |
Very dissatisfied | 0 | 0.00 |
Dissatisfied | 3 | 0.01 |
Neutral | 20 | 0.09 |
Satisfied | 64 | 0.30 |
Very satisfied | 109 | 0.51 |
Completely satisfied | 19 | 0.09 |
Using the table above, construct a \(95\%\) confidence interval for the proportion of customers that are “very satisfied” or more.
\(\bf (6)\) The following table summarizes variables from a random sample of \(25\) observations from a dataset concerning housing in California taken from the 1990 census. The full dataset is featured in Aurélien Géron’s book ‘Hands-On Machine learning with Scikit-Learn and TensorFlow’ (Géron 2022). Use the table to answer parts (a) and (b)
Variable | Sample Mean | Sample Standard Deviation |
---|---|---|
Median Age of House Owner | 24.28 | 13.56 |
Total Rooms (per block) | 3263.72 | 2105.99 |
Total Bedrooms (per block) | 640.52 | 380.55 |
Population of District | 1861.68 | 1193.30 |
Number of Households (per block) | 584.24 | 330.38 |
Median Family Income (in tens of thousands of USD) | 3.74 | 1.54 |
Median House value (USD) | 179320.04 | 109450.56 |
a.) Construct and interpret a \(95\%\) confidence interval for the average median family income for households in California in 1990
b.) Construct and interpret a \(99\%\) confidence interval for the average number of bedrooms per block in California in 1990
\(\bf (7)\) According to a 2023 study published by the Saudi Heart Association, a survey of college students from Taibah University in Medina, Saudi Arabia, found that \(24\%\) of students reported using e-cigarettes (Alzahrani et al. 2023). Assuming a comparable rate of e-cigarette use in the U.S., what sample size is required to estimate the proportion of college students who use e-cigarettes at the \(95\%\) confidence level and within a margin of error of \(1.5\%\)?
\(\bf (8)\) Recent concerns over the rise in global infertility has drawn significant attention to the harmful effects of microplastics, PFAS, and highly processed foods (Zhang et al. 2022; Agarwal et al. 2015). A study from 2002 published in found that approximately \(15\%\) of couples worldwide experienced infertility (Sharlip et al. 2002). Assuming a similar rate of infertility in the U.S, compute the sample size needed to estimate infertility of U.S couples at the \(99\%\) confidence level and within a margin of error of \(4\%\).
\(\bf (9)\) Describe the assumptions associated with estimating \(p\) with \(\hat{p}\)
\(\bf (10)\) Describe the assumptions associated with estimating \(\mu\) with \(\bar{x}\)
\(\bf (11)\) A pharmaceutical company produces a generic version of the pain reliever ibuprofen, marketing a tablet with a 200 milligram dose. Concerned about the accuracy of the dosing process, the manufacturer suspects that the machine filling the tablets may be malfunctioning leading to a smaller dose in each tablet. The manufacturer wishes to conduct a significance test to determine if the dose is significantly lower than 200 milligrams. State the null and alternative hypotheses for this test
\(\bf (12)\) A financial institution invests in a pharmaceutical company, which produces a widely prescribed antidepressant medication with a market value of $100 per share. Worried about potential discrepancies in the market valuation, the institution suspects that recent market trends may have led to an underestimation of the stock’s value. The institution plans to conduct a significance test to determine if the stock’s value has significantly increased. State the null and alternative hypotheses for this test.
\(\bf (13)\) A semiconductor manufacturing company produces microchips with a target defect rate of 0.1% per batch. Concerned about the quality control process, the company suspects that recent changes in manufacturing procedures may have resulted in a change in the defect rate. The company intends to conduct a significance test to determine if the defect rate is significantly different than the target rate of 0.1%. State the null and alternative hypotheses for this test.
\(\bf (14)\) For each set of hypotheses, significance level, and \(p\)-value. State whether the test rejects or fails to reject the null hypothesis.
a.) \(H_0: \mu = 0\); \(H_A: \mu \neq 0\); \(\alpha = 0.01\); \(p\)-value \(= 0.0098\)
b.) \(H_0: p = 0.5\); \(H_A: p > 0.5\); \(\alpha = 0.05\); \(p\)-value \(= 0.086\)
c.) \(H_0: \mu = 100\); \(H_A: \mu < 100\); \(\alpha = 0.001\); \(p\)-value \(= 0.0015\)
d.) \(H_0: p = 0.9\); \(H_A: p \neq 0.9\); \(\alpha = 0.1\); \(p\)-value \(= 0.053\)
\(\bf (15)\) For each set of hypotheses, significance level, sample size, and test statistic. Give the critical value and state whether the test rejects or fails to reject the null hypothesis.
a.) \(H_0: \mu = 15\); \(H_A: \mu > 15\); \(n = 33\); \(\alpha = 0.05\); \(t_{obs} = 2.13\)
b.) \(H_0: \mu = 0\); \(H_A: \mu \neq 0\); \(n = 14\); \(\alpha = 0.01\) ; \(t_{obs} = -4.1\)
c.) \(H_0: p = 0.5\); \(H_A: p < 0.5\); \(n = 120\); \(\alpha = 0.1\); \(Z_{obs} = -1.5\)
d.) \(H_0: p = 0.05\); \(H_A: p \neq 0.05\); \(n = 60\); \(\alpha = 0.03\); \(Z_{obs} = 1.95\)
\(\bf (16)\) A toy manufacturer claims that \(50\%\) of their toy robots are defect-free. However, a quality control inspector suspects that the actual proportion of defect-free robots is different from what the manufacturer claims. To investigate, the inspector randomly selects \(100\) toy robots from a production batch and examines them for defects. After the inspection, the inspector finds that \(40\) out of the \(100\) toy robots are defect-free. To test their suspicion, they set up a two-tailed hypothesis test with a significance level \(\alpha = 0.1\) and hypotheses \(H_0: p_0 = 0.5\) (manufacturer’s claim) \(H_A: p \neq p_0\).
\(\bf (17)\) A pharmaceutical company has developed a new drug designed to cure a specific bacterial infection. They claim that their drug is effective, with a cure rate of \(20\%\). However, a group of independent researchers believes that the cure rate less than what the pharmaceutical company claims. To investigate, the researchers conduct a clinical trial on \(300\) patients suffering from the bacterial infection. After the trial, they find that \(70\) out of the \(300\) patients were cured using the new drug. To test their suspicion, they set up a hypothesis test with a significance level of \(\alpha = 0.05\) with \(H_0: p_0 = 0.2\), \(H_A: p < p_0\)
a.) Compute the critical value for this test
b.) Compute the test statistic \(Z_{obs}\)
c.) Compute the \(p\)-value
d.) Is there enough evidence to conclude that the cure rate of the new drug is less than rate claimed by the pharmaceutical company?
\(\bf (18)\) A soft drink company claims that their new “Zero Sugar” soda contains zero grams of added sugar per 12-ounce can. To test this claim, a random sample of \(25\) cans of this soda is selected. The sample mean is found to be \(1.3\) grams of added sugar per can, with a standard deviation of \(0.5\) grams. To determine if there is enough evidence to reject the company’s claim and conclude that the soda does not contain zero grams of added sugar per can, a group of researchers are interested in testing the following hypotheses: \(H_0: \mu_0 = 0\), \(H_A: \mu \neq \mu_0\) at the \(\alpha = 0.01\) significance level.
\(\bf (19)\) A group of botanists is studying the growth rate of Venus fly traps in a controlled greenhouse environment. Based on historical data, the typical growth rate of Venus fly traps is believed to be \(100\) millimeters per year. The botanists suspect that the current growth rate in their greenhouse is higher than this historical value. To investigate, they take a random sample of \(40\) Venus fly traps and measure their growth rates over a year. After the study, they find that the sample mean growth rate for the \(40\) Venus fly traps is \(105\) millimeters with a standard deviation of \(12\)mm. To test their suspicion, they set up a hypothesis test with a significance level of \(\alpha = 0.05\)
a.) State the null and alternative hypotheses
b.) Compute the critical value for this test
c.) Compute the test statistic \(t_{obs}\)
d.) Compute the \(p\)-value
e.) Is there enough evidence to conclude that the current growth rate of Venus fly traps in the greenhouse is higher than the historical value of \(100\) millimeters per year?