\(\bf (1)\) Rocket League is a popular online video game and E-sport that emerged in 2015. The game enjoys a healthy following of around 93 million players per month. The game features players from around the world who compete in sports like soccer (football), basketball and hockey while controlling RC-like vehicles. The most popular game-mode is doubles soccer in which teams of two players each try to outscore each other by hitting a soccer ball into the opposing goal. In the course of each match, the player controlled vehicles often collide in what is commonly referred to as a “bump”. The following histogram shows the distribution of the average number of “bumps” per match for a given player of Rocket League
\[ z = \frac{423 - 355}{69} \approx 1\]. Therefore we want to know what proportion of observations have a value \(\geq \bar{x}+1s\) which is by the empirical rule is \(13.5\%+2.5\% = 16\%\)
approximately \(95\%\)
at least \(355+2s = 493\) bumps
\[ z = \frac{620 - 355}{69} = 3.84 \ \text{standard deviations}\]
Figure 1 Pinneped whisker morphology - Ginter et al. 2012 | A juvenile Harp Seal |
---|---|
\(\bf (2)\) Pinnipeds (seals and sea lions) possess the largest vibrissae (whiskers) among mammals and their vibrissal hair shafts demonstrate a diversity of shapes (See Figure 1). In a study conducted by (Ginter et al. 2012), researchers measured 9 characteristics of Pinniped whiskers in individuals from 9 species of seals and sea lion. Their goal was to better characterize whisker morphology and evolution. The following data are the recorded whisker lengths of 20 Harp Seals (given in Table 1). Note that the data have been sorted according to increasing ``Total Length (cm)” for your convenience - use the table to answer the following questions:
Observation | Whisker Total Length (cm) | Number of Beads |
---|---|---|
1 | 4.18 | 2 |
2 | 4.23 | 2 |
3 | 4.31 | 2 |
4 | 4.31 | 2 |
5 | 4.33 | 2 |
6 | 4.34 | 2 |
7 | 4.35 | 2 |
8 | 4.38 | 2 |
9 | 4.40 | 2 |
10 | 4.41 | 2 |
11 | 4.43 | 3 |
12 | 4.44 | 2 |
13 | 4.46 | 2 |
14 | 4.47 | 3 |
15 | 4.48 | 3 |
16 | 4.51 | 2 |
17 | 4.59 | 3 |
18 | 4.60 | 2 |
19 | 4.62 | 2 |
20 | 5.08 | 3 |
## [1] "comma separated values = 4.18,4.23,4.31,4.31,4.33,4.34,4.35,4.38,4.4,4.41,4.43,4.44,4.46,4.47,4.48,4.51,4.59,4.6,4.62,5.08"
\[\begin{eqnarray} Q1 = \frac{4.33+4.34}{2} = 4.335 \approx 4.34 \\\nonumber median = \frac{4.41+4.43}{2} = 4.42 \\\nonumber Q3 = \frac{4.48+4.51}{2} = 4.49 \\\nonumber IQR = 4.49 - 4.34 = 0.15 \\\nonumber \bar{x} = \sum_{i = 1}^{20} \frac{4.18 + 4.23 + \cdots 5.08}{20} = 4.446 \\\nonumber s = \sqrt{\sum_{i = 1}^{20}\frac{(4.18 - 4.45)^2 + (4.23 - 4.45)^2 + \cdots (5.08 - 4.45)^2}{19}} = 0.189 \\\nonumber \end{eqnarray}\]
The sample mean is \(\bar{x} = 4.446\) and the sample standard deviation is \(s = 0.189\) \[ z = \frac{5.08 - \bar{x}}{s} = \frac{5.080 - 4.446}{0.189} = 3.372 \]
\(\bf (3)\) Use the following pair of boxplots constructed from the Pinneped whisker data in \(\textbf{Table 1}\). The boxplots show the distribution of total whisker length (cm) for whiskers with 2 beads vs whiskers with three beads. Answer the following questions:
## [1] "-----------------2 bead whisker lengths-----------------"
## [1] "comma separated values = 4.3,4.6,4.5,4.6,4.2,4.3,4.3,4.2,4.4,4.3,4.5,4.3,4.4,4.4,4.4"
## [1] "-----------------3 bead whisker lengths-----------------"
## [1] "comma separated values = 4.6,5.1,4.5,4.4,4.5"
The distribution for 2 beads is normally distributed (given by the symmetry of boxplot) but the distribution for 3 beads is skewed
The mean length for whiskers with two beads and three beads is \(\bar{x}_{2beads} = 4.32\) and \(\bar{x}_{3beads} = 4.61\) respectively. The standard deviation in length for whiskers with 2 and 3 beads is \(s_{2beads} = 0.121\) and \(s_{3beads} = 0.270\) respectively. Seal 1 has a \(z\)-score of \[z_1 = \frac{4.32 - \bar{x_3}}{s_3} = \frac{4.32 - 4.61}{0.27} = -1.07\] And Seal 2 has a \(z\)-score of \[z_2 = \frac{4.34 - \bar{x_2}}{s_2} = \frac{4.34 - 4.39}{0.12} = -0.43\]
Since the data is approximately symmetric (normal) we can use the 2 standard deviations rule \(\bar{x}-2s > x > \bar{x}+2s\) - the whisker would have to be greater than \[ \bar{x}_3 + 2s = 4.608 + 2\times 0.270 = 5.148 (cm) \] or less than \[ \bar{x}_3 - 2s = 4.608 - 2\times 0.270 = 4.068 (cm)\]
\(\bf (4)\) Use the following plot of the cumulative distribution of a quantitative variable \(x\) to answer questions (a)-(c)
## [1] "Comma separated values = -4.7,-2.5,-2.3,-1.3,-1.3,-1.2,0,0,1,1.8,2.1,2.3,2.6,2.8,2.8,2.9,3.5,3.6,3.7,3.7,4.3,4.8,4.9,5.2,5.3,5.5,6,6.5,6.6,7.1,7.1,7.2,7.4,7.7,7.8,8.1,8.1,8.1,8.5,8.7,9,9,10,10.2,10.2,11,11,11.1,11.1,11.5"
approximately \(-3\) and \(13\). Using the plot above the answers are approximately -2.5 and 11.1. The percentiles differ because the distribution of \(X\) is skewed to the left. The percentiles given by the empirical rule are only approximate for distributions with deviations from normality/span>
if the mean is \(\bar{x} = 5\) then both \(-3\) and \(13\) are two standard deviations from the mean e.g \(-3 = \bar{x}-2s\) and \(13 = \bar{x}+2s\). Under the empirical rule, appoximately \(95\%\) of the observations will fall in this interval
\[ z_1 = \frac{11.5 - 5}{4} \approx 1.63 \] No, because \(1.63 < 2\) \[ z_2 = \frac{-4.7 - 5}{4} \approx -2.43 \] Yes, because \(-2.43 < -2\)
\(\bf (5)\) Define the following terms
Explanatory variable - The independent variable which we manipulate to see how the response variable changes
Response variable - The outcome or dependent variable - the variable on which we make comparisons for different values of the explanatory variable
experimental study - A study which assigns subjects/observations to one or more treatments and observes the outcome of the response variable
observational study - A nonexperimental study in which the researcher observes values of the response and explanatory variables for each subject/observation
survey - A type of nonexperimental study in which subjects/observations are selected from a population for the purpose of making inferences about the population.
\(\bf (6)\) Name the sampling method used in each of the following situations:
convienence sampling
cluster sampling
stratified random sampling
systematic sampling
simple random sampling