Examtest with the 9th edition of Statistics for Business and Economics by Newbold

How to describe data graphically? - ExamTests 1
How to describe data numerically? - ExamTests 2
How to use probability calculation? - ExamTests 3
How to use probability models for discrete random variables? - ExamTests 4
How to use probability models for continuous random variables? - ExamTests 5
How to obtain a proper sample from a population? - ExamTests 6
How to obtain estimates for a single population? - ExamTests 7
How to estimate parameters for two populations? - ExamTests 8
How to develop hypothesis testing procedures for a single population? - ExamTests 9
What test procedures are there for testing the difference between two populations? - ExamTests 10
How to conduct a simple regression? - ExamTests 11
How to conduct a multiple regression? - ExamTests 12
What other topics are important in regression analysis? - ExamTests 13
How to analyze categorical data? - ExamTests 14
How to conduct an analysis of variance? - ExamTests 15
How to analyze data sets with measurements over time? - ExamTests 16
What other sampling procedures are available? - ExamTests 17

How to describe data graphically? - ExamTests 1

Questions

Question 1

Indicate whether each of the following variables is categorical or numeric. If the variable is categorical, specify the measurement level. If the variable is numeric, specify the measurement level and indicate whether the variable is discrete or continuous:

The number of shares of a stock purchased by a broker.
The nationality of a student.
The grade point average of a student.
The temperature in degrees Celsius.

Question 2

Upon visiting a newly opened H&M store, customers were given a brief survey. Is the answer to each of the following questions categorical or numerical? If categorical, give the level of measurement. If numerical, is it discrete or continuous?

Is this your first visit to this H&M store?
On a scale from 1 (very dissatisfied) to 5 (very satisfied), how satisfied are you with today's purchase(s)?
What was the cost of your purchase(s)?

Question 3

Tourists visiting Croatia are asked to fill in a survey. The survey consists of various questions about how they experienced their holiday. Describe for each question the type of data obtained.

Question	Type of data
Which of the following areas did you visit? Coast. Islands. Mountains. The capital (Zagreb).
Did you rent a sailing boat? Yes. No.
What was the average amount of money you spent on food per day?
What would you recommend as the optimal number of days for tourists to spend in Croatia?
How often would you recommend visiting Croatia? Every year. Once every five years. Once in a lifetime. Never.

Question 4a

An administrator examines the travel expenses of faculty members that attended various professional meetings. He found that 36% of the travel expenses was spent for transportation costs, 17% was spent for accommodation, 13% was spent on food; 9% was spent on conference fees, 10% on registration costs, and the remainder was spent on miscellaneous costs.

Construct a pie chart for these data.

Question 4b

Construct a bar chart for these data.

Question 5

A company has defined seven codes for possible defects for one of its products. Construct a Pareto diagram for the following frequencies:

Defect code	A	B	C	D	E	F	G
Frequency	10	70	15	90	8	4	3

Question 6

Construct a time-series plot for the following data of customers shopping at a new mall during a particular week.

Day	Number of customers
Monday	516
Tuesday	534
Wednesday	451
Thursday	487
Friday	558
Saturday	641
Sunday	830

Question 7

Determine an appropriate interval width for a random sample of 370 observations with scores that fall between 40 to 200.

Question 8a

Construct a stem-and-leaf display for the following data.

17	16	15	17	17
20	30	25	25	14
12	18	31	26	26
12	15	16	16	28

Question 8b

Construct a histogram for these data.

Question 8c

Is the distribution of these data symmetric, right-skewed, or left-skewed?

Question 9

Prepare a scatter plot of the following data:

(3, 10).
(2, 8).
(3, 12).
(4, 15).
(6, 20).
(5, 15).
(4, 12).

Question 10a

The following table shows the age of faculty members who have obtained a PhD degree from the largest university in the Netherlands.

Age	Percent
26 - 28	18.00
29 - 32	23.50
33 - 40	30.51
41 - 55	12.99
56+	15.00

What percent of faculty members who obtained a PhD are 46 years or older?

Question 10b

What percent of faculty member who obtained a PhD are under the age of 33 years?

Question 10c

Construct a relative cumulative frequency distribution of the data.

Question 10d

Suppose, we have 200 observations. What are the cumulative frequencies for the data described?

Question 10e

Interpret the cumulative frequencies.

Question 11

The following data are presented:

Age	30 -40	40 -50	50 - 60	60 - 70
Number	12	13	22	34

Describe possible errors in this table.

Question 12

Suppose, the amount of money a person spends on movie tickets each month (in euros) is:
6.0, 5.3, 4.0, 5.7, 10.0, 8.4, 2.5, 10.0, 9.5, 0.0, 5.0, 10.0
What graph would you use to visually display these data?

Question 13

In Germany, it was found that 32% of shoppers with incomes less than 50,000 shop online. Of the remaining 68%, half of the individuals never shop, and the other half shops by going to the actual store. Use a pie chart to plot this data.

Question 14a

Four types of checking accounts are offered by a bank. Suppose, a random sample of 300 customers were surveyed and asked some questions. It was found that 60% of the respondents preferred "Easy Checking", 12% preferred "Intelligent Checking", 18% preferred "Super Checking", and the remainder preferred "Ultimate Checking". Of the participants who selected Easy Checking, 100 were females. Of those who selected Intelligent Checking, a third was female. Of those who selected Super checking, half was female. Finally, of those who selected Ultimate Checking, 80% was female. Describe the data with a cross table.

Question 14b

How many females are there in total, and how many males?

Question 14c

What type of graph is appropriate for these data?

Histogram.
Scatter plot.
Time-series plot.
Bar chart.

Question 15

What type of graph is most appropriate for two numerical variables?

Answer indication

Question 1

The number of shares of a stock purchased by a broker: Numerical; interval; discrete
The nationality of a student: Categorical; nominal
The grade point average of a student: Numerical; ratio; continuous.
The temperature in degrees Celsius: Numerical; interval; continuous.

Question 2

Categorical; nominal.
Categorical; ordinal.
Numerical; continuous.

Question 3

Question	Type of data
Which of the following areas did you visit? Coast. Islands. Mountains. The capital (Zagreb).	Both categorical (nominal data, binary coded: yes/no) as numerical (discrete) by the number of areas that one visited.
Did you rent a sailing boat? Yes. No.	Categorical; nominal; binary coded.
What was the average amount of money you spent on food per day?	Numerical; interval; continuous.
What would you recommend as the optimal number of days for tourists to spend in Croatia?	Numerical; interval; discrete.
How often would you recommend visiting Croatia? Every year. Once every five years. Once in a lifetime. Never.	Categorical; ordinal.

Question 4a

No answer indication available.

Question 4b

No answer indication available.

Question 5

No answer indication available.

Question 6

Note that the time points on the horizontal axis consists of numbers. This could of course also be replaced by the days (Monday - Sunday).

Question 7

According to the quick guide, a sample size of 370 can be approximated by eight to ten classes.
Using the formula for interval width yields:
w = (200 - 40) / 8 = 20; or
w = (200 - 40) / 10 = 16
Thus, an appropriate interval width lies somewhere between 16 and 20.

Question 8a

1 | 2, 2, 4, 5, 5, 6, 6, 6, 7, 7, 7, 8.
2 | 0, 5, 5, 6, 6, 8.
3| 0, 1.

Question 8b

No answer indication available.

Question 8c

Right skewed (positively skewed); the tail is at the right side of the distribution.

Question 9

No answer indication available.

Question 10a

12.99 + 15.00 = 27.99%

Question 10b

18.00 + 23.50 = 41.50%

Question 10c

Age	Percent
26 - 28	18.00
29 - 32	41.50
33 - 40	72.01
41 - 55	85.00
56+	100.00

Question 10d

The cumulative frequencies for 200 observations are: 36, 82, 144, 170, 200.

Question 10e

For sample size n = 200, there are 36 individuals that obtained a PhD between the age of 26 and 28. There are 82 individuals that obtained a PhD before the age of 33. There are 144 individuals that obtained a PhD before the age of 41, and so forth.

Question 11

A possible error lies in the boundaries of the frequency classes. First, there is no upper and lower limit, hence (possibly) excluding some observations. Second, it is unclear from this frequency distribution, to what class observations such as 30 and 40 belong to.

Question 12

A time-series plot would be appropriate here. Data are given for t number of time points, with t = 12.

Question 13

No answer indication available.

Question 14a

Type of checking account	Female	Male	Total
Easy Checking	100	80	180
Intelligent Checking	12	24	36
Super checking	27	27	54
Ultimate Checking	24	6	30
Total	163	137	300

Question 14b

There are 163 females and 137 males in the sample of 300 participants.

Question 14c

D, a bar chart. The other graphs are appropriate in the event of numerical variables. Here, we have frequencies for two categorical variables. This is best displayed by a bar chart (or pie chart).

Question 15

A scatter plot.

How to describe data numerically? - ExamTests 2

Questions

Question 1

A random sample of five numbers was drawn:

18 71 80 80 84

Compute the mean, median, and mode.

Question 2

The number of cars crossing the border between Israel and Jordan is recorded. Over a 6-day period, the following number of cars for each day is found:

16 21 12 19 1 2

Compute the mean, median, and mode.

Question 3a

The records of the university of Groningen over a 12-year period show the following percentage increase in the number of students enrolled:

4.1 3.2 3.5 4.5 5.1 3.8

2.1 2.2 3.1 5.1 1.5 1.0

Compute the mean increase in the number of students enrolled.

Question 3b

Compute the median increase in the number of students enrolled.

Question 3c

Find the mode.

Question 4a

The finances over the past decade are reviewed. The records are shown per year.

2.51 3.74 4.15 5.33 6.18

6.65 7.18 6.92 6.95 7.54

Calculate the mean.

Question 4b

Calculate the median.

Question 5a

During the past years, many countries faced depopulation. We collected the number of elementary schools that were closed for ten countries:

10 6 13 5 11 5 6 3 7 9

Find the mean, median, and mode of the number of schools closed.

Question 5b

Find the five-number summary.

Question 6

A textile manufacturer obtains a sample of 50 bolts of cloths and carefully inspects each bolt. Based on this inspection, the manufacturer records the number of imperfections.The following

contingency table is obtained:

Number of imperfections	0	1	2	3
Number of bolts	33	12	4	1

Calculate the mean, median, and mode for these sample data.

Question 7

Compute the variance and standard deviation of the following sample data:

6 8 10 12 14 9 11 7 13 11

Question 8

Compute the variance and standard deviation of the following sample data:

5 -3 0 2 -1 7 4

Question 9

Consider two different investments, stock A and stock B. The mean closing price for stock A is 4.00 and the mean closing price for stock B is 80.00. The mean rate of return is the same for both stock A and stock B. We might think that stock B is more volatile than stock A. Now, suppose the standard deviations were found to be considerably different, with S_A = 2.00 and S_B = 8.00. Compute the coefficient of variation for these sample data and compare these competing investment opportunities.

Question 10

Calculate the coefficient of variation for the following data:

13 15 12 14 11

Question 11a

A set of data is mounded (bell-shaped) with a mean of 300 and a variance of 144.

Approximately what proportion of observations is greater than 288?

Question 11b

Approximately what proportion of observations is less than 324?

Question 11c

Approximately what proportion of observations is greater than 336?

Question 12a

The number of cars that pass through a tunnel during a period of 35 are as follows:

60 70 74 56 84 54 50

47 80 71 50 95 121 90

75 84 70 61 110 64 80

85 85 43 76 60 91 90

60 87 110 85 44 94 69

What is the mean number of cars?

Question 12b

What is the standard deviation?

Question 12c

What is the coefficient of variation?

Question 12d

Construct a stem-and-leaf display of the number of cars that pass through the tunnel. Next, find the interquartile range.

Question 12e

Provide the five-number summary for the sample data.

Question 13a

The daily exchange rate from EUR to USD for seven business days is:

1.14 1.14 1.13 1.13 1.12 1.11

Over the same period, the daily exchange rate from EUR of JPY is:

110 110 109 109 108 109

Compare the means of these two distributions.

Question 13b

Compare the standard deviations of these two distributions.

Question 14a

A company produces light bulbs with a mean lifetime of 1,200 hours and a standard deviation of 50 hours. Find the z-score for a light bulb that lasts only 1,120 hours.

Question 14b

Consider the z-score computed by question 14a. What percentage of light bulbs lasts longer than 1,120 hours?

Question 14c

Consider again the mean and standard deviation from question 14a. Find the z-score corresponding to a light bulb that lasts 1,300 hours.

Question 14d

What percentage of light bulbs lasts longer than 1300 hours?

Question 15a

Suppose that a student who completed courses for 15 ECTS in total during his first semester of college. He received one A, one B, one C, and one D. Now, suppose that a value of 4 is assigned to an A, a value of 3 is assigned to a B, a value of 2 is assigned to a C, and a value of 1 is assigned to a D. Calculate the student's semester GPA.

Question 15b

Now, however, each course is not worth the same number of credit hours. The A was earned in a 3-credit English course, the B was earned in a biology course of 3 hours, the C was earned in a 4-credit biology course, and the D was earned in a 5-credit Spanish course. Using these weight, calculate again the student's semester weighted GPA.

Question 16a

Consider the following data:

x_i	w_i
4.7	8
3.8	7
5.7	4
2.6	3
5.5	2

What is the artihmetic mean of the x_i values?

Question 16b

What is the weighted mean of the x_i values?

Question 16c

What is the sample variance?

Question 16d

What is the sample standard deviation?

Question 17a

Consider the following data:

(15,45) (6,18) (11,33) (12,36) (16,48), (14,42)

(5,15) (17,51) (4,12) (19,57), (7,21)

Compute the covariance.

Question 17b

Compute the correlation coefficient.

Question 17c

Draw a scatter plot to display the relationship between the two variables.

Question 18a

Consider the following data:

Quiz score (x)	4	3.4	3	5	1.1
Exam score (y)	100	66	78	80	30

Compute the covariance.

Question 18b

Compute the correlation coefficient.

Answer indication

Question 1

Mean = (18+71+80+80+84)/5 = 66.7; median = 80; mode = 80.

Question 2

Mean = (16+21+12+19+1+2)/6 = 11.8; median = (12+16)/2 = 14; there is no mode.

Question 3a

Mean = (4.1 + 3.2 + 3.5 + 4.5 + 5.1 + 3.8 + 2.1 + 2.2 + 3.1 + 5.1 + 1.5 + 1.0) / 12 = 3.3.

Question 3b

Median = 3.4.

Question 3c

Mode = 5.1.

Question 4a

Mean = (2.51 + 3.74 + 4.15 + 5.33 + 6.18 + 6.65 + 7.18 + 6.92 + 6.95 + 7.54) / 10 = 5.7.

Question 4b

Median = 6.4.

Question 5a

Mean = 7.5; median = 6.5; mode = 6.

Question 5b

For the five number summary, order the data in ascending order, that is:

3 5 5 6 6 7 9 10 11 13

Q1 is the value located in the 0.25(10+1)th position, that is the 2.75th position.
The second value is 5, the third value is also 5.
Q1 = 5 + 0.25*(5 - 5)
Q1 = 5 + 0
Q1 = 5

Q3 = the value located in the 0.75(10+1)th ordered position, that is the 8.25th position.
Q3 = 10 + 0.75(11 - 10)
Q3 = 10 + 0.75
Q3 = 10.75

Thus, the five number summary is: 3 (minimum); 5 (Q1); 6.5 (median); 10.75 (Q3); 13 (maximum).

Question 6

Mean = (0*33 + 1*12 + 2*4 + 3*1) / 50 = 23/50 = 0.46.
Median = 0
Mode = 0

Question 7

To calcuate the sample variance and standard deviation, follow these steps:

Step 1: Calculate the sample mean. The sample mean here is equal to 10.1.
Step 2: Find the difference between each of the values and the sample mean of 10.1.
Step 3: Square each difference.

The squared deviation from the mean for all observations are: 16.81 4.41 0.01 3.61 15.21 1.21 0.81 9.61 8.41 and 0.81. The sum of these squared deviations equals 60.9. Next, s² = (60.9) / (n -1) = 60.9/9 = 6.76. Thus, the variance equals 6.76. The standard deviation then is computed by the square root of the variance. That is: s = √6.76 = 2.6

Question 8

Again, apply the same steps as in question 7. The sample mean is equal to 2. The squared deviation from the mean for each observation is: 9, 25, 4, 0, 9, 25, 4. The sum of these squared differences is equal to 76. The variance, s² = 76/6 = 12.83. The standard deviation is the square root of the variance, that is: s = √12.83 = 3.56.

Question 9

CV_A = 2.00 / 4.00 x 100% = 50%.
CV_B = 8.00 / 80.00 x 100% = 10%.
The market value of stock A fluctuates more from period to period than does the market value of stock B. The coefficient of variation (CV) indicates that stock for stock A, the sample standarddeviation is 50% of the mean, and for stock B the sample standard deviation is only 10% of the mean.

Question 10

Use the formula:
\[CV = \frac{s}{\bar{x}} x 100\% \hspace{5mm} if \hspace{5mm} \bar{x} > 0 \]
CV = (1.58 / 13) x 100% = 12.15%
Thus, the sample standard deviation is 12.15% of the mean.

Question 11a

Use the formula:
\[z = \frac{x_{i} - \mu}{\sigma} \]
The standard deviation, σ, is equal to the square root of the variance, σ², that is: √144 = 12
z = (288 - 300) / 12 = -12/12 = -1
According to the empirical rule, approximately 68% fall within 1 standard deviation above and below the mean. The remaining 34% percent is thus spread to the left and right of this interval. This means that 0.5*34 = 16% of the observations fall below z = -1. Vice versa, 100 - 16 = 84% of scores are greater than 288.

Question 11b

z = (324 - 300) / 12 = 24/12 = 2
According to the empirical rule, approximately 95% fall within 2 standard deviations above and below the mean. The reamining 5% is spread at the higher and lower end of the distribution. Thus, 97.5% of observations are less than 324.

Question 11c

z = (336 - 300) / 12 = 36/12 = 3. Approximately all observations are lower than 336. Thus, to answer the question, almost no (0.15%) observations are greater than 336.

Question 12a

Mean = 75.

Question 12b

Standard deviation = 19.26.

Question 12c

CV = (19.26/75) x 100% = 25.67.

Question 12d

4 | 3 4 7
5 | 0 0 4 6
6 | 0 0 0 1 4 9
7 | 0 0 1 4 5 6
8 | 0 0 4 4 5 5 5 7
9 | 0 0 1 4 5
10 |
11| 0 0
12| 1
The interquartile range, IQR = 26.

Question 12e

Minimum = 43; Q1 = 60; Median = 75; Q3 = 86; Maximum = 121.

Question 13a

The means are 1.13 and 109.17.

Question 13b

The standard deviations are 0.01 and 0.75
CV_A = (0.01/1.13) x 100% = 1.04%
CV_B = (0.75/109.17) x 100% = 0.69%
The coefficient of variations tells us that the sample standard deviation for EUR to USD is 1.04% of the mean, whereas the sample standard deviation for EUR to JPY is 0.69% of the mean. Thus, the exchange rate for EUR to USD fluctuates more from day to day than does that of EUR of JPY.

Question 14a

z = (1,120 - 1,200) / 50 = -1.6.

Question 14b

94.52 (you can find the p-value corresponding to this z-score in the table of a standard normal distribution).

Question 14c

z = (1,300 - 1,200) / 50 = 2.

Question 14d

According to the empirical rule, approximately 2.5% of observations are more than two standard deviations above the mean.

Question 15a

\[ \bar{x} = \frac{4+3+2+1}{4} = 2.5\]

Question 15b

Use the formula for the weighted mean, that is:
\[\bar{x} = \frac{\Sigma w_{i}x_{i}}{n} \]
\[\bar{x} = \frac{4*3 + 3*3 + 2*4 + 1*5}{15} = \frac{34}{15} = 2.267 \]

Question 16a

\[\bar{x} = \frac{4.7+2.8+5.7+2.6+5.5}{5} = \frac{22.3}{5} = 4.46\]

Question 16b

\[\bar{x} = \frac{4.7*8 + 3.8*7 + 5.7*4 + 2.6*3 + 5.5*2}{24} = \frac{105.8}{24} = 4.41 \]

Question 16c

The variance is 1.643.

Question 16d

The standard deviation is √1.643 = 1.281.

Question 17a

The covariance = 82.42.

Question 17b

The correlation coefficient between x and y, that is r = 1.0 (perfect positive linear relationship).

Question 17c

Question 18a

Cov(x,y) = 30.8.

Question 18b

r = 0.83.

A random sample of five numbers was drawn:

18 71 80 80 84

Compute the mean, median, and mode.

How to use probability calculation? - ExamTests 3

Questions

Question 1a

The sample space S = [E₁, E₂, E₃, E₄, E₅, E₆]. Given A = [E₁, E₂, E₃] and B = [E₃, E₄, E₅].

What is A intersection B?

Question 1b

What is the union of A and B?

Question 1c

Is the union of A and B collectively exhaustive?

Question 2a

Use the following sample space S: S = [E₁, E₂, E₃, E₄, E₅, E₆, E₇, E₈, E₉, E₁₀].

Given A = [E₁, E₂, E₃, E₄], what is Ā?

Question 2b

Given Ā = [E₁, E₄, E₅, E₇] and B̄ (complement B) = [E₂, E₃, E₅, E₈]. What is A intersection B̄ (complement B)?

Question 2c

What is A intersection B?

Question 2d

What is the union of A and B?

Question 2e

Is the union of A and B collectively exhaustive?

Question 3

Suppose, two letters are to be selected from A, B, C, D, and E. Further, these two letters have to be arranged in order. How many permutations are possible?

Question 4

Suppose, there are 8 candidates that applied for a particular job. Yet, there are only 4 positions available. Of these 8 candidates, 5 are men and 3 are women. If every combination of candidates is equally likely to occur, what is then the probability that no women will be hired?

Question 5a

Suppose, there are 10 Apple iPads, 5 Samsung tablets, and 5 Huawei tablets on offer in a store A person enters the store and wants to buy 3 tablets. These tablets are selected purely by chance. What is the total number of outcomes in the sample space?

Question 5b

What is the probability that this person selects 2 Apple iPads and 1 Samsung tablet?

Question 6a

A sample space consists of 5 A's and 7 B's. Now, suppose we want to randomly draw two letters from this sample space. What is the total number of possible combinations?

Question 6b

What is the probability that a randomly selected set of 2 will include 1 A and 1 B?

Question 7

In a family of 6 family members, there are three males and three females. What is the probability that a random sample of two family members consists of two males?

Question 8a

Suppose there are 12 employees who could be assigned to an editorial task. Of these 12 employees, 7 are women and 5 are men. Two of the men are brothers. The manager of the company has to assign the editorial task randomly to one employee. Let A be the event "chosen employee is a man". Let B be the event "chosen employee is one of the brothers". What is the probability of event A?

Question 8b

What is the probability of event B?

Question 8c

What is the probability of the intersection of A and B?

Question 9a

Suppose, P(A) = 0.75, P(B) = 0.80, and P(A ∩ B) = 0.65. What is P (A ∪ B)?

Question 9b

What is the conditional probability of event B, given that event A has occurred?

Question 9c

What is the joint probability of both event A and event B?

Question 10

Suppose, within the Netherlands, 54% of all master's degrees are earned by women. Of all master's degrees that are obtained, 20% is obtained in psychology. In addition, 8% of all master's degrees are obtained by women in psychology. Are the events "the diploma holder is a woman" and the event "the diploma is in psychology" statistically independent?

Question 11

Suppose, the odds in favor of winning are 3 to 2. What is then the probability of winning?

Question 12a

Suppose, we are interested in examining the effect of alcohol on highway crashes. Obviously, it is unethical to provide one group of drivers with alcohol and compare their crash involvement to that of a sober group. We know, however, that 10.3% of the nighttime drivers have been drinking, and that 32.4% of the single-vehicle-accident drivers had been drinking. In this example, single-vehicle accidents are chosen to ensure that any driving error could be assigned to the driver only.
Based on these data, what is the sample space?

Question 12b

What is the conditional probability that the driver had been drinking, given that he was not involved in a crash?

Question 12c

Do these numbers provide sufficient evidence to conclude that alcohol increases the probability of crashes?

Question 13

For questions 26-30, the sample space is defined by events A₁, A₂, B₁, and B_2.
Given that P(A₁) = 0.15, P(B₁) = 0.20, and P(B₁|A₁) = 0.60. What is P(A₁|B₁)?

Question 14

Given that P(A₁ ∩ B₁) = 0.09 and P(B₁) = 0.18. What is P(A₁|B₁)?

Question 15

Given that P(A₂ ∩ B₂) = 0.81 and P(B₂) = 0.82. What is P(A₂|B₂)?

Question 16

Given that P(A₁) = 0.10, P(B₁|A₁) = 0.90. What is the probability of P(A₁ ∩ B₁)?

Question 17

Given that P(A₁) = 0.10, P(B₁|A₁) = 0.90, P(B₂|A₁) = 0.10. What is the probability of P(A₂)?

Answer indication

Question 1a

A ∩ B = [E₃].

Question 1b

A ∪ B = [E₁, E₂, E₃, E₄, E₅].

Question 1c

No, A and B are not collectively exhaustive, because E₆ is not covered in the union.

Question 2a

Ā = [E₅, E₆, E₇, E₈, E₉, E₁₀]

Question 2b

A ∩ complement B = [E₂, E₃, E₅, E₈], because A is equal to the complement of B.

Question 2c

A ∩ B is the empty set. There are no basic outcomes in both A and B, because they are each others complement.

Question 2d

A ∪ B = [E₁, E₂, E₃, E₄, E₅, E₆, E₇, E₈]

Question 2e

No, events E₉ and E₁₀ are not covered in the union of A and B.

Question 3

There are five outcomes, that is n = 5, and two outcomes have to be selected, that is x = 2.
Using the formula for the number of permutations yields:
\[P^{5}_{2} = \frac{5!}{3} = \frac{120}{6}\ = 20 ].

Question 4

First, calculate the total number of possible combinations of four candidates selected from the eight possible candidates. That is:
\[ C^{8}_{4} = \frac{8!}{4!4!} = 70 \]
Then, if no women is to be hired, this implies that the four successful candidates must come from the available five men. That means that the number of combinations is as follows:
\[ C^{5}_{4} = \frac{5!}{4!1!} = 5 \]
To conclude, if out of 70 possible combinations each is likely to be chosen, the probability that one of the 5-all male combinations would be selected is 5/70 = 1/14 = 0.07 (that is, 7%).

Question 5a

\[N = C^{20}_{3} = \frac{20!}{3!(20-3)!} = 1,140 \]
Thus, there are 1,140 number of outcomes in the sample space.

Question 5b

\[ C^{10}_{2} = \frac{10!}{2!(10-2)!} = 45 \]
Similarly, the number of ways that we can select 1 Samsung tablet from the available 5 is 5.
\[ C^{5}_{1} = \frac{5!}{1!(5-1!)} = \frac{5!}{1!4!} = 5 \]
Therefore, the number of outcomes that satisfy event A is as follows:
\[ N_{A} = C^{10}_{2} x C^{5}_{1} = 45 x 5 = 225 \]
Hence, the probability of A [i.e., 2 Apple iPads and 1 Samsung tablet] is:
\[ P_{A} = \frac{N_{A}}{N} = \frac{225}{1140} = 0.197 \]

Question 6a

The total number of possible combinations of 2 letters selected from 8 is as follows:
\[ C^{12}_{2} = \frac{12!}{2!10!} = 66 \]

Question 6b

The number of ways that we can select 1 A from the 5 available A's is as follows:
\[ N_{A} = C^{12}_{2} x C^{5}_{1} = \frac{5!}{1!(5-1)!} = \frac{5!}{1!4!} = 5 \]
Similarly, the number of ways that we can select 1 B from the 7 available B's is as follows:
\[ N_{A} = C^{12}_{2} x C^{7}_{1} = \frac{7!}{1-(7-1)!} = \frac{7!}{1!6!} = 7\]
Therefore, the number of ways that we can select one A and one B, that is the number of outcomes that satisfy event A, is as follows:
\[N_{A} = C^{5}_{1} x C^{7}_{1} = 5 x 7 = 35 \]
Finally, the probability of event A (that is, one A and one B) is as follows:
\[ P_{A} = \frac{N_{A}}{A} = \frac{35}{66} = 0.53\].

Question 7

\[ N = C^{6}{3} = \frac{6!}{3!3!} = \frac{720}{36} = 20 \]
Now, the number of combinations for two males is:
\[ C^{3}_{2} = \frac{3!}{2!1!} = \frac{6}{2} = 3 \]
Therefore, the probability of selecting two males is 3/20 = 0.15 (that is: 15%).

Question 8a

\[P_{A} = \frac{N_{A}}{N} = \frac{5}{12} = 0.42 \]

Question 8b

\[P_{B} = \frac{N_{B}}{N} = \frac{2}{12} = 0.17 \]

Question 8c

A ∩ B = 0.17

Question 9a

Use the addition rule of probabilities.
\[ P (A ∪ B) = P(A) + P(B) - P(A ∩ B) \]
Transforming this formula provides:
\[ P (A ∩ B) = P(A) + P(B) - P(A ∪ B) \]
This gives:
\[ 0.75 + 0.80 - 0.65 = 0.90 \]

Question 9b

\[ P(B|A) = \frac{P(A ∩ B)}{P(A)} = \frac{0.65}{0.75} = 0.8667 \]

Question 9c

To answer this question, use the multiplication rule of probabilities. That is:
\[ P(A ∩ B) = P(A|B) P(B) = (0.8125)(0.80) = 0.65 \]

Question 10

\[ P(A) = 0.54, P(B) = 0.20, P(A ∩ B) = 0.08 \]
Since
\[ P(A)P(B) = (0.54)(0.20) = 0.108 \neq 0.08 = P(A ∩ B) \]
these events are not independent.
The dependence can be found from the conditional probability:
\[ P(A|B) = \frac{P(A ∩ B)}{P(B)} = \frac{0.08}{0.20} = 0.40 \neq 0.54 = P(A) \]
That means that, in the Netherlands, only 40% of psychology degrees go to women, whereas women constitute 54% of all degree recipients.

Question 11

\[ \frac{3}{2} = \frac{P(A)}{1-P(A)} \]
\[ 3(1-P(A)) = 2P(A) \]
\[ 5P(A) = 3 \]
\[ P(A) = \frac{3}{5} = 0.6 \]

Question 12a

A₁: the driver had been drinking.
A₂: the driver had not been drinking.
B₁: the driver was involved in a single-vehicle crash.
B₂: the driver was not involved in a single-vehicle crash.

Question 12b

P(A₁|C₁) = 0.324

Question 12c

P(A₁|C₂) = 0.103

To answer this question, use the overinvolvement ratio. That is:
\[ \frac{P(A_{1}|C_{1})}{P(A_{1}|C_{2})} = \frac{0.324}{0.103} = 3.15 \]
Based on this ratio of 3.15, we can conclude that there is evidence that alcohol increases the probability of car crashes.

Question 13

Using Bayes' theorem, we find that P(A₁|B₁) = (0.60*0.15)/(0.20) = 0.45.

Question 14

\[ P(A_{1}|B_{1}) = \frac{P(A_{1} ∩ B_{1})}{P(B_{1})} = \frac{0.09}{0.18} = 0.50 \]

Question 15

\[ P(A_{2}|B_{2}) = \frac{P(A_{2} ∩ B_{2})}{P(B_{2})} = \frac{0.81}{0.82} = 0.988 \]

Question 16

P(A₁ ∩ B₁) = 0.90 * 0.10 = 0.09

Question 17

Use both:
P(A₁ ∩ B₁) = 0.90 * 0.10 = 0.09
and:
P(A₁ ∩ B₂) = 0.10 * 0.10 = 0.01
to find that:
P(A₁) = 0.09 + 0.01 = 0.10
A₂is the complement of A₁, thus A₂ = 1 - A₁ = 1 - 0.10 = 0.90

The sample space S = [E₁, E₂, E₃, E₄, E₅, E₆]. Given A = [E₁, E₂, E₃] and B = [E₃, E₄, E₅].

What is A intersection B?

How to use probability models for discrete random variables? - ExamTests 4

Questions

Question 1

A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?

Question 2

The weight of students is recorded as part of a national health study. Is the weight of students a discrete or continuous random variable?

Question 3

Indicate for each of the following if a discrete or continuous random variable provides the best definition:

The number of sunny days in the Netherlands.
The level of pressure in the tires of a car.
The amount of oil exported by Saudi Arabia in 2019.

Question 4

Give the probability distribution function of the face values of a single die when a fair die is rolled.

Question 5

What is the probability of a value of 5 or higher, when rolling a single fair die once?

Question 6a

Use the following probability distribution:

x	0	1	2	3	4	5	6
P(x)	0.03	0.15	0.11	0.19	0.22	0.26	0.04

P(3 < x < 6) = ?

Question 6b

P(x > 3) = ?

Question 6c

P(2 < x < 5) = ?

Question 6d

P(x < 4) = ?

Question 6e

What is the mean of this probability distribution?

Question 7

Suppose, the probability distribution of the number of errors (X) on pages from a business textbook is as follows: P(0) = 0.81; P(1) = 0.17; P(2) = 0.02.
What is the mean number of errors per page?

Question 8a

Someone is interested in the total costs of a project on which he intends to bid. He estimates that the materials will costs €25,000,- and that the larbor will costs €900,- per day. Suppose the project takes X days to complete. Provide the linear function for the total costs, denoted by C, of the project.

Question 8b

Now, assume that the following probability distribution is provided for the completion time of the project.

Completion time (x)	10	11	12	13	14
P(x)	0.1	0.2	0.3	0.2	0.1

Question 8c

What is the variance for completion time X?

Question 8d

What is the mean for the total costs, C?

Question 8e

What is the variance for the total costs, C?

Question 9a

Suppose that a real estate agent has five contacts and believes that for each contact the probability of making a sale is 0.40. What is the probability that the real estate agent makes at most 1 sale?

Question 9b

What is the probability that the real estate agent makes between 2 and 4 sales (inclusive)?

Question 10a

It is predicted that 3.5% of all small corporations will file for bankruptcy in 2020. For a random sample of 100 small corporations, estimate the probability that at least 3 will file for bankruptcy in 2020, assuming that this prediction is correct. To do so, use the Poisson distribution.

Question 10b

Now, do the same using the (actual) binomial distribution. Is the Poisson distribution a close estimate of the actual binomial distribution?

Question 11a

Consider the following joint probability distribution for two random variables X and Y. Find the marginal probabilities.

		Y return
X return	0%	5%	10%	15%
0%	0.0625	0.0625	0.0625	0.0625
5%	0.0625	0.0625	0.0625	0.0625
10%	0.0625	0.0625	0.0625	0.0625
15%	0.0625	0.0625	0.0625	0.0625

Question 11b

Are X and Y independent?

Question 11c

Find the mean of X.

Question 11d

Find the mean of Y.

Question 11e

What is the variance of X?

Question 11f

What is the standard deviation of X?

Question 12

Consider the following probability distribution

		X
Y		0	1
	0	0.25	0.35
	1	0.10	0.30

Compute the marginal probability distributions for X and Y.

Question 13a

Consider the following information for questions 28-30. An investor has €1000,- to invest and two investment opportunities, each requiring a minimum of €500,-. The profit for €100,- for the first investment (X) can be represented by the following probability distributions: P(X = -5) = 0.4 and P(X = 20) = 0.6. Subsequently, the profit per €100,- from the second investment (Y) is represented by the following probability distributions: P(Y = 0) = 0.6 and P(Y = 25) = 0.4. Random variables X and Y are independent. The investor has the following possible strategies:

€1000,- in the first investment.
€1000,- in the second investment.
€500,- in each investment.

Find the mean and variance for the first strategy.

Question 13b

Find the mean and variance for the second strategy.

Question 13c

Find the mean and variance for the third strategy.

Answer indication

Question 1

It is a discrete random variable, because it can take on a finite number of countable numbers.

Question 2

The weight of students is a continuous random variable.

Question 3

The number of sunny days in the Netherlands: discrete.
The level of pressure in the tires of a car: continuous.
The amount of oil exported by Saudi Arabia in 2019: continuous.

Question 4

Probability distribution of a single fair die
x	P(x)
1	0.16667
2	0.16667
3	0.16667
4	0.16667
5	0.16667
6	0.16667

Question 5

0.1667 + 0.1667 = 0.3333

Question 6a

P(3 < x < 6) = 0.19 + 0.22 + 0.26 = 0.67

Question 6b

P(x > 3) = 0.19 + 0.22 + 0.26 + 0.04 = 0.71

Question 6c

P(2 < x < 5) = 0.19 + 0.22 + 0.26 = 0.67

Question 6d

P(x < 4) = 0.03 + 0.15 + 0.11 + 0.19 = 0.48

Question 6e

\[ \mu_{X} = 0(0.03) + (1)(0.15) + (2)(0.11) + (3)(0.19) + (4)(0.22) + (5)(0.26) + (6)(0.04) = 3.36 \]

Question 7

\[ \mu_{x} = E[X] = \sum_{x} xP(x) = (0)(0.81) + (1)(0.17) + (2)(0.02) = 0.21 \]
Thus, the mean number of errors per page is 0.21.

Question 8a

C = 25,000 + 900X.

Question 8b

\[ \mu_{X} = E[X] = \sum_{x}xP(x) = (10)(0.1) + (11)(0.3) + (12)(0.3) + (13)(0.2) + (14)(0.1) = 11.9 \]
So, the mean for completion time X is 11.9 days.

Question 8c

\[ \sigma^{2}_{Y} = Var(a + bX) = b^{2}\sigma^{2}_{X} \]
\[ (10 - 11.9)^{2}(0.1) + (11 - 11.9)^{2}(0.3) + ... + (14 - 11.9)^{2}(0.1) = 1.29 \]
So, the variance for completion time X is 1.29 days.

Question 8d

\[ \mu_{C} = E[25,000 + 900X] = (25,000 + 900\mu_{X}) = 2500 + (900)(11.9) = €35,710,- \]

Question 8e

\[ \sigma^{2}_{C} = Var(25,000 + 900X) = (900)^{2}\sigma^{2}_{X} = (810,000)(1.29) = €1,044,900,- \]

Question 9a

\[ P(0) = \frac{5!}{0!5!}(0.4)^{0}(0.6)^{5} = (0.6)^{5} = 0.078 \]
\[ P(1) = \frac{5!}{1!4!}(0.4)^{1}(0.6)^{4} = 5(0.4)(0.6)^{4} = 0.259 \]
P(X < 1) = P(X = 0) + P(X = 1) = 0.078 + 0.259 = 0.337

Question 9b

\[ P(2) = \frac{5!}{2!3!}(0.4)^{2}(0.6)^{3} = 10(0.4)^{2}(0.6)^{3} = 0.346 \]
\[ P(3) = \frac{5!}{3!2!}(0.4)^{3}(0.6)^{2} = 10(0.4)^{3}(0.6)^{2} = 0.230 \]
\[ P(4) = \frac{5!}{4!1!}(0.4)^{4}(0.6)^{1} = 5(0.4)^{4}(0.6)^{1} = 0.077 \]
P(2 < X < 4) = P(2) + P(3) + P(4) = 0.346 + 0.230 + 0.077 = 0.653

Question 10a

The distribution of X is binomial with n = 100 and P = 0.0035, so that the mean of the distribution is equal to nP = 3.5. Next, using the Poisson distribution to approximate the probabily of at least 3 bankruptcies, we find:
\[ P(X \geq 3) = 1 - P(X \leq 2) \]
\[ P(0) = \frac{e^{-3.5}(3.5)^{0}}{0!} = e^{-3.5} = 0.030197 \]
\[ P(1) = \frac{e^{-3.5}(3.5)^{1}}{1!} = (3.5)(0.030197) = 0.1056895 \]
\[ P(2) = \frac{e^{-3.5}(3.5)^{2}}{2!} = (6.125)(0.030197) = 0.1849566 \]
Hence,
\[ P(X \leq 2) = P(0) + P(1) + P(2) = 0.3208431 \]
\[ P(X \geq 3) = 1 - 0.3208431 = 0.6791569 \]

Question 10b

Using the binomial distribution, we compute the probability belonging to X > 3 as: P(X > 3) = 0.684093.
Thus, the Poisson probability is a close estimate of the actual binomial distribution.

Question 11a

\[ P(X = 0) = \sum_{y}P(0,y) = 0.0625 + 0.0625 + 0.0625 + 0.0625 = 0.25\]
Note that for every combination of values for X and Y, P(x,y) = 0.0625. Therefore, all the marginal probabilities of X are 25%. The same holds for the marginal probabilities of Y. Note that the sum of the marginal probabilities for a random variable is 1.

Question 11b

To test independence, we need to check if P(x,y) = P(x)P(y) for all possible pairs of values x and y.
P(x,y) = 0.0625 for all possible values of x and y.
P(x) = 0.25 and P(y) = 0.25 for all possible values of x and y.
P(x,y) = 0.0625 = (0.25)(0.25) = P(x)P(y)
Thus, X and Y are independent.

Question 11c

\[ \mu_{X} = E[X] = \sum_{x}P(x) = 0(0.25) + 0.05(0.25) + 0.10(0.25) + 0.15(0.25) = 0.075 \]

Question 11d

The mean of Y is equal to the mean of X, that is 0.075.

Question 11e

\[ \sigma^{2}_{X} = \sum_{X}(x-\mu_{X})^{2}P(x) = (0.25)[(0 - 0.075)^{2} + (0.05 - 0.075)^{2} + (0.10 - 0.075)^{2} + (0.15 - 0.075)^{2}] = 0.003125 \]

Question 11f

The standard deviation of X is the square root of the variance, that is 0.0559016, or 5.59%.

Question 12

\[ P(X = 0) = \sum_{y}P(0,y) = 0.25 + 0.10 = 0.35 \]
\[ P(Y = 0) = \sum_{x}P(x,0) = 0.35 + 0.20 = 0.55 \]

Question 13a

\[ \mu_{X} = E[X] = \sum_{x}xP(x) = (-5)(0.4) + (20)(0.6) = €10,- \]
\[ \sigma^{2}_{x} = E[(X - \mu_{X})^{2}] = \sum_{x}(x - \mu)^{2} P(x) = (-5 - 10)^{2}(0.4) + (20 - 10)^{2}(0.6) = 150 \]
Strategy a has a mean profit of E[10X] = €100,- and variance of Var(10X) = 100Var(X) = 15,000.

Question 13b

\[ \mu_{Y} = E[Y] = \sum_{y}yP(y) = (0)(0.6) + (25)(0.4) = €10,- \]
\[ \sigma^{2}_{y} = E[(Y - \mu_{Y})^{2}] = \sum_{y}(y - \mu)^{2} P(Y) = (0 - 10)^{2}(0.6) + (25 - 10)^{2}(0.4) = 150 \]
Strategy b has a mean profit of E[10Y] = €100,- and variance of Var(10Y) = 100Var(Y) = 15,000.

Question 13c

\[ E[5X + 5Y] = E[5X] + E[5Y] = 5E[X] + 5E[Y] = €100,- \]
\[ Var(5X + 5Y) = Var(5X) + Var(5Y) = 25Var(X) + 25Var(Y) = 7,500 \]
The variance of strategy c is smaller than that of the strategies of a and b, reflecting the decrease in risk that follows from diversification in an investment portfolio. Most investors would prefer strategy c, because this strategy yields the same expected return as the other two strategies, but with a lower risk.

A researcher is studying the number of owl eggs found in Danmark. Is the number of eggs a discrete or continuous random variable?

How to use probability models for continuous random variables? - ExamTests 5

Questions

Question 1

Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?

Question 2

Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 0.5 and 1.6?

Question 3

Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is less than 0.8?

Quesiton 4

Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is greater than 1.3?

Question 5

A homeowner estimates the heating bill based on the range of likely temperatures in January. He obtains the following linear equation: Y = 290 - 5T, in which T refers to the average temperature for the month in degrees Fahrenheit. If the average temperature in January has mean 24 and standard deviation 4, what is then the mean and standard deviation of this homeowner's January heating bill?

Question 6

The profit for a production process is equal to 6000 dollars minus three times the number of units produced. The mean and variance for the number of units produced are 1000 and 900 respectively. Find the mean and variance of the profit.

Question 7

The profit of a particular production process is equal to €2000,- minus two times the number of units produced. The mean and variance for the number of units produced are 500 and 900 respectively. What are the mean and variance of the profit?

Question 8

The profit of a particular production process is equal to €1000,- minus two times the number of units produced. The mean and variance for the number of units produced are 50 and 90 respectively. What are the mean and variance of the profit?

Question 9

Consider for questions 9-15 the standard normal distribution.
P(Z < 1.16) = ?

Question 10

P(Z > 1.73) = ?

Question 11

P(Z > -2.29) = ?

Question 12

P(Z > -1.35) = ?

Question 13

P(1.16 < Z < 1.73) = ?

Question 14

P(-2.29 < Z < 1.26) = ?

Question 15

P(-2.29 < Z < -1.35) = ?

Question 16

The probability is 0.70 that Z is less than what number?

Question 17

The probability is 0.25 that Z is less than what number?

Question 18

The probability is 0.2 that Z is greater than what number?

Question 29

The probability is 0.6 that Z is greater than what number?

Question 20

Let a continuous random variable X be normally distributed with X ~ (30, 81). What is the probability that X is greater than 40?

Question 21

The anticipated consumer demand at a restaurant can be modeled by a normal random variable with mean 1,500 pounds and standard deviation 110 pounds. What is the probability that the demand will exceed 1,300 pounds?

Question 22

The scores on an achievement test are known to be randomly distributed with a mean of 420 and a standard deviation of 80. What is the minimum test score needed in order to be in the top 10% of all people taking the test?

Question 23

Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. Can the normal distribution be used to compute probabilities belonging to this distribution. If so, why?

Question 24

Given a random sample size of n = 900 from a binomial probability distribution with P = 0.30. What is the probability that the number of successes is greater than 305?

Question 25

Service times for customers at a library information desk can be modeled by an exponential distribution with a mean service of 5 minutes. What is the probability that a customer service time will take longer than 10 minutes?

Question 26

A company in the Netherlands with 2000 employees has a mean number of lost-time accidents per week equal to λ = 0.4 and the number of accidents follow a Poisson distribution. What is the probability that the time between accidents is less than 2 weeks?

Question 27a

An investor has asked you for assistance in establishing a portfolio containing two stocks. The investor has €1000,- which can be allocated in any proportion to two alternative stocks. The returns per euro from these two investments are denoted by random variables X and Y. Both of these variables are independent and normally distributed. Investment X has a mean of 25 and variance of 81. The second investment has a mean of 40 and a variance of 121. These two stock prices have a negative correlation, ρ_xy = -0.40. Define the linear equation of the value of the portfolio, denoted by W.

Question 27b

What is the mean value for the stock portfolio?

Question 27c

What is the standard deviation for the stock portfolio?

Question 27d

What is the probability that the portfolio value exceeds 2,000?

Answer indication

Question 1

P(1.8 < X < 1.4) = F(1.8) - F(1.4) = (0.5)(1.8) - (0.5)(1.4) = 0.9 - 0.7 = 0.2.

Question 2

P(1.6 < X < 0.5) = F(1.6) - F(0.5) = (0.5)(1.6) - (0.5)(0.5) = 0.8 - 0.25 = 0.55.

Question 3

P(X < 0.8) = F(0.8) = (0.5)(0.8) = 0.40.

Question 4

P(2.0 < X < 1.3) = F(2.0) - F(1.3) = (0.5)(2.0) - (0.5)(1.3) = 1.0 - 0.65 = 0.35.

Question 5

\[ \mu_{Y} = 290 - 5\mu_{T} = 290 - (5)(24) = 170 \]
\[ \sigma_{Y} = |-5| \sigma_{T} = (5)(4) = 20 \]

Question 6

\[ Y = 6000 - 3U \]
\[\mu_{Y} = 1000 = 6000 - 3U \]
\[3U = 6000 - 1000 = 5000 \]
\[U ≈ 1667 \]
\[ \sigma_{Y} = |3|\sigma_{U} \]
\[ 900 = |3|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{3} = 300 \]
Thus, the mean and variance of the profit are 1,667 and 300 dollars respectively.

Question 7

\[ Y = 2000 - 2U \]
\[\mu_{Y} = 500 = 2000 - 2U \]
\[2U = 2000 - 500 = 1500\]
\[U ≈ 750 \]
\[ \sigma_{Y} = |2|\sigma_{U} \]
\[ 900 = |2|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{2} = 450 \]
Thus, the mean and variance of the profit are €750,- and €450,- respectively.

Question 8

\[ Y = 1000 - 2U \]
\[\mu_{Y} = 50 = 1000 - 2U \]
\[2U = 1000 - 50 = 950\]
\[U ≈ 475 \]
\[ \sigma_{Y} = |2|\sigma_{U} \]
\[ 90 = |2|\sigma_{U} \]
\[ \sigma_{U} = \frac{900}{2} = 45 \]
Thus, the mean and variance of the profit are €950,- and €45,- respectively

Question 9

P(Z < 1.16) = 0.8770

Question 10

P(Z > 1.73) = 1 - 0.9582 = 0.0418

Question 11

P(Z > -2.29) = P(Z < 2.29) = 0.9890

Question 12

P(Z > -1.35) = P(Z > 1.35) = 0.9115

Question 13

P(1.16 < Z < 1.73) = 0.9582 - 0.8770 = 0.0812

Question 14

P(-2.29 < Z < 1.26) = 0.9890 - 0.8962 = 0.0928

Question 15

P(-2.29 < Z < -1.35) = 0.0855 - 0.011 = 0.0745

Question 16

z = 0.525

Question 17

z = -0.575

Question 18

z = -0.845

Question 19

z = -0.256

Question 20

\[ Z = \frac{X - \mu}{sigma} = \frac{40 - 30}{\sqrt{81}} = \frac{-10}{9} = -1.11 \]
P(Z > -1.11) = 1 - 0.8665 = 0.1335

Question 21

\[ Z = \frac{(1300 - 1,500)}{110} = -1.82 \]
P(Z > -1.82) = 0.9656

Question 22

Top 10% corresponds to z = 1.185 (between z = 1.18 and z = 1.19 in Standard Normal Distribution Table).
\[ 1.185 = \frac{X - 420}{80} \]
\[ 1.185*80 = X - 420 \]
\[ 94.5 + 420 = X\]
Thus, X = 514.8. One needs to score at least 515 to be in the top 10% of all people taking this test.

Question 23

nP(1 - P) = 900*0.30(1 - 0.30) = 189 > 5, thus the binomial distribution can be approximated by the standard normal distribution.

Question 24

\[ \mu = nP = 270 \]
\[ \sigma^{2} = 189 \]
\[ \sigma = \sqrt{189} = 13.75 \]
\[ z = \frac{305 - 270}{13.75} = 2.55 \]
P(Z > 2.55) = 1 - 0.9946 = 0.0054

Question 25

\[ P(T > 10) = 1 - P(T < 10) = 1 - F(10) = 1 - (1 - e^{-(0.20)(10)}) = e^{-2.0} = 0.1353 \]
Thus, the probability that a service time exceeds 10 minutes is 0.1353.

Question 26

\[ P(T < 2) = F(2) = 1 - e^{-(0.4)(2)} = 1 - e^{-0.8} = 1 - 0.4493 = 0.5507 \]
Thus, the probability of less than 2 weeks between accidents is about 55%.

Question 27a

W = 20X + 30Y

Question 27b

W = 20*25 + 30*40 = 1,700

Question 27c

\[ \sigma^{2}_{W} = 20^{2} \sigma^{2}_{X} 30^{2} \sigma^{2}_{Y} + 2*30 \rho_{XY} \sigma_{X} \ sigma_{Y} \]
\[ \sigma^{2}_{W} = 20^{2}*81 + 30^{2}*121 + 2*20*30*{-0.40}*9*11 = 93,780 \]
\[ \sigma = \sqrt{\sigma^{2}} = \sqrt{93,780} = 306.24 \]

Question 27d

\[ Z = \frac{2000 - 1700}{306.24} = 0.980 \]
P(Z > 0.980) = 0.1635

Consider the uniform probability density function f(x) = 0.5x with a range of 0 to 2. What is the probability that a random variable X is between 1.4 and 1.8?

How to obtain a proper sample from a population? - ExamTests 6

Questions

Question 1a

Suppose that we know that the annual percentage salary increase is normally distributed with a mean of 12.2% and a standard deviation of 3.6%. A random sample of 9 observations is obtained from this population and the sample mean is computed. What is the standard error of the sample mean?

Question 1b

What is the probability that the sample mean exceeds 14.4%?

Question 2a

Given a population with a mean of 105 and a variance of 16, the central limit theorem applies when the sample size is n > 25. A random sample of size 25 is obtained. What are the mean and variance of the sampling distribution for the sample means?

Question 2b

What is the probability that x̅ > 106?

Question 2c

What is the probability that 104 < x̅ < 106?

Question 2d

What is the probability that x̅ < 105.5?

Question 3a

Given a population with a mean of 150 and a variance of 1600, the central limit theorem applies when the sample size is n > 25. A random sample of size 36 is obtained. What are the mean and variance of the sampling distribution for the sample means?

Question 3b

What is the probability that x̅ > 155?

Question 3c

What is the probability that 145 < x̅ < 165?

Question 3d

What is the probability that x̅ > 165?

Question 4a

The lifetime of light bulbs procuded by a company have a mean of 1,200 hours and a standard deviation of 400 hours. The population is normally distributed. Suppose that you buy nine light bulbs, which can be regarded as a proper random sample from the population. What is the mean of the sample mean lifetime?

Question 4b

What is the variance of the sample mean?

Question 4c

What is the standard error of the sample mean?

Question 4d

What is the probability that, on average, those nine light bulbs have live times of less than 1050 hours?

Question 5a

To get some feeling for possible magnitudes of the finite population correction factor, calculate it for samples of n = 20 observations from populations of members: 20, 100, 10,000.

Question 5b

Explain why the result found in the previous question is precisely what one should expect on intuitive grounds.

Question 6a

A random sample of 270 students was taken from a large population of students taking a statistics exam. If, in fact, 20% of the students fail the test, what is the probability that the sample proportion of students failing the test will be between 16 and 24%?

Question 6b

Now, compute the same probability for 16 to 24%, but this time use a sample of 400 students.

Question 7

It has been estimated that 43% of the students drink alcohol. Find the probability that more than half of a random sample of 80 students drink alcohol.

Question 8

Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 58% of a random sample of 250 adult Americans eat McDonald's once a week?

Question 9

Suppose that 50% of all adult Americans eat McDonald's once a week. What is the probability that more than 55% of a random sample of 250 adult Americans eat McDonald's once a week?

Question 10

Given is n = 6. Determine an upper limit for the sample variance such that the probability of exceeding this limit, given a population standard deviation of 3.6, is less than 0.05. Use the chi-square distribution to solve this problem.

Question 11a

There are six employees with the following years of experience:

2, 4, 6, 6, 7, 8

Two of these employees are to be chosen at random.

What is the mean age for these six employees?

Question 11b

How many possible samples of two employees are there?

Question 11c

List all possible samples

Question 11d

Find the sampling distribution of the sample means.

Question 12

What is the central limit theorem?

Question 13a

Suppose a population distribution is left-skewed with mean 100 and variance 15. From this population, we draw a random sample of n = 100. What is the expected mean of this sample?

Question 13b

What is the expected variance of this sample?

Question 13c

What shape is expected for the sampling distribution?

Answer indication

Question 1a

μ = 12.2; σ = 3.6; n = 9.
\[ \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} = \frac{3.6}{\sqrt{9}} = 1.2 \]

Question 1b

\[ P(\bar{x} > 14.4) = P( \frac{\bar{X} - \mu}{\sigma_{\bar{x}}} > \frac{14.4 - 12.2}{1.2} ) = P(z > 1.83) = 0.0336 \]
To conclude, the probability that the sample mean will exceed 14.4% is only 0.0336.

Question 2a

The central limit theorem appies, thus the sampling distribution has mean 105 and variance 16/√25 = 3.2.

Question 2b

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{106 - 105}{3.2} = 0.3125\]
P(Z > 0.3125) = 1- 0.6217 = 0.3783

Question 2c

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{104 - 105}{3.2} = -0.3125\]
P(104 < x̅ < 106) = P(-0.3125 < z < 0.3125) = 0.6217 - (1 - 0.6217) = 0.2434

Question 2d

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{105.5 - 105}{3.2} = 0.1563\]
P(Z < 0.1563) = 0.5636

Question 3a

The central limit theorem applies, thus the mean of the sampling distribution is 150 and the variance 1600/√36 = 266.67.

Question 3b

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{155 - 150}{266.7} = 0.0188\]
P(Z > 0.0188) = 1- 0.5040 = 0.4960

Question 3c

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{145 - 150}{266.7} = -0.06563\]
\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{165 - 150}{266.7} = 0.0563\]
P(145 < x̅ < 165) = P(-0.0563 < z < 0.0563) = 0.5239 - (1 - 0.5239) = 0.5239 - 0.4761 = 0.0478

Question 3d

P(x̅ > 165) = 1 - 0.5239 = 0.4761

Question 4a

The population is normally distributed. Therefore, the sampling distribution of the sample means is normal. Hence, the mean of the sampling distribution is 1,200.

Question 4b

The variance is 400/√9 = 133.33

Question 4c

The standard error is √400/√9 = 6.67.

Question 4d

\[ Z = \frac{\bar{X} - \mu_{X}}{\sigma_{\bar{X}}} = \frac{1050 - 1200}{133.33} = 1.1250\]
P(x̅ < 1050) = P(Z < 1.1250) = (0.8686 + 0.8708)/2 = 0.8697

Question 5a

The finite population correction factor is calculated as follows: (N - n)/(N - 1).
The population correction factor for sample size n = 20 for a population with 20 members is: (20 - 20)(20 - 1) = 0.
The population correction factor for sample size n = 20 for a population with 100 members is: (100 - 20)(100 - 1) = 0.8081.
The population correction factor for sample size n = 20 for a population with 10,000 members is: (10,000 - 20)(10,000 - 1) = 0.9981.

Question 5b

It is the total sample size, not the fraction of the population in the sample, that determines the precision of the results from a random sample. The larger the number of members in the population, the higher the precision of the estimate, regardless of the size of a single sample.

Question 6a

P = 0.20 and n = 270.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{270} } = 0.024 \]
The required probability is:
\[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.024} < Z \frac{0.24 - 0.20}{0.024} ) \]
P(-1.67 < Z < 1.67) = 0.9525 - (1 - 0.9525) = 0.9050
Thus, we see that the probability is 0.9050 that the sample proportion is within the interval [0.16 - 0.24] given P = 0.20 and sample size n = 270. This interval can be called a 90.50% acceptance interval. Note that, if the sample proportion was actually outside this interval, we may suspect that the population proportion P is not 0.20.

Question 6b

P = 0.20; n = 400.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.20(1 - 0.20)}{400} } = 0.0200 \]
The required probability is:
\[ P(0.16 < \hat{p} < 0.24 = P( \frac{0.16 - 0.20}{0.0200} < Z \frac{0.24 - 0.20}{0.0200} ) \]
P(-2.00 < Z < 2.00) = 0.9772 - (1 - 0.9772) = 0.9544
This interval can thus be called a 95.44% acceptance interval (given P = 0.20 and sample size n = 400).

Question 7

P = 0.43; n = 80.
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.43(1 - 0.43)}{80} } = 0.055 \]
\[ P(\hat{p} > 0.50) = P(Z > \frac{0.50 - 0.43}{0.055}) \]
P (Z > 1.27) = 0.1020

Question 8

P = 0.50; n = 250
\[ \sigma_{\hat{p}} = \sqrt{ \frac{P(1 - P)}{n} } = \sqrt{\frac{0.50(1 - 0.50)}{250} } = 0.0316 \]
\[ P(\hat{p} > 0.58) = P(Z > \frac{0.58 - 0.50}{0.0316}) = 2.5316 \]
P (Z > 2.53) = 1 - 0.9943 = 0.0057

Question 9

\[ P(\hat{p} > 0.55) = P(Z > \frac{0.55 - 0.50}{0.0316}) = 0.9494 \]
P (Z > 0.95) = 1 - 0.8289 = 0.1711

Question 10

n = 6; σ² = (3.6)² = 12.96.
Using the chi-square distribution, we can state that:
\[ P(s² > K) = P ( \frac{(n - 1) s^{2}}{12.96} > 11.070) = 0.05 \]
where K is the desired upper limit and X²₅ = 11.070 is the upper 0.05 critical value of the chi-square distribution with 5 degrees of freedom. The required upper limit for s² is obtained by solving:
\[ \frac{(n - 1)K}{12.96} = 11.070 \]
\[ K = \frac{(11.070)(12.96)}{(6 - 1)} = 28.69 \]
Thus, if the sample variance s² from a random sample of size n = 6 exceeds 28.69, there is strong evidence to suspect that the population variance exceeds 12.96.

Question 11a

\[ \mu = \frac{2 + 4 + 6 + 6 + 7 + 8}{6} = 5.5 \]

Question 11b

Two of these employees are to be chosen randomly. We are sampling without replacement, thus, the first observation has a probability of 1/6 of being selected, while the second observation has a probability of 1/5 of being selected. Fifteen possible random samples of two eployees could be selected. Note that some samples (such as 2,6) occur twice because there are two employees with six years of experience in the population.

Question 11c

2 4
2 6 (2x)
2 7
2 8
4 6 (2x)
4 7
4 8
6 6
6 7 (2x)
6 8 (2x)
7 8

Question 11d

Sample mean	Probability of sample mean
3.0	1/15
4.0	2/15
4.5	1/15
5.0	3/15
5.5	1/15
6.0	2/15
6.5	2/15
7.0	2/15
7.5	1/15

Question 12

The central limit theorem shows that, if the sample size is large enough, the mean of a random sample drawn from a population with any probability distribution, will be approximately normally distributed with mean μ and variance σ²/n.

Question 13a

100

Question 13b

σ²/n = 15/100 = 0.15

Question 13c

According to the central limit theorem, we expect that, as n becomes large, the distribution approaches the standard normal distribution.

How to obtain estimates for a single population? - ExamTests 7

Questions

Question 1

Let x₁, x₂, ..., x_n be a random sample from a normally distributed population with mean μ and variance σ². Assuming that a population is normally distributed with a very large population size compared to the sample size, should the sample mean or the sample median be used to estimate the population mean?

Question 2

Give one advantage of the median over the mean for estimating a population mean.

Question 3

Give one disadvantage of the median in comparison to the mean for estimating a population mean.

Question 4

Which two properties should an estimator possess?

Question 5a

Suppose that shopping times for customers at a local mall follow a normal distribution. The population standard deviation is equal to 20 minutes. A random sample of 64 shoppers in the local grocery store has a mean time of 75 minutes. What is the standard error?

Question 5b

What is the margin of error?

Question 5c

What is the 95% confidence interval for the population mean μ?

Question 5d

Give an interpretation of this confidence interval.

Question 6

How can the margin of error be reduced?

Question 7

What distribution is used when the population variance is known?

Question 8

What distribution is used when the population variance is unknown?

Question 9

Find the standard error for n = 17 and s = 16.

Question 10

Find the upper critical value of student's t distribution with v = 23 degrees of freedom for α = 0.05.

Question 11a

From a random sample of 344 employees, it was found that 261 were in favor of a modified bonus plan. What is the sample proportion?

Question 11b

What is the reliability factor for a 90% confidence interval?

Question 11c

What is the margin of error for a 90% confidence interval?

Question 11d

Provide the 90% confidence interval.

Question 11e

Interpret the 90% confidence level.

Question 12

What is the number that is exceeded with probability 0.10 by a chi-square random variable with 4 degrees of freedom?

Question 13

What is the number that is exceeded with probability 0.05 by a chi-square random variable with 18 degrees of freedom?

Question 14

The following information is provided: n = 25, s² = 100. What are the critical values for a 95% confidence interval with α = 0.05?

Question 15

Use the information provided in the previous question. Find the 95% confidence interval for the population variance.

Question 16a

Suppose there are 1395 secondary schools in the Netherlands. From a simple random sample of 400 of these schools, it was found that the sample mean enrollment during the past year in biology courses was 320.8 students, and the sample standard deviation was found to be 149.7 students. What it the point estimate for the population total, Nμ?

Question 16b

Find the corresponding 99% confidence interval for this population total.

Question 17a

From a simple random sample of 400 of the 1,395 students in our population, it is found that biology was a two-semester course in 141 of the sampled schools. Estimate the proportion of all schools for which the biology course is two semesters long.

Question 17b

Provide the confidence interval for the proportion of all schools for which the biology course is two semesters long.

Question 18

Suppose we have: ME = 0.50; σ = 1.8; and z_a/2 = z_0.005= 2.576. What is the required sample size for a 99% confidence interval?

Question 19

It is given that ME = 0.06 and z_a/2 = z_0.025 = 1.96. What is the required sample size?

Question 20

Suppose that an opinion survey is conducted about the presidential election. The survey was said to have a 3% margin of error. The implication is that a 95% confidence interval for the population proportion holding a particular opinion is the sample proportion plus or minus 3%. How many citizens of voting age need to be sampled to obtain this 3% margin of error?

Question 21

Suppose that a simple random sample of the 1,395 Dutch secondary schools is taken. Whatever the true proportion, a 95% confidence interval must extend no further than 0.04 on each side of the sample proportion. How many sample observations should be taken?

Answer indication

Question 1

Assuming that a population is normally distributed with a very large population size compared to the sample size, the sample mean is an unbiased estimator of the population mean.

Question 2

The median gives less weight to extreme observations and, thus, is less sensitive to outliers.

Question 3

The relative efficiency of the median is lower than that of the mean.

Question 4

Unbiasedness and being the most efficient.

Question 5a

Standard error = σ/√n = 20/√64 = 2.5

Question 5b

The margin of error = z_α/2 * (σ/√n) = 1.96*2.5 = 4.9

Question 5c

The 95% confidence interval runs from 75 - 4.9 to 75 + 4.9, that is: [70.1 - 79.9].

Question 5d

In the long run, 95% of the intervals found in this manner contain the true value of the population mean.

Question 6

Decrease the population standard deviation, or increase the sample size, or decrease the confidence interval.

Question 7

The standard normal distribution (z distribution).

Question 8

Student's t distribution.

Question 9

Standard error = s/√n = 16/√17 = 3.88

Question 10

Use Table 8 (Appendix) to find that the upper critical value is 1.714.

Question 11a

\[ \hat{p} = 261/344 = 0.759 \]

Question 11b

\[ z_{\alpha/2} = z_{0.05} = 1.645\]

Question 11c

\[ 1.645 \sqrt{(0.759)(0.241)}{344} = 0.038 \]

Question 11d

0.759 +/- 0.038 = [0.721; 0.797]

Question 11e

Imagine taking a very large number of independent random samples of size n = 344 from this population, and, calculating a 90% confidence interval for each sample result. Then, the confidence level of the interval implies that in the long run 90% of the intervals found in this manner contain the true value of the population proportion.

Question 12

7.779

Question 13

28.869

Question 14

\[ X^{2}_{n-1,1-\alpha/2} = \chi^{2}_{24,0.975} = 12.401 \]
\[ X^{2}_{n-1,\alpha/2} = \chi^{2}_{24,0.025} = 39.364 \]

Question 15

\[ LCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,\alpha/2} } = \frac{(24)(100)}{39.364} = 60.97 \]
\[ UCL = \frac{(n - 1) s^{2}}{\chi^{2}_{n - 1,1 - \alpha/2} } = \frac{(24)(100)}{12.401} = 193.53 \]
Hence, the 95% confidence interval is: [60.97; 193.53]

Question 16a

Nx̄ = (1,395)(320.8) = 447,516. Thus, we estimate a total of 447,516 students to be enrolled in biology courses.

Question 16b

\[ N\hat{\sigma}_{\bar{x}} = \frac{Ns}{\sqrt{n}} \sqrt{ (\frac{N - n}{N - 1}) } = \frac{(1,395)(149.7)}{\sqrt{400}} = 8,821.6 \]
Because the sample size is large, we can use the central limit theorem with z_α/2 = 2.58 for a 99% confidence interval. Hence:
\[ N\bar{x} \pm z_{\alpha/2} N \hat{\sigma}_{\bar{x}} \]
\[ 447,516 \pm 2.58(8.821.6) \]
\[ 447,516 \pm 22,760 \]
Thus, the 99% confidence interval runs from 424,756 to 470,276 students.

Question 17a

N = 1,395; n = 400.
\[ \hat{p} = \frac{141}{400} = 0.3525 \]
The point estimate of the population proportion P, is simply equal to this population proportion, that is: 0.3525.

Question 17b

\[ \hat{\sigma}^{2}_{\hat{p}} = \frac{\hat{p} (1 - \hat{p}}{n - 1} ( \frac{N - n}{N - 1} ) = \frac{(0.3525)(0.6475)}{400} = 0.0004073 \]
so
\[ \hat{\sigma}_{\hat{p}} = \sqrt{0.0004073} = 0.0202 \]
For a 90% confidence interval: z_a/2 = 1.645.
\[ ME = z_{\alpha/2} \hat{\sigma}_{\hat{p}} = 1.645(0.0202) ≅ 0.0332 \]
Thus, the 90% confidence interval runs from 0.3525 +/- 0.0332. That is, from 31.93% to 38.57%.

Question 18

\[ n = \frac{z^{2}_{\alpha/2}} \sigma^{2}{ME^{2}} = \frac{ (2.576)^{2} (1.8)^{2} }{(0.5)^{2}} ≈ 86 \]

Question 19

\[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{0.25(1.96)^{2}}{(0.06)^{2}} = 267 \]

Question 20

\[ n = \frac{0.25 (z_{\alpha/2})^{2}}{(ME)^{2}} = \frac{(0.25)(1.96)^{2}}{(0.03)^{2}} = 1067.11 = 1068 \]

Question 21

\[ 1.96 \sigma_{\hat{p}} = 0.04 \]
\[ \sigma_{\hat{p}} = 0.020408 \]
\[ n_{max} = \frac{0.25N}{(N - 1) \sigma^{2}_{\hat{p}} + 0.25 } = \frac{(0.25)(1,395)}{(1,394)(0.020408)^{2} + 0.25} = 419.88 = 420 \]

How to estimate parameters for two populations? - ExamTests 8

Questions

Question 1a

The following information is provided for a dependent random sample from two normally distributed populations:
\[ n = 11 \hspace{3mm} \bar{d} = 28.5 \hspace{3mm} s_{d} = 3.3 \]
Find the 98% confidence interval for the difference between the means of the two populations.

Question 1b

What is the margin of error for a 98% confidence interval for the difference between the means of the two populations?

Question 1c

What do you conclude based on the confidence interval found in question 1a?

Question 2a

Consider the following data:

Before	After
6 12 8 10 6	8 14 9 13 7

What type of dependent sample is depicted here?

Question 2b

What is the sample mean of the differences?

Question 2c

It is given that the mean difference is equal to 7.7 with standard deviation s_d = 43.68901. Compute the 95% confidence interval using the normal approximation.

Question 3a

An educational study is conducted to examine the effectiveness of a mathematics reading program of elementary age school children. Each child was given a pre- and posttest. HIgher scores indicate improvement in mathematics. From a very large population, a random sample was drawn. The data obtained from this sample are provided in the table below. What is the mean difference score?

Child

Pretest Score

Posttest score

1
2
3
4
5
6
7

40
36
32
38

33
35

48
42

36
43
38
45

Question 3b

What is the standard deviation of the difference scores?

Question 3c

Find the t value corresponding to a 95% confidence interval.

Question 3d

Compute a 95% confidence interval.

Question 3e

Can we conclude, based on this 95% confidence interval, that there is a significant improvement in mathematics?

Question 3f

Compute a 95% confidence interval using the normal approximation.

Question 3g

What do we conclude based on this interval?

Question 4

A study regarding student's GPA was conducted. From a very large university, independent random samples of 120 students majoring in economics and 90 students majoring in finance were selected. The mean GPA for the random sample of economics majors was found to be 3.08. The mean GPA for the random sample of finance majors was found to be 2.88. From similar past studies, the population standard deviation for the finance majors is 0.64. Denote the population mean for economics by μ_x and the population mean for finance by μ_y. With which scenario are we dealing here?

Population variances known.
Population variances unknown, but assumed to be equal.
Population variances unknown, and not assumed to be equal.

Question 4b

Compute the 95% confidence interval for the difference score for the information provided in the previous question.

Question 4c

What do we conclude based on this 95% confidence interval (from question 4b)?

Question 5a

Consider the following data:

X	100	125	135	128	140	142	128	137	156	142
Y	95	87	100	75	110	105	85	95

Suppose these are independent samples with unknown variances, but the variances are assumed to be equal. Give n_x, n_y, x̄, ȳ, σ²_x and σ²_y.

Question 5b

Compute the pooled variance.

Question 5c

What are the degrees of freedom?

Question 5d

Find the t value corresponding to a 95% confidence interval.

Question 5e

Compute a 95% confidence interval.

Question 6a

Assuming equal population variances, determine the number of degrees for:
n_x = 16; s²_x = 30
n_y = 9; s²_x = 36

Question 6b

Compute the pooled sample variance for the information provided in the previous question.

Question 7a

Assuming equal population variances, determine the number of degrees for:
n_x = 12; s²_x = 30
n_y = 14; s²_x = 36

Question 7b

Compute the pooled sample variance for the information provided in the previous question.

Question 8a

Assuming equal population variances, determine the number of degrees for:
n_x = 20; s²_x = 16
n_y = 8; s²_x = 25

Question 8b

Compute the pooled sample variance for the information provided in the previous question.

Question 9

The following information is provided:
\[ n_{x} = 120; \hat{p}_{y} = 0.892 \]
\[ n_{y} = 141; \hat{p}_{y} = 0.518 \]
Compute a 95% confidence interval for the population difference (P_x - P_y).

Question 10

Calculate the margin of error for a 95% confidence interval with:
\[ n_{x} = 300; \hat{p}_{y} = 0.62 \]
\[ n_{y} = 350; \hat{p}_{y} = 0.72 \]

Question 11

Calculate the margin of error for a 95% confidence interval with:
\[ n_{x} = 100; \hat{p}_{y} = 0.44 \]
\[ n_{y} = 150; \hat{p}_{y} = 0.55 \]

Answer indication

Question 1a

\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} = 28.5 \pm 2.764 \frac{3.3}{\sqrt{11}} = 28.5 \pm 2.7502 \]
The 98% confidence interval is: [25.75; 31.25].

Question 1b

ME = 2.7502

Question 1c

Based on these sample data we conclude that there is sufficient evidence to suggest that there is a significant difference between the two populations.

Question 2a

Repeated measurements

Question 2b

\[ \bar{d} = \frac{2 + 2 + 1 + 3 + 1}{5} = 1.8 \]

Question 2c

Using the normal approximation we have t_n-1,a/2 = t_139,0.025 ≅ 1.96.
\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
\[ 7.7 \pm 1.96 \frac{43.68901}{\sqrt{140}} \]
\[ 7.7 \pm 7.2 \]
This results in the following 95% confidence interval: [70.5; 84.9]

Question 3a

\[ \bar{d} = \frac{8 + 6 -2 + 5 + 10}{5} = 5.4 \]

Question 3b

s_d ≅ 4.56

Question 3c

t_4,0.025 = 2.776

Question 3d

\[ \bar{d} \pm t_{n-1,\alpha/2} \frac{s_{d}}{\sqrt{n}} \]
\[ 5.4 \pm 2.776 \frac{4.56}{\sqrt{5}} \]
\[ 5.4 \pm 5.6620 \]
The 95% confidence interval is: [-0.26; 11.620]

Question 3e

No, because the zero is within the range of the confidence interval. Thus, there is insufficient evidence to conclude that there is a significant difference.

Question 3f

Using the normal approximation, we replace t by z, that is: z = 1.96.
\[ 5.4 \pm 1.96 \frac{4.56}{\sqrt{5}} \]
\[ 5.4 \pm 3.9976 \]
The 95% confidence interval is: [1.40; 9.40]

Question 3g

Based on the 95% confidence interval computed by the normal approximation, we would conclude that there is a significant improvement in the mathematics scores. Note, however, that we are dealing with a dependent sample here (repeated measures). Therefore, the normal approximation is not a valid procedure. It is, however, important to see the difference the distribution can make on the statistical inferences.

Question 4a

A. population variances known.

Question 4b

\[ (\bar{x} - \bar{y}) \pm z_{\alpha/2} + \sqrt{\frac{\sigma^{2}_{x}}{n_{x}} + \frac{\sigma^{2}_{y}}{n_{y}}} \]
\[ (3.08 - 2.88) \pm 1.96 \sqrt{ \frac{(0.42)^{2}}{120}} + \frac{(0.64)^{2}}{90} = 0.20 \pm 0.1521 \]
Thus, the 95% interval extends from 0.0479 to 0.3521

Question 4c

The confidence interval does not comprise the zero, thus we conclude that there is a significant difference in the mean GPA of students majoring in economics and students majoring in finance. More precisely, on average, the mean GPA of students majoring in economics is higher than the GPA of students majoring in finance.

Question 5a

n_x = 10; x̄ = 133.30; σ²_x = 218.0111
n_y = 8; ȳ = 94.00; σ²_y = 129.4286

Question 5b

\[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} = \frac{(10 - 1)(218.0111) + (8 - 1)(129.4286) }{10 + 8 -2} = 19.2563 \]

Question 5c

The degrees of freedom are given by: n_x + n_y - 2 = 10 + 8 - 2 = 16

Question 5d

t₁₆_,0.025 = 2.12

Question 5e

\[ (\bar{x} - \bar{y}) \pm t_{n_{x} + n_{y} - 2, a/2} + \sqrt{\frac{s^{2}_{p}}{n_{x}} + \frac{s^{2}_{p}}{n_{y}}} \]
\[ 39.3 \pm (2.21) \sqrt{ \frac{179.2563}{10} + \frac{179.2563}{8} } \]
\[ 39.3 \pm 13.46 \]
Thus, the 95% confidence interval is: [25.84; 52.76]

Question 6a

df = n_x + n_y - 2 = 16 + 9 - 2 = 23

Question 6b

\[ s^{2}_{p} = \frac{ (n_{x} - 1)s^{2}_{x} + (n_{y} - 1)s^{2}_{y} }{n_{x} + n_{y} - 2} \]
\[ s^{2}_{p} = \frac{ (16-1)30 + (9 - 1)36}{16 + 9 - 2} = \frac{738}{23} = 32.08 \]

Question 7a

df = n_x + n_y - 2 = 12 + 14 - 2 = 24

Question 7b

\[ s^{2}_{p} = \frac{ (12-1)30 + (14 - 1)36}{12 + 14 - 2} = \frac{798}{24} = 33.25 \]

Question 8a

df = n_x + n_y - 2 = 20 + 8 - 2 = 26

Question 8b

\[ s^{2}_{p} = \frac{ (20-1)16 + (8 - 1)25}{20 + 8 - 2} = \frac{479}{26} = 18.42 \]

Question 9

\[ (\hat{p}_{x} - \hat{p}_{y}) \pm z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ (0.892 - 0.518) \pm 1.96 \sqrt{ \frac{(0.892)(0.108)}{120} + \frac{(0.518)(0.482)}{141} } \]
From this, it follows that the 95% confidence interval runs from 0.274 to 0.473.

Question 10

\[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ 1.96 \sqrt{ \frac{(0.62)(0.38)}{300} + \frac{(0.72)(0.28)}{350} } \]
ME = 0.0733

Question 11

\[ ME = z_{\alpha/2} = \sqrt{ \frac{ \hat{p}_{x} (1 - \hat{p}_{x} ) }{n_{x}} + \frac{ \hat{p}_{y} (1 - \hat{p}_{y} ) }{n_{y}} } \]
\[ 1.96 \sqrt{ \frac{(0.44)(0.56)}{100} + \frac{(0.55)(0.45)}{120} } \]
ME = 0.1329

How to develop hypothesis testing procedures for a single population? - ExamTests 9

Questions

Question 1a

Kees wants to use the results of a random sample market survey to seek strong evidence that his brand of cereal has more than 20% of the total market. Formulate the null hypothesis and alternative hypothesis using P as the population proportion.

Question 1b

Is the alternative hypothesis you formulated a one-sided or two-sided composite alternative hypothesis?

Question 2

A car factory has proposed a process to monitor the diameter of pistons on a regular schedule. They want to test whether the diameter is equal to 3800. Formulate the null hypothesis and alternative hypothesis.

Question 3

What is a type I error?

Question 4

What is a type II error?

Question 5a

A random sample is obtained from a population with variance σ² = 625. The sample mean is computed. Test the null hypothesis H₀: μ = 100 versus the alternative hypothesis H₁: μ > 100 with α = 0.05. Compute the critical value x̅_c and state your decision rule regarding a sample size of n = 25.

Question 5b

Do the same for n = 16.

Question 5c

Do the same for n = 44.

Question 5d

Do the same for n = 32.

Question 6a

A random sample of n = 25 is obtained from a population with known variance. The sample mean is computed. Test the null hypothesis: H₀: μ = 120 versus the alternative hypothesis H₁: μ > 120 with α = 0.10. Compute the critical value x̅_c and state your decision rule regarding the population variance σ² = 196.

Question 6b

Do the same for σ² = 625.

Question 6c

Do the same for σ² = 900.

Question 6d

Do the same for σ² = 500.

Question 7

Test the hypotheses: H₀: μ = 100 and H₁ = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 108; s = 20.

Question 8

Test the hypotheses: H₀: μ = 100 and H₁ = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 104; s = 10.

Question 9

Test the hypotheses: H₀: μ = 100 and H₁ = μ > 100, using a random sample of n = 31, a probability of type I error equal to 0.05 and the following sample statistics: x̅ = 96; s = 10.

Question 10

Mention four conditions that will raise the power function.

Question 11

Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.56 to be β = 0.31 using a significance level of α = 0.05. What is the power?

Question 12

Suppose, we find the probability of a type II error involved in failing to reject the null hypothesis when the true proportion is 0.66 to be β = 0.25 using a significance level of α = 0.10. What is the power?

Question 13a

A random sample of 20 products is obtained, and the weight of each product is measured. The sample variance is computed to be 6.62. The hypothesis is tested that the weight of the products cannot exceed. Formulate the null hypothesis and alternative hypothesis.

Question 13b

What are the degrees of freedom?

Question 13c

What is the critical value?

Question 13d

What is the test statistic?

Question 13e

Based on these sample data, can we reject the null hypothesis?

Question 14a

Suppose we are testing the following hypotheses:
H₀: μ < 100
H₁: μ > 100
using a random sample of n = 49, a probability of type I error equal to 0.05.
Suppose the population variances are unknown, what distribution should you use?

Question 14b

Test the hypotheses using the following test statistics: x̅ = 108; s = 20

Question 14c

Test the hypotheses using the following test statistics: x̅ = 104; s = 10

Question 14d

Test the hypotheses using the following test statistics: x̅ = 96; s = 10

Question 14e

Test the hypotheses using the following test statistics: x̅ = 95; s = 8

Answer indication

Question 1a

H₀: P = 0.20
H₁: P > 0.20

Question 1b

A one-sided composite alternative hypothesis.

Question 2

H₀: μ = 3800
H₁: μ ≠ 3800

Question 3

A type I error refers to rejecting the null hypothesis, while the null hypothesis is true.

Question 4

A type II error refers to failing to reject the null hypothesis, while the null hypothesis is false.

Question 5a

For a one-sided hypothesis test with significance level α = 0.05, the value of z_α = 1.645 from the standard normal table. The variance is 625, thus the standard deviation is √625 = 25.
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{25}) = 109.80 \]
The decision rule is: reject H₀ if x̅ > 109.80

Question 5b

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{16}) = 112.50 \]
The decision rule is: reject H₀ if x̅ > 112.50

Question 5c

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{44}) = 107.39 \]
The decision rule is: reject H₀ if x̅ > 107.39

Question 5d

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 100 + 1.96 x (25 / \sqrt{32}) = 108.62 \]
The decision rule is: reject H₀ if x̅ > 108.62

Question 6a

For a one-sided hypothesis test with significance level α = 0.05, the value of z_α = 1.282 from the standard normal table. The variance is 196, thus the standard deviation is √196 = 14.
\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (14 / \sqrt{25}) = 123.59 \]
The decision rule is: reject H₀ if x̅ > 123.59

Question 6b

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{625} / \sqrt{25}) = 121.28 \]
The decision rule is: reject H₀ if x̅ > 121.28

Question 6c

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{900} / \sqrt{25}) = 127.69 \]
The decision rule is: reject H₀ if x̅ > 127.69

Question 6d

\[ x_{c} = \mu_{0} + z_{\alpha} \sigma/\sqrt{n} = 120 + 1.282 x (\sqrt{500} / \sqrt{25}) = 125.73 \]
The decision rule is: reject H₀ if x̅ > 125.73

Question 7

t_30,0.05= 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{108 - 100}{20 / \sqrt{31}} = 2.23 \]
Thus, t > t_30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.

Questiom 8

t_30,0.05= 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{104 - 100}{10 / \sqrt{31}} = 2.23 \]
Thus, t > t_30,0.05. Based on this result, we reject the null hypothesis in favor of the alternative hypothesis.
The t value is actually the same as in the previous question, because both the nominator and denominator are half of the original value, hence yielding the same outcome.

Question 9

t_30,0.05= 1.697
\[ t = \frac{\bar{x} - \mu_{0}}{s / \sqrt(n)} = \frac{96 - 100}{10 / \sqrt{31}} = -2.23 \]
Thus, t < t_30,0.05. Because we are testing a one-sided alternative hypothesis with H₁: μ > μ₀, here, we cannot reject the null hypothesis (be aware that the sample mean is lower than the parameter of interest, rather than higher than the parameter).

Question 10

(1) the true mean is farther from the hypothesized mean μ₀; (2) the significance level is higher; (3) the population variance is lower; (4) the sample size is larger.

Question 11

Power = 1 - β = 1 - 0.31 = 0.69

Question 12

Power = 1 - β = 1 - 0.25 = 0.75

Question 13a

H₀: σ² < σ²₀ = 4
H₁: σ² > 4

Question 13b

df = n - 1 = 20 - 1 = 19

Question 13c

For this test with a significance level of α = 0.05 and 19 degrees of freedom, the critical value of the chi-square variable is 30.144 (see Appendix Table 7 of the book).

Question 13d

\[ \frac{(n - 1)s^{2}}{\sigma^{2}_{0}} = \frac{20 -1)(6.62)}{4} = 31.445 \]

Question 13e

31.445 > 30.144. Therefore, we can reject the null hypothesis and conclude that the variability of the weight of the products exceeds the standard.

Question 14a

Student's t distribution

Question 14b

The critical t value is: t_c = 1.684
\[ t = \frac{108 - 100}{20 / \sqrt{49}} = 2.8 \]
t > t_c, therefore we can reject the null hypothesis.

Question 14c

\[ t = \frac{104 - 100}{20 / \sqrt{10}} = 2.8 \]
t > t_c, therefore we can reject the null hypothesis.

Question 14d

\[ t = \frac{96 - 100}{10 / \sqrt{49}} = -2.8 \]
t < t_c, yet we are testing t > t_c. Therefore we cannot reject the null hypothesis ("wrong side").

Question 14e

\[ t = \frac{95 - 100}{8 / \sqrt{49}} = 4.38 \]
t < t_c, yet we are testing t > t_c. Therefore we cannot reject the null hypothesis ("wrong side").

What test procedures are there for testing the difference between two populations? - ExamTests 10

Questions

Question 1a

A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ₁ and the mean of production process 2 is defined as μ₂. The null and alternative hypotheses are as follows: H₀: μ₁ – μ₂ = 0 and H₁: μ₁ – μ₂ > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.

Question 1b

Can you reject the null hypothesis if the sample standard deviation of the difference is 20?

Question 1c

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 30?

Question 1d

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 15?

Question 1e

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample standard deviation of the difference is 40?

Question 2a

A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ₁ and the mean of production process 2 is defined as μ₂. The null and alternative hypotheses are as follows: H₀: μ₁ – μ₂ = 0 and H₁: μ₁ – μ₂ < 0. From the populations, a random sample is drawn of 25 matched pairs. The standard deviation of the difference between the sample means is found to be 25. Give the decision rule using a probability of type I error α = 0.05.

Question 2b

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 50 for populations 1 and 2?

Question 2c

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 59 and 50 for populations 1 and 2?

Question 2d

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 56 and 48 for populations 1 and 2?

Question 2e

Can you reject the null hypothesis using a probability of type I error α = 0.05 if the sample means are respectively 54 and 50 for populations 1 and 2?

Question 3a

A researcher wants to conduct a hypothesis test for the difference in means between two populations with independent samples. The following information is provided:
n_x = 25; = 115; = 625
n_y = 25; = 100; = 400
Compute the test statistic.

Question 3b

The researcher decides to test at a significance level of α = 0.05. Determine the critical z value.

Question 3c

Compare the critical z value to the test statistic. Can the researcher reject the null hypothesis?

Question 4

How large should the sample size be in order to obtain a good approximation if we replace the population variances with the sample variances?

Question 5a

Use the following information:
n_x = 25; = 1078; s_x = 633
n_y = 25; = 908.2; s_y = 469.8
We are interested in testing the difference in population means between X and Y. The alternative hypothesis states that the mean of population 2 is larger than the mean of population 1. For this hypothesis test, we are using a significance level of α = 0.05. Note that the population variances are unknown and that the sample variances are given.
Formulate the null hypothesis and alternative hypothesis.

Question 5b

Compute the pooled variance estimate.

Question 5c

What are the degrees of freedom?

Question 5d

What is the critical value of t?

Question 5e

Compute the test statistic.

Question 5f

Provide the decision rule for this hypothesis test.

Question 6

Can the null hypothesis be rejected?

Question 7

How large should the sample size be in order to be able to use the standard normal distribution for testing the equality of two population proportions?

Question 8a

Consider the following information:
n_x = 270; = 0.185
n_y = 203; = 0.399
Compute the estimate of the common variance, P₀, under the null hypothesis.

Question 8b

Compute the test statistic.

Question 8c

Suppose we are testing with the alternative hypothesis: H₁: P_x < P_y. For this test, we are using a significance level of α = 0.05. What is the critical value?

Question 8d

Formulate the decision rule.

Question 8e

Can we reject the null hypothesis?

Question 9a

Consider the following information:
n_x = 17; s_x = 123.35
n_y = 11; s_y = 8.02
What are the degrees of freedom for the F distribution?

Question 9b

Given a significance level of α = 0.02, what is the critical value of F?

Question 9c

Compute the test statistic. Can the null hypothesis be rejected?

Answer indication

Question 1a

t_n-1,a = t_24,0.05 = 1.711
The general decision rule here is: reject H₀ if t > t_24,0.05 = 1.711.

Question 1b

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{20 / \sqrt{25}} = 2.5 \]
t > t_24,0.05 and, thus, we can reject the null hypothesis.

Question 1c

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{30 / \sqrt{25}} = 1.67 \]
t < t_24,0.05 and, thus, we cannot reject the null hypothesis.

Question 1d

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{15 / \sqrt{25}} = 3.33 \]
t > t_24,0.05 and, thus, we can reject the null hypothesis.

Question 1e

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{10}{40 / \sqrt{25}} = 1.25 \]
t < t_24,0.05 and, thus, we cannot reject the null hypothesis.

Question 2a

t_n-1,a = t_24,0.05 = -1.711
The general decision rule here is: reject H₀ if t < -t_24,0.05 = -1.711.

Question 2b

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-6}{25 / \sqrt{25}} = -3.8 \]
t < t_24,0.05 and, thus, we can reject the null hypothesis.

Question 2c

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-9}{25 / \sqrt{25}} = -1.8 \]
t < t_24,0.05 and, thus, we can reject the null hypothesis.

Question 2d

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-8}{25 / \sqrt{25}} = -1.6 \]
t > t_24,0.05 and, thus, we cannot reject the null hypothesis.

Question 2e

\[ t = \frac{\bar{d}}{s_{d} / \sqrt{n} } = \frac{-4}{25 / \sqrt{25}} = -0.8 \]
t > t_24,0.05 and, thus, we cannot reject the null hypothesis.

Question 3a

\[ z = \frac{115 - 100}{\sqrt{\frac{625}{25} + \frac{400}{25}}} = 2.34 \]

Question 3b

Z_0.05 = 1.645

Question 3c

z > z_0.05 thus the null hypothesis can be rejected.

Question 4

The sample size should be larger than 100.

Question 5a

H₀: μ_x – μ_y = 0
H₁: μ_x – μ_y< 0

Question 5b

\[ s^{2}_{p} = \frac{ (25-1)(633)^{2} + (25 – 1)(469.8)^{2} }{25 + 25 - 2} = 310,700 \]

Question 5c

df = 25 + 25 – 2 = 48

Question 5d

t_48,0.05 = 1.677

Question 5e

\[ t = \frac{1078 – 908.2}{ \sqrt{ \frac{310,700}{25} + \frac{310,700}{25}}} = 1.08 \]

Question 5f

Reject H₀ if t > t_48,0.05 = 1.677

Question 6

No, the test statistic is smaller than the critical value. Thus, there is not sufficient evidence to reject the null hypothesis.

Question 7

nP₀(1 – P₀) > 5

Question 8a

\[ \hat{p}_{0} = \frac{n_{x} \hat{p}_{x} + n_{y} \hat{p}_{y}}{n_{x} + n_{y}} = \frac{(270)(0.185) + (203)(0.399)}{270 + 203} = 0.277 \]

Question 8b

\[ \frac{0.185 – 0.399}{ \sqrt{ \frac{ (0.277)(1 – 0.277) }{270} + \frac{ (0.277)(1 – 0.277) }{203} } } = -5.15 \]

Question 8c

–z_0.05 = -1.645

Question 8d

Reject H₀ if z < –z_0.05 = -1.645

Question 8e

Yes, we can reject the null hypothesis that there is no difference in proportions between these two populations, because -5.15 < -1.645.

Question 9a

df_numerator = (n_x - 1) = 17 – 1 = 16 and df_denominator = (n_y - 1) = 11 – 1 = 10.

Question 9b

From Appendix Table 9 (in the book) it follows that: F_16,10,0.01 = 4.520

Question 9c

\[ F = \frac{s^{2}_{x}}{s^{2}_{y}} = \frac{123.35}{8.02} = 15.380 \]
Obviously, the test statistic of F(15.380) exceeds the critical value (4.520). Hence, the null hypothesis can be rejected in favor of the alternative hypothesis.

Use the following information for questions 1-5. A researcher wants to determine whether two different production processes have different mean numbers of products produced per hour. The mean of production process 1 is defined as μ₁ and the mean of production process 2 is defined as μ₂. The null and alternative hypotheses are as follows: H₀: μ₁ – μ₂ = 0 and H₁: μ₁ – μ₂ > 0. From the populations, a random sample is drawn of 25 matched pairs. The sample means are respectively 50 and 60 for populations 1 and 2. Give the decision rule using a probability of type I error α = 0.05.

How to conduct a simple regression? - ExamTests 11

Questions

Question 1a

Suppose we are interested in the relationship between the number of workers (denoted by X) and the number of tables produced per hour (Y). A sample of 10 workers is provided. The following descriptive statistics are obtained:
\[Cov(x,y) = 106.93 \hspace{5mm} s^{2}_{x} = 42.01 \hspace{5mm} \bar{y} = 41.2 \hspace{5mm} \bar{x} = 21.3 \]
Compute the slope of the sample regression.

Question 1b

Compute the y-intercept for the sample regression.

Question 1c

What is the equation of the regression line?

Question 1d

If management decides to employ 25 workers, how many tables would we expect to be produced?

Question 2

The following regression equation is given: Y = 559 + 0.3815X.
What is the expected value of Y for X = 55,000.

Question 3a

Use the following regression equation:
Y = 100 + 21X
Interpret the slope of the regression line.

Question 3b

What is the change in Y when X changes by +5?

Question 3c

What is the change in Y when X changes by -7?

Question 3d

What is the predicted value of Y when X = 14?

Question 3e

What is the predicted value of Y when X = 27?

Question 3f

Does this equation prove that a change in X causes a change in Y?

Question 4a

Given the regression equation:
Y = 107 + 10X
What is the change in Y when X changes by +2?

Question 4b

What is the change in Y when X changes by -4?

Question 4c

What is the predicted value of Y when X = 15?

Question 4d

What is the predicted value of Y when X = 22?

Question 5

Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 10; ȳ = 50; s_x = 80; s_y = 75; r_xy = 0.4; n = 60.

Question 6

Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 60; ȳ = 50; s_x = 80; s_y = 65; r_xy = 0.7; n = 60.

Question 7

Compute the coefficients for a least squares regression equation and write the equation, given the following sample statistics: x̅ = 90; ȳ = 100; s_x = 60; s_y = 70; r_xy = 0.4; n = 60.

Question 8

The following information is provided: SSE = 17.89 and SST = 68.22. What is the percent explained variability?

Question 9

What absolute value of the Student's t statistic indicates a relationship between two variables when we use a two-tailed test with α= 0.05 and n > 60?

Question 10a

Given the simple regression model
\[ Y = \beta_{0} + \beta_{1}X \]
and the regression results that follow, test the null hypothesis that the slope coefficient is zero versus the alternative hypothesis that the slope coefficient differs from zero using probability of type I error rate equal to 0.005 and determine the two-sided 99% confidence interval. The following sample statistics are provided: n = 22; b₁ = 0.3815; s_b1 = 0.0253.

Question 10b

Consider your answer on the previous question. Based on this result, what do you conclude about the slope coefficient?

Question 11

Which four factors result in narrower prediction intervals?

Question 12a

Suppose we want to test H₀: ρ = 0 against H₁: ρ > 0 using the sample information: n = 49 and r = 0.42.
What is the test statistic?

Question 12b

What is the critical value if we are testing at a 0.05% signifcance level?

Question 12c

What do we conclude about the population correlation?

Question 13

Suppose we have the following information: n = 25. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?

Question 14

Suppose we have the following information: n = 64. Using the rule of thumb for testing the hypothesis that the population correlation is zero, what should be the absolute value of the sample correlation that has to be exceeded in order to reject this null hypothesis?

Question 15

Which two factors can influence the estimated regression equation?

Question 16

Points with a high leverage will have a .... standard error of the residual.

Answer indication

Question 1a

\[ b_{1} = \frac{Cov(x,y)}{s^{2}_{x}} = r \frac{s_{y}}{s_{x}} = \frac{106.93}{42.01} = 2.545 \]

Question 1b

\[ b_{0} = \bar{y} - b_{1}\bar{x} = 41.2 - 2.545(21.3) = -13.02 \]

Question 1c

\[ \bar{y} = b_{0} + b_{1}x = -13.02 + 2.545x \]

Question 1d

\[ \hat{y} = -13.02 + 2.545(25) = 50.605 \]

Question 2

Y = 559 + 0.3815*55,000 = 21,542

Question 3a

For every one-unit change in X, Y changes by 21.

Question 3b

If X changes by +5, Y changes by (21)(5) = 105

Question 3c

If X changes by -7, Y changes by (21)(-7) = -147

Question 3d

Y = 100 + (21)(14) = 394

Question 3e

Y = 100 + (21)(27) = 667

Question 3f

No, regression results summarize the information contained in the data. They do not prove causation.

Question 4a

If X changes by +2, Y changes by (10)(2) = 20

Question 4b

If X changes by -4, Y changes by (10)(-4) = 40

Question 4c

Y = 107 + (10)(15) = 257

Question 4d

Y = 107 + (10)(22) = 327

Question 5

\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{75}{80} = 0.375 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.43(10) = 46.25 \]
\[ \hat{y}_{i} = 46.25 + 0.375x_{i} \]

Question 6

\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.7 \frac{65}{80} = 0.8125 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 50 - 0.8125(60) = 1.25 \]
\[ \hat{y}_{i} = 1.25 + 0.8125x_{i} \]

Question 7

\[ b_{1} = r\frac{s_{Y}}{s_{X}} = 0.4 \frac{70}{60} = 0.467 \]
\[ b_{0} = \bar{y} = b_{1}\bar{x} = 100 - 0.467(90) = 58 \]
\[ \hat{y}_{i} = 58 + 0.467x_{i} \]

Question 8

\[ R^{2} = 1 - \frac{SSE}{SST} = 1 - \frac{17.89}{68.22} = 0.738 \]
Thus, 73,80% of the variability is explained by the regression model.

Question 9

According to the rule of thumb, the absolute value of the Student's t statistic should be greater than 2.0 to indicate that there is a relationship.

Question 10a

For a 99% confidence interval we have 1 - α = 0.05 and n - 2 = 22 - 2 = 20 degrees of freedom. Hence, from Appendix Table 8 (see book) it follows that:
\[ t_{n-2,\alpha/2} = t_{20,0.005} = 2.845 \]
Therefore, the 99% confidence interval is:
\[ 0.3815 - (2.845)(0.0253) < \beta_{1} < 0.381 + (2.845)(0.0253) \]
\[ 0.3095 < \beta_{1} < 0.4535 \]

Question 10b

The confidence interval does not comprise the zero, therefore we can reject the null hypothesis and conclude that the slope coefficient is not equal to zero.

Question 11

A larger sample size (n).
A smaller value of s²_e.
A large dispersion of the observations of the independent variable.
Smaller values of the quantity (x_n+1 - x̅)².

Question 12a

\[ t = \frac{0.43 \sqrt{(49 - 2)}}{\sqrt{1 - (0.43)^{2}}} = 3.265 \]

Question 12b

Since there are (n - 2) = 47 degrees of freedom, it follows from Appendix Table 8 that t_47,0.005 = 2.704

Question 12c

t_47,0.005 = 2.704 < t. Therefore, we can reject the null hypothesis. There is strong evidence of a positive linear relationship between the two variables. Note, however, that we cannot conclude from this result that one variable caused the other, but only that they are related.

Question 13

\[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{25}} > 0.4 \]

Question 14

\[ |r| > \frac{2}{\sqrt{n}} = \frac{2}{\sqrt{64}} > 0.25 \]

Question 15

Points with a high leverage and outliers.

Question 16

Smaller.

How to conduct a multiple regression? - ExamTests 12

Questions

Question 1a

\[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
Compute the expected value of y when x₁ = 11, x₂ = 24, and x₃ = 27.

Question 1b

Compute the expected value of y when x₁ = 31, x₂ = 20, and x₃ = 17.

Question 1c

Compute the expected value of y when x₁ = 32, x₂ = 29, and x₃ = 13.

Question 1d

Compute the expected value of y when x₁ = 30, x₂ = 26, and x₃ = 29.

Question 2a

\[ \hat{y} = 10 + 5_{x1} + 4_{x2} + 2_{x3} \]
Compute the expected value of y when x₁ = 20, x₂ = 11, and x₃ = 10.

Question 2b

Compute the expected value of y when x₁ = 15, x₂ = 14, and x₃ = 20.

Question 2c

Compute the expected value of y when x₁ = 35, x₂ = 19, and x₃ = 25.

Question 2d

Compute the expected value of y when x₁ = 10, x₂ = 17, and x₃ = 30.

Question 3a

\[ \hat{y} = 10 - 2_{x1} - 14_{x2} + 6_{x3} \]
What is the change in y when x₁ increases by 4?

Question 3b

What is the change in y when x₃ increases by 1?

Question 3c

What is the change in y when x₂ increases by 2?

Question 4

What is the fifth assumption of a multiple linear regression model?

Question 5

Compute the coefficient b₁ for the regression model
\[ \hat{y}_{i} = b_{0} + b_{1}x_{1i} + b_{2}x_{x2i} \]
given the following summary statistics:
r_x1y = 0.80, r_x2y = 0.30, r_x1x2 - 0.90, s_x1 = 500, s_x2 = 400, s_y = 100

Question 6

Compute the coefficient b₂ for the regression model (using the regression model of question 13).

Question 7

The following data are available: n = 25; K = 2; SSE = 0.0625; SST = 0.4640.
Compute the adjusted coefficient of determination.

Question 8

When is the adjusted coefficient of determination preferred over the standard coefficient of determination?

Question 9

How is the coefficient of multiple correlation related to the multiple coefficient of determination?

Question 10a

b₁ = 0.2372; s_b1 = 0.0556; b₂ = -0.000249; s_b2 = 0.00003205.
What is the critical t statistic for a two-tailed hypothesis test with a 99% confidence interval?

Question 10b

Provide the 99% confidence interval for β₁.

Question 10c

Provide the 99% confidence interval for β₂.

Question 11a

A researcher is testing the influence of four independent variables on a certain dependent variable using multiple regression (n = 88). He finds that, for the complete model with four predictor variables, SSE = 1,149.14. For a multiple regression model with only two of the four predictor variables, SSE = 1,426.93. The variance estimator is s²_e = 13.52. Compute the F statistic.

Question 11b

How many degrees of freedom does the F statistic have?

Question 11c

What is the critical value for F with a significance level of 0.01?

Question 11d

What is a dummy variable?

Question 12

Formulate the null hypothesis and the alternative hypothesis for testing the slope coefficient in the event of dummy variables.

Question 13

What is the model constant when the dummy variable equals 1 in the following equation, where x₁ is a continuous variable and x₂ is a dummy variable?
\[ \hat{y} = 9 + 6x_{1} + 9x_{2} \]

Question 14

What is the model constant when the dummy variable equals 1 in the following equation, where x₁ is a continuous variable and x₂ is a dummy variable?
\[ \hat{y} = 7 + 4x_{1} + 2x_{2} \]

Question 15

What is the model constant when the dummy variable equals 1 in the following equation, where x₁ is a continuous variable and x₂ is a dummy variable?
\[ \hat{y} = 4 + 4x_{1} + 8x_{2} + 9x_{1}x_{2} \]

Question 16

Consider the following equation: y_i = 2x^1.4
Compute the value of y_i when x_i = 1

Question 17

Consider the following equation: y_i = 2x^1.4
Compute the value of y_i when x_i = 1

Answer indication

Question 1a

\[ \hat{y} = 12 + (5)(11) + (6)(24) + (2)(27) = 265 \]

Question 1b

\[ \hat{y} = 12 + (5)(31) + (6)(20) + (2)(17) = 321 \]

Question 1c

\[ \hat{y} = 12 + (5)(32) + (6)(29) + (2)(13) = 372 \]

Question 1d

\[ \hat{y} = 12 + (5)(30) + (6)(26) + (2)(9) = 336 \]

Question 2a

\[ \hat{y} = 10 + (5)(20) + (4)(11) + (2)(10) = 174 \]

Question 2b

\[ \hat{y} = 10 + (5)(15) + (4)(14) + (2)(20) = 181 \]

Question 2c

\[ \hat{y} = 10 + (5)(35) + (4)(19) + (2)(25) = 311 \]

Question 2d

\[ \hat{y} = 10 + (5)(10) + (4)(17) + (2)(30) = 188 \]

Question 3a

The change in y when x₁ increases by 4 is equal to (2)(4) = 8.

Question 3b

The change in y when x₃ increases by 1 is equal to (6)(1) = 6.

Question 3c

The change in y when x₂ increases by 2 is equal to (14)(2) = 28.

Question 4

There is no direct linear relationship between the independent variables.

Question 5

\[ b_{1} = \frac{ s_{y} (r_{x1y} - r_{x1x2}r_{x2y} ) }{s_{x1} (1 - r^{2}_{x1x2})} = \frac{100 (0.80 - 0.90*0.30)}{500 (1 - 0.90^{2}) = 0.56 } \]

Question 6

\[ b_{2} = \frac{s_{y} (r_{x2y} - r_{x1x2} r_{x1y} ) }{s_{x2} (1 - r^{2}_{x1x2})} =
\frac{100 (0.30 - 0.90*0.80)}{400 (1 - 0.90^{2}) = -0.55 } \]

Question 7

\[ \bar{R}^{2} = 1 - \frac{0.0625/22}{0.4640/24} = 0.853 \]

Question 8

This adjusted coefficient of determination corrects for the fact that nonnrelevant independent variables will result in a (small) reduction in the error sum of squares (SSE). Consequently, the adjusted coefficient of determination offers a better comparison between multiple regression models with different numbers of independent variables.

Question 9

The coefficient of multiple correlation is equal to the square root of the multiple coefficient of determination

Question 10a

t_n-K-1,a/2 = t_22,0.005 = 2.819

Question 10b

0.237 - (2.819)(0.05556) < β₁ < 0.237 + (2.819)(0.05556)
0.80 < β₁< 0.394

Question 10c

-0.000249 - (2.819)(0.0000320) < β₂ < -0.000249 + (2.819)(0.0000320)
-0.000339 < β₂ < -0.000159

Question 11a

\[ F = \frac{(1426.93 - 1149.14)/2}{13.52} = 10.27 \]

Question 11b

The F statistic has 2 degrees of freedom (i.e., for the two variables tested simultaneously) for the numerator and 85 degrees of freedom for the denominator.

Question 11c

F* = 4.9 (see Appendix Table 9)

Question 11d

A dummy variable is a variable with two possible outcomes: 0 and 1.

Question 12

\[ H_{0}: \beta_{3} = 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]
\[ H_{1}: \beta_{3} \neq 0 | \beta_{1} \neq 0, \beta_{2} \neq 0 \]

Question 13

Question 14

Question 15

Question 16

2.64

Question 17

5.28

\[ \hat{y} = 12 + 5_{x1} + 6_{x2} + 2_{x3} \]
Compute the expected value of y when x₁ = 11, x₂ = 24, and x₃ = 27.

What other topics are important in regression analysis? - ExamTests 13

Questions

Question 1

What are the four stages of model building?

Question 2

If a model cannot be verified, what should you do?

Question 3

In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for ... and ... variables.

Question 4

If a blocking variable has 4 levels, how many dummy variables should be created?

Question 5

What is a treatment variable?

Question 6

What is a blocking variable?

Question 7

What is a lagged value?

Question 8

What is multicollinearity?

Question 9

Suppose that all the coefficient student t statistics are small, indicting no individual effect, and yet the overall F statistic indicates a strong effect for the total regression model. What is this an indication of?

Question 10

How to correct for multicollinearity?

Question 11

What is the danger of correcting multicollinearity by removing one or more of the highly correlated independent variables?

Question 12

What are the four assumptions made in a simple linear regression analysis?

Question 13

What is the fifth assumption that is added for multiple regression analysis?

Question 14

What is heteroscedasticity?

Question 15

Describe one procedure to check for heteroscedasticity.

Question 16a

From the regression of the squared residuals on the predicted values, we obtain the following estimated model (for n = 25):
\[ e^{2} = 0.00621 - 0.00550 \hat{y} \hspace{2mm} with \hspace{2mm} R^{2} = 0.066 \]
Compute the test statistic.

Question 16b

What is the critical value if we are testing with a 10% significance level?

Question 16c

Can we reject the null hypothesis that the regression model has uniform variance?

Question 17

What is the meaning of ρ for (auto)correlated errors?

Question 18

What does it imply if ρ = 0?

Question 19

What does it imply if ρ = 0.3?

Question 20

What does it imply if ρ = 0.9?

Question 21a

What is the most commonly used test to check possible autocorrelation of error terms?

Question 21b

Formulate the null hypothesis of this test.

Question 22

Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H₁: ρ > 0.

Question 23

Provide the decision rules for testing the null hypothesis against the alternative hypothesis: H₁: ρ < 0.

Question 24

Suppose we found d = 0.2015, indicating positive autocorrelation. Estimate the serial correlation.

Question 25

Suppose we found d = 0.5213, indicating positive autocorrelation. Estimate the serial correlation.

Question 26a

In determining whether the errors in a regression model are positively correlated for the model
\[ y_{t} = \beta_{0} + \beta_{1}x_{1t} + \epsilon_{t} \]
we determine
\[ \sum^{30}_{t = 1}e^{2}_{t} = 7587.9154 \]
and
\[ \sum^{30}_{t = 2} (e_{t} - e_{t - 1})^{2} = 8195.2065 \]
Formulate the null and alternative hypothesis for the mentioned analysis.

Question 26b

Calculate the Durbin-Watson statistic.

Answer indication

Question 1

Model building consists of four stages: (1) model specification; (2) coefficient estimation; (3) model verification, and; (4) interpretation and inference.

Question 2

Go back to the first stage; model specification.

Question 3

In an experimental design, the experimental outcome (Y) is measured at specific combinations of levels for treatment and blocking variables.

Question 4

Question 5

A treatment variable is a variable whose effect we are interested in estimating with minimum variance. For instance, we may desire to know which of the five different production machines provides the highest productivity per hour. For this example, the treatment variable is the production machine, represented by a four-level categorical variable.

Question 6

A blocking variable is a variable that is part of the environment. Therefore, the variable level of such a variable cannot be preselected.

Question 7

When time series are analyzed (i.e., when measurements are taken over time) lagged values of the dependent variable are an important issue. Often in time series data, the dependent variable in time period t is related to the value taken by this dependent variable in an earlier time period, that is y_t-1. The lagged value then is the value of the dependent variable in this previous time period.

Question 8

Multicollinearity refers to a state of very high intercorrelations among the independent variables.

Question 9

Multicollinearity

Question 10

1. Remove one or more of the highly correlated independent variables.
2. Change the model specification, including possibly a new independent variable that is a function of several correlated independent variables.
3. Obtain additional data that do not have the same strong correlations between the independent variables.

Question 11

This might lead to a bias in coefficient estimation

Question 12

1. The Y's are linear functions of X, plus a random error term.
2. The x values are fixed number that are independent of the error terms.
3. The error terms are assumed to be random variables with a mean of zero and a covariance of σ².
4. The random error terms are not correlated with one another.

Question 13

There is no direct linear relationship between the X_j independent variables.

Question 14

Heteroscedasticity refers to the situation in which the errors terms do not have uniform variance.

Question 15

One possibility to check for heteroscedasticity is by examining a scatter plot of the residuals versus the independent variable. If the magnitude of the error terms tends to increase (or decrease) for increasing values of the independent variable, this indicates that the error variances are not constant.

Question 16a

\[ nR^{2} = (25)(0.066) = 1.65 \]

Question 16b

From Appendix Table 7, it can be found that for a 10% significance level, the critical value is: X²_1,0.10 = 2.706

Question 16c

The test statistic does not exceed the critical value, therefore the null hypothesis cannot be rejected.

Question 17

This ρ is the correlation coefficient (range -1 to +1) between the error in time t and the error in the previous time point, that is t - 1.

Question 18

If ρ = 0, this means that there is no autocorrelation in the errors.

Question 19

There is a relatively weak autocorrelation.

Question 20

There is a quite strong autocorrelation.

Question 21a

Durbin-Watson test.

Question 21b

H₀: ρ = 0.

Question 22

Reject H₀ if d > d_L. Accept H₀ if d > d_u. Test inconclusive if d_L < d < d_U.

Question 23

Reject H₀ if d > 4 - d_L. Accept H₀ if d < 4 - d_u. Test inconclusive if 4 - d_L > d > 4 - d_U

Question 24

\[ r = 1 - \frac{d}{2} = 1 - \frac{0.2015}{2} = 0.90 \]

Question 25

\[ r = 1 - \frac{d}{2} = 1 - \frac{0.5213}{2} = 0.74 \]

Question 26a

H₀: ρ = 0 and H₀: ρ > 0.

Question 26b

\[ d = \frac{ \sum^{n}_{t = 2} (e_{t} - e_{t-1})^{2} }{\sum^{n}_{t=1} e^{2}_{t}} = \frac{8195.2065}{7587.9154} = 1.08 \]

What are the four stages of model building?

How to analyze categorical data? - ExamTests 14

Questions

Question 1a

Consider the following data:

Category	A	B	C	D	Total
Observed number of objects	43	53	60	44	200
Probability (under H₀)	1/4	1/4	1/4	1/4	1
Expected number of objects (under H₀)	50	50	50	50	200

Compute the chi-square test statistic.

Question 1b

What are the degrees of freedom for the critical test statistic?

Question 1c

Provide the range of the test statistic with probability .10 and .90 using Table 7a and 7b.

Question 1d

Can we reject the null hypothesis that there is no preference for any of the four categories?

Question 2a

Consider the following data:

Category	A	B	C	D	Total
Observed number of objects	50	93	45	12	200
Probability (under H₀)	0.30	0.50	0.15	0.05	1
Expected number of objects (under H₀)					200

Compute the expected values based on the null hypothesis that is specified in the table.

Question 2b

Compute the chi-square test statistic.

Question 2c

How many degrees of freedom are there?

Question 2d

From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between .... and ....

Question 2e

Can the null hypothesis be rejected?

Question 3a

Consider the following data:

Category	A	B	C	D	Total
Observed number of objects	287	49	30	34	400
Probability (under H₀)	0.80	0.10	0.06	0.04	1
Expected number of objects (under H₀)					400

Compute the expected values based on the null hypothesis that is specified in the table.

Question 3b

Compute the chi-square test statistic.

Question 3c

How many degrees of freedom are there?

Question 3d

Find the critical value using a significance level of 0.001.

Question 3e

Can the null hypothesis be rejected?

Question 4a

It is tested whether the population distribution is Poisson. Consider the following data:

Number of occurrences	0	1	2	3+
Observed frequency	156	63	29	14
Expected frequency under H₀	135.4	89.4	29.5	7.7

Compute the test statistic.

Question 4b

How many degrees of freedom are there?

Question 4c

Find the corresponding critical value using a 0.001 significance level.

Question 4d

Can the null hypothesis that the population distribution is Poisson be rejected?

Question 5

Suppose we are interested in whether people prefer pinapple on their pizza. We sample 7 participants under the null hypothesis H₀: P = 0.5. What is the probability of obtaining no more than 2 people with a preference for pineapple on their pizza?

Question 6

If our test statistic for a Sign test is equal to S = 2. Can we reject the null hypothesis?

Question 7a

A random sample of 100 students was asked to compare two new ice cream flavors: grilled BBQ and bubblegum surprise. After testing both flavors, 65 students preferred grilled BBQ, 40 students preerred bubblegum flavor, and 4 expressed no preference. Use the normal approximation to determine the mean and standard deviation for preferring bubblegum surprise.

Question 7b

Compute the test statistic using the normal approximation and continuity correction.

Question 7c

Find the approximate p-value.

Question 7d

Can we reject the null hypothesis?

Question 7e

What will be the test statistic if the continuity correction is not used?

Question 8

Given a random sample of n = 31 matched pairs, compute the mean and standard deviation for the Wilcoxon statistic under the null hypothesis.

Question 9

Now, suppose we find that the observed value of the statistic is T = 189. If we test the null hypothesis against a lower-tail alternative hypothesis with significance level 0.05, what can we conclude about the null hypothesis?

Question 10

Two independent samples are considered with n₁ = 10, n₂ = 12 and R₁ = 93.5.
Compute the mean and variance for the Mann-Whitney statistic.

Question 11

Compute the Mann-Whitney U statistic.

Question 12

What can we conclude about the null hypothesis if we are testing with a significence level of 0.05?

Answer indication

Question 1a

X² = 3.88

Question 1b

df = K - 1 = 4 - 1 = 3.

Question 1c

Lower critical value (Appendix Table 7b) X²_3,0.90 = 0.584
Upper critical value (Appendix Table 7a) X²_3,0.10 = 6.251

Question 1d

It is found that the test statistic of 3.88 falls between 0.584 and 6.251; from this it follows that 0.10 < p-value < 0.90. The null hypothesis can therefore not be rejected. However, this does not mean that we can conclude that all four categories are equally preferred. It only means that there is not enough evidence to support a preference.

Question 2a

E_A = nP_A = 200(0.30) = 60
E_B = nP_B = 200(0.50) = 100
E_C = nP_C = 200(0.15) = 30
E_D = nP_D = 200(0.05) = 10

Question 2b

X² = 10.06

Question 2c

df = K - 1 = 4 - 1 = 3.

Question 2d

From Appendix Table 7 with K - 1 degrees of freedom, it is found that the test statistic falls between 9.348 and 11.345.

Question 2e

0.001 < p-value < 0.025. Hence, the null hypothesis can be rejected.

Question 3a

E_A = nP_A = 400(0.80) = 320
E_B = nP_B = 400(0.10) = 40
E_C = nP_C = 400(0.06) = 24
E_D = nP_D = 400(0.04) = 16

Question 3b

X² = 27.178

Question 3c

df = K - 1 = 4 - 1 = 3.

Question 3d

From Appendix Table 7 with K - 1 degrees of freedom and significance level 0.001, it is found that X²_3,0.001 = 16.266

Question 3e

The test statistic is much larger than the critical value. Hence, the null hypothesis can be rejected.

Question 4a

X² = 16.08

Question 4b

df = K - m - 1 = 4 - 1 - 1 = 2

Question 4c

X²_2,0.001 = 13.816

Question 4d

The test statistic exceeds the critical value, thus the null hypothesis that the population distribution is Poisson can be rejected at the 0.01% significance level.

Question 5

p-value = P(x < 2) = 0.227 (see Appendix Table 3)

Question 6

No, with a p-value this large, the null hypothesis cannot be rejected.

Question 7a

Let P be the population proportion that prefers bubblegum surprise, given S = 40.
\[ \mu = np = 0.5n = 0.5(96) = 48 \]
\[ \sigma = 0.5 \sqrt{96} = 4.899 \]

Question 7b

Since 40 < 48, S* = 40.5
\[ z = \frac{S* - \mu}{\sigma} = \frac{40.5 - 48}{4.899} = -1.53 \]

Question 7c

From the standard normal distribution, it follows that the approximate p-value = 2(0.0630) = 0.126

Question 7d

The null hypothesis can be rejected at all significance levels greater than 12.6%.

Question 7e

If no continuity correction factor is used, the value for the test statistic becomes Z = -1.633, yielding a slightly smaller p-value of 0.1024.

Question 8

\[ \mu_{T} = \frac{n(n + 1)}{4} = \frac{(31)(32)}{4} = 248 \]
\[ Var(T) = \sigma^{2}_{T} = \frac{n(n + 1)(2n + 1)}{24} = \frac{ (31)(32)(63) }{24} = 2604 \]
\[ \sigma_{T} = \sqrt{2604} = 51.03 \]

Question 9

\[ Z = \frac{T - \mu_{T}}{\sigma_{T}} = \frac{189 - 248}{51.03} = \frac{-59}{51.03} = -1.16 \]
For α = 0.05, z_α = -1.645
The test statistic does not exceed the critica value, hence there is not enough evidence to reject the null hypothesis.

Question 10

\[ E(U) = \mu_{U} = \frac{n₁n₂}{2} = \frac{ (10)(12) }{2} = 60 \]
\[ Var(U) = \sigma^{2}_{U} = \frac{ n₁n₂ (n₁ + n₂ + 1) }{12} = \frac{ (10)(12)(23) }{12} = 230 \]

Question 11

\[ Z = \frac{U - \mu{U}}{\sigma_{U}} = \frac{81.5 - 60}{ \sqrt{230} } = 1.42 \]

Question 12

The corresponding p-value = 0.1556. With a 0.05 significance level, this test result is not sufficient to conclude that the null hypothesis can be rejected.

How to conduct an analysis of variance? - ExamTests 15

Questions

Question 1

What is the null hypothesis of a one-way analysis of variance?

Question 2

Suppose, we found the following data: SSW = 12.18, n = 20, k = 3. Compute an estimate of the within-groups mean square.

Question 3

Suppose, we found the following data: SSG = 21.55, n = 20, k = 3. Compute an estimate of the between-groups mean square.

Question 4

Compute the F ratio for the MSW and MSG calculate in the previous two questions.

Question 5

What are the degrees of freedom corresponding to the information provided in questions 2 and 3.

Question 6

What is the critical F value if we are testing with a 1% significance level?

Question 7

What can we conclude about the population means based on this F ratio?

Question 8a

Consider the following analysis of variance table:

Source of variation	Sum of Squares	Degrees of freedom	Mean Squares	F ratio
Between groups	1728	4
Within groups	624	..
Total	2352	17

How many degrees of freedom does the within-groups sum of squares have?

Question 8b

Compute the mean squares for between groups.

Question 8c

Compute the mean squares for within groups.

Question 8d

Compute the F ratio.

Question 8e

Find the critical F value corresponding to a significance level of 0.05.

Question 8f

What can be concluded about the null hypothesis?

Question 9a

Consider the following analysis of variance table:

Source of variation	Sum of Squares	Degrees of freedom	Mean Squares	F ratio
Between groups	879	..
Within groups	798	16
Total	1677	19

How many degrees of freedom does the between-groups sum of squares have?

Question 9b

Compute the mean squares for between groups.

Question 9c

Compute the mean squares for within groups.

Question 9d

Compute the F ratio.

Question 9e

Find the critical F value corresponding to a significance level of 0.05.

Question 9f

What can be concluded about the null hypothesis?

Question 10a

Consider for questins 20-28 a two-way analysis of variance with one observations per cell and randomized blocks with the following results:

Source of variation	Sum of squares	Degrees of freedom	Mean squares	F ratio
Between groups	3636	33	MSG = SSG / (K - 1)
Between blocks	7575	66	MSB = SSB / (H - 1)
Error	9999	1818	MSE = SSE / ((K - 1) (H - 1))
Total	210210	2727

Compute the mean squares for the between groups.

Question 10b

Compute the mean squares for the within groups.

Question 10c

Compute the mean squares for the error.

Question 10d

Compute the F ratio MSG / MSE.

Question 10e

Find the critical value for the hypothesis test that the between group means are equal using a 5% significance level.

Question 10f

What do we conclude about the null hypothesis that the between group means are equal?

Question 10g

Compute the F ratio MSB / MSE.

Question 10h

Find the critical value for the hypothesis test that the between block means are equal using a 5% significance level.

Question 10i

What do we conclude about the null hypothesis that the between block means are equal?

Question 11a

Consider the following data:

Source of variation	Sum of squares	Degrees of freedom	Mean squares	F ratio
Between groups	62.04	1	62.04
Between blocks	0.06	1	0.06
Interaction	1.85	...	1.85
Error	23.31	63	0.37
Total	87.26	66

Compute the degrees of freedom for the interaction term.

Question 11b

Compute the F ratio for the interaction term.

Answer indication

Question 1

All population means are equal, that is: H₀: μ₁ = μ₂ = ... = μ_k for K populations.

Question 2

MSW = (12.18) / (20 - 3) = 0.72

Question 3

MSG = (21.55) / (3 - 1) = 10.78

Question 4

F = MSG / MSW = 10.78 / 0.72 = 15.039

Question 5

df = (K - 1) = 3 - 1 = 2 for the numerator
df = (n - K) = 20 - 3 = 17 for the denominator

Question 6

F_2,17,0.01 = 6.112 (Appendix Table 9)

Question 7

The test value (15.039) exceeds the critical value (6.112), therefore we can reject the null hypothesis that the population mean is the same for all three groups.

Question 8a

It follows from the degrees of freedom of the between-groups sum of squares that there are K - 1 = 4, thus K = 5. Further, from the degrees of freedom of the total sum of squares it follows that n - 1 = 17, thus n = 18.
As a result, we obtain: df = N - k = 18 - 5 = 13.

Question 8b

MSG = SSG / (K - 1) = 1728 / 4 = 432

Question 8c

MSW = SSW / (n - K) = 624 / 13 = 48

Question 8d

F = MSG / MSW = 246.86 / 48 = 9

Question 8e

F_4,13,0.05= 3.179

Question 8f

F > F_4,13,0.05, therefore we can reject the null hypothesis that the population means are equal.

Question 9a

n - 1 = 19 --> n = 20
n - k = 16 --> 20 - k = 16 --> k = 4
df = k - 1 = 4 - 1 = 3
Thus, there are 3 degrees of freedom.

Question 9b

MSG = SSG / (K - 1) = 879 / 3 = 293

Question 9c

MSW = SSW / (n - K) = 798 / 16 = 49.875

Question 9d

F = MSG / MSW = 293 / 49.875 = 5.875

Question 9e

F_3,16,0.05= 3.239

Question 9f

F < F_3,16,0.05, therefore we cannot reject the null hypothesis that the population means are equal.

Question 10a

MSG = SSG / (K - 1) = 3636 / 33 = 110.18

Question 10b

MSB = SSB / (H - 1) = 7575 / 66 = 114.77

Question 10c

MSE = SSE / ((K - 1) (H - 1)) = 9999 / 1818 = 5.5

Question 10d

F = MSG / MSE = 110.18 / 5.5 = 20.03

Question 10e

F_33,1818,0.05 = 1.676

Question 10f

The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-groups means are equal.

Question 10g

F = MSB / MSE = 114.77 / 5.5 = 20.87

Question 10h

F_66,9999,0.05 = 1.676

Question 10i

The test statistic exceeds the critical value, therefore we can reject the null hypothesis that the between-blocks means are equal.

Question 11a

df = 1

Question 11b

F = MSI / MSE = 1.85 / 0.37 = 5.

What is the null hypothesis of a one-way analysis of variance?

How to analyze data sets with measurements over time? - ExamTests 16

Questions

Question 1

What is meant with a time series?

Question 2

What are the four components of a time series?

Question 3

Let the estimates of level and trend in year 5 be as follows:
\[ \hat{x}_{5} = 347 \]
\[ T_{5} = 13 \]
What is the forecast for the next year using the Holt-Winters method?

Question 4

What is the forecast for year 7 using the Holt-Winters method for nonseasonal series?

Question 5

What is the forecast for year 8 using the Holt-Winters method for nonseasonal series?

Question 6

What is the forecast for year 9 using the Holt-Winters method for nonseasonal series?

Question 7

Suppose we have 32 observations and a seasonal factor s = 4 indicating quarterly data. Write down the equation for the forecast the next observation beyond the end of the series. Use for this the method developed by Holt-Winters for seasonal series.

Question 8

What is the null hypothesis in an autoregressive model?

Question 9

Provide the general equation that represents a series according to the autoregressive model.

Question 10

What algorithm is used to obtain the parameters for the autoregressive model?

Answer indication

Question 1

A time series is a set of measurements, ordered over time, on a particular quantity of interest. In a time series, the sequence of observations is important.

Question 2

T_t: trend component.
S_t: Seasonality component.
C_t: Cyclical component.
I_t: Irregular component.

Question 3

\[ \hat{x}_{6} = 347 + 13 = 360 \]

Question 4

\[ \hat{x}_{7} = 347 + (2)(13) = 373 \]

Question 5

\[ \hat{x}_{8} = 347 + (3)(13) = 386 \]

Question 6

\[ \hat{x}_{8} = 347 + (4)(13) = 399 \]

Question 7

\[ \hat{x}_{n+h} = ( \hat{x}_{n} + hT_{n} ) F_{n+h-s} = \hat{x}_{33} = (\hat{x}_{32} + T_{32}) F_29 \]

Question 8

H₀: Φ_p = 0

Question 9

\[ x_{t} = \gamma + \phi_{1}x_{t - 1} + \gamma + \phi_{2}x_{t - 2} + ... + \gamma + \phi_{p}x_{t - p} + \epsilon_{t} \]

Question 10

The least squares algorithm.

What is meant with a time series?

What other sampling procedures are available? - ExamTests 17

Questions

Question 1a

Suppose we conducted a stratified sampling procedure. Use the following information:
N₁ = 75; N₂ = 30; N₃ = 125.
n₁ = 15; n₂ = 8; n₃ = 25.
x̄₁= 21.2; s₁ = 12.8.
x̄₂= 13.3; s₂ = 11.4.
x̄₃= 26.1; s₃ = 9.2.
Compute the point estimate of the population mean.

Question 1b

Compute the point estimate of the variance for the first stratum.

Question 1c

Compute the point estimate of the variance for the second stratum.

Question 1d

Compute the point estimate of the variance for the third stratum.

Question 1e

Compute the point estimate of the variance for the population mean.

Question 1f

Compute the point estimate of the standard deviation for the population mean.

Question 1g

Compute a 95% confidence interval for the population mean.

Question 2a

Suppose we conducted a stratified sampling procedure. Use the following information:
N₁ = 364; N₂ = 1031.
n₁ = 40; n₂ = 60.
p(hat)₁= 7/40 = 0.175
p(hat)₂= 13/60 = 0.217
Compute the point estimate of the population proportion.

Question 2b

Compute the point estimate of the variance of the proportion for the first stratum.

Question 2c

Compute the point estimate of the variance of the proportion for the second stratum.

Question 2d

Compute the point estimate of the variance of the proportion for the population.

Question 2e

Compute the point estimate of the standard deviation of the proportion for the population.

Question 2f

Compute the 90% confidence interval for the population proportion from these stratified samples.

Question 3a

Suppose we have a total of N = 125 which is divided into three strata with N₁ = 75, N₂ = 30, and N₃ = 20. Now, suppose we want to select a sample of size n = 25.
Compute the sample size for the first stratum using proportional allocation.

Question 3b

Compute the sample size for the second stratum using proportional allocation.

Question 3c

Compute the sample size for the third stratum using proportional allocation.

Question 4a

Suppose we have a total of N = 225 which is divided into three strata with N₁ = 100, N₂ = 75, and N₃ = 50. Now, suppose we want to select a sample of size n = 50.
Compute the sample size for the first stratum using proportional allocation.

Question 4b

Compute the sample size for the second stratum using proportional allocation.

Question 4c

Compute the sample size for the third stratum using proportional allocation.

Question 5a

Suppose we have a total of N = 500 which is divided into three strata with N₁ = 250, N₂ = 100, and N₃ = 150. Now, suppose we want to select a sample of size n = 50.
Compute the sample size for the first stratum using proportional allocation.

Question 5b

Compute the sample size for the second stratum using proportional allocation.

Question 5c

Compute the sample size for the third stratum using proportional allocation.

Question 6a

Suppose we have a total of N = 500 which is divided into three strata with N₁ = 250, N₂ = 100, and N₃ = 150. Now, suppose we want to select a sample of size n = 100.
Compute the sample size for the first stratum using proportional allocation.

Question 6b

Compute the sample size for the second stratum using proportional allocation.

Question 6c

Compute the sample size for the third stratum using proportional allocation.

Question 7

What is the difference between proportional allocation and optimal allocation in terms of sample effort?

Question 8

What is the difference between proportional allocation and optimal allocation in terms of estimating the sample size for strata for population proportions?

Question 9

What is the difference between stratified sampling and cluster sampling?

Question 10

Mention one advantage and one disadvantage of cluster sampling.

Question 11

Mention one advantage and one disadvantage of two-phase sampling

Answer indication

Question 1a

\[ \bar{x}_{st} = \frac{1}{N} \ sum^{K}_{j = 1} N_{j}\bar{x}_{j} = \frac{ (75)(21.2) + (30)(13.3) + (20)(26.1) }{125} = 20.09 \]

Question 1b

\[ \hat{\sigma}^{\frac{2}{x_{1}}} = \frac{ s^{2}_{1} }{n_{1}} x \frac{ (N_{1} - n_{1} ) }{N_{1} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(12.8)^{2}}{15} x \frac{60}{74} = 8.856 \]

Question 1c

\[ \hat{\sigma}^{\frac{2}{x_{2}}} = \frac{ s^{2}_{2} }{n_{2}} x \frac{ (N_{2} - n_{2} ) }{N_{2} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(11.4)^{2}}{8} x \frac{22}{29} = 12.324 \]

Question 1d

\[ \hat{\sigma}^{\frac{2}{x_{3}}} = \frac{ s^{2}_{3} }{n_{3}} x \frac{ (N_{3} - n_{3} ) }{N_{3} - 1 } = \frac{s^{2}_{1}}{n_{1}} x \frac{ (N_{1} - n_{1}) }{N_{1} - 1} = \frac{(9.2)^{2}}{2} x \frac{18}{19} = 40.093 \]

Question 1e

\[ \hat{\sigma}^{\frac{2}{st}} = \frac{1}{N^{2}} \ sum^{K}_{j = 1} N^{2}_{j} \hat{\sigma}^{2}_{x_{j}} = \frac{ (75)^{2}(8.856) + (30)^{2}(12.324) + (20)^{2}(40.093) }{125^{2}} = 4.924 \]

Question 1f

\[ \hat{\sigma}_{\bar{x}_{st}} = \sqrt{4.924} = 2.22 \]

Question 1g

20.09 +/- (1.96)(2.22) = [15.74; 24.44]

Question 2a

\[ \hat{p}_{st} = \frac{1}{N} = \sum^{K}_{j = 1} N_{j} \hat{p}_{j} = \frac{ (364)(0.175) + (1031)(0.217) }{1395} = 0.206 \]

Question 2b

\[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.175)(0.825) }{39} x \frac{324}{363} = 0.003304 \]

Question 2c

\[ \hat{\sigma}^{2}_{p_{st}} = \frac{ \hat{p}_{j} (1 - \hat{p}_{j}) }{n_{j} - 1} x \frac{ (N_{j} - n_{j}) }{N_{j} - 1} = \frac{ (0.217)(0.783) }{59} x \frac{971}{1030} = 0.002715 \]

Question 2d

\[ \hat{\sigma}^{2}_{\hat{p}_{st}} = \frac{1}{N^{2}} \sum^{K}{j = 1} N^{2}_{j} \ hat{\sigma}^{2}_{\hat{p}_{j}} = \frac{ (364)^{2}(0.003304) + (1031)^{2}(0.002715) }{ (1395)^{2} } = 0.001708 \]

Question 2e

\[ \hat{\sigma}_{\hat{p}_{st}} = 0.0413 \]

Question 2f

(0.206) +/- (1.645)(0.0413) = [0.138; 0. 274]

Question 3a

\[ n_{1} = \frac{75}{125} x 25 = 12 \]

Question 3b

\[ n_{2} = \frac{30}{125} x 25 = 5 \]

Question 3c

\[ n_{3} = \frac{20}{125} x 25 = 6 \]

Question 4a

\[ n_{1} = \frac{100}{225} x 50 = 22 \]

Question 4b

\[ n_{2} = \frac{75}{225} x 50 = 17 \]

Question 4c

\[ n_{3} = \frac{50}{225} x 50 = 11 \]

Question 5a

\[ n_{1} = \frac{250}{500} x 50 = 25 \]

Question 5b

\[ n_{2} = \frac{100}{500} x 50 = 10 \]

Question 5c

\[ n_{3} = \frac{150}{500} x 50 = 15 \]

Question 6a

\[ n_{1} = \frac{250}{500} x 100 = 50 \]

Question 6b

\[ n_{2} = \frac{100}{500} x 100 = 20 \]

Question 6c

\[ n_{3} = \frac{150}{500} x 100 = 30 \]

Question 7

Optimal allocation allocates relatively more sample effort to strata in which the population variance is highest.

Question 8

Optimal allocation allocates more sample observations to strata in which the true population proportions are closest to 0.50.

Question 9

In stratified random sampling, a sample is taken from every stratum of the population in an attempt to ensure that important segments of the population are given corresponding weight. In cluster sampling, a random sample of clusters is taken, such that some clusters will have no members in the sample.

Question 10

Advantage: convenience. Disadvantage: the additional imprecision in the sample estimates.

Question 11

Advantage: it enables the researcher, at a low cost, to try out the survey. Disadvantage: time consuming.

Suppose we conducted a stratified sampling procedure. Use the following information:

N₁ = 75; N₂ = 30; N₃ = 125.
n₁ = 15; n₂ = 8; n₃ = 25.
x̄₁= 21.2; s₁ = 12.8.
x̄₂= 13.3; s₂ = 11.4.
x̄₃= 26.1; s₃ = 9.2.
Compute the point estimate of the population mean.

External Link

Bedrijfsorganisatie & Economie: opleiding tot studeren in het buitenland

Bedrijfskundige en management in het buitenland via betaald werk, stage en vrijwilligerswerk

Studeren in het buitenland verzekeren

Competenties en je kwaliteiten verbeteren en versterken

Join World Supporter

for free to follow other supporters, see more content and use the tools
for €10,- by becoming a member to see all content

Why create an account?

Your WorldSupporter account gives you access to all functionalities of the platform
Once you are logged in, you can:
- Save pages to your favorites
- Give feedback or share contributions
- participate in discussions
- share your own contributions through the 7 WorldSupporter tools

Follow the author: Vintage Supporter

Vintage Supporter

Promotions

JoHo kan jouw hulp goed gebruiken! Check hier de diverse bijbanen die aansluiten bij je studie, je competenties verbeteren, je cv versterken en je een bijdrage laten leveren aan een mooiere wereld