## Exam # 3

She would like to provide a quarterly forecast of sales for the Northern United ..... points (with three independent variables) has how many degrees of freedom for ...
DS 533 Fall 2003

Exam # 3

Name: _____KEY______________

1. Barbara Lynch is the product manager for a line of ski wear produced by HealthCo industries. She would like to provide a quarterly forecast of sales for the Northern United States. One of her colleagues suggested that unemployment (NRUR), and income (INC) in the regions in which the clothes are marketed might be causually connected to sales. To see whether a multiple regression model would work well, She fit the model on a 10-year quarterly sales history using EXCEL. The partial summary output is given below.

|SUMMARY OUTPUT| | | | | | | | | | | | | | | |Regression Statistics | | | | | | |Multiple R |0.9302800| | | | | | | |99 | | | | | | |R Square |0.8654210| | | | | | | |62 | | | | | | |Adjusted R |0.8581465| | | | | | |Square |25 | | | | | | |Standard Error|25369.969| | | | | | | |37 | | | | | | |Observations |40 | | | | | | | | | | | | | | |ANOVA | | | | | | | |  |df |SS |MS |F |Significan| | | | | | | |ce F | | |Regression |2 |1.53141E+1|765705870|118.96578| | | | | |1 |62 |95 | | | |Residual |37 |2381450779|643635345| | | | | | |6 |.8 | | | | |Total |39 |1.76956E+1|  |  |  | | | | |1 | | | | | | | | | | | | | |  |Coefficie|Standard |t Stat |P-value |Lower 95% |Upper 95%| | |nts |Error | | | | | |Intercept |89861.011|24296.0548|3.6985844|0.0007003|40632.5763|139089.44| | |47 | |9 |78 |7 |66 | |Inc |121.92360|8.40433199|14.507233|7.58631E-|104.894826|138.95238| | |41 |6 |19 |17 |5 |16 | |NRUR |-1824.933|3162.80367|-0.576998|0.5674346|-8233.3764|4583.5090| | |699 |4 |729 |82 |15 |17 |

a) Give the prediction equation based on this analysis. Do the signs on the coefficients make sense? Explain why? ŷ = 89861.01 + 121.92INC – 1824.93NRUR

Yes there is a positive relation between income and the sales of skiwear. Also if there is unemployment people are less likely to buy skiwear.

b) Perform an overall test of fit of the model (State the Null and alternative Hypothesis, the test statistic, and your decision criterion, and your conclusion).

Ho : β1 = β2 = 0 Ha : at least one is not equal to zero. F = 118.96 Reject Ho if F > F (α, 2, 37) Conclusion: Reject Ho, at least one of the independent variables is a significant predictor.

c) What percentage of the variation in sales is explained by this model? 87%

d) Do you think we need to all the variables in the model? If no which one do you think should be dropped and why? Make sure you state your null and alternative hypothesis, the test statistics, and your decision criterion or the p-value of your test.

Given income in the model, unemployment is not a significant predictor so we can drop it.

Ho: β2 = 0 Ha: β2 ≠ 0 t = -.57 Do not reject Ho since P-Value = .57 > 5%

e) Use the model in part a to make a sales forecast (SF1) for 1998Q1 through 1998Q4, given the values for unemployment (NRUR) and income (INC) as follows

|Period |NRUR |INC |SF1 | |1998Q1 |7.6 |1928 |311060.22 | |1998Q2 |7.7 |1972 |316242.36 | |1998Q3 |7.5 |2017 |322093.91 | |1998Q4 |7.4 |2062 |327762.96 |

ŷ = 89861.01 + 121.92(1928) – 1824.93(7.6) = 230178.3 ŷ = 89861.01 + 121.92(1972) – 1824.93(7.7) = 316235.29 ŷ =89861.01 + 121.92(2017) – 1824.93(7.5) = 322086.68 ŷ =89861.01 + 121.92(2062) – 1824.93(7.4) = 341260.05

2. Nelson Industries manufactures a part for a type of aircraft engine that is becoming obsolete. The sales history for the last 10 years is as follows: |Sales |Year | |945 |1 | |875 |2 | |760 |3 | |690 |4 | |545 |5 | |420 |6 | |305 |7 | |285 |8 | |250 |9 | |210 |10 |

[pic] Use the following information to answer these questions.

[pic] [pic]

a) Find the prediction equation for a linear trend of sales. b1 = -7322.5 = -88.76 b0 = 528.5 - (-88.76)(5.5) = 1016.67 82.5 ŷ = 1016.67 – 88.76x

b) Calculate the residuals. |Sales (Y) |Year (X) |Y-hat |Residual |(Residual)2 | |945 |1 |928 |17 |289 | |875 |2 |839.33 |35.67 |1272.349 | |760 |3 |750.66 |9.34 |87.2356 | |690 |4 |661.99 |28.01 |784.5601 | |545 |5 |573.32 |-28.32 |802.0224 | |420 |6 |484.65 |-64.65 |4179.623 | |305 |7 |395.98 |-90.98 |8277.36 | |285 |8 |307.31 |-22.31 |497.7361 | |250 |9 |218.64 |31.36 |983.4496 | |210 |10 |129.97 |80.03 |6404.801 |

c) Calculate the regression standard error estimate (S y.x ) Sy.x = 23578.14 = 54.29 8

d) Calculate and interpret R2. R2.= (-73225.5/√82.5√673502.5)2 = .9649 97% of the variability in the aircraft engine sales can be explained by the change in the time

e) Give a 95% confidence interval estimate of the rate of decline in sales of aircraft engine part. Explain what it means.

b1 = ±t(.025, 8) s(b1) s(b1) = 54.29 = 5.98 -88.76 ± 2.306(5.98) √82.5 -88.76 ± 13.78 (-102.54, -74.98)

f) Forecast the sales for the next year (year =11). ŷ = 1016.67 – 88.76(11) = 40.31

g) Give a 90% prediction interval for the next year sales. = 65.75

ŷ = ± t (.05, n-2) Sf 40.31 ± 1.86(65.75) 40.31 ± 122.295 (-81.99, 162.61)

Multiple Choice Questions Select the best answer

1. The least squares procedure minimizes the sum of

A) the residuals. B) squared maximum error. C) absolute errors. D) squared residuals. ** E) None of the above.

2. A residual is

A) the difference between the mean of Y conditional on X and the unconditional mean. B) the difference between the mean of Y and its actual value. C) the difference between the regression prediction of Y and its actual value. ** D) the difference between the sum of squared errors before and after X is used to predict Y. E) None of the above.

3. The following regression equation was estimated: Y = -2.0 + 4.6X. This indicates that

A) there has been an error since "b" cannot be a negative number. B) there is a negative relationship between the two variables. C) Y equals 44 when X is 10. ** D) the correlation coefficient for Y and X will be negative. E) None of the above.

4. The regression slope term (β) in the simple regression model is

A) correctly interpreted as the change in X given a unit change in Y . B) usually known to the investigator. C) the change in the mean of Y given a unit change in X. ** D) undetermined using the ordinary least squares method. E) None of the above.

5. Visual inspection of the data will help the forecaster identify

A) trend. B) seasonality. C) linearity. D) nonlinearity. E) All the above. **

6. Testing the null hypothesis that the slope coefficient is zero uses what sampling d istribution?

A) Normal. B) Chi-square. C) t distribution with n-1 degrees of freedom. D) Standard Normal. E) None of the above. **

7. A multiple regression model using 200 data points (with three independent variables) has how many degrees of freedom for testing the statistical significance of individual slope coefficients?

A) 199. B) 198. C) 197. D) 196. **

8. The F-test in multiple regression

A) is used to test for the presence of autocorrelation. B) tests for the presence of first-order autocorrelation. C) tests the significance of the Durbin-Watson statistic. D) tests a null involving all regression slope coefficients simultaneously. ** E) is used to test the significance of individual coefficients.

9. The F-statistic reported in standard multiple regression computer packages tests which hypothesis?

A) H0: β1 ≠ β2 ≠ β3 ≠ .. ≠ βK ≠ 0. B) H0: β1 + β2 + β3 + .. + βK = 0. C) H0: β1 = β2 = β3 = .. = βK = 0. ** D) H0: The set of independent variables has a significant linear influence on the dependent variable.

10. If R2 is .95 in a simple regression model, it can be said that:

A) X and Y have a correlation of .95. B) the relationship between X and Y is positive. C) 5 percent of Y's variability is caused by variability in X. D) 95% of Y's variability can be explained by X's variability. **

Formulas

Multiple Regressions

Simple Regression

df = Degrees of freedom

----------------------- [pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]

[pic]