1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 1 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y R-Square Selection Method Number in Model R-Square C(p) Variables in Model 1 0.6745 138.7308 X4 1 0.6663 142.4864 X2 ------------------------------------------------------- 2 0.9787 2.6782 X1 X2 2 0.9725 5.4959 X1 X4 ------------------------------------------------------- 3 0.9823 3.0182 X1 X2 X4 3 0.9823 3.0413 X1 X2 X3 ------------------------------------------------------- 4 0.9824 5.0000 X1 X2 X3 X4 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 2 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Forward Selection: Step 1 Variable X4 Entered: R-Square = 0.6745 and C(p) = 138.7308 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 1831.89616 1831.89616 22.80 0.0006 Error 11 883.86692 80.35154 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 117.56793 5.26221 40108 499.16 <.0001 X4 -0.73816 0.15460 1831.89616 22.80 0.0006 Bounds on condition number: 1, 1 ------------------------------------------------------------------------ Forward Selection: Step 2 Variable X1 Entered: R-Square = 0.9725 and C(p) = 5.4959 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2641.00096 1320.50048 176.63 <.0001 Error 10 74.76211 7.47621 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 103.09738 2.12398 17615 2356.10 <.0001 X1 1.43996 0.13842 809.10480 108.22 <.0001 X4 -0.61395 0.04864 1190.92464 159.30 <.0001 Bounds on condition number: 1.0641, 4.2564 ------------------------------------------------------------------------ Forward Selection: Step 3 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 3 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Forward Selection: Step 3 Variable X2 Entered: R-Square = 0.9823 and C(p) = 3.0182 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 2667.79035 889.26345 166.83 <.0001 Error 9 47.97273 5.33030 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 71.64831 14.14239 136.81003 25.67 0.0007 X1 1.45194 0.11700 820.90740 154.01 <.0001 X2 0.41611 0.18561 26.78938 5.03 0.0517 X4 -0.23654 0.17329 9.93175 1.86 0.2054 Bounds on condition number: 18.94, 116.36 ------------------------------------------------------------------------ No other variable met the 0.5000 significance level for entry into the model. Summary of Forward Selection Variable Number Partial Model Step Entered Vars In R-Square R-Square C(p) F Value Pr > F 1 X4 1 0.6745 0.6745 138.731 22.80 0.0006 2 X1 2 0.2979 0.9725 5.4959 108.22 <.0001 3 X2 3 0.0099 0.9823 3.0182 5.03 0.0517 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 4 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Backward Elimination: Step 0 All Variables Entered: R-Square = 0.9824 and C(p) = 5.0000 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 4 2667.89944 666.97486 111.48 <.0001 Error 8 47.86364 5.98295 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 62.40537 70.07096 4.74552 0.79 0.3991 X1 1.55110 0.74477 25.95091 4.34 0.0708 X2 0.51017 0.72379 2.97248 0.50 0.5009 X3 0.10191 0.75471 0.10909 0.02 0.8959 X4 -0.14406 0.70905 0.24697 0.04 0.8441 Bounds on condition number: 282.51, 2489.2 ------------------------------------------------------------------------ Backward Elimination: Step 1 Variable X3 Removed: R-Square = 0.9823 and C(p) = 3.0182 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 2667.79035 889.26345 166.83 <.0001 Error 9 47.97273 5.33030 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 71.64831 14.14239 136.81003 25.67 0.0007 X1 1.45194 0.11700 820.90740 154.01 <.0001 X2 0.41611 0.18561 26.78938 5.03 0.0517 X4 -0.23654 0.17329 9.93175 1.86 0.2054 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 5 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Backward Elimination: Step 1 Bounds on condition number: 18.94, 116.36 ------------------------------------------------------------------------ Backward Elimination: Step 2 Variable X4 Removed: R-Square = 0.9787 and C(p) = 2.6782 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2657.85859 1328.92930 229.50 <.0001 Error 10 57.90448 5.79045 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 52.57735 2.28617 3062.60416 528.91 <.0001 X1 1.46831 0.12130 848.43186 146.52 <.0001 X2 0.66225 0.04585 1207.78227 208.58 <.0001 Bounds on condition number: 1.0551, 4.2205 ------------------------------------------------------------------------ All variables left in the model are significant at the 0.1000 level. Summary of Backward Elimination Variable Number Partial Model Step Removed Vars In R-Square R-Square C(p) F Value Pr > F 1 X3 3 0.0000 0.9823 3.0182 0.02 0.8959 2 X4 2 0.0037 0.9787 2.6782 1.86 0.2054 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 6 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Stepwise Selection: Step 1 Variable X4 Entered: R-Square = 0.6745 and C(p) = 138.7308 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 1831.89616 1831.89616 22.80 0.0006 Error 11 883.86692 80.35154 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 117.56793 5.26221 40108 499.16 <.0001 X4 -0.73816 0.15460 1831.89616 22.80 0.0006 Bounds on condition number: 1, 1 ------------------------------------------------------------------------ Stepwise Selection: Step 2 Variable X1 Entered: R-Square = 0.9725 and C(p) = 5.4959 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2641.00096 1320.50048 176.63 <.0001 Error 10 74.76211 7.47621 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 103.09738 2.12398 17615 2356.10 <.0001 X1 1.43996 0.13842 809.10480 108.22 <.0001 X4 -0.61395 0.04864 1190.92464 159.30 <.0001 Bounds on condition number: 1.0641, 4.2564 ------------------------------------------------------------------------ Stepwise Selection: Step 3 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 7 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Stepwise Selection: Step 3 Variable X2 Entered: R-Square = 0.9823 and C(p) = 3.0182 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 2667.79035 889.26345 166.83 <.0001 Error 9 47.97273 5.33030 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 71.64831 14.14239 136.81003 25.67 0.0007 X1 1.45194 0.11700 820.90740 154.01 <.0001 X2 0.41611 0.18561 26.78938 5.03 0.0517 X4 -0.23654 0.17329 9.93175 1.86 0.2054 Bounds on condition number: 18.94, 116.36 ------------------------------------------------------------------------ Stepwise Selection: Step 4 Variable X4 Removed: R-Square = 0.9787 and C(p) = 2.6782 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2657.85859 1328.92930 229.50 <.0001 Error 10 57.90448 5.79045 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 52.57735 2.28617 3062.60416 528.91 <.0001 X1 1.46831 0.12130 848.43186 146.52 <.0001 X2 0.66225 0.04585 1207.78227 208.58 <.0001 Bounds on condition number: 1.0551, 4.2205 ------------------------------------------------------------------------ 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 8 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Stepwise Selection: Step 4 All variables left in the model are significant at the 0.1500 level. No other variable met the 0.1500 significance level for entry into the model. Summary of Stepwise Selection Variable Variable Number Partial Model Step Entered Removed Vars In R-Square R-Square C(p) F Value 1 X4 1 0.6745 0.6745 138.731 22.80 2 X1 2 0.2979 0.9725 5.4959 108.22 3 X2 3 0.0099 0.9823 3.0182 5.03 4 X4 2 0.0037 0.9787 2.6782 1.86 Summary of Stepwise Selection Step Pr > F 1 0.0006 2 <.0001 3 0.0517 4 0.2054 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 9 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Maximum R-Square Improvement: Step 1 Variable X4 Entered: R-Square = 0.6745 and C(p) = 138.7308 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 1 1831.89616 1831.89616 22.80 0.0006 Error 11 883.86692 80.35154 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 117.56793 5.26221 40108 499.16 <.0001 X4 -0.73816 0.15460 1831.89616 22.80 0.0006 Bounds on condition number: 1, 1 ------------------------------------------------------------------------ The above model is the best 1-variable model found. Maximum R-Square Improvement: Step 2 Variable X1 Entered: R-Square = 0.9725 and C(p) = 5.4959 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2641.00096 1320.50048 176.63 <.0001 Error 10 74.76211 7.47621 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 103.09738 2.12398 17615 2356.10 <.0001 X1 1.43996 0.13842 809.10480 108.22 <.0001 X4 -0.61395 0.04864 1190.92464 159.30 <.0001 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 10 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Maximum R-Square Improvement: Step 2 Bounds on condition number: 1.0641, 4.2564 ------------------------------------------------------------------------ Maximum R-Square Improvement: Step 3 Variable X4 Removed: R-Square = 0.9787 and C(p) = 2.6782 Variable X2 Entered Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 2 2657.85859 1328.92930 229.50 <.0001 Error 10 57.90448 5.79045 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 52.57735 2.28617 3062.60416 528.91 <.0001 X1 1.46831 0.12130 848.43186 146.52 <.0001 X2 0.66225 0.04585 1207.78227 208.58 <.0001 Bounds on condition number: 1.0551, 4.2205 ------------------------------------------------------------------------ The above model is the best 2-variable model found. Maximum R-Square Improvement: Step 4 Variable X4 Entered: R-Square = 0.9823 and C(p) = 3.0182 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 3 2667.79035 889.26345 166.83 <.0001 Error 9 47.97273 5.33030 Corrected Total 12 2715.76308 1 EXAMPLE 13. VARIABLE SELECTION IN MULTIPLE REGRESSION: HALD DATA 11 23:42 Sunday, July 22, 2001 The REG Procedure Model: MODEL1 Dependent Variable: Y Maximum R-Square Improvement: Step 4 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 71.64831 14.14239 136.81003 25.67 0.0007 X1 1.45194 0.11700 820.90740 154.01 <.0001 X2 0.41611 0.18561 26.78938 5.03 0.0517 X4 -0.23654 0.17329 9.93175 1.86 0.2054 Bounds on condition number: 18.94, 116.36 ------------------------------------------------------------------------ The above model is the best 3-variable model found. Maximum R-Square Improvement: Step 5 Variable X3 Entered: R-Square = 0.9824 and C(p) = 5.0000 Analysis of Variance Sum of Mean Source DF Squares Square F Value Pr > F Model 4 2667.89944 666.97486 111.48 <.0001 Error 8 47.86364 5.98295 Corrected Total 12 2715.76308 Parameter Standard Variable Estimate Error Type II SS F Value Pr > F Intercept 62.40537 70.07096 4.74552 0.79 0.3991 X1 1.55110 0.74477 25.95091 4.34 0.0708 X2 0.51017 0.72379 2.97248 0.50 0.5009 X3 0.10191 0.75471 0.10909 0.02 0.8959 X4 -0.14406 0.70905 0.24697 0.04 0.8441 Bounds on condition number: 282.51, 2489.2 ------------------------------------------------------------------------ The above model is the best 4-variable model found. No further improvement in R-Square is possible.