# Econometrics Practice Exercises ## Classical Linear Regression Model (CLRM) --- # Practice Exercise 1: Hypothesis Testing with t-statistics ## Problem Statement You are analyzing the relationship between years of education and hourly wages using a simple linear regression model. A researcher collected data from 45 randomly selected workers and estimated the following regression equation: $$\text{Wage}_i = \beta_0 + \beta_1 \text{Education}_i + u_i$$ ### Estimated Results: | Coefficient | Estimate | Standard Error | |-------------|----------|----------------| | $\hat{\beta}_0$ (Intercept) | 3.25 | 1.84 | | $\hat{\beta}_1$ (Education) | 1.78 | 0.42 | | $R^2$ | 0.62 | - | | Sample size ($n$) | 45 | - | ### Additional Information: - Wage is measured in dollars per hour - Education is measured in years of schooling completed - The classical assumptions of the CLRM hold (homoskedasticity, no autocorrelation, normality of errors) --- ## Questions ### Part A: Two-Tailed Test for Slope Coefficient **Test whether education has a statistically significant effect on wages at the 5% significance level.** 1. State the null and alternative hypotheses. 2. Calculate the t-statistic. 3. Determine the critical value(s). 4. State your conclusion in statistical terms. 5. Interpret your conclusion in the context of the wage-education relationship. **Calculation Space:** ``` H₀: ________________________________________________ H₁: ________________________________________________ t-statistic formula: t = (β̂₁ - β₁₀) / SE(β̂₁) t = ________________________________________________ t = ________________________________________________ t = ________________________________________________ degrees of freedom = _______________________________ critical values (α = 0.05, two-tailed): ____________ Decision: __________________________________________ Interpretation: ____________________________________ ____________________________________________________ ``` ### Part B: One-Tailed Test for Slope Coefficient **Test whether each additional year of education increases wages by more than $1.50 per hour at the 1% significance level.** 1. State the null and alternative hypotheses. 2. Calculate the t-statistic. 3. Determine the critical value. 4. State your conclusion. **Calculation Space:** ``` H₀: ________________________________________________ H₁: ________________________________________________ t = ________________________________________________ t = ________________________________________________ critical value (α = 0.01, one-tailed): _____________ Decision: __________________________________________ Interpretation: ____________________________________ ____________________________________________________ ``` ### Part C: Test for Intercept **Test whether the intercept is significantly different from zero at the 10% significance level.** 1. State the hypotheses. 2. Calculate the t-statistic. 3. Make your decision and interpret. **Calculation Space:** ``` H₀: ________________________________________________ H₁: ________________________________________________ t = ________________________________________________ t = ________________________________________________ degrees of freedom = _______________________________ critical values (α = 0.10, two-tailed): ____________ Decision: __________________________________________ Interpretation: ____________________________________ ____________________________________________________ ``` ### Part D: Economic Interpretation Explain what the coefficient $\hat{\beta}_1 = 1.78$ means in practical terms. If someone completes an additional 4 years of college education, what would this model predict as their wage increase, assuming all else equal? --- # Practice Exercise 2: Confidence Intervals and Joint Hypothesis Testing ## Problem Statement A regional transportation authority wants to understand factors affecting monthly public transit ridership across 35 cities. They estimate the following multiple regression model: $$\text{Ridership}_i = \beta_0 + \beta_1 \text{Fare}_i + \beta_2 \text{Income}_i + \beta_3 \text{PopDensity}_i + u_i$$ Where: - **Ridership**: Monthly ridership per 1,000 residents (number of trips) - **Fare**: Average one-way fare in dollars - **Income**: Median household income in thousands of dollars - **PopDensity**: Population density (thousands of people per square km) ### Estimated Results: | Variable | Coefficient | Standard Error | |----------|-------------|----------------| | Intercept ($\hat{\beta}_0$) | 48.6 | 12.3 | | Fare ($\hat{\beta}_1$) | -4.20 | 1.15 | | Income ($\hat{\beta}_2$) | 0.85 | 0.32 | | PopDensity ($\hat{\beta}_3$) | 3.40 | 1.08 | | Sample size ($n$) | 35 | - | | $R^2$ | 0.71 | - | | Adjusted $R^2$ | 0.68 | - | --- ## Questions ### Part A: 95% Confidence Interval for Fare Coefficient **Construct and interpret a 95% confidence interval for $\beta_1$ (the effect of fare on ridership).** **Calculation Space:** ``` Confidence interval formula: β̂₁ ± t(α/2, df) × SE(β̂₁) degrees of freedom = n - k - 1 = ____________________ = ____________________ t-critical for 95% CI: ____________________________ Margin of error = __________________________________ = __________________________________ Lower bound = ______________________________________ Upper bound = ______________________________________ 95% CI for β₁: [ _______ , _______ ] ``` **Interpretation:** What does this confidence interval tell us about the relationship between fares and ridership? ### Part B: Hypothesis Test Using Confidence Interval **Using the confidence interval from Part A, test H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5 at the 5% significance level.** **Decision Rule:** Does -2.5 fall inside or outside the confidence interval? **Conclusion:** _______________________________________________ ### Part C: 90% Confidence Interval for Income Coefficient **Construct a 90% confidence interval for $\beta_2$ and interpret its meaning.** **Calculation Space:** ``` t-critical for 90% CI: ____________________________ Margin of error = __________________________________ 90% CI for β₂: [ _______ , _______ ] ``` **Interpretation:** What does this tell us about the relationship between income and transit ridership? ### Part D: Testing a Specific Hypothesis **Test whether population density has a positive effect on ridership at the 5% significance level.** 1. State the hypotheses. 2. Calculate the t-statistic. 3. Determine the p-value range using the t-distribution table. 4. Make your decision and interpret. **Calculation Space:** ``` H₀: ________________________________________________ H₁: ________________________________________________ t-statistic = _______________________________________ = _______________________________________ = _______________________________________ One-tailed critical value (α = 0.05): ______________ t-statistic > critical value? _______________________ p-value is: (circle one) p < 0.01 0.01 < p < 0.025 0.025 < p < 0.05 0.05 < p < 0.10 p > 0.10 Decision: __________________________________________ Interpretation: ____________________________________ ____________________________________________________ ``` ### Part E: Joint Interpretation Suppose a city is considering two policies: 1. **Policy X:** Reduce fare by $0.50 2. **Policy Y:** Increase population density by 0.2 (through zoning changes) Based on your regression results, calculate the **expected change in ridership per 1,000 residents** for each policy. Which policy would be predicted to have a larger impact on ridership? **Calculation Space:** ``` Policy X (Fare reduction): Expected ΔRidership = ______________________________ = ______________________________ Policy Y (Density increase): Expected ΔRidership = ______________________________ = ______________________________ Larger predicted impact: ____________________________ ``` --- # ANSWER KEY --- ## Exercise 1 Answers ### Part A: Two-Tailed Test for Slope 1. **Hypotheses:** - H₀: β₁ = 0 (Education has no effect on wages) - H₁: β₁ ≠ 0 (Education has an effect on wages) 2. **t-statistic:** $$t = \frac{1.78 - 0}{0.42} = 4.238$$ 3. **Critical values:** - df = 45 - 2 = 43 - t-critical (two-tailed, α=0.05) = ±2.017 4. **Decision:** Reject H₀ because |4.238| > 2.017 5. **Conclusion:** There is statistically significant evidence at the 5% level that education affects wages. The p-value is approximately 0.0001 (much less than 0.05). --- ### Part B: One-Tailed Test 1. **Hypotheses:** - H₀: β₁ ≤ 1.50 - H₁: β₁ > 1.50 2. **t-statistic:** $$t = \frac{1.78 - 1.50}{0.42} = \frac{0.28}{0.42} = 0.667$$ 3. **Critical value:** - t-critical (one-tailed, α=0.01, df=43) = 2.416 4. **Decision:** Fail to reject H₀ because 0.667 < 2.416 5. **Conclusion:** At the 1% significance level, we do NOT have sufficient evidence to conclude that each year of education increases wages by more than $1.50. --- ### Part C: Test for Intercept 1. **Hypotheses:** - H₀: β₀ = 0 - H₁: β₀ ≠ 0 2. **t-statistic:** $$t = \frac{3.25 - 0}{1.84} = 1.766$$ 3. **Critical values:** - t-critical (two-tailed, α=0.10, df=43) = ±1.681 4. **Decision:** Reject H₀ because |1.766| > 1.681 5. **Conclusion:** The intercept is statistically significant at the 10% level, suggesting that even with zero education, predicted wages differ significantly from zero. (Note: This may not be economically meaningful—workers with zero education would still earn something.) --- ### Part D: Economic Interpretation - **β̂₁ = 1.78** means: Each additional year of education is associated with an increase of **$1.78 per hour** in wages, holding all else constant. - **For 4 years of college:** - Predicted wage increase = 4 × $1.78 = **$7.12 per hour** - If working 2,000 hours/year, this translates to approximately **$14,240 additional annual income** --- ## Exercise 2 Answers ### Part A: 95% Confidence Interval for Fare Coefficient - **df** = 35 - 3 - 1 = **31** (k = 3 regressors) - **t-critical** (two-tailed, α=0.05, df=31) = **2.040** - **Margin of error** = 2.040 × 1.15 = **2.346** - **Lower bound** = -4.20 - 2.346 = **-6.546** - **Upper bound** = -4.20 + 2.346 = **-1.854** **95% CI for β₁: [ -6.55 , -1.85 ]** **Interpretation:** We are 95% confident that a $1 increase in fare is associated with a decrease in ridership of between 1.85 and 6.55 trips per 1,000 residents per month. Since the entire interval is negative, there is strong evidence of an inverse relationship. --- ### Part B: Hypothesis Test Using Confidence Interval **H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5** - **Decision:** Since -2.5 falls **WITHIN** the 95% CI [-6.55, -1.85], we **fail to reject H₀** - **Conclusion:** At the 5% significance level, we do not have sufficient evidence to reject the claim that the true effect of fare on ridership is -2.5 trips per dollar increase. --- ### Part C: 90% Confidence Interval for Income Coefficient - **t-critical** (two-tailed, α=0.10, df=31) = **1.696** - **Margin of error** = 1.696 × 0.32 = **0.543** - **Lower bound** = 0.85 - 0.543 = **0.307** - **Upper bound** = 0.85 + 0.543 = **1.393** **90% CI for β₂: [ 0.31 , 1.39 ]** **Interpretation:** We are 90% confident that a $1,000 increase in median household income is associated with an increase in transit ridership of between 0.31 and 1.39 trips per 1,000 residents per month. The positive relationship suggests higher-income cities use transit more (perhaps due to downtown employment). --- ### Part D: Testing Population Density Effect 1. **Hypotheses:** - H₀: β₃ ≤ 0 (Population density has no positive effect) - H₁: β₃ > 0 (Population density has a positive effect) 2. **t-statistic:** $$t = \frac{3.40 - 0}{1.08} = 3.148$$ 3. **Critical value:** - t-critical (one-tailed, α=0.05, df=31) = **1.696** 4. **Decision:** **Reject H₀** because 3.148 > 1.696 5. **p-value range:** **p < 0.01** (actually p ≈ 0.002) 6. **Conclusion:** There is strong statistical evidence that higher population density increases transit ridership. Cities with greater density have significantly more transit usage per capita. --- ### Part E: Policy Comparison **Policy X (Fare reduction of $0.50):** $$\Delta \text{Ridership} = (-4.20) \times (-0.50) = +2.10 \text{ trips per 1,000 residents}$$ **Policy Y (Density increase of 0.2):** $$\Delta \text{Ridership} = 3.40 \times 0.2 = +0.68 \text{ trips per 1,000 residents}$$ **Larger predicted impact: Policy X (fare reduction)** The fare reduction is predicted to increase ridership by about **3 times more** than the density increase, based on these coefficient estimates. --- ## Common Mistakes to Avoid 1. **Degrees of freedom:** Remember df = n - k - 1 for multiple regression (where k = number of slope coefficients). For simple regression, df = n - 2. 2. **One-tailed vs two-tailed:** Always check whether the alternative hypothesis uses ≠ (two-tailed) or < / > (one-tailed). This affects your critical value. 3. **Sign interpretation:** When interpreting coefficients, always explain both the magnitude AND the direction (positive/negative). 4. **Confidence interval for hypothesis testing:** If the hypothesized value falls within the (1-α)% confidence interval, you fail to reject H₀ at significance level α. 5. **Practical vs statistical significance:** A coefficient can be statistically significant (large t-statistic) but economically small, or vice versa. Always consider both! --- *End of Practice Exercises*