14 KiB
Econometrics Practice Exercises
Classical Linear Regression Model (CLRM)
Practice Exercise 1: Hypothesis Testing with t-statistics
Problem Statement
You are analyzing the relationship between years of education and hourly wages using a simple linear regression model. A researcher collected data from 45 randomly selected workers and estimated the following regression equation:
\text{Wage}_i = \beta_0 + \beta_1 \text{Education}_i + u_i
Estimated Results:
| Coefficient | Estimate | Standard Error |
|---|---|---|
\hat{\beta}_0 (Intercept) |
3.25 | 1.84 |
\hat{\beta}_1 (Education) |
1.78 | 0.42 |
R^2 |
0.62 | - |
| Sample size ($n$) | 45 | - |
Additional Information:
- Wage is measured in dollars per hour
- Education is measured in years of schooling completed
- The classical assumptions of the CLRM hold (homoskedasticity, no autocorrelation, normality of errors)
Questions
Part A: Two-Tailed Test for Slope Coefficient
Test whether education has a statistically significant effect on wages at the 5% significance level.
- State the null and alternative hypotheses.
- Calculate the t-statistic.
- Determine the critical value(s).
- State your conclusion in statistical terms.
- Interpret your conclusion in the context of the wage-education relationship.
Calculation Space:
H₀: ________________________________________________
H₁: ________________________________________________
t-statistic formula: t = (β̂₁ - β₁₀) / SE(β̂₁)
t = ________________________________________________
t = ________________________________________________
t = ________________________________________________
degrees of freedom = _______________________________
critical values (α = 0.05, two-tailed): ____________
Decision: __________________________________________
Interpretation: ____________________________________
____________________________________________________
Part B: One-Tailed Test for Slope Coefficient
Test whether each additional year of education increases wages by more than $1.50 per hour at the 1% significance level.
- State the null and alternative hypotheses.
- Calculate the t-statistic.
- Determine the critical value.
- State your conclusion.
Calculation Space:
H₀: ________________________________________________
H₁: ________________________________________________
t = ________________________________________________
t = ________________________________________________
critical value (α = 0.01, one-tailed): _____________
Decision: __________________________________________
Interpretation: ____________________________________
____________________________________________________
Part C: Test for Intercept
Test whether the intercept is significantly different from zero at the 10% significance level.
- State the hypotheses.
- Calculate the t-statistic.
- Make your decision and interpret.
Calculation Space:
H₀: ________________________________________________
H₁: ________________________________________________
t = ________________________________________________
t = ________________________________________________
degrees of freedom = _______________________________
critical values (α = 0.10, two-tailed): ____________
Decision: __________________________________________
Interpretation: ____________________________________
____________________________________________________
Part D: Economic Interpretation
Explain what the coefficient \hat{\beta}_1 = 1.78 means in practical terms. If someone completes an additional 4 years of college education, what would this model predict as their wage increase, assuming all else equal?
Practice Exercise 2: Confidence Intervals and Joint Hypothesis Testing
Problem Statement
A regional transportation authority wants to understand factors affecting monthly public transit ridership across 35 cities. They estimate the following multiple regression model:
\text{Ridership}_i = \beta_0 + \beta_1 \text{Fare}_i + \beta_2 \text{Income}_i + \beta_3 \text{PopDensity}_i + u_i
Where:
- Ridership: Monthly ridership per 1,000 residents (number of trips)
- Fare: Average one-way fare in dollars
- Income: Median household income in thousands of dollars
- PopDensity: Population density (thousands of people per square km)
Estimated Results:
| Variable | Coefficient | Standard Error |
|---|---|---|
| Intercept ($\hat{\beta}_0$) | 48.6 | 12.3 |
| Fare ($\hat{\beta}_1$) | -4.20 | 1.15 |
| Income ($\hat{\beta}_2$) | 0.85 | 0.32 |
| PopDensity ($\hat{\beta}_3$) | 3.40 | 1.08 |
| Sample size ($n$) | 35 | - |
R^2 |
0.71 | - |
Adjusted R^2 |
0.68 | - |
Questions
Part A: 95% Confidence Interval for Fare Coefficient
Construct and interpret a 95% confidence interval for \beta_1 (the effect of fare on ridership).
Calculation Space:
Confidence interval formula: β̂₁ ± t(α/2, df) × SE(β̂₁)
degrees of freedom = n - k - 1 = ____________________
= ____________________
t-critical for 95% CI: ____________________________
Margin of error = __________________________________
= __________________________________
Lower bound = ______________________________________
Upper bound = ______________________________________
95% CI for β₁: [ _______ , _______ ]
Interpretation: What does this confidence interval tell us about the relationship between fares and ridership?
Part B: Hypothesis Test Using Confidence Interval
Using the confidence interval from Part A, test H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5 at the 5% significance level.
Decision Rule: Does -2.5 fall inside or outside the confidence interval?
Conclusion: _______________________________________________
Part C: 90% Confidence Interval for Income Coefficient
Construct a 90% confidence interval for \beta_2 and interpret its meaning.
Calculation Space:
t-critical for 90% CI: ____________________________
Margin of error = __________________________________
90% CI for β₂: [ _______ , _______ ]
Interpretation: What does this tell us about the relationship between income and transit ridership?
Part D: Testing a Specific Hypothesis
Test whether population density has a positive effect on ridership at the 5% significance level.
- State the hypotheses.
- Calculate the t-statistic.
- Determine the p-value range using the t-distribution table.
- Make your decision and interpret.
Calculation Space:
H₀: ________________________________________________
H₁: ________________________________________________
t-statistic = _______________________________________
= _______________________________________
= _______________________________________
One-tailed critical value (α = 0.05): ______________
t-statistic > critical value? _______________________
p-value is: (circle one)
p < 0.01 0.01 < p < 0.025 0.025 < p < 0.05
0.05 < p < 0.10 p > 0.10
Decision: __________________________________________
Interpretation: ____________________________________
____________________________________________________
Part E: Joint Interpretation
Suppose a city is considering two policies:
- Policy X: Reduce fare by $0.50
- Policy Y: Increase population density by 0.2 (through zoning changes)
Based on your regression results, calculate the expected change in ridership per 1,000 residents for each policy. Which policy would be predicted to have a larger impact on ridership?
Calculation Space:
Policy X (Fare reduction):
Expected ΔRidership = ______________________________
= ______________________________
Policy Y (Density increase):
Expected ΔRidership = ______________________________
= ______________________________
Larger predicted impact: ____________________________
ANSWER KEY
Exercise 1 Answers
Part A: Two-Tailed Test for Slope
-
Hypotheses:
- H₀: β₁ = 0 (Education has no effect on wages)
- H₁: β₁ ≠ 0 (Education has an effect on wages)
-
t-statistic:
t = \frac{1.78 - 0}{0.42} = 4.238 -
Critical values:
- df = 45 - 2 = 43
- t-critical (two-tailed, α=0.05) = ±2.017
-
Decision: Reject H₀ because |4.238| > 2.017
-
Conclusion: There is statistically significant evidence at the 5% level that education affects wages. The p-value is approximately 0.0001 (much less than 0.05).
Part B: One-Tailed Test
-
Hypotheses:
- H₀: β₁ ≤ 1.50
- H₁: β₁ > 1.50
-
t-statistic:
t = \frac{1.78 - 1.50}{0.42} = \frac{0.28}{0.42} = 0.667 -
Critical value:
- t-critical (one-tailed, α=0.01, df=43) = 2.416
-
Decision: Fail to reject H₀ because 0.667 < 2.416
-
Conclusion: At the 1% significance level, we do NOT have sufficient evidence to conclude that each year of education increases wages by more than $1.50.
Part C: Test for Intercept
-
Hypotheses:
- H₀: β₀ = 0
- H₁: β₀ ≠ 0
-
t-statistic:
t = \frac{3.25 - 0}{1.84} = 1.766 -
Critical values:
- t-critical (two-tailed, α=0.10, df=43) = ±1.681
-
Decision: Reject H₀ because |1.766| > 1.681
-
Conclusion: The intercept is statistically significant at the 10% level, suggesting that even with zero education, predicted wages differ significantly from zero. (Note: This may not be economically meaningful—workers with zero education would still earn something.)
Part D: Economic Interpretation
-
β̂₁ = 1.78 means: Each additional year of education is associated with an increase of $1.78 per hour in wages, holding all else constant.
-
For 4 years of college:
- Predicted wage increase = 4 × $1.78 = $7.12 per hour
- If working 2,000 hours/year, this translates to approximately $14,240 additional annual income
Exercise 2 Answers
Part A: 95% Confidence Interval for Fare Coefficient
-
df = 35 - 3 - 1 = 31 (k = 3 regressors)
-
t-critical (two-tailed, α=0.05, df=31) = 2.040
-
Margin of error = 2.040 × 1.15 = 2.346
-
Lower bound = -4.20 - 2.346 = -6.546
-
Upper bound = -4.20 + 2.346 = -1.854
95% CI for β₁: [ -6.55 , -1.85 ]
Interpretation: We are 95% confident that a $1 increase in fare is associated with a decrease in ridership of between 1.85 and 6.55 trips per 1,000 residents per month. Since the entire interval is negative, there is strong evidence of an inverse relationship.
Part B: Hypothesis Test Using Confidence Interval
H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5
-
Decision: Since -2.5 falls WITHIN the 95% CI [-6.55, -1.85], we fail to reject H₀
-
Conclusion: At the 5% significance level, we do not have sufficient evidence to reject the claim that the true effect of fare on ridership is -2.5 trips per dollar increase.
Part C: 90% Confidence Interval for Income Coefficient
-
t-critical (two-tailed, α=0.10, df=31) = 1.696
-
Margin of error = 1.696 × 0.32 = 0.543
-
Lower bound = 0.85 - 0.543 = 0.307
-
Upper bound = 0.85 + 0.543 = 1.393
90% CI for β₂: [ 0.31 , 1.39 ]
Interpretation: We are 90% confident that a $1,000 increase in median household income is associated with an increase in transit ridership of between 0.31 and 1.39 trips per 1,000 residents per month. The positive relationship suggests higher-income cities use transit more (perhaps due to downtown employment).
Part D: Testing Population Density Effect
-
Hypotheses:
- H₀: β₃ ≤ 0 (Population density has no positive effect)
- H₁: β₃ > 0 (Population density has a positive effect)
-
t-statistic:
t = \frac{3.40 - 0}{1.08} = 3.148 -
Critical value:
- t-critical (one-tailed, α=0.05, df=31) = 1.696
-
Decision: Reject H₀ because 3.148 > 1.696
-
p-value range: p < 0.01 (actually p ≈ 0.002)
-
Conclusion: There is strong statistical evidence that higher population density increases transit ridership. Cities with greater density have significantly more transit usage per capita.
Part E: Policy Comparison
Policy X (Fare reduction of $0.50):
\Delta \text{Ridership} = (-4.20) \times (-0.50) = +2.10 \text{ trips per 1,000 residents}
Policy Y (Density increase of 0.2):
\Delta \text{Ridership} = 3.40 \times 0.2 = +0.68 \text{ trips per 1,000 residents}
Larger predicted impact: Policy X (fare reduction)
The fare reduction is predicted to increase ridership by about 3 times more than the density increase, based on these coefficient estimates.
Common Mistakes to Avoid
-
Degrees of freedom: Remember df = n - k - 1 for multiple regression (where k = number of slope coefficients). For simple regression, df = n - 2.
-
One-tailed vs two-tailed: Always check whether the alternative hypothesis uses ≠ (two-tailed) or < / > (one-tailed). This affects your critical value.
-
Sign interpretation: When interpreting coefficients, always explain both the magnitude AND the direction (positive/negative).
-
Confidence interval for hypothesis testing: If the hypothesized value falls within the (1-α)% confidence interval, you fail to reject H₀ at significance level α.
-
Practical vs statistical significance: A coefficient can be statistically significant (large t-statistic) but economically small, or vice versa. Always consider both!
End of Practice Exercises