402 lines
14 KiB
Markdown
402 lines
14 KiB
Markdown
# Econometrics Practice Exercises
|
||
## Classical Linear Regression Model (CLRM)
|
||
|
||
---
|
||
|
||
# Practice Exercise 1: Hypothesis Testing with t-statistics
|
||
|
||
## Problem Statement
|
||
|
||
You are analyzing the relationship between years of education and hourly wages using a simple linear regression model. A researcher collected data from 45 randomly selected workers and estimated the following regression equation:
|
||
|
||
$$\text{Wage}_i = \beta_0 + \beta_1 \text{Education}_i + u_i$$
|
||
|
||
### Estimated Results:
|
||
| Coefficient | Estimate | Standard Error |
|
||
|-------------|----------|----------------|
|
||
| $\hat{\beta}_0$ (Intercept) | 3.25 | 1.84 |
|
||
| $\hat{\beta}_1$ (Education) | 1.78 | 0.42 |
|
||
| $R^2$ | 0.62 | - |
|
||
| Sample size ($n$) | 45 | - |
|
||
|
||
### Additional Information:
|
||
- Wage is measured in dollars per hour
|
||
- Education is measured in years of schooling completed
|
||
- The classical assumptions of the CLRM hold (homoskedasticity, no autocorrelation, normality of errors)
|
||
|
||
---
|
||
|
||
## Questions
|
||
|
||
### Part A: Two-Tailed Test for Slope Coefficient
|
||
**Test whether education has a statistically significant effect on wages at the 5% significance level.**
|
||
|
||
1. State the null and alternative hypotheses.
|
||
2. Calculate the t-statistic.
|
||
3. Determine the critical value(s).
|
||
4. State your conclusion in statistical terms.
|
||
5. Interpret your conclusion in the context of the wage-education relationship.
|
||
|
||
**Calculation Space:**
|
||
```
|
||
H₀: ________________________________________________
|
||
H₁: ________________________________________________
|
||
|
||
t-statistic formula: t = (β̂₁ - β₁₀) / SE(β̂₁)
|
||
|
||
t = ________________________________________________
|
||
t = ________________________________________________
|
||
t = ________________________________________________
|
||
|
||
degrees of freedom = _______________________________
|
||
|
||
critical values (α = 0.05, two-tailed): ____________
|
||
|
||
Decision: __________________________________________
|
||
|
||
Interpretation: ____________________________________
|
||
____________________________________________________
|
||
```
|
||
|
||
### Part B: One-Tailed Test for Slope Coefficient
|
||
**Test whether each additional year of education increases wages by more than $1.50 per hour at the 1% significance level.**
|
||
|
||
1. State the null and alternative hypotheses.
|
||
2. Calculate the t-statistic.
|
||
3. Determine the critical value.
|
||
4. State your conclusion.
|
||
|
||
**Calculation Space:**
|
||
```
|
||
H₀: ________________________________________________
|
||
H₁: ________________________________________________
|
||
|
||
t = ________________________________________________
|
||
t = ________________________________________________
|
||
|
||
critical value (α = 0.01, one-tailed): _____________
|
||
|
||
Decision: __________________________________________
|
||
|
||
Interpretation: ____________________________________
|
||
____________________________________________________
|
||
```
|
||
|
||
### Part C: Test for Intercept
|
||
**Test whether the intercept is significantly different from zero at the 10% significance level.**
|
||
|
||
1. State the hypotheses.
|
||
2. Calculate the t-statistic.
|
||
3. Make your decision and interpret.
|
||
|
||
**Calculation Space:**
|
||
```
|
||
H₀: ________________________________________________
|
||
H₁: ________________________________________________
|
||
|
||
t = ________________________________________________
|
||
t = ________________________________________________
|
||
|
||
degrees of freedom = _______________________________
|
||
|
||
critical values (α = 0.10, two-tailed): ____________
|
||
|
||
Decision: __________________________________________
|
||
|
||
Interpretation: ____________________________________
|
||
____________________________________________________
|
||
```
|
||
|
||
### Part D: Economic Interpretation
|
||
Explain what the coefficient $\hat{\beta}_1 = 1.78$ means in practical terms. If someone completes an additional 4 years of college education, what would this model predict as their wage increase, assuming all else equal?
|
||
|
||
---
|
||
|
||
# Practice Exercise 2: Confidence Intervals and Joint Hypothesis Testing
|
||
|
||
## Problem Statement
|
||
|
||
A regional transportation authority wants to understand factors affecting monthly public transit ridership across 35 cities. They estimate the following multiple regression model:
|
||
|
||
$$\text{Ridership}_i = \beta_0 + \beta_1 \text{Fare}_i + \beta_2 \text{Income}_i + \beta_3 \text{PopDensity}_i + u_i$$
|
||
|
||
Where:
|
||
- **Ridership**: Monthly ridership per 1,000 residents (number of trips)
|
||
- **Fare**: Average one-way fare in dollars
|
||
- **Income**: Median household income in thousands of dollars
|
||
- **PopDensity**: Population density (thousands of people per square km)
|
||
|
||
### Estimated Results:
|
||
| Variable | Coefficient | Standard Error |
|
||
|----------|-------------|----------------|
|
||
| Intercept ($\hat{\beta}_0$) | 48.6 | 12.3 |
|
||
| Fare ($\hat{\beta}_1$) | -4.20 | 1.15 |
|
||
| Income ($\hat{\beta}_2$) | 0.85 | 0.32 |
|
||
| PopDensity ($\hat{\beta}_3$) | 3.40 | 1.08 |
|
||
| Sample size ($n$) | 35 | - |
|
||
| $R^2$ | 0.71 | - |
|
||
| Adjusted $R^2$ | 0.68 | - |
|
||
|
||
---
|
||
|
||
## Questions
|
||
|
||
### Part A: 95% Confidence Interval for Fare Coefficient
|
||
**Construct and interpret a 95% confidence interval for $\beta_1$ (the effect of fare on ridership).**
|
||
|
||
**Calculation Space:**
|
||
```
|
||
Confidence interval formula: β̂₁ ± t(α/2, df) × SE(β̂₁)
|
||
|
||
degrees of freedom = n - k - 1 = ____________________
|
||
= ____________________
|
||
|
||
t-critical for 95% CI: ____________________________
|
||
|
||
Margin of error = __________________________________
|
||
= __________________________________
|
||
|
||
Lower bound = ______________________________________
|
||
Upper bound = ______________________________________
|
||
|
||
95% CI for β₁: [ _______ , _______ ]
|
||
```
|
||
|
||
**Interpretation:** What does this confidence interval tell us about the relationship between fares and ridership?
|
||
|
||
### Part B: Hypothesis Test Using Confidence Interval
|
||
**Using the confidence interval from Part A, test H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5 at the 5% significance level.**
|
||
|
||
**Decision Rule:** Does -2.5 fall inside or outside the confidence interval?
|
||
|
||
**Conclusion:** _______________________________________________
|
||
|
||
### Part C: 90% Confidence Interval for Income Coefficient
|
||
**Construct a 90% confidence interval for $\beta_2$ and interpret its meaning.**
|
||
|
||
**Calculation Space:**
|
||
```
|
||
t-critical for 90% CI: ____________________________
|
||
|
||
Margin of error = __________________________________
|
||
|
||
90% CI for β₂: [ _______ , _______ ]
|
||
```
|
||
|
||
**Interpretation:** What does this tell us about the relationship between income and transit ridership?
|
||
|
||
### Part D: Testing a Specific Hypothesis
|
||
**Test whether population density has a positive effect on ridership at the 5% significance level.**
|
||
|
||
1. State the hypotheses.
|
||
2. Calculate the t-statistic.
|
||
3. Determine the p-value range using the t-distribution table.
|
||
4. Make your decision and interpret.
|
||
|
||
**Calculation Space:**
|
||
```
|
||
H₀: ________________________________________________
|
||
H₁: ________________________________________________
|
||
|
||
t-statistic = _______________________________________
|
||
= _______________________________________
|
||
= _______________________________________
|
||
|
||
One-tailed critical value (α = 0.05): ______________
|
||
|
||
t-statistic > critical value? _______________________
|
||
|
||
p-value is: (circle one)
|
||
p < 0.01 0.01 < p < 0.025 0.025 < p < 0.05
|
||
0.05 < p < 0.10 p > 0.10
|
||
|
||
Decision: __________________________________________
|
||
|
||
Interpretation: ____________________________________
|
||
____________________________________________________
|
||
```
|
||
|
||
### Part E: Joint Interpretation
|
||
Suppose a city is considering two policies:
|
||
1. **Policy X:** Reduce fare by $0.50
|
||
2. **Policy Y:** Increase population density by 0.2 (through zoning changes)
|
||
|
||
Based on your regression results, calculate the **expected change in ridership per 1,000 residents** for each policy. Which policy would be predicted to have a larger impact on ridership?
|
||
|
||
**Calculation Space:**
|
||
```
|
||
Policy X (Fare reduction):
|
||
Expected ΔRidership = ______________________________
|
||
= ______________________________
|
||
|
||
Policy Y (Density increase):
|
||
Expected ΔRidership = ______________________________
|
||
= ______________________________
|
||
|
||
Larger predicted impact: ____________________________
|
||
```
|
||
|
||
---
|
||
|
||
# ANSWER KEY
|
||
|
||
---
|
||
|
||
## Exercise 1 Answers
|
||
|
||
### Part A: Two-Tailed Test for Slope
|
||
|
||
1. **Hypotheses:**
|
||
- H₀: β₁ = 0 (Education has no effect on wages)
|
||
- H₁: β₁ ≠ 0 (Education has an effect on wages)
|
||
|
||
2. **t-statistic:**
|
||
$$t = \frac{1.78 - 0}{0.42} = 4.238$$
|
||
|
||
3. **Critical values:**
|
||
- df = 45 - 2 = 43
|
||
- t-critical (two-tailed, α=0.05) = ±2.017
|
||
|
||
4. **Decision:** Reject H₀ because |4.238| > 2.017
|
||
|
||
5. **Conclusion:** There is statistically significant evidence at the 5% level that education affects wages. The p-value is approximately 0.0001 (much less than 0.05).
|
||
|
||
---
|
||
|
||
### Part B: One-Tailed Test
|
||
|
||
1. **Hypotheses:**
|
||
- H₀: β₁ ≤ 1.50
|
||
- H₁: β₁ > 1.50
|
||
|
||
2. **t-statistic:**
|
||
$$t = \frac{1.78 - 1.50}{0.42} = \frac{0.28}{0.42} = 0.667$$
|
||
|
||
3. **Critical value:**
|
||
- t-critical (one-tailed, α=0.01, df=43) = 2.416
|
||
|
||
4. **Decision:** Fail to reject H₀ because 0.667 < 2.416
|
||
|
||
5. **Conclusion:** At the 1% significance level, we do NOT have sufficient evidence to conclude that each year of education increases wages by more than $1.50.
|
||
|
||
---
|
||
|
||
### Part C: Test for Intercept
|
||
|
||
1. **Hypotheses:**
|
||
- H₀: β₀ = 0
|
||
- H₁: β₀ ≠ 0
|
||
|
||
2. **t-statistic:**
|
||
$$t = \frac{3.25 - 0}{1.84} = 1.766$$
|
||
|
||
3. **Critical values:**
|
||
- t-critical (two-tailed, α=0.10, df=43) = ±1.681
|
||
|
||
4. **Decision:** Reject H₀ because |1.766| > 1.681
|
||
|
||
5. **Conclusion:** The intercept is statistically significant at the 10% level, suggesting that even with zero education, predicted wages differ significantly from zero. (Note: This may not be economically meaningful—workers with zero education would still earn something.)
|
||
|
||
---
|
||
|
||
### Part D: Economic Interpretation
|
||
|
||
- **β̂₁ = 1.78** means: Each additional year of education is associated with an increase of **$1.78 per hour** in wages, holding all else constant.
|
||
|
||
- **For 4 years of college:**
|
||
- Predicted wage increase = 4 × $1.78 = **$7.12 per hour**
|
||
- If working 2,000 hours/year, this translates to approximately **$14,240 additional annual income**
|
||
|
||
---
|
||
|
||
## Exercise 2 Answers
|
||
|
||
### Part A: 95% Confidence Interval for Fare Coefficient
|
||
|
||
- **df** = 35 - 3 - 1 = **31** (k = 3 regressors)
|
||
- **t-critical** (two-tailed, α=0.05, df=31) = **2.040**
|
||
|
||
- **Margin of error** = 2.040 × 1.15 = **2.346**
|
||
- **Lower bound** = -4.20 - 2.346 = **-6.546**
|
||
- **Upper bound** = -4.20 + 2.346 = **-1.854**
|
||
|
||
**95% CI for β₁: [ -6.55 , -1.85 ]**
|
||
|
||
**Interpretation:** We are 95% confident that a $1 increase in fare is associated with a decrease in ridership of between 1.85 and 6.55 trips per 1,000 residents per month. Since the entire interval is negative, there is strong evidence of an inverse relationship.
|
||
|
||
---
|
||
|
||
### Part B: Hypothesis Test Using Confidence Interval
|
||
|
||
**H₀: β₁ = -2.5 vs. H₁: β₁ ≠ -2.5**
|
||
|
||
- **Decision:** Since -2.5 falls **WITHIN** the 95% CI [-6.55, -1.85], we **fail to reject H₀**
|
||
|
||
- **Conclusion:** At the 5% significance level, we do not have sufficient evidence to reject the claim that the true effect of fare on ridership is -2.5 trips per dollar increase.
|
||
|
||
---
|
||
|
||
### Part C: 90% Confidence Interval for Income Coefficient
|
||
|
||
- **t-critical** (two-tailed, α=0.10, df=31) = **1.696**
|
||
|
||
- **Margin of error** = 1.696 × 0.32 = **0.543**
|
||
- **Lower bound** = 0.85 - 0.543 = **0.307**
|
||
- **Upper bound** = 0.85 + 0.543 = **1.393**
|
||
|
||
**90% CI for β₂: [ 0.31 , 1.39 ]**
|
||
|
||
**Interpretation:** We are 90% confident that a $1,000 increase in median household income is associated with an increase in transit ridership of between 0.31 and 1.39 trips per 1,000 residents per month. The positive relationship suggests higher-income cities use transit more (perhaps due to downtown employment).
|
||
|
||
---
|
||
|
||
### Part D: Testing Population Density Effect
|
||
|
||
1. **Hypotheses:**
|
||
- H₀: β₃ ≤ 0 (Population density has no positive effect)
|
||
- H₁: β₃ > 0 (Population density has a positive effect)
|
||
|
||
2. **t-statistic:**
|
||
$$t = \frac{3.40 - 0}{1.08} = 3.148$$
|
||
|
||
3. **Critical value:**
|
||
- t-critical (one-tailed, α=0.05, df=31) = **1.696**
|
||
|
||
4. **Decision:** **Reject H₀** because 3.148 > 1.696
|
||
|
||
5. **p-value range:** **p < 0.01** (actually p ≈ 0.002)
|
||
|
||
6. **Conclusion:** There is strong statistical evidence that higher population density increases transit ridership. Cities with greater density have significantly more transit usage per capita.
|
||
|
||
---
|
||
|
||
### Part E: Policy Comparison
|
||
|
||
**Policy X (Fare reduction of $0.50):**
|
||
$$\Delta \text{Ridership} = (-4.20) \times (-0.50) = +2.10 \text{ trips per 1,000 residents}$$
|
||
|
||
**Policy Y (Density increase of 0.2):**
|
||
$$\Delta \text{Ridership} = 3.40 \times 0.2 = +0.68 \text{ trips per 1,000 residents}$$
|
||
|
||
**Larger predicted impact: Policy X (fare reduction)**
|
||
|
||
The fare reduction is predicted to increase ridership by about **3 times more** than the density increase, based on these coefficient estimates.
|
||
|
||
---
|
||
|
||
## Common Mistakes to Avoid
|
||
|
||
1. **Degrees of freedom:** Remember df = n - k - 1 for multiple regression (where k = number of slope coefficients). For simple regression, df = n - 2.
|
||
|
||
2. **One-tailed vs two-tailed:** Always check whether the alternative hypothesis uses ≠ (two-tailed) or < / > (one-tailed). This affects your critical value.
|
||
|
||
3. **Sign interpretation:** When interpreting coefficients, always explain both the magnitude AND the direction (positive/negative).
|
||
|
||
4. **Confidence interval for hypothesis testing:** If the hypothesized value falls within the (1-α)% confidence interval, you fail to reject H₀ at significance level α.
|
||
|
||
5. **Practical vs statistical significance:** A coefficient can be statistically significant (large t-statistic) but economically small, or vice versa. Always consider both!
|
||
|
||
---
|
||
|
||
*End of Practice Exercises*
|