1  Linear Regression

1.1 A simple linear regression

To get us started, we run Example 2.4 from Wooldridge (2015), where we find the relationship between wage (wage) and years of education (educ).

Our model is, then:

\[ \text{wage} = \beta_0 + \beta_1 \times \text{wage} + \epsilon \]

We run the statsmodel code below to estimate \(\beta_0\) and \(\beta_1\).

import pandas as pd
import statsmodels.formula.api as smf
from markdownify import markdownify as md
from IPython.display import display, Markdown

# Load the data
df_wage = pd.read_csv("data/wage1.csv")

# Create an OLS model using the R syntax - assumes an intercept
mod = smf.ols(formula="wage ~ educ", data=df_wage)

# Fit the model
res = mod.fit()

# Show the results
display(Markdown(md(res.summary().tables[1].as_html())))
coef std err t P> t
Intercept -0.9049 0.685 -1.321 0.187 -2.250 0.441
educ 0.5414 0.053 10.167 0.000 0.437 0.646