Demo#

Ordinary Least Squares / Linear Regression

import numpy as np
import pandas as pd
import statsmodels.api as sm
import seaborn as sns
n = 500
np.random.seed(0)
df = pd.DataFrame({
    "x1":np.random.normal(10,1,n),
    "x2":np.random.normal(2,1,n),
    "e":np.random.normal(0,1,n)
})

df["y"] = 2 + 3*df["x1"] + (-2)*df["x2"] + df["e"]
X = sm.add_constant(df[["x1","x2"]], prepend=False)
mod = sm.OLS(df["y"], X)
res = mod.fit()
print(res.summary())
                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.933
Model:                            OLS   Adj. R-squared:                  0.933
Method:                 Least Squares   F-statistic:                     3480.
Date:                Thu, 01 Jun 2023   Prob (F-statistic):          5.01e-293
Time:                        12:46:04   Log-Likelihood:                -691.18
No. Observations:                 500   AIC:                             1388.
Df Residuals:                     497   BIC:                             1401.
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.9548      0.043     68.145      0.000       2.870       3.040
x2            -2.0109      0.044    -45.317      0.000      -2.098      -1.924
const          2.5011      0.446      5.602      0.000       1.624       3.378
==============================================================================
Omnibus:                        0.355   Durbin-Watson:                   2.034
Prob(Omnibus):                  0.837   Jarque-Bera (JB):                0.203
Skew:                           0.010   Prob(JB):                        0.903
Kurtosis:                       3.097   Cond. No.                         106.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Terms:#

Goodness of Fit:#

  • R-squared: proportion of variance of dependent variable explained by covariates

  • Adjusted R-squared: adjusts R-squared for the number of predictors in the model

  • F-statistic: tests null hypothesis that all coefficients are zero

  • Prob (F-statistic): low p-value means current model is more significant than intercept-only model

  • Log-Likelihood: \(log(p(X|\mu\Sigma)\) log of probability that data is produced by this model

  • AIC: \(-2logL + kp\) with \(k=2\); lower AIC = better fit

  • BIC: \(-2logL + kp\) with \(k=log(N)\); lower BIC = better fit

    • BIC penalizes model complexity more than AIC

Tests for normal, i.i.d residuals#

  • Omnibus: \(K^2\) statistic

  • Prob(Omnibus): small p-value means reject null of normal dist

  • Skew: perfect symmetry = 0

  • Kurtosis: normal distribution = 3

  • Durbin-Watson: Tests for autocorrelation, independence of errors

    • Ideally between 1 and 2

  • Jarque-Bera (JB): also tests normality of residuals

  • Prob(JB): small p-value means reject null of normal dist

  • Cond. No.: used to diagnose multicollinearity;

    • it is the condition number of the design matrix of the covariates