Demo

Demo#

Ordinary Least Squares / Linear Regression

import numpy as np
import pandas as pd
import statsmodels.api as sm
import seaborn as sns

n = 500
np.random.seed(0)
df = pd.DataFrame({
    "x1":np.random.normal(10,1,n),
    "x2":np.random.normal(2,1,n),
    "e":np.random.normal(0,1,n)
})

df["y"] = 2 + 3*df["x1"] + (-2)*df["x2"] + df["e"]

X = sm.add_constant(df[["x1","x2"]], prepend=False)

mod = sm.OLS(df["y"], X)

res = mod.fit()

print(res.summary())

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.933
Model:                            OLS   Adj. R-squared:                  0.933
Method:                 Least Squares   F-statistic:                     3480.
Date:                Thu, 01 Jun 2023   Prob (F-statistic):          5.01e-293
Time:                        12:46:04   Log-Likelihood:                -691.18
No. Observations:                 500   AIC:                             1388.
Df Residuals:                     497   BIC:                             1401.
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
x1             2.9548      0.043     68.145      0.000       2.870       3.040
x2            -2.0109      0.044    -45.317      0.000      -2.098      -1.924
const          2.5011      0.446      5.602      0.000       1.624       3.378
==============================================================================
Omnibus:                        0.355   Durbin-Watson:                   2.034
Prob(Omnibus):                  0.837   Jarque-Bera (JB):                0.203
Skew:                           0.010   Prob(JB):                        0.903
Kurtosis:                       3.097   Cond. No.                         106.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

Terms:#

Goodness of Fit:#

R-squared: proportion of variance of dependent variable explained by covariates
Adjusted R-squared: adjusts R-squared for the number of predictors in the model
F-statistic: tests null hypothesis that all coefficients are zero
Prob (F-statistic): low p-value means current model is more significant than intercept-only model
Log-Likelihood: \(log(p(X|\mu\Sigma)\) log of probability that data is produced by this model
AIC: \(-2logL + kp\) with \(k=2\); lower AIC = better fit
BIC: \(-2logL + kp\) with \(k=log(N)\); lower BIC = better fit
- BIC penalizes model complexity more than AIC

Tests for normal, i.i.d residuals#

Omnibus: \(K^2\) statistic
- D’Agostino’s K-squared test
- If null hypotehsis of normality is true, then \(K^2\) is approximately \(\chi^2\) distributed with 2 df
Prob(Omnibus): small p-value means reject null of normal dist
Skew: perfect symmetry = 0
Kurtosis: normal distribution = 3
Durbin-Watson: Tests for autocorrelation, independence of errors
- Ideally between 1 and 2
Jarque-Bera (JB): also tests normality of residuals
Prob(JB): small p-value means reject null of normal dist
Cond. No.: used to diagnose multicollinearity;
- it is the condition number of the design matrix of the covariates

Demo

Contents

Demo#

Terms:#

Goodness of Fit:#

Tests for normal, i.i.d residuals#