Advanced Analytics

Regression & Forecasting Lab

Build, test, and validate regression, trend, and autoregressive models on the monthly modeling dataset. Select a target (Y), optional predictors (X), then review fit, forecast quality, diagnostics, and interpretation generated by the backend regression engine.

Notes: Results describe statistical associations, not causality. For time-series data, autocorrelation and heteroskedasticity are common—use robust/HAC inference when diagnostics suggest it.

Regression Settings

1. Target Variable (Y)

Dependent variable you want to explain or forecast.

2. Predictors (X)

Independent variables used to explain the target.

3. Settings

Train / Test Split
100%
Advanced settings
Seasonality
Seasonal differencing is available per predictor as Seasonal diff (Δ12) in its transform dropdown.
Walk-Forward Backtest
Forecast horizons
Source: processed_features_monthly_model.parquet

4. Run Model

Runs the selected model on the cleaned monthly sample. Results include fit, forecast quality, diagnostics, and interpretation.

Select variables to begin.

Ready to Analyze

Choose a target (Y), select a model type, optionally add predictors (X), then run the model. You’ll get fit metrics, forecast checks, diagnostics, plots, coefficient tables, VIF, and ANOVA when applicable.

Tip: If diagnostics flag heteroskedasticity or autocorrelation, prefer robust (HC) or HAC (Newey–West) standard errors when interpreting p-values and confidence intervals.
Methodology Guide
t-test & p-value: Per-coefficient test of whether a predictor’s coefficient differs from zero under the chosen standard errors.
F-test (overall model): Tests whether the predictors jointly improve fit relative to a constant-only model.
Adj. R²: In-sample explanatory power adjusted for model complexity (penalizes adding weak predictors).
AIC & BIC: Relative model selection scores (lower is better) for comparing models fit on the same target and sample window.
Breusch–Pagan / White: Tests for heteroskedasticity (non-constant error variance). If detected, prefer robust (HC) or HAC inference.
Durbin–Watson / Breusch–Godfrey: Diagnostics for residual autocorrelation. If present, HAC (Newey–West) inference is recommended.
VIF (Multicollinearity): Indicates redundancy among predictors. VIF ≥ 5 suggests moderate multicollinearity; ≥ 10 is high risk.
ANOVA: Decomposes explained vs unexplained variance to support model-fit interpretation.
Practical caution: In financial datasets, predictor transforms may embed trends and shared exposures. Strong diagnostics + robust/HAC inference improve reliability, but model design (lagging predictors, avoiding leakage) is equally important.