What regression is
Regression estimates how an outcome (Y) changes when one or more predictors (X1…Xk) change. In linear regression the fitted line is: Ŷ = b0 + b1X1 + … + bkXk. Each coefficient shows the expected change in Y for a one-unit change in that predictor while other variables are held constant. Regression is for prediction and adjusted associations; it does not prove causation.
When to use regression
– Continuous outcome (e.g., score, income): Linear regression (ordinary least squares).
– Binary outcome (0/1): Binary logistic regression.
– Multicategory nominal outcome: Multinomial logistic regression.
– Ordered categories: Ordinal logistic regression.
– Counts: Poisson or Negative Binomial (use NB if overdispersed).
– Time to event: Cox proportional hazards (survival).
Pre-checks before running
– Inspect scatterplots to check linearity; consider transformations or splines if curved.
– Look for outliers and influence (leverage, Cook’s D).
– Check multicollinearity (VIF < about 5–10).
– Code categorical predictors with a clear reference level (dummy/indicator coding).
– Plan interactions a priori; center or standardize continuous predictors if you include them.
– Decide how to handle missing data (prefer multiple imputation over listwise deletion when feasible).
How to run it
SPSS (linear)
- Analyze → Regression → Linear.
- Move outcome to Dependent, predictors to Independent(s).
- Statistics: Estimates, Model fit, Confidence intervals, Collinearity diagnostics; add Durbin–Watson for time-ordered data.
- Plots: ZPRED vs ZRESID; histogram and normal P-P of residuals.
- Save: Predicted values and residuals if you need diagnostics.
- OK.
SPSS (binary logistic)
- Analyze → Regression → Binary Logistic.
- Put outcome (coded 0/1) in Dependent; predictors in Covariates; specify categorical predictors and reference levels in Categorical.
- Options: Confidence intervals for exp(B); classification table; ROC if available.
- OK.
jamovi
– Linear: Regression → Linear Regression. Add Dependent, Covariates/Factors. Turn on Standardized estimates, Confidence intervals, VIF, residual plots.
– Logistic: Regression → Logistic Regression (or GAMLj module). Request Odds ratios (exp(β)), Pseudo-R², ROC/AUC.
jamovi
– Linear: Regression → Linear Regression. Add Dependent, Covariates/Factors. Turn on Standardized estimates, Confidence intervals, VIF, residual plots.
– Logistic: Regression → Logistic Regression (or GAMLj module). Request Odds ratios (exp(β)), Pseudo-R², ROC/AUC.
JASP
– Linear: Regression → Linear Regression. Add variables; enable Standardized coefficients, VIF, residual diagnostics (QQ plot, residual vs fitted).
– Logistic: Regression → Logistic Regression (Binary/Multinomial/Ordinal). Request Odds ratios, ROC curve, classification table, Pseudo-R², diagnostics.
R

How to interpret results
Linear regression (OLS)
– Coefficients (b): expected change in Y for a one-unit increase in X, holding others constant.
– Standardized beta (β): effect in SD units; useful for comparing predictors on different scales.
– Confidence intervals: report 95% CI for each coefficient.
– Model fit: R² (variance explained) and Adjusted R²; F-test for overall model; check residual plots for homoscedasticity and normality (for inference).
Example: “Study hours predicted GPA, b = 0.07, 95% CI [0.03, 0.11], t(196) = 3.45, p = .001, β = .24. Model F(3,196) = 18.2, p < .001, R² = .22.”
Logistic regression
– Coefficients are log-odds; exponentiate to get Odds Ratios (OR). OR > 1 increases odds; OR < 1 decreases odds.
– Report 95% CI for OR, Pseudo-R² (e.g., Nagelkerke), AUC for discrimination, and calibration checks.
Example: “GRE score predicted admission, OR = 1.12, 95% CI [1.05, 1.21], p = .002, controlling for GPA and research experience. AUC = .81; Nagelkerke R² = .29.”
Diagnostics and good practice
– Outliers/influence: inspect leverage, Cook’s D, DFBETAs; justify any exclusions.
– Heteroscedasticity (OLS): residual vs fitted plot; consider robust SEs.
– Nonlinearity: add polynomial or spline terms; check linearity of log-odds for logistic.
– Multicollinearity: high VIF suggests redundant predictors; consider dropping, combining, or regularizing.
– Multiple testing: control false discovery when fitting many models.
– Model selection: avoid blind stepwise; prefer theory, cross-validation, or regularization (ridge/LASSO).
– Transparency: preregister, share code, and include diagnostics.
Quick decision guide
– Continuous outcome → Linear regression.
– Binary outcome → Logistic regression.
– Counts → Poisson or Negative Binomial.
– Ordered categories → Ordinal logistic.
– Time-to-event → Cox regression.
Takeaway
Match the regression type to your outcome, check assumptions and diagnostics, report effect sizes with confidence intervals, and avoid causal claims unless your design supports them.