#
Regression & Linear Modeling
Best Practices and Modern Methods

- Jason W. Osborne - Clemson University, USA

**Regression & Linear Modeling**provides conceptual, user-friendly coverage of the generalized linear model (GLM). Readers will become familiar with applications of ordinary least squares (OLS) regression, binary and multinomial logistic regression, ordinal regression, Poisson regression, and loglinear models. The author returns to certain themes throughout the text, such as testing assumptions, examining data quality, and, where appropriate, nonlinear and non-additive effects modeled within different types of linear models.

**Available with**

**Perusall****—an eBook that makes it easier to prepare for class**

*Perusall*is an award-winning eBook platform featuring social annotation tools that allow students and instructors to collaboratively mark up and discuss their SAGE textbook. Backed by research and supported by technological innovations developed at Harvard University, this process of learning through collaborative annotation keeps your students engaged and makes teaching easier and more effective. Learn more.

The Variables Lead the Way |

Different Classifications of Measurement |

It’s All About Relationships! |

A Brief Review of Basic Algebra and Linear Equations |

The GLM in One Paragragh |

A Brief Consideration of Prediction |

A Brief Primer on Null Hypothesis Statistical Testing |

A Tale of Two Errors |

What Conclusions Can We Draw Based on NHST Results? |

So What Does Failure to Reject the Null Hypothesis Mean? |

Moving Beyond NHST |

The Importance of Replication and Generalizability |

Where We Go From Here |

Enrichment |

Estimation and the GLM |

What Is OLS Estimation? |

ML Estimation—A Gentle but Deeper Look |

Assumptions for OLS and ML Estimation |

Simple Univariate Data Cleaning and Data Transformations |

What If We Cannot Meet the Assumptions? |

Where We Go From Here |

Enrichment |

Advance Organizer |

It’s All About Relationships! |

Basics of the Pearson Product-Moment Correlation Coefficient |

Calculating r |

Effect Sizes and r |

A Real Data Example |

The Basics of Simple Regression |

Basic Calculations for Simple Regression |

Standardized Versus Unstandardized Regression Coefficients |

Hypothesis Testing in Simple Regression |

A Real Data Example |

Does Centering or z-Scoring Make a Difference? |

Some Simple Multivariate Data Cleaning |

Summary |

Enrichment |

Advance Organizer |

It’s All About Relationships! (Part 2) |

Analyzing These Data via t-Test |

Analyzing These Data via ANOVA |

ANOVA Within an OLS Regression Framework |

When Your IV Has More Than Two Groups: Dummy Coding Your Unordered Polytomous Variable |

Smoking and Diabetes Analyzed via ANOVA |

Smoking and Diabetes Analyzed via Regression |

What If the Dummy Variables Are Coded Differently? |

Unweighted Effects Coding |

Weighted Effects Coding |

Common Alternatives to Dummy or Effects Coding |

Summary |

Enrichment |

Advance Organizer |

It’s All About Relationships! (Part 3) |

The Linear Probability Model |

How Logistic Regression Solves This Issue: The Logit Link Function |

A Brief Digression Into Probabilities, Conditional Probabilities, and Odds |

Simple Logistic Regression Using Statistical Software |

The Logistic Regression Equation |

Interpreting the Constant |

What If You Want CIs for the Constant? |

Summary So Far |

Logistic Regression With a Continuous IV |

Some Best Practices When Using a Continuous Variable in Logistic Regression |

Testing Assumptions and Data Cleaning in Logistic Regression |

Hosmer and Lemeshow Test for Model Fit |

Summary |

Enrichment |

Appendix 5A: A Brief Primer in Probit Regression |

Advance Organizer |

Understanding Marijuana Use |

Dummy-Coded DVs and Our Hypotheses to Be Tested |

Basics and Calculations |

Multinomial Logistic Regression (Unordered) With Statistical Software |

Multinomial Logistic Regression With a Continuous Predictor |

Multinomial Logistic Regression as a Series of Binary Logistic Regressions |

Data Cleaning and Multinomial Logistic Regression |

Testing Whether Groups Can Be Combined |

Ordered Logit (Proportional Odds) Model |

Assumptions of the Ordinal Logistic Model |

Interpreting the Results of the Ordinal Regression |

Interpreting the Intercepts/Thresholds |

Interpreting the Parameter Estimates |

Data Cleaning and More Advanced Models in Ordinal Logistic Regression |

The Measured Variable is Continous, Why Not Just Use OLS Regression for This Type of Analysis? |

A Brief Note on Log-Linear Analyses |

Summary and Conclusions |

Enrichment |

Advance Organizer |

Zeno’s Paradox, a Nerdy Science Joke, and Inherent Curvilinearity in the Universe… |

A Brief Review of Simple Algebra |

Hypotheses to Be Tested |

Illegitimate Causes of Curvilinearity |

Detection of Nonlinear Effects |

Basic Principles of Curvilinear Regression |

Curvilinear OLS Regression Example: Size of the University and Faculty Salary |

Data Cleaning |

Interpreting Curvilinear Effects Effectively |

Reality Testing This Effect |

Summary of Curvilinear Effects in OLS Regression |

Curvilinear Logistic Regression Example: Diabetes and Age |

Curvilinear Effects in Multinomial Logistic Regression |

Replication Becomes Important |

More Fun With Curves: Estimating Minima and Maxima as Well as Slope at Any Point on the Curve |

Summary |

Enrichment |

Advance Organizer |

The Basics of Multiple Predictors |

What Are the Implications of This Act? |

Hypotheses to Be Tested in Multiple Regression |

Assumptions of Multiple Regression and Data Cleaning |

Predicting Student Achievement From Real Data |

Testing Assumptions and Data Cleaning in the NELS88 Data |

Methods of Entering Variables |

Using Multiple Regression for Theory Testing |

Logistic Regression With Multiple IVs |

Assessing the Overall Logistic Regression Model: Why There Is No R2 for Logistic Regression |

Summary and conclusions |

Exercises |

Advance Organizer |

What is an Interaction? |

Procedural and Conceptual Issues in Testing for Interactions Between Continuous Variables |

Procedural and Conceptual Issues in Testing for Interactions Containing Categorical Variables |

Hypotheses to Be Tested in Multiple Regression With Interactions Present |

An OLS Regression Example: Predicting Student Achievement From Real Data |

Interpreting the Results From a Significant Interaction |

Graphing Interaction Effects |

An Interaction Between a Continuous and a Categorical Variable in OLS Regression |

Interactions With Logistic Regression |

Example Summary of Interaction Analysis |

Interactions and Multinomial Logistic Regression |

Example Summary of Findings |

Can These Effects Replicate? |

Post Hoc Probing of Interactions |

Summary |

Enrichment |

Advance Organizer |

What is a Curvilinear Interaction? |

A Quadratic Interaction Between X and Z |

A Cubic Interaction Between X and Z |

A Real-Data Example and Exploration of Procedural Details |

Curvilinear Interactions Between Continuous and Categorical Variables |

Curvilinear Interactions With Categorical DVs (Multinomial Logistic) |

Curvilinear Interaction Effects in Ordinal Regression |

Chapter Summary |

Enrichment |

Advance Organizer |

The Basics and Assumptions of Poisson Regression |

Why Can’t We Just Analyze Count Data via OLS, Multinomial, or Ordinal Regression? |

Hypotheses Tested in Poisson Regression |

Poisson Regression With Real Data |

Interactions in Poisson regression |

Data Cleaning in Poisson Regression |

Refining the Model by Eliminating Excess (Inappropriate) Zeros |

A Refined Analysis With Excess Zeros Removed |

Curvilinear Effects in Poisson Regression |

Dealing With Overdispersion or Underdispersion |

Negative Binomial Model |

Summary and Conclusions |

Enrichment |

Advance Organizer |

The Basics of Loglinear Analysis |

Hypotheses Being Tested |

Assumptions of Loglinear Models |

A Slightly More Complex Loglinear Model |

Can We Replicate These Results in Logistic Regression? |

Data Cleaning in Loglinear Models |

Summary and Conclusions |

Enrichment |

Advance Organizer |

Why HLM models Are Necessary |

How Do Hierarchical Models Work? A Brief Primer |

Generalizing the Basic HLM Model |

Residuals in HLM |

Results of DROPOUT Analysis in HLM |

Summary and Conclusions |

Enrichment |

Advance Organizer |

Not All Missing Data Are the Same |

Categories of Missingness: Why Do We Care If Data Are MCAR or Not? |

How Do You Know If Your Data Are MCAR, MAR, or MNAR? |

What Do We Do With Randomly Missing Data? |

Data MCAR |

Data MNAR |

How Missingness Can Be an Interesting Variable in and of Itself |

Summing Up: Benefits of Appropriately Handling Missing Data |

Enrichment |

Advance Organizer |

What Is Power, and Why Is It Important? |

Power in Linear Models |

Summary of Points Thus Far |

Who Cares as Long as p < .05? Volatility in Linear Models |

A Brief Introduction to Bootstrap Resampling |

Summary and Conclusions |

Enrichment |

Advance Organizer |

A More Modern View of Reliability |

What is Cronbach’s Alpha (and What Is It Not)? |

Factors That Influence Alpha |

What Is “Good Enough” for Alpha? |

Reliability and Simple Correlation or Regression |

Reliability and Multiple IVs |

Reliability and Interactions in Multiple Regression |

Protecting Against Overcorrecting During Disattenuation |

Other (Better) Solutions to the Issue of Measurement Error |

Does Reliability Influence Other Analyses, Such as Analysis of Variance? |

Reliability in Logistic Models |

But Other Authors Have Argued That Poor Reliability Isn’t That Important. Who Is Right? |

Sample Size and the Precision/Stability of Alpha-Empirical CIs |

Summary and Conclusions |

Advance Organizer |

Prediction vs. Explanation |

How is a Prediction Equation Created? |

Shrinkage and Evaluating the Quality of Prediction Equations |

An Example Using Real Data |

Improving on Prediction Models |

Calculating a Predicted Score, and CIs Around That Score |

Prediction (Prognostication) in Logistic Regression (and Other) Models |

An Example of External Validation of a Prognostic Equation Using Real Data |

External Validation of a Prediction Equation |

Using Bootstrap Analysis to Estimate a More Robust Prognostic Equation |

Summary |

Advance Organizer |

What Types of Studies Use Complex Sampling? |

Why Does Complex Sampling Matter? |

What Are Best Practices in Accounting for Complex Sampling? |

Does It Really Make a Difference in the Results? |

Conditions Used |

Comparison of Unweighted Versus Weighted Analyses |

Summary |

Enrichment |

### Supplements

Data sets for the exercises and additional resources are available on the free open-access site.

“I really enjoyed reading this, which is rare to say about a statistics textbook. The style of writing is very approachable, and the material is presented in a way that is informative even to someone who thinks about these topics often.”

**Saint Louis University**

“The author has taught this subject matter for years. . . . He speaks to me as I face similar situations in the classroom. He writes in an accessible way for those who are not methodologists.”

**The University of North Carolina at Greensboro**

“The conversational language is a strength of the text. I can see it helping to put some otherwise anxious readers at ease. The author’s sharing of their experience in data analysis is a nice touch, too. The manner in which the material is presented is not at all threatening or intimidating.”

**University of Pennsylvania**