Fundamentals of Statistical Interactions

Back To: FAQ Home Introduction Interpretation Polynomials References


Questions:
What is the difference between "main effects" and "interaction effects"?
What are the advantages of including an interaction that is relevant to a model?
What are the disadvantages of including a relevant product term to a model?
Why is the inclusion of a product term in a regression model referred to as nonadditive?
How does one obtain standard errors and t-ratios for the parameter estimates in an interactive model?
Is it possible that effect of X1 on the dependent variable can have fluctuating significance?
Should you drop the constituent variable if it is non-significant?
Is it true that including an multiplicative term in a model drastically increases the level of collinearity?
Does the inclusion of a product term in model estimation dramatically change the parameter estimates of the constituent variables?
Does the addition of a product term cause certain coefficients that are significant in an additive model to be non-significant in an interactive model?
The multiplicative term is statistically significant, but the level of explained variance does not increase much, should you leave the interaction in the model?

What is the difference between "main effects" and "interaction effects"?

The terms "interaction" and "main effects" were adopted from the analysis of variance method (ANOVA). In the context of analysis of variance an "interaction" refers to the effect of a factor averaged over another factor and the "main effect" represents the average effect of a single variable. In the case of multiple regression, this terminology is not suitable in the presence of an interaction. Typically, b1 and b2 in a nonadditive model are referred to as "main effects". It is quiet common in the case of an interactive model to refer to b 1 and b2 as "main effects". The use of this term in examining an interactive model is held here to be misleading. In the presence of an interaction these coefficients in no instance represent a constant effect of the independent variable on the dependent variable. The use of the term "main effect" implies that b1 and b2 are somehow interpretable alone when they actually represent a portion of the effect of the corresponding variable on the dependent variable. In other words, the value of b1 represents only part of the effect of X1 on the dependent variable, the remaining effect is in the interaction term. A more appropriate way to label b1 is the "constituent effect", thus indicating that the effect of b1 on the dependent variable is conditional upon the value of another independent variable. An "interaction effect" has traditionally implied a separate effect of an independent variable on the dependent variable. The "product term" actually represents a portion of the effect of the independent variables on the dependent variable. The "main effects" and "interaction effect" in a multiplicative model in no instance represent a constant effect of an independent variable on the dependent variable.

Back to Top

What are the advantages of including an interaction that is relevant to a model?

If an interaction exists in the data there are several advantages for including the multiplicative term. First, if an interaction does in fact exist and is not included in the estimation, this introduces a specification error in the form of omitted variable bias. Estimation of a model that fails to account for the interaction will not provide an accurate estimation of the true relationship between the dependent and independent variables. A model that includes the interaction term provides a better description of the relationship between the independent and dependent variables. Second, the inclusion of the product term will offer a more accurate estimation of the relationship and explain more of the variation in the dependent variable. Finally, including a product term according to Friedrich (1982) is a "low-risk strategy" in that if the product term is significant then keep it in the model otherwise one can drop the product term out of the model.

Back to Top

What are the disadvantages of including a relevant product term to a model?

The criticism of the inclusion of "product terms" in regression analysis has been heavy; however, there are no real disadvantages to the inclusion of such a term in a regression model. These criticisms waged against multiplicative or interaction terms will be discussed in subsequent questions and answers. By addressing these criticisms it will be clearer that the inclusion of a multiplicative term in a regression model has no disadvantages.

Back to Top

Why is the inclusion of a product term in a regression model referred to as nonadditive?

In regular regression, the relationship between the independent and dependent variables is referred to as additive. This is based on the assumption that the effect of an independent variable on a dependent variable is constant regardless of the value of any other independent variable. For example, in interpreting a beta coefficient produced by regular regression we are taught to interpret the relationship along these lines: for every one unit increase in X1 there is a ____ change in the dependent variable. This implies that for all values of the independent variable the effect on the dependent variable is constant, regardless of the value of a second independent variable. The inclusion of an interaction term is "nonadditive", meaning that the effect of one independent variable on the dependent variable varies according the value of a second independent variable.

Back to Top

How does one obtain standard errors and t-ratios for the parameter estimates in an interactive model?

The relevant standard errors and t-ratios for different levels can also be obtained with the appropriate formulas. Following the example outlined above it is possible to determine the standard error for the metric effect of X1 by taking the square root of the following equation:

var (b1) + X22 var(b3) + 2X2 cov(b1 b3)

Remember that this standard error represents the standard error for X1 at the specified value of X2.  The t-ratio can then be derived by dividing the metric effect by the standard error for X1 at a particular value of X2.

Back to Top

Is it possible that effect of X1 on the dependent variable can have fluctuating significance?

It is possible that the impact of X1 on Y is significant at some values of X2, while non-significant at other values of X2. How is this possible? The effect is conditional upon the value of the other independent variable. An example of a case in which the significance of X1 varies according to the value of X2 could be the effect of age on voter turnout at various levels of education. The results could conceivable indicate that the level of education has no effect on the voter turnout among senior citizens and the young; whereas, education levels plays a role among middle aged portions of the population.

Back to Top

Should you drop the constituent variable if it is non-significant?

The constituent variables of the interaction model should always be included regardless of whether they are significant. In this type of model X1 represents the effect of X1 on the dependent variable when X2 equals zero, and vice versa. The fact that the constituent variables are non-significant does not imply that they are dispensable. If the product term is significant this means that the effect of X1 at some other value of X2 has a significant effect on the dependent variable. As previously noted the significance of X1 can vary at differing values of X2 and in some instances this can involve the constituent variables.

Back to Top

Is it true that including an multiplicative term in a model drastically increases the level of collinearity?

Yes, including a product term in a general linear model can drastically increase the level of collinearity. The product term (X1X2) is an exact nonlinear function of the constituent variables (X1 and X2), thus correlations of the constituent variables with the product term are usually high. Critics speculate that the increase in collinearity impacts the quality of the parameter estimate of the effect of the independent variables on the dependent variable by increasing the covariance and variances of regression coefficients (Friedrich 1982). According to the regression assumptions the only time that multicollinearity is crippling to analysis is in the presence of perfect collinearity. In this situation the model estimation is unable to produce results.

Back to Top

Does the inclusion of a product term in model estimation dramatically change the parameter estimates of the constituent variables?

Critics assert that increased levels of collinearity in models including a multiplicative term distort the beta coefficients. The beta coefficients in the multiplicative model often differ drastically from the additive model because the interactive model and additive model are describing different relationship. The additive model is describing a constant effect of the independent variable on the dependent variable. The interactive model describes the relationship as a conditional relationship, meaning the effects of each independent variable on the dependent variable varying according to the level of the other independent variable.

Back to Top

Does the addition of a product term cause certain coefficients that are significant in an additive model to be non-significant in an interactive model?

It is possible that significant coefficients in an additive model can be non-significant in the interactive model.  It is important to recognize that this occurrence does not mean that the parameter estimates of the interactive model are wrong, rather these coefficients are estimates of particular trends of change in Y with changes in the independent variables.  Specifically, b1 and b2 in the interactive model are estimates of the change in Y with changes in X1 and X 2, when X2 and X1 respectively equal zero.   These beta coefficients estimate particular conditional relationships rather than general ones.  Thus it is possible that at this level, the effect of the independent variables on the dependent variables is non-significant.

Back to Top

The multiplicative term is statistically significant, but the level of explained variance does not increase much, should you leave the interaction in the model?

In estimating an interactive model one may find that the explained variance increases very little by including the product term. This has lead to questions of parsimony, arguing that the inclusion of the product variable uses an additional degree of freedom yet provides little in the way of explanatory power. For example, in the additive model the first order effect explains 80% of the variance in the dependent variable. The interactive model explains 81% of the variance in the dependent variable. Some critics might argue that including the interaction is harmful not helpful. In this case it is once again important to remember that the inclusion is not solely important for explaining the variance in the dependent variable, but also establishing the presence of the conditional relationship between the independent variables and the dependent variables.

Back to Top


Back To: FAQ Home Introduction Interpretation Polynomials References

Last updated: August 16, 1999