Fundamentals of Statistical Interactions
Back To: FAQ Home Introduction
Interpretation
Polynomials
References
Questions:
What is the difference between "main effects" and
"interaction effects"?
What are the advantages of including an
interaction that is relevant to a model?
What are the disadvantages of including a
relevant product term to a model?
Why is the inclusion of a product term in
a regression model referred to as nonadditive?
How does one obtain standard errors and
t-ratios for the parameter estimates in an interactive
model?
Is it possible that effect of
X1 on the
dependent variable can have fluctuating
significance?
Should you drop the constituent variable if
it is non-significant?
Is it true that including an multiplicative
term in a model drastically increases the level of collinearity?
Does the inclusion of a
product term in model estimation dramatically change the parameter
estimates of the constituent variables?
Does the addition of a product term cause
certain coefficients that are significant in an additive model to be
non-significant in an interactive model?
The multiplicative term is statistically significant, but
the
level of explained variance does not increase much, should you leave the
interaction in the model?
What is the difference between "main
effects" and
"interaction effects"?
The terms "interaction" and "main effects" were adopted
from the analysis of variance method (ANOVA). In the context of
analysis of variance an "interaction" refers to the effect of a factor
averaged over another factor and the "main effect" represents the average
effect of a single variable. In the case of multiple regression, this
terminology is not suitable in the presence of an interaction.
Typically, b1 and b2 in a nonadditive model are
referred to as "main effects". It is quiet common in the case of an
interactive model to refer to b
1 and b2 as "main effects". The use of this term in examining an
interactive model is held here to be misleading. In the
presence of an interaction these coefficients in no instance represent a constant effect
of the independent variable on the dependent variable. The use of the term "main
effect" implies that b1
and b2 are somehow interpretable alone when they actually represent a portion of the
effect of the corresponding variable on the dependent variable. In other words, the value
of b1 represents
only part of the effect of X1 on the dependent
variable, the remaining effect is in the interaction term. A more appropriate way to label
b1 is the
"constituent effect", thus indicating that the effect of b1 on the dependent variable is
conditional upon the value of another independent variable. An
"interaction effect" has traditionally implied a separate effect
of an independent variable on the dependent variable. The "product term"
actually represents a portion of the effect of the independent variables
on the dependent variable. The "main effects" and "interaction effect" in
a multiplicative model in no instance represent a constant effect of an
independent variable on the dependent variable.
Back to Top
What are the advantages of including an
interaction that is relevant to a model?
If an interaction exists in the data there are several advantages
for including the multiplicative term. First, if an interaction does in fact exist and is
not included in the estimation, this introduces a specification error in the form of
omitted variable bias. Estimation of a model that fails to account for the
interaction will not provide an accurate estimation of the
true relationship between the dependent and independent variables. A
model that includes the interaction term provides a better description of the relationship between the
independent and dependent variables. Second, the inclusion of the product
term will offer a more accurate estimation of the relationship and
explain more of the variation in the dependent variable. Finally, including a product term according to
Friedrich (1982) is a "low-risk strategy" in that if the product term is
significant then keep it in the model otherwise one can drop the product
term out of the model.
Back to Top
What are the disadvantages of including a
relevant product term to a model?
The criticism of the inclusion of "product terms" in
regression analysis has been heavy; however, there are no real
disadvantages to the inclusion of such a term in a regression model.
These criticisms waged against multiplicative or interaction terms will be
discussed in subsequent questions and answers. By addressing these
criticisms it will be clearer that the inclusion of a
multiplicative term in a regression model has no disadvantages.
Back to Top
Why is the inclusion of a product term in
a regression model referred
to as nonadditive?
In regular regression, the relationship between the independent and
dependent variables is referred to as additive. This is based on the
assumption that the effect of an independent variable on a dependent
variable is constant regardless of the value of any other independent
variable. For example, in interpreting a beta coefficient produced by
regular regression we are taught to interpret the relationship along these
lines: for every one unit increase in X1 there
is a ____ change in the dependent variable. This implies that for all values of the
independent variable the effect on the dependent variable is constant, regardless of the
value of a second independent variable. The inclusion of an interaction
term is "nonadditive", meaning that the effect of one
independent variable on the dependent variable varies according the value
of a second independent variable.
Back to Top
How does one obtain standard errors and
t-ratios for the parameter estimates in an interactive
model?
The relevant standard errors and t-ratios for different
levels can
also be obtained with the appropriate formulas. Following the example
outlined above it is possible to determine the standard error for the
metric effect of X1 by taking
the square root of the following equation:
var (b1) + X22 var(b3) +
2X2
cov(b1
b3)
Remember that this standard error represents the standard
error for X1 at the specified
value of X2. The t-ratio can then be derived by dividing the metric
effect by the standard error for X1 at a particular value of
X2.
Back to Top
Is it possible that effect of
X1 on the
dependent variable can have fluctuating
significance?
It is possible that the impact of X1 on Y is significant at some values of
X2, while non-significant at other values of X2. How is
this possible? The effect is conditional upon the value of the other
independent variable. An example of a case in which the significance of X1 varies according to the value of X2
could be the effect of age on voter turnout at various levels of
education. The results could conceivable indicate that the level of
education has no effect on the voter turnout among senior citizens and the
young; whereas, education levels plays a role among middle aged portions
of the population.
Back to Top
Should you drop the constituent variable if
it is non-significant?
The constituent variables of the interaction model should always be included
regardless of whether they are significant. In this type of model X1
represents the effect of X1 on the dependent variable when X2 equals zero, and vice versa. The fact that the constituent variables are
non-significant does not imply that they are dispensable. If the product term is
significant this means that the effect of X1 at some other value of
X2 has a significant effect on the dependent
variable. As previously noted the significance of X1
can vary at differing values of X2 and in some
instances this can involve the constituent variables.
Back to Top
Is it true that including an multiplicative
term in a model
drastically increases the level of collinearity?
Yes, including a product term in a general linear model can
drastically increase the level of collinearity. The product term (X1X2) is an exact nonlinear function of
the constituent variables (X1
and X2), thus correlations of
the constituent variables with the product term are usually high. Critics
speculate that the increase in
collinearity impacts the quality of the parameter estimate of the effect
of the independent variables on the dependent variable by increasing the
covariance and variances of regression coefficients (Friedrich 1982).
According to the regression assumptions the only time that
multicollinearity is crippling to analysis is in the presence of perfect
collinearity. In this situation the model estimation is unable to produce
results.
Back
to Top
Does the inclusion of a
product term in model estimation dramatically change the parameter
estimates of the constituent variables?
Critics assert that increased levels of
collinearity in models including a multiplicative term distort the beta
coefficients. The beta coefficients in the multiplicative model
often differ drastically from the additive model because the interactive
model and additive model are describing different relationship. The
additive model is describing a constant effect of the independent variable
on the dependent variable. The interactive model describes the
relationship as a conditional relationship, meaning the effects of each
independent variable on the dependent variable varying according to the
level of the other independent variable.
Back to
Top
Does the addition of a product term cause
certain coefficients that are significant in an additive model to be
non-significant in an interactive model?
It is possible that significant coefficients in an
additive model can be non-significant in the interactive model. It
is important to recognize that this occurrence does not mean that the
parameter estimates of the interactive model are wrong, rather these
coefficients are estimates of particular trends of change in Y with
changes in the independent variables. Specifically, b1 and b2 in the interactive model are estimates of
the change in Y with changes in X1 and X 2, when
X2 and X1 respectively equal zero. These
beta coefficients estimate particular conditional relationships rather
than general ones. Thus it is possible that at this level, the
effect of the independent variables on the dependent variables is
non-significant.
Back to Top
The
multiplicative term is statistically significant, but the level of
explained variance does not increase much, should you leave the
interaction in the model?
In
estimating
an interactive model one may find that the explained variance increases
very little by including the product term. This has lead to questions of
parsimony, arguing that the inclusion of the product variable uses an
additional degree of freedom yet provides little in the way of
explanatory power. For example, in the additive model the first order
effect explains 80% of the variance in the dependent variable. The
interactive model explains 81% of the variance in the dependent variable.
Some critics might argue that including the interaction is harmful not helpful.
In this case it is once again important to remember that the inclusion is
not solely
important for explaining the variance in the dependent variable, but also establishing the
presence of the conditional relationship between the independent variables and the
dependent variables.
Back to Top
Back To:
FAQ Home Introduction
Interpretation
Polynomials
References
Last updated: August 16, 1999