open Secondary menu

Youth Engagement and Mobilization in the 2010 Toronto Municipal Election

Appendix D – Multiple Regression Explained

Multiple regression is a statistical technique that examines the relationship between a dependent variable (e.g. height) and a number of independent variables (e.g. parents' height, diet, exercise and gender). Rather than comparing the relationship between height and all of its possible causes separately, multiple regression considers all these causes at the same time and determines the independent effect of each.

The estimated effect of each factor is represented by a regression coefficient. Coefficients tell us how strongly an independent variable is related to the dependent variable. Coefficients are accompanied by a p-value that tells us how sure we can be that the relationship between the two variables is not due to chance. The larger the regression coefficient, the more important its effect. The smaller the p-value, the surer we can be that the relationship is real and not due to chance. We say that a relationship that is not due to chance is statistically significant.

Returning to the example of the determinants of height, imagine if we found that the only statistically significant predictors of height were parents' height and gender. This would tell us that diet and exercise do not matter after we control for parents' height and gender. It would also tell us that parents' height and gender matter individually, such that a brother and sister could expect to be of different heights (because despite sharing the same parents, they are of different genders). Likewise, two women with different parents could expect to be of different heights, provided their parents were not of the same height.