Here we see that the predicted value is 0.865, and has a 95% confidence interval of. The constant, _cons, is the vote share we expect when Tweet share and percent white both equal zero. The 95% confidence interval for the coefficients are also presented. Here we see that the mshare and pct_white coefficient estimates are easily significant, \(p < 0.001\). Dividing the coefficient by the standard error gives us the \(t\)-statistic used to calculate the \(p\)-value. The standard error tells us how much sample-to-sample variability we should expect in the coefficient estimates. This means that for each increase of one on the pct_white variable, the vote share increases by 0.55, holding tweet share constant. This means that for each increase of one on the mshare variable, the vote share increases by 0.178, holding percent white constant. The final table tells us the results of the regression model. It is also used in the formula for the standard error of the coefficient estimates, shown in the next table. This value gives a summary of how much the observed values vary around the predicted values, with better models having lower RMSEs. The Root MSE is the square root of the residual MS from the top left table, \(\sqrt = 11.349\). The adjusted \(R^2\) provides a slightly more conservative estimate of the percentage of variance explained, 55.19%. The R-squared value tells us that the independent variables explain 55.41% of the variation in the outcome. We clearly reject the null hypothesis with \(p F = 0.0000. The \(F\)-statistic tests the null hypothesis that the independent variables together do not help explain any variance in the outcome. Looking at the top right, we see that the number of observations used to fit the model was 406. These values go into calculating the \(F\)-statistic, \(R^2\), adjusted \(R^2\), and Root Mean Square Error shown in the top right of the output. Dividing the SS column by the df (degrees of freedom) column returns the mean squares in the MS column. The box at the top left provides us with an ANOVA table that gives 1) the sum of squares ( SS) for the model, often called the regression sum of squares, 2) the residual sum of squares, and 3) the total sum of squares. This returns the following: Source | SS df MS Number of obs = 406 The following syntax runs the regression. pct_white ( independent variable): The percent of white voters in a given Congressional districtĪll three variables are measured as percentages ranging from zero to 100.mshare ( independent variable): The percent of social media posts for a Republican candidate.vote_share ( dependent variable): The percent of votes for a Republican candidate.The replication data in Stata format can be downloaded from our github repo. ![]() The data used in this post come from the More Tweets, More Votes: Social Media as a Quantitative Indicator of Political Behavior study from DiGrazia J, McKelvey K, Bollen J, Rojas F (2013), which investigated the relationship between social media mentions of candidates in the 20 US House elections with actual vote results. The details of the underlying calculations can be found in our multiple regression tutorial. This tutorial shows how to fit a multiple regression model (that is, a linear regression with more than one independent variable) using Stata.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |