Joseph Nathan Cohen

Department of Sociology, CUNY Queens College, New York, NY

Correlations in Stata

To see more about the “Teaching through YouTube” series.  For more, see this post.

Original Video Description

Using pairwise correlations to summarize the bivariate relationship between two continuous variables

Transcription (Auto-Generated)

in this video I’ll show you how to describe the relationship between two continuous variables using Paris collection z’ correlations summarize the linear relationship between two continuous variables you can get correlations by using the can’t command PW car for pairwise correlations correlations describe the relationship between two continuous variables on a minus 1 to +1 scale when a correlation is negative it means that as the value of one variable goes up the other one goes down positive correlations mean that as the value of one variable goes up the other one goes up The Closer a relationship is to 0 the less related the two are changes in one variable do not appear to correspond to changes in the other it can be hard to make correlations concrete so here’s some real-world examples this is an example of a strong negative relationship it describes the relationship between country’s birth rates and the prevalence of contraceptive use among the female population of childbearing age this is a strong negative relationship the higher the birth rate the lower the contraceptive use and the higher the contraceptive use the lower the birth rate the graph exhibits a pretty strong looking relationship and it’s reflected in a stronger negative correlation this is an example of a pairwise correlation that’s weak it’s close to zero it shows the relationship between poverty and the murder rate as the graph suggests countries with high rumored er rates can have a lot of poverty or not much and countries with low murder rates can have a lot of poverty or not much this is an example of a positive relationship with a pairwise correlation that’s 0.82 is close to +1 it shows the relationship between countries GDP per capita and the number of entry that users per hundred people the variable is strongly correlated really poor countries tend not to have many Internet users and richer ones have a lot of them we use the PW core command to get pairwise correlations the syntax is PW core and then a list of variables separated by spaces this command will get you the pairwise correlation between three variables GDP per capita life expectancy and Internet users the results suggest that there’s a reasonably strong positive correlation between all three variables as an economy tends to get richer it tends to have longer lifespans and more internet use likewise it suggests that internet use and lifespans are positively related there are at least two options you can use with the PW core command the option OBS is for observations it asks data to report how many observations were used in calculating the pairwise correlation this is useful because sometimes we don’t realize that we’re making inferences about correlations based on really small sample sizes it’s always good to run it with the OBS option once this is an example of what it looks like when we use the OBS option it shows that these pairwise correlations are calculated between 171 and 188 observations the sinc command asks data to give the results of a significance test that the correlation is nonzero when you use this command you’re looking for a significance score that’s less than 0.05 in this case all of these relationships score below point zero five suggesting that they are all significant this means that we predict there’s a high likelihood that all of these variables have nonzero relationships let’s review correlations describe the relationship between two continuous variables to get pairwise correlations we use the command PWR and the list of variables that we want to have correlated the option herbs will last data to report the number of observations upon which our correlation estimates are based and sake will ask data to perform a significance test to determine whether or not we have good evidence that two variables have a non zero pairwise correlation for more information on this command type help PW core in the state of command window