Joseph Nathan Cohen

Department of Sociology, CUNY Queens College, New York, NY

Summarizing Variables on Stata Using Sum

A short video on how to implement and interpret the sum command on Stata

To see more about the “Teaching through YouTube” series.  For more, see this post.

Original Video Description

How to use the command “sum” to summarize variables in Stata

Transcription (Auto-Generated)

In this video, I will demonstrate how to retrieve summary statistics of a variable using the summarize command in Stata. By the end, you’ll know how to utilize the sum command to summarize any variable. To execute this command, type: sum [variable name], detail For instance, to summarize a variable named “IQ”, you’d enter: sum IQ, detail Ensure you’ve loaded your data into Stata’s memory for this command to work. To provide a practical example, I’ve preloaded a dataset into Stata. Let’s summarize the “per capita GDP” variable, which measures a country’s average economic output per person. The variable is labeled “GDP_PC”. So, I’d type: sum GDP_PC, detail And here’s the resulting output. Interpreting the output: The left column displays various percentiles for the variable. The 50th percentile (or median) score, for instance, is 4955, indicating half of the observations are below and half are above this value. The smallest “per capita GDP” score was $140, while the largest was $123,263. The summarized output presents the average (mean) value as 9815, and the standard deviation is 12104. The “obs” line reveals that this variable has 5,151 observations. Additional summary statistics describe the variable’s distribution. The variance indicates value dispersion, skewness shows deviation from the mean, and kurtosis gauges concentration around the mean. In summary, to get a variable’s summary statistics, use: sum [variable name], detail But remember, your data must be loaded into Stata’s memory for this to function. For further insights, please visit my website at josephinecohen.org.