How to summarize a continuous variable in Stata.

in this video
i’ll tell you how to summarize the
distribution of a continuous variable
continuous distributions can be
described using measures of central
tendency and measures of dispersion
central tendency measures give you an
idea of what a typical observation looks
like
dispersion measures
ask how spread out the values are
to get either set of measures we use the
command sum which is short for summarize
when we think of averages we generally
Central Tendency Measures
think of means
that’s the sum of all scores divided by
the number of respondents
whose scores are summed the median is
the middle value fifty percent of a
sample will score above the median and
fifty percent below
the mode refers to the most common
answer
in general we don’t use modes that much
when describing continuous variables
dispersion measures give us an idea of
the spread
of values
by spread i mean how often are
observations score
close to or well away from typical
measurements
a common measure of dispersion is the
standard deviation
this measure captures the average
departure from the sample mean either
above or below
when the deviation is higher
then our observations
tend to be further away from the mean
when it’s lower then our observations
are closely clustered around the mean
to get these measures we use the sum
Command
command which is short for summarize
the syntax is sum space and the variable
name
comma space detail
for example
if i want to summarize a variable called
income i would enter
sum space income comma detail
the detail
option asks data to deliver a wider
array of summary
statistics if you don’t use detail
you’ll get a more limited number of
distribution summary statistics
the detail option is useful when you
want to look at one variable at a time
i tend not to use the distribution when
i want stata to summarize multiple
variables at once
let’s see what it looks like to use the
sum command in a real stata session
i’m going to summarize the sei variable
which shows people’s socioeconomic index
score on a scale of zero to one hundred
i type sum sei common detail
that’s the mean
of the sample and the standard deviation
this part gives us the different scores
at the samples percentiles the median is
41.2
28.4 is the 10th percentile score 78.5
is the 90th
let’s review
Review
to get a summary of a continuous
variables distribution
we use the sum command
the syntax is sum space the variable
name comma space detail
for more help on this command type help
sum in the state stated command window

