SECTION 3: Tables and Graphs

We will again use the data in the file ‘lipids.dta’ to illustrate the commands in this section.

I. Tables

A. One-way table: the tabulate command (tab for short) will tabulate

categorical variables. It produces frequencies, relative frequencies, and

cumulative frequencies.

. tab smoke01

B. Two-way table: when given 2 variables, tab will create a 2x2 table of

frequencies. Options for two-way tables include row, which gives row

percents (which means that the percentages in each row sum to 100%),

col which gives column percents (analogous to row percents), and cell

which gives the fraction of the total sample in the cell.

. tab smoke01 group

. tab smoke01 group, row

. tab smoke01 group, row col cell

C. Chi-square Tests: You can test for “independence” or “trend” in tables by

using a chi-square statistic. Stata will calculate the chisquare statistic and

associated pvalue for a two-way table if you specify chi as an option.

. tab smoke01 group, cell chi

II. Graphs: Almost all graphs produced in Stata are displayed in the Graphics Window

and use the command graph (an exception to this is the stem and leaf plot)

A. Stem and Leaf Plots: The command to produce a stem and leaf plot for a

variable is stem followed by the variable name.

. stem tothdl

B. Histograms: Histograms are the easiest plots to produce in Stata. When Stata

is given the command graph followed by one variable name it creates a

histogram with five bars/bins.

. graph tothdl

You can change the number of bars/bins using the bin option. The

following increases the number of bins to 8:

. graph tothdl, bin(8)

Notice that Stata only gives the minimum and maximum values on the x

and y axes. To make the graph have more numbers on the axes specify the

xlabel and ylabel options (xlab and ylab for short). Lastly, notice

that the y-axis indicates the percent in each of the bins. We can change the

y-axis to instead indicate the frequency by using the freq option.

.graph tothdl, bin(8) xlab ylab freq

C. Boxplots: Do the same as when creating a histogram, but give the box option.

. graph tothdl, box ylab

Obviously, the xlab, freq, and bin options are not applicable to

boxplots. You can, however, use the by option. For example, we can

look at Total HDL for smokers and non-smokers:

. sort smoke01

. graph tothdl, by(smoke01) box ylab

D. Scatterplots: If you use the graph command followed by two or more

variables, Stata will create a scatterplot. The following creates a

scatterplot of Total HDL versus Triglycerides:

. graph plasmatc plasmatg, xlab ylab

Notice that the variable on the y-axis is first, followed by the variable on

the x-axis. Two options specific to scatterplots are symbol and

connect. symbol (s for short) specifies the plotting symbol and

connect (c for short) specifies if the points should be connected or not

and how they should be connected. By default, the plotting symbol is ‘o’

and the points are not connected. We could write this as

. graph plasmatc plasmatg, xlab ylab s(o) c(.)

The ‘.’ means do not connect and ‘o’ means use an ‘o’ as the plotting

symbol. Other symbol options are

O large circle p small plus

T large triangle d small diamond

S large square . dot

[_n] use obs. number as symbol o small circle

[varname] use varname as symbol i invisible

Some connect options are

. do not connect

l (lowercase L) draw straight lines between points

J connect in steps

L connect x-ascending points

E. Scatterplots with more than two variables. You can give Stata as many

variables as you want in a graph command and it will plot all of them. It

treats all of the variables except for the last variable as y-variables and the

last variable as the x-variable. In other words, it plots each of the variables

versus the last one in the list on the same graph. Stata will give each one a

different symbol so that you can distinguish which y-variable is which:

. graph tothdl plsmaldl age, xlab ylab

This command plots tothdl versus age and plsmaldl versus age on

the same graph.

E. Other options:

1. Specifying where to put labels: You can tell Stata where to put the x

and y values on the axes. For example,

. graph plasmatc plasmatg, xlab(0,300,600,900) ylab

2. Putting a title on the graph: Use the title (ti for short)option and a

title will be placed under your graph in large letters.

. graph plasmatc plasmatg, xlab ylab title(Total

Cholesterol vs. Triglycerides)

3. Other titles: You can put smaller titles on the right, left, top, and

bottom of your graph using l1title, l2title, r1title,

r2title, t1title, t2title, b1title, and

b2title. For example, t1title and t2title both put titles

on the top of your graph and t1title is closer to the graph .

. graph plasmatc plasmatg, xlab ylab t1(Scatterplot:) t2(Triglycerides and Cholesterol)

4. The by option: If you have a grouping variable you may create graphs

for each group. Including the total option also includes a graph

with all observations (i.e. observations in all groups).

. sort group

. graph plasmatc plasmatg, by(group) total xlab ylab

F. Advanced Options to investigate: xtick, ytick, rtick and ttick

put tick marks on the axes. yline and xline put horizontal and vertical

lines on your graph at specified x and y values.

III. Saving and Printing Graphs:

A. Saving graphs: Below are three ways to save Stata graphs.

1. Choose ‘Save Graph’ from the File menu. This will prompt you for a

path and name for your graph. This graph will be able to be opened in Stata again (see below).

2. Include the saving option in your graph command. For example,

. graph plasmatc plasmatg, saving(a:\graph1)

This will save the graph to the file graph1.gph on the a:\

drive. Stata will not let you overwrite a graph unless you tell it that

it is okay to replace it with a new graph:

. graph plasmatc plasmatg, saving(a:\graph1,replace)

To reopen the graph in Stata, you would type at the Stata command

line

. graph using a:\graph1

3. Save the graph to a word processing document. You can select ‘Copy

Graph’ from the Edit menu and paste the graph into a Word or

WordPerfect document. This is a very attractive option because in

the document you can resize the graph or put several graphs on one

page. However, this graph cannot be reopened in Stata.

B. Printing Graphs

1. After creating a graph, select ‘Print Graph’ from the File menu.

2. Save the graph to a word processing document and print the document.

IV. Multiple Imaging: You can put several plots on one graphics image. To do this, you need to create each of the graphs separately and save them in .gph files. Then use the using option to bring them together into one graph:

. graph plasmatc, saving(a:\g1) xlab ylab

. graph plasmatg, saving(a:\g2) xlab ylab

. graph plsmaldl, saving(a:\g3) xlab ylab

. graph tothdl, saving(a:\g4) xlab ylab

. graph using a:\g1 a:\g2 a:\g3 a:\g4

17