Lecture 1: Introduction to STATA
1. The Stata Interface
§ The command window (Stata prompt)
§ The results window
§ The review window
§ The variable window
2. Change directory
§ Stata starts in its default working folder or directory, typically C:\data or C:\stata ((Pwd *Display the name of the current working directory)
§ STATA is case sensitive
§ Normally you need the cd (Change Directory) command
§ cd “U:\Qi Liu homepage”
§ cd C:\data
§ cd C:/data
3. Keeping Track of Your Work
§ keep permanent copies of the work you do in Stata.
Ø log using code.log
§ from then on everything you type and all the output Stata produces automatically is recorded in the .log file you specify
4. Read data (1): Stata format
§ Check which files are in the directory: dir
§ .dta
Ø *Datasets stored in Stata format
§ use births
Ø *load the dataset into memory
5. Read data (1): spreadsheet
§ Save data as a text file (tab or comma separated)
§ Variables names do not include space
§ Save file
§ clear
§ insheet using births.tab
§ insheet using births.tab,clear
§ insheet using births.csv,comma clear
§ insheet id bweight lowbw gestwks preterm matage hyp sex sexalph using noname.tab, clear (try insheet using births.csv,comma clear and see the difference)
6. Input data in STATA
§ clear
§ Data Editor icon
§ Input the value (no var name) and enter
§ Double click on the column to change var names
§ Preserve
§ outfile using try.txt (Exercise: clear the memory, read try.txt)
§ outsheet using try.tab (Exercise: clear the memory, read try.tab)
7. Basic commands
§ browse / Data Browse icon
§ describe / desc
§ summarize (mean, sd and range of non-string variables)
§ summarize var, detail
§ list matage (change screen: Go button/space-bar)
§ list varlist 1/5
§ count *Count observations satisfying specified condition
8. Stata Syntax
§ command varlist if_expression in_range,options
§ list bweight hyp if sex==1 in 1/10, noobs
9. Conditions
§ ==
§ <,<=,>,>=
§ !=,~=
§ &,|
§ count if bweight<2000 & sex==1
§ count if bweight<2000 | sex==1
10. Missing values
§ count if gestwks==.
§ count if gestwks>15
§ count if gestwks>15 & gestwks<.
Ø Rule:
All numbers are less than . (missing)
§ Change missing values to numeric values
Ø list gestwks in 1/5
Ø mvencode _all, mv(.=-9)
Ø list gestwks in 1/5
§ Change numeric values to missing values
Ø mvdecode _all, mv(-9=.)
Ø list gestwks in 1/5
11. Generating new variables
§ generate var1=1
§ generate var2=3
§ generate var3=(var1+var2)/var2*var1
§ generate var4=log10(var2)/ln(var3)
§ replace var1=10 [if sex==1]
§ gen str5 name="John"
§ recode sex 2=0, generate(sex2)
12. Exercise:
§ List observations 301 to 320
§ Summarize matage for hypertensive women (hyp=1)
§ Use count to find how many hypertensive women have babies with birth weight less than 2000g
§ Generate bwkgs: birth weight in kilograms
§ list in 301/320
§ summarize matage if hyp==1
§ count if hyp==1 & gestwks<2000
§ generate bwkgs=bweight/1000
Stata's Online Help
There is online help available inside Stata. To get help for a command, simply type "help" and the name of the command.
If you are not sure which command you need, you can type "search" and a keyword. screen.
Alternatively, you can use the "Help" menu, and click on "Stata Command" if you know the command or "Search" if you don't.
Stata displays information one screen at a time. To proceed to the next screen, hit the space bar or click on the more prompt at the bottom left corner of the results window. If you've seen enough, hit control-k (hold down the control key and type k) on Windows or control-c on Unix to cut off the flow of information. Alternatively, click the break button, which is a red circle with an X through it near the top of the Stata window.
4