To get help in Stata type help
followed by topic or command, e.g., help codebook
.
Most Stata commands follow the same basic syntax: Command varlist, options
.
Start with comment describing your Do-file and use comments throughout
* Use '*' to comment a line and '//' for in-line comments
* Make Stata say hello:
disp "Hello " "World!" // 'disp' is short for 'display'
Hello World!
///
to break varlists over multiple lines:disp "Hello" ///
" World!"
Hello World!
* change directory
// cd "C://Users/dataclass/Desktop/StataIntro"
cd dataSets
// open the gss.dta data set
use gss.dta, clear
// save data file:
save newgss.dta, replace // "replace" option means OK to overwrite existing file
/home/izahn/Documents/Work/Classes/IQSS_Stats_Workshops/Stata/StataIntro/dataSets file newgss.dta saved
* import data from a .csv file
import delimited gss.csv, clear
* save data to a .csv file
export delimited gss_new.csv, replace
Picked up _JAVA_OPTIONS: -Dawt.useSystemAAFontSettings=gasp -Dswing.aatext=true -Dsun.java2d.opengl=true (7 vars, 451 obs) file gss_new.csv saved
* import/export SAS xport files
clear
import sasxport gss.xpt
export sasxport gss_new, replace
file gss_new.xpt saved
.do
file. cd
) to the dataSets
folder.use gss.dta, clear
sum educ // statistical summary of education
Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 217 13.52995 3.0687 1 20
codebook region // information about how region is coded
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- region (unlabeled) --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- type: string (str5) unique values: 4 missing "": 0/217 tabulation: Freq. Value 54 "east" 48 "north" 48 "south" 67 "west"
tab sex // numbers of male and female participants
respondents | sex | Freq. Percent Cum. ------------+----------------------------------- male | 114 52.53 52.53 female | 103 47.47 100.00 ------------+----------------------------------- Total | 217 100.00
/* Histograms */
hist educ
(bin=14, start=1, width=1.3571429)
// histogram with normal curve; see 'help hist' for other options
hist age, normal
(bin=14, start=18, width=4.2142857)
/* scatterplots */
twoway (scatter educ age)
graph matrix educ age inc
* By Processing
bysort sex: tab happy // tabulate happy separately for men and women
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> sex = male general | happiness | Freq. Percent Cum. --------------+----------------------------------- very happy | 32 28.07 28.07 pretty happy | 68 59.65 87.72 not too happy | 14 12.28 100.00 --------------+----------------------------------- Total | 114 100.00 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> sex = female general | happiness | Freq. Percent Cum. --------------+----------------------------------- very happy | 33 32.04 32.04 pretty happy | 61 59.22 91.26 not too happy | 9 8.74 100.00 --------------+----------------------------------- Total | 103 100.00
bysort marital: sum educ // summarize eudcation by marital status
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> marital = married Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 103 13.65049 3.374381 1 20 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> marital = widowed Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 6 12.33333 1.36626 11 15 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> marital = divorced Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 39 13.46154 2.501012 6 19 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> marital = separate Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 9 12.11111 2.803767 6 14 --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -> marital = never ma Variable | Obs Mean Std. Dev. Min Max -------------+--------------------------------------------------------- educ | 60 13.7 3.004516 6 20
happy
for married individuals only/* Labelling and renaming */
// Label variable inc "household income"
label var inc "household income"
// change the name 'educ' to 'education'
rename educ education
// you can search names and labels with 'lookfor'
lookfor household
storage display value variable name type format label variable label --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- inc byte %8.0g rincom06 household income
/*define a value label for sex */
label define mySexLabel 1 "Male" 2 "Female"
/* assign our label set to the sex variable*/
label val sex mySexLabel
var | rename to | label with |
---|---|---|
v1 | marital | marital status |
v2 | age | age of respondent |
v3 | educ | education |
v4 | sex | respondent's sex |
v5 | inc | household income |
v6 | happy | general happiness |
v7 | region | region of interview |
value | label |
---|---|
1 | "married" |
2 | "widowed" |
3 | "divorced" |
4 | "separated" |
5 | "never married" |
Operator | Meaning |
---|---|
== | equal to |
!= | not equal to |
> | greater than |
>= | greater than or equal to |
< | less than |
<= | less than or equal to |
& | and |
or |
// create a new variable named mc_inc
// equal to inc minus the mean of inc
gen mc_inc = inc - 15.37
/* the 'generate and replace' strategy */
// generate a column of missings
gen age_wealth = .
// Next, start adding your qualifications
replace age_wealth=1 if age<30 & inc < 10
replace age_wealth=2 if age<30 & inc > 10
replace age_wealth=3 if age>30 & inc < 10
replace age_wealth=4 if age>30 & inc > 10
// conditions can also be combined with "or"
gen young=0
replace young=1 if age_wealth==1 | age_wealth==2
(217 missing values generated) (19 real changes made) (26 real changes made) (22 real changes made) (134 real changes made) (45 real changes made)