



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Cheat sheet on data processing, data transformation, data visualization, plotting, data analysis and programming with stata 15
Typology: Cheat Sheet
1 / 6
This page cannot be seen from the preview
Don't miss anything!
rll = � t4A�0 D P
search mdesc (^) 1 _ ,,
lmp9rt:pat
import excel ·yourSpreadsheetxlsx·, r-·. •; sheet("Shee,1") cellrange(A2:H11) firstrow
Basic Syntax
._________, �--� ._________.
�-do
command
To find out more about any command - like what options it takes - type hel (^) p command
Arithmetic
! (^) =] not
Ex lare Data
IN_THl [
�� ·��r'.i%l•r:-� ._
�
1
���.--'---,
Data Transformation
Select Parts of Data {Subsetting)
Cl!ANG[ (OLUMN NAM[S
f
Reshape Data
TIOY DATASETS W10E
� r1r
MA-
Manipulate Strings
ANATOMY OF A PLOT Plotting in Stata 15
armota:.on (^) title oJ subbtle^
•or more ·nfo see Stata's referenc:e manual (statacom) (^) �I plots contain many features y-a, ---
�r-?:V""-gO" """' g,aon "'9 °" �-01cr.rw9·on
oiot"'9on
- ne (^) .-
r- ·,,,.....-..... :;""' ...,... er e-e .....-.
01..----r---,----,-----,-----,-- tid rnarls o � � w � m sc.atter or>ce mpg graphregion(kolor(i92'l921921ifcolor("2082lll 2001) specify the fi I of the background in RGB 0< with a Stata color scatter: l'Ce mpg plotregion(fcolor('224 224 224") ifcolor("240 240 240")) spectfy the fili of the pio! bac<ground in RGB or w,th a Stata color
SYMBOLS
·narker � marle
arguments for the plot objects (In green) go in the options po rtionot these comma'lds (In orange) for e,ample SGltref pnce mpg, xbne(20, �ldth(vthick))
!fil_ofor(i45 168 208') mcofor(none) �edv the fili and stroke o': the maricer in RGB or with a Stata color 9 mfcolor('145168 208') mfcolor(ncme) 8 soecify the fili ef the maricer
rmize{medium) specify^ the marker si::e-:
V,V, ehuge • medlarge
� (^) vhuge (^) • medsmall I-I: --... •^ small
huge l:l (^) l vi (^) • vlarge tiny
msymbol(Dh) specify the mance, 5Ymbol:
tl
o Oh^ ◊Dh^ 6, Th^ □ Sh
jitter(#) (^) jitterwed(:} randomly dispiace the maricers set seed
axe; xscafe{ �e()
�d; rnart: gnd ne; abe( ylabel(
!l;olor("145 168 208') !l;olor(none) specify the stroke color of the line a border ,.,., e^ mkolor(i45168 2081 tic,; 11"..ar, (^) tlcolor('145 168 208')
grid �es gkolor(i45168 208')
lwidth(medthick) specify the thicl:ness (:troke) o' a r.ne.
� wvthick
l"e a es .!pattem(dash) ';'cl ne; .glpattem(dash)
mlwidth(thin) tlwidth(th1n) gm,_idth(thin)
medthin thin vthin wthin vwthin none
speafythe fine patterr.
axes noline tic( mz • JlQ1kks
a, es off no axisllabels te IT'd� _ !!ength(2) gnd mes nogrid nogmin nogmax
tie 1w1 xfabel(#l0, !Qosition(crossing)) number cl bel: marks, pos-tion (outside I cross ng I inside)
x-axis title
ege•,d -1-•-��d va�,I
marier label t�� ax;-; làbels m.::'ker trt',e (^) ) x1abel optons subtitle( abe annotaton xmle( )^ eçend ., vtte(. ) ��-^ .,
i;_ofor(i45 168 208") i;_ofor(none) specify tne color of lhe text ,.,.., er e-e mlab!.:ofor(i45168 208")
siz.e{medsmafl) specify the size of the^ text ,.,,., e· .,be mlabsizfilmedsmam axis labes. fabsizg:(medsmall)
Text
Text Text Text Text Text
marler abel
vhuge
huge
vlarge la rge rrnedlarge medium
Tex^ medsmall Tt<t smatl r (^) vsmall Tt (^) tiny half_tiny third_tiny quarter_tiny minuscule
mlabet(foreign) label the points with the values o' the forcign variable nolabels no axìs labels = (^) labe. format(% 12.2f) change the format ol the a>ds labels egencJ off tum off legend lege<id label(;t "label") change legend abe rext
mar er aoel mlabposition(S) ,abe, locatìon relative to marker (dock posrbon: O - 12)
Apply Themes
Schemes are sets of graphical parameters. so you don't have to specify the look of the graphs every time. USING A SAVED THEME
twoway scatter mpg price, scheme(custom-heme)
help scherne entries , � see ali options for setting scheme properties adopath ++ '-/
set scheme customTheme, Qfilfilanently change the theme
netinst � from("https/,'wbuchanan.github.io/brewscheme/1 replace 1nstall William Buchanan·s package to generaie custom schemes and color patenes (inc.luding ColorBrewer) USING THE GRAPH EDITOR
twoway sc.atter mpg pnce, play(graphEdìtorTheme)
. •. .. 'F
! e
. (^). '
- �1- -.
,.,
e=;, -
Select the Graph Editor
Click Record
Double click on
symbols and a�eas = on plot or reg 1ons -;- on sidebar to customize
Unclick Record
Save theme as a .gre< file
Save Plots
graph twoway scatter y x, saving("myPlot.gph") replace save the graph when drawing g_raph save "myPlot.gph", replace save current graph to disk g__raph combine plot1.gph plot2.gph... combine 2+ saved graphs into a single plot g__raph export •myPlot.pdf', as(.pdf) export the c.urrent graph as an image file
updateo June 2016
Data Analysis
with Stata 15 Cheat Sheet For more info see stata·s reference manual (stata.com) I' T Id· � ,�t Summarl·ze Data (^) unlessExamples useauto.dta (sysuseauto,dear) otherwlse noted
univar price mpg, boxplot (^) • ...: .•i... u calculate univariate summary. wrth box-and-whiskers plot stem mpg return stem-and-leaf display of mpg summarize price mpg, detail -1 , , catculate a variety of univariate summary statistics ci mean mpg price, level(99) - " 1r ....., r compute standard errors and confidence intervals correlate mpg price return correlation or covariance matrix pwcorr price mpg we1ght, gar(0 05) return all pairwise correlation coefficients wrth sig. levels
r
mean price mpg estimates of means. including standard errors proportion rep78 foreign estimates of proportions. including standard errors for r, catcgories idcntificd in varlist I ratio estimates of ratio, including standard errors
l total price estimates of totals. including standard errors
Statistica! Tests tabulate foreign rep78, chi2 .e_xact_gmected tabulate foreign and repair record and retum chi^2 and Fisher's exact statistic alongside the expected values ttest mpg, by(foreign) estimate t test on equality of means tor mpg by foreign
::: prtest foreign == O. 5 one-sample test of proportions ksmirnov mpg, by(foreign) g_xact Kolmogorov-Smimov equality-of-distributions test ranksum mpg, by(foreign) equality tests on unmatched data (independent samples)
[
anova systolic drug -t>u 1- ' analysis of variance and covariance pwmean mpg, over(rep78) pveffects mcompare(tukey) l estimate pairwise comparisons of means wrth equal variances include multiple compartson adjustment
Decla re Data By^ declarmg data type, you enable Stata to apply data mungcng and analys1s funct1ons specific to certam data types
TIME SERIES -+....,,. • tsset time, yearly Bm
PANEL / LONGITUDINAL wdluse^ rlswori:. cle.ar
tsreport report time series aspects of a dataset 9.enerate lag_spot = L 1.spot c 1
_
reate a new variable ,
of annua! lags ��; ,J.,..
ts " ts ine spot · 1 • A • , • ftA e
plot time series of sunspots
estimate an auto-regressive model with 2 lags TtME SERIES 0PERATORS I
xtset id year dedare national longitudinal data to be a panel (^) t-t xtdescribe report pane! aspects of a dataset xtsum hours summarize hours worked. decomposing , � � standard deviation into between and within components xtline ln_wage if id <= 22, tlabel(#3} plot pane! data as a line plot
o-----
. - ---...,.,..
xtreg ln_w c.age##c.age ttl_exp, fe vce(robust) L llg x.- estimate a fixed-effects model with robust standard errors F. le.ad:x,.,
I.Z. 2-penoa rag ••, fZ. 2-i»riod lHd •..i SURVEY DATA O. �rencew,.-..,. S. SUSONI differeroce X ....,
02 -,_dd<Fe""'cel.-.:,.,il',.,,U S2. lag-2 (seasonal o.fferfflce) r,-", UsEFUL Aoo-lNS ---------------- 1:scollap 00frf>,1Ct tn>@ series "1!D ,.....,,.., wm, and end-cl-p@riod valuo, canry(orward CMT)I^ non-rrming values^ ID<watd from one om. ID the next tsspell idffltify ,pe1, cx runs in tòne sorin
SURVIVAL ANALYSIS stset studytime, failure(died) , declare sur vey design for a dataset stsum
lii --+-- I summarize survival-time data
e
stcox drug age j estimate a Cox proportional hazard model
o Estimate Models storesresult5asQ -dass
iggress price mpg weight vce(robust) estimate ordinary least squares (OLS) model on mpg weight and foreign, apply robust standard errors iggress price mpg we1ght if foreign == O. vce(cluster rep78) regress price only on domestic cars, cluster standard errors rreg price mpg weight genwt(reg_wt) estimate robust regression to eliminate outliers probit foreign turn price. vce(!:obust} estimate probrt regression with robust standard errors ,!Qgitforeign headroom mpg, or estimate logistic regression and report odds ratios bootstrap, reps(100):.rggress mpg /* */ we,ght gear foreign
AoomoNAL Moow pc.I - -•lblll � ...._..,. � l'KIOf ,,....,,....,^ ""'°'aNl,sk pob-•nllf"ll CIMI-
- -,d- "'"9"" IZl) lnltn,�-lablo,
i§rèS❖·:i:dritl pr.,.a,1CCWe� l!lilll .,.-c:iomd ,....,.
svyset psuid (pweight = finalwgt), strata(stratid)
svydescribe report survey data details r svy: mean age, over(sex) I estimate a population mean tor each subpopulation svy, subpop(rural): mean age
"
estimate a population mean for rural areas
I
svy: tabulate sex heartatk report two-way table with tests of independence svy: reg zinc c.age##c.age ternale we1ght rural estimate a regression using survey weights
estat hettest test for heteroskedasticity ovtest test tor omitted variable bias vif report variance inffation factor dfbeta(length) MJOn pfoc calculate measure of inffuence -"' � ._.,u. � - rvfplot, yline(0) e. ,. e •, avplots plot residuals li�li�^ plot ali partial- against fitted � �t� regre�sion leverage values (^) ". .;.,..,., _ plots m one graph
_rg_gress price headroom length " .....,...."'"'"'" eum.,i,,, .Qisplay _b[length] display _se[length] return coefficient estimate or standard errar for mpg from most recent regression model ma rgins, dydx(length) ,. " ,. •.,,- u .-. vs .JSed
estimate regression with bootstrapping jackknife r(mean), double: sum mpg
Estimation with Categorica I & Factor Varia bles more deta1ls at http://www.rtato.com/manuals/u25.pdf
"' ma rgins, eyex(length) retum the estimated elasticity for price
(ONTJNUOUSVARIABL.ES predict yhat if e(sample) R mea1ure something CATEGORICA!. VAR!ABLES ■ • • identify an observabons belongs^ a group to^ whK:h
INDICATOR VARIABLES T F denote whethersomething is bue cr false
O�ERATOR
ib. Me c.
o. � ,nr
OESCRIPT10N specify ìndicalD,s specify bare indìcatll< a>mmand IO ct>ange ba!e treat variable as ccntiruous
orrit il varìilbfe or indic:ator specify -· ,pecify lactorial Ìn11!fiKIÌ<ms
ExAMPLE regress pnce Ln,p ,egress price ibC3).rep lvset baso flrl!quent rep 78 ,egress pnce Lfcre,gnlunpg 1Jcreign
,egreu price lo(Z)Jep ,eg,..n pria, mpg c.mpgltff1> ,egress pnce <Jll)Qtlc.mpg
specify rep78 variallle ID be an Oldlc.nDr ·;ariable set lhe third ca.:egory or rep78 ID be lhe ba<e categery <et the ba;e IO most m,quently cxwmng Còl!gOry ror '"1) treat mpg as a a>nbn1JOU$ vMlab!e and tpecrfy an vn:eract:ion between forelgn and mpg ,et "'1)78 a, an.,éoca10r cmitob<ervation1 with rep78 == 2 create a iqua!Wd mpg tffm a, be used In regl"HSXln cnoate all pos<ible lnteraction1 ""h mpg (rrc,g and mpg'J
create predictions tor sample on which model was fit predict double resid, residuals calculate residuals based on last fit model test headroom = O test linear hypotheses that headroom estimate equals zero lincom headroom - length test linear combination of estimates (headroom = length)
updated June 2016