Stata
Stata is a statistical software system including
- Basic statistics (e.g., univariate statistics, regression, ANOVA, logistic regression).
- Advanced econometric statistics (e.g., panel analysis with random coefficients, censored regression.
- Simultaneous-equation analysis, conditional logistic regression).
- Full programming capabilities, including matrix syntax.
- Data management capabilities.
- High-resolution graphics.
Where to find stata
-
Stata/IC for UNIX is on the central UNIX server strauss.udel.edu. The run command is
stata
The run command for the X-Windows version is
xstata
-
Stata/IC for Windows is in the Research & Data Management Services (RDMS) Lab, 002D Smith Hall. Research users have priority access to the RDMS systems.
-
Information Technologies makes annual bulk-purchases of Stata (Windows and Mac) on behalf of UD departments to reduce their licensing expenses. To take part in this purchase, send e-mail to RDMS.
Instructions for Stata on Strauss
Stata for UNIX may be run (1) in a full-screen environment, (2) in line-prompt mode, or (3) batch mode.
Full-screen mode. To run Stata interactively in full-screen mode, you must be connected to Strauss using an X-Window server such as a SunRay, a UNIX workstation, or software such as Xming or Cygwin.
To start full-screen Stata, connect to Strauss and type:
xstata
at the UNIX prompt. Three panes appear in a new window like the one below:
The Results window may be difficult to read, as illustrated by the previous image. You may customize its appearance by clicking Edit/Preferences/General/Preferences/Results. Change "Color scheme:" from "Black background" to "White background" and change the font to Courier New, point size 12 as shown below:
The editor is the lower-right pane. Type Stata commands here. For example, type
sysuse auto, clear summarize
The following image shows the results of these two commands:
The upper-left pane displays the Stata commands that were entered in the editor (lower-right pane). The lower-left pane lists the variable names, and the Stata Results pane (upper-right, black background) displays the output.
For example, to show a high-resolution scatterplot with overlaid linear-regression line, type
graph twoway (lfit mpg weight) (scatter mpg weight)
The result looks like
Print the plot by typing
print @Graph
or by clicking the right mouse-button on the plot and selecting "Print."
You can create a plot in an interactive mode or in a batch submission, as described below.
Line mode
To run a prompted Stata session but without the full-screen window, start Stata by typing
stata
at the UNIX prompt. The Stata prompt is the period at the beginning of the command line.
To exit Stata, type
. exit
at the Stata prompt. If you have unsaved work in memory, Stata will refuse to exit.
You can save your worksheet, then exit by typing
. save filename . exit
replacing filename wiith the name of your file. Stata adds an extension of .dta to your filename.
Alternatively, you can force Stata to exit without saving your data by typing
. exit, clear
at the Stata prompt.
You may enter data at the Stata prompt by typing the keyword input followed by a list of variable names. For example
. input price mpg weightStata responds with a sequence of numbered prompts (shown in red below), one for each new observation. Terminate the input data with the keyword end:
1. 4697 25 1930 2. 8814 21 4060 3. 3667 . 2750 4. 4099 22 2930 5. end
To list the data, type list at the Stata prompt.
. list +----------------------+ | price mpg weight | |----------------------| 1. | 4697 25 1930 | 2. | 8814 21 4060 | 3. | 3667 . 2750 | 4. | 4099 22 2930 | +----------------------+
Notes:
-
Variable names are case sensitive and must
- consist of letters, numerals, and underscores,
- contain no more than 32 characters, and
- begin with a letter or underscore (underscore not recommended).
- A period denotes a missing numeric value.
To add more observations, type input wiith no variable list. For example:
. input 5. 5079 24 2280 6. 5189 20 3280 7. 8129 21 2750 8. end
List the cases again to check your typing..
. list +----------------------+ | price mpg weight | |----------------------| 1. | 4697 25 1930 | 2. | 8814 21 4060 | 3. | 3667 . 2750 | 4. | 4099 22 2930 | 5. | 5079 24 2280 | |----------------------| 6. | 5189 20 3280 | 7. | 8129 21 2750 | +----------------------+
To get univariate descriptive statistics, type
. summarize Variable | Obs Mean Std. Dev. Min Max ---------+----------------------------------------------------- price | 7 5667.714 1997.448 3667 8814 mpg | 6 22.16667 1.94079 20 25 weight | 7 2854.286 688.7878 1930 4060
To record your session in a file, type
. log using filename, text
substituting the name of your file for filename. The text option is required to get a plain text file that formats properly in a text editor such as pico or vim. The log file will be named filename.log.
To stop recording commands and output in the log, type
. log close
Batch mode
You can run a Stata job with your commands stored in a command file instead of typing them interactively at the Stata prompt.
Stata expects a filename extension of .do for its command files. For example, suppose a command file named mpgtest.do contains the following commands.
sysuse auto, clear summarize graph twoway (scatter mpg weight) (lfit mpg weight) graph export mpgXweight.ps, replace shell xv mpgXweight.ps
To run Stata using this command file, type the following at the UNIX prompt:
stata -b do mpgtest
The -b do flag indicates a batch run. The do keyword tells Stata to execute the commands in the file named after it, (mpgtest.do, in this example). Stata assumes an extension of .do if you omit that part of the filename. The output is saved in a file called mpgtest.log because the input file is named mpgtest.do.
You also can run a batch Stata job by using the UNIX redirection symbols ("<" and ">!"):
stata < mpgtest.do >! mpgtest.log
This gives complete control over the names of your command files and output files. Its disadvantage is that you cannot use a command delimiter (e.g., ";") which allows you to type a single long Stata command over several lines in the command file. This is discussed further below.
To run the job in the background, use the
&
character at the end of the command. For example,
stata < mpgtest.do >! mpgtest.log &
Output
The out log (.log) file will also contain information to help you diagnose problems.
Plots are not automatically displayed on the terminal screen when running batch stata. You can display a plot saved during a batch run using a UNIX viewer such as ghostview or xv. For example, this command file ends with a Stata shell command to display the plot using the UNIX application named xv. Note that the xv program requires use of an X-Window connection to Strauss.
Command delimiter for long Stata commands
The command to set the character that divides Stata commands is
#delimit
This can be quite useful if you have commands that span more than one line, because it improves the readability of your code. For example, to set it to the semicolon, put
#delimit ;
at the top of your command file. Using the
#delimit
command, the previous example might be reformatted for easy readability like this:
#delimit ; sysuse auto, clear; summarize; graph twoway (scatter mpg weight) (lfit mpg weight); graph export mpgXweight.ps, replace; shell xv mpgXweight.ps;
However, the
#delimit
command is ignored unless you run the batch command file using the -b do flag.
To run the job in the background, use the
&
character at the end of the command. For example,
stata < mpgtest.do >! mpgtest.log &
More help
There are several sources of additional information about Stata.
-
The Stata interactive help facility. This works best from a full-screen Stata session. Click Help and follow the links. Or type the help command in the full-screen Stata editor or from the command prompt in line mode.
Help command:
help command name.
Search command:
search search topic.
Getting started: type
help
without a command name. -
The Stata support web site contains links to Stata documentation, training and users groups.
-
The Stata manual. A reference copy is kept in the Research & Data Management Services (RDMS) Lab, 002D Smith Hall.