You can summarize results from multiple studies with Stata’s new meta-analysis suite. Use random-effects, fixed-effects, or common-effect meta-analysis to combine individual results and compute overall effect size. Forest plots allow you to visualize the results. With subgroup analysis or meta-regression, you can explore heterogeneity of studies. And you can evaluate publication bias using funnel plots and the trim-and-fill method.
You can use lasso and elastic net for model selection and prediction. Want to estimate effects and test coefficients? With cutting-edge inferential methods, you can make inferences for variables of interest. Analyze continuous, binary, and count outcomes. You can even account for unobserved confounding.
With Stata’s reporting features, you can easily incorporate Stata results and graphs with formatted text and tables in Word, PDF, HTML, and Excel formats. Take advantage of Stata’s integrated versioning to create reproducible reports. Dynamic documents can be updated as your data change. With Stata 16, you can create Word documents from Markdown; easily include headers, footers, page numbers, and large blocks of text in Word; and convert HTML to Word or Word to PDF. The new Reporting Reference Manual guides you with many examples and workflows.
You can now embed and execute Python code within Stata. Invoke Python interactively or within do-files or ado-files. With the new Stata Function Interface (sfi) Python module, you can pass data back and forth seamlessly. This means that you can now use any Python package directly within Stata. For instance, you might use Matplotlib to draw 3-dimensional graphs, Scrapy to scrape data from the web, or TensorFlow and scikit-learn to access additional machine-learning techniques.
The most requested additions for Bayesian analysis— multiple chains and Bayesian predictions—are now available. You can use multiple chains with Bayesian estimation to evaluate MCMC convergence. And you can now evaluate convergence using the Gelman–Rubin convergence diagnostic. With Bayesian predictions, you can check model fit and predict out-of-sample observations.
Stata 16 introduces a new, unified suite of features for summarizing and modeling choice data. In addition, you can now fit mixed logit models for panel data. And here’s the best part: margins now works after fitting choice models. This means that you can now easily interpret your results. For example, estimate how much wait times at the airport affect the probability of traveling by air or even by train. And you can answer these types of questions whether you just fit a conditional logit, multinomial probit, mixed logit, rank-ordered probit, or another choice model.
This is about changing the way you work. Datasets in memory are stored in frames, and frames are named. When Stata launches, it creates a frame named default, but there is nothing special about it, and the name has no special or secret meaning. You can rename it. You can create frames, and delete them, and rename them.
Extended regression models account for common problems—endogenous covariates, sample selection, and treatment—either alone or in combination. In Stata 16, add panel data to that list. Fit random-effects extended regression models for linear, interval-censored (including tobit), binary, and ordinal outcomes.
One of the first tasks of any research project is reading in data. import sas allows us to import SAS® data from version 7 or higher while import spss allows us to bring IBM® SPSS® files (version 16 or higher) and compressed IBM SPSS files (version 21 or higher) into Stata. We can import the entire dataset or only a subset of it. With import sas we may also import value labels. Dates, value labels, and missing values are all converted properly from SAS or SPSS to Stata format.
Don’t know the functional form of the relationship between your outcome and covariates? Don’t worry. Nonparametric series regression can select a polynomial, B-spline, or spline function that closely approximates the mean of your outcome. And you can still make inferences. Explore the response surface, estimate population-averaged effects, and obtain tests and confidence intervals.
Mixed logit models are models for choice outcomes. Choices might be modes of transportation, car insurance providers, or types of vacations. Sometimes individuals make the same decision repeatedly for example they choose whether to bike or take a car to work each day; or choose the same car insurance provider each year; or they choose to to vacation at the beach, mountains, or city each summer.
When data contain repeated choices, we have panel data.
With Stata 16's new cmxtmixlogit command, you can fit panel-data mixed logit models.
The new ciwidth command performs precision and sample-size analysis for confidence intervals (CIs). The goal is to optimally allocate study resources when CIs are to be used for inference or, said differently, to estimate the sample size required to achieve the desired precision of a CI.
ciwidth also lets you investigate the precision in various scenarios, which is useful at the planning stage. You can investigate the tradeoffs among sample size, required CI width, and the probability that the actual CI width will be less than required. And you can examine how each varies with other parameters.
Results can be presented in a table or graph.
With IRT, you can explore the relationship between an unobserved latent trait, such as mathematical ability, and an instrument designed to measure that trait, such as a test. In Stata 16, you can now fit multiple-group IRT models and evaluate whether tests perform equally across different subpopulations. Do students in urban and rural schools respond in the same way to test questions, or are some questions worded unfairly for one group? With multiple-group IRT, you can perform an IRT model-based test of this hypothesis and of similar hypotheses related to differential item functioning.
Dynamic stochastic general equilibrium (DSGE) models consist of systems of equations that describe the structure of the economy. Equations in these models are almost always nonlinear. With Stata 16’s new dsgenl command, you no longer need to linearize the equations before fitting your DSGE models. And after fitting your model, you can obtain policy and transition matrices, identify the model’s steady state, estimate covariances and autocovariances, and create and graph impulse–response functions.
Ordinal variables are categorical and ordered, such as poor, fair, good, very good, and excellent. One way to think about ordered variables is that the categories represent ranges of an unobserved continuous variable z.
If z were distributed normal with mean 0 and standard deviation 1, the above would be an ordered probit model. It would correspond to 4% of subjects reporting poor, 13% reporting fair, and so on. Stata would fit this model if you used its ordered probit command oprobit. You could instead specify a linear function for z in terms of age, bmi, and i.exercise by typing oprobit health age bmi i.exercise. The fitted model might be z = -0.0083*age - 0.0469*bmi + 0.5596*i.exercise. It would, however, be reasonable to assume that health status varies more as age increases; Stata's new hetoprobit command can handle that. You would type hetoprobit health age bmi i.exercise, het(age)
Heckman selection models adjust for bias when some outcomes are missing not at random. Imagine modeling income. The problem is that income is observed only for those who work. Missingness is not random.
Stata fits Heckman selection models and, new in Stata 16, Stata can fit them with panel (two-level) data.
Existing command menl has new features for fitting nonlinear mixed-effects models (NLMEMs) that may include lag, lead (forward), and difference operators. One important class of such models is the class of pharmacokinetic (PK) models and, specifically, multiple-dose PK models. menl's new features can also be used to fit other models, such as certain growth models and time-series nonlinear multilevel models.
Mata's new Quadrature() class provides adaptive Gaussian quadrature for numerically integrating univariate functions. It approximates the integral from a to b of f(x), where a can be minus infinity or finite and b can be finite or positive infinity.
Quadrature() uses the adaptive Gauss—Kronrod method. It also provides the Simpson method for use in teaching.
Stata provides statistical solutions developed by StataCorp, and it provides programming tools for those who want to develop their own solutions. There are two Stata programming languages: ado, which is easy to use, and Mata, which performs numerical heavy lifting. And Stata is integrated with Python.
New Mata class LinearProgram() solves linear programs. It uses Mehrotra's (1992) interior-point method, which is faster for large problems than the traditional simplex method.
In Stata 16, you can now specify sizes of graph elements in printer points, inches, and centimeters. Simply add a unit suffix to the size: pt for printer points, in for inches, cm for centimeters, and rs for relative size.
Stata's Do-file Editor provided syntax highlighting for Stata. It still does. In Stata 16, it also provides syntax highlighting for Python and Markdown.
And Stata 16's Do-file Editor has autocompletion. The editor autocompletes words that already exist in the document, autocompletes Stata commands, and autocompletes quotes, parentheses, braces, and brackets.
Last but not least, you can now use spaces for indentation as well as tabs.
All of Stata's interface—all menus and all dialogs—is now available in Korean.
Dark Mode is a color scheme that darkens background windows and controls, so it directs your focus to what you are working on.