With missing data “Full-information Maximum Likelihood” (FIML) is an alternative to multiple imputation which requires considerably fewer decisions from a researcher – and fewer “researcher degrees of freedom” are potentially preferred (cf. here).
FIML in Stata
Stata implements FIML through its SEM suite. FIML requires the maximum likelihood estimation method option:
method(ml) *Normal maximum likelihood
To specify the use of FIML for missing value, you simply need to add “mv” for missing values to the option
method(mlmv) *Full information maximum likelihood estimation
**Load example data** sysuse auto **Variable with missing data:** codebook rep78 **OLS regression** regress price rep78 mpg **Regression using SEM** sem (price <- rep78 mpg ) /* Number of obs = 69 <- 5 missing obs. */ **Regression using SEM - Full information maximuum likelihood** sem (price <- rep78 mpg ), method(mlmv) /* Number of obs = 74 <- Complete observations */
Pitfalls with FIML
Always check whether your FIML results give you all observations. FIML sometimes seems not to work, with only complete observations being used and not missing observations being taking into account.
The most common reason for FIML not to work in Stata is missing values coding. For FIML to work all missing values need to be coded as “.” not “.a”, “.b” or worst “999”, “888” a la SPSS, or “NA” a la R.
Always for FIML, recode missing values:
mvdecode _all, mv(333=. \999=. \666=.) recode VARNAME (.a = .) (.b=.)
Excellent slides on Multiple Imputation and FIML in Stata: https://www.stata.com/meeting/switzerland16/slides/medeiros-switzerland16.pdf