Fitting finite Normal mixture models using SPSS and BMDP

Normal mixture models may be used to identify locations and spreads of suspected multiple peaks in a distribution for an apriori number of hypothesised normal distributions. The example below examines the possibility of data arising from a mixture of two Normal distributions.

The bimodal example data (consisting on one variable) is given in a SPSS data file here and clearly shows two bumps when viewed using a histogram. The data may be saved into a file called bimodal2.dat (excluding the variable name) from SPSS which can be entered into the statistical package BMDP.

This facility is not available in most statistical packages but it is supported by maximum likelhood routines in BMDP and STATA (Haughton 1997). Neither is currently available at CBSU, however you can use SPSS. A BMDP run using syntax below will fit two normal distributions. This syntax needs to be saved in a file, say, mlm.bmdp.

/ input         file= 'bimodal2.dat'.
                 variables=1.
                 format=free.

 / variable      names = bdat.

 / estimate      parameters=4.

 / parameter     names=mu, sigmasq, mu2, sigmasq2.
                 initial = 2, 0.5, 7, 2.

 / density       f = 0.5*exp(-(bdat-mu)**2/(2*sigmasq))/
                     sqrt(6.2832*sigmasq) +
                     0.5*exp(-(bdat-mu2)**2/(2*sigmasq2))/
                     sqrt(6.2832*sigmasq2).

 / end

To run the job we type the following which assumes the syntax is in file mlm.bmdp. Output is sent to a newly created file called mlm.out.

bmdp le mlm.bmdp mlm.out

The file, mlm.out, contains the means and standard deviations of the two normal distributions which "best" explain the data. In this example the best fitting normal distributions have a means of 1.99 and 6.72 with respective variances of 0.24 and 4.73.

The log-likelihood from the above fit may be compared with that assuming just one peak and the estimated density functions plotted to assess fit graphically.

References

BMDP Statistical Software Manual Volume 2 (1992) BMDP Statistical Software Inc.

Haughton, D. (1997) Packages for Estimating Finite Mixtures: A Review The American Statistician, 51 194-205.