In the EM algorithm, both complete data and missing data are defined:
defines the complete data.
{
,
} is missing data, where
is a K-dimensional vector whose
th component,
, is 1 or 0 depending on whether
belongs to the
th mixing in the equation:

where
is the weight for the
th Gaussian distribution
(nonnegative number, normalized by
)
is the mean vector (
)
is the positive definite covariance matrix (
)
The purpose of the EM algorithm is to start with
and iterate from
to
at the rth iteration, continuing the process until the desired parameters
are identified, such that

where
is defined and calculated by equations discussed below. This process guarantees convergence to a stationary point of the likelihood [3] [11] [12], and typically, a number of starting positions are suggested in an effort to ensure convergence to a global maximum [3].
The E-Step
During the Expectation Step, the function is defined as
, where the complete data likelihood
is given by

By using Bayes Theorem, the function can be written as [3]

where

and, for some constant C,

Note that the probability that the ith individual belongs to the
th mixing component can be defined as

The M-Step
In the Maximization Step, it is sufficient to find the unique solution of
such that

where
. This leads to unique solutions [3] of
. (See the “Solutions” section for details.)
The updating of
can be calculated as the average of the contributions from each subject to the
th mixing [3], i.e.,

To calculate the log of the likelihood function
in

(discussed in “Two-Stage Nonlinear Random Effects Mixture Model”) first evaluate the denominator of
, which does not depend on
. Define it as Ni such that

where

Once
and Ni are obtained, the earlier equation:

can be immediately evaluated by 
The log of the likelihood function
is
.
The EM iterates
have the important property that the corresponding likelihoods
are non-decreasing, i.e.,
for all r [11] [3].
Looking at
(from the E-step)

leads to the conclusions that

Therefore, the unique solutions [3] of
(from the M-step) can be written as

In a case described in reference [3], the parameter
can be partitioned into two components
, where
is from a mixture of multivariate Gaussians.
is from one single multivariate Gaussian.
The EM updates from
are given by

Legal Notice | Contact Certara
© Certara USA, Inc. All rights reserved.