General linear mixed effects model

Phoenix’s Linear Mixed Effects object is based on the general linear mixed effects model, which can be represented in matrix notation as:

y = Xlinmixgenmodel00540.png + Zlinmixgenmodel00542.png + e

(1)

where:

X is the matrix of known constants and the design matrix for linmixgenmodel00544.png 

linmixgenmodel00546.png is the vector of unknown (fixed effect) parameters

linmixgenmodel00548.png is the random unobserved vector (the vector of random-effects parameters), normally dis­tributed with mean zero and variance G 

Z is the matrix of known constants, the design matrix for g 

e is the random vector, normally distributed with mean zero and variance R 

linmixgenmodel00550.png and e are uncorrelated. In the Linear Mixed Effects object, the Fixed Effects tab is used to specify the model terms for constructing X; the Random tabs in the Variance Structure tab specifies the model terms for constructing Z and the structure of G; and the Repeated tab specifies the structure of R.

In the linear mixed effects model, if R=s2I and Z=0, then the LinMix model reduces to the ANOVA model. Using the assumptions for the linear mixed effects model, one can show that the variance of y is:

linmixgenmodel00552.png 

(2)

If linmixgenmodel00554.png, i.e., the model includes one or more random effects, the G matrix is determined by the variance structures specified for the random models. If the model includes a repeated specification, then the R matrix is determined by the variance structure specified for the repeated model. If the model doesn’t include a repeated specification, then R is the residual variance, that is R = s2I.

It is instructive to see how a specific example fits into this framework. The fixed effects model is shown in equation 19 under “Linear mixed effects scenario”. Listing all the parameters in a vector, one obtains:

linmixgenmodel00556.pngT = [linmixgenmodel00558.png, p1, p2, d1, d2, t1, t2]

(3)

For each element of b, there is a corresponding column in the X matrix. In this mode, the X matrix is composed entirely of ones and zeros. The exact method of constructing it is discussed under “Con­struction of the X matrix”, below.

For each subject, there is one Sk(j). The collection can be directly entered into linmixgenmodel00560.png. The Z matrix will con­sist of ones and zeros. Element (i, j) will be one when observation i is from subject j, and zero other­wise. The variance of linmixgenmodel00562.png is of the form

G º Var(linmixgenmodel00564.png)=Var{Sk(j)} In(1)+n(2)

(4)

where Ib represents a b´b identity matrix.

The parameter to be estimated for G is Var{Sk(j)}. The residual variance is:

R=Var{eijkm}=s2 I2[n(1)+n(2)]

(5)

The parameter to be estimated for R is s2.

A summary of the option tabs and the part of the model they generate follows.

Option tabs and the part of the model they generated

Tab

Specifies

Tabs on page

Creates

Fixed Effects

Specifies X matrix

 

Generates linmixgenmodel00566.png

Variance Structure

Specifies Z matrix

Random tabs (1 or more)

Generates G = var(linmixgenmodel00568.png)

 

 

Repeated tab (exactly 1)

Generates R=var(e)

Additional information is available on the following topics regarding the linear mixed effects general model:

Construction of the X matrix

A term is a variable or a product of variables. A user specifies a model by giving a sum of terms. For example, if A and B are variables in the dataset:

A+B+A*B

(6)

Linear Mixed Effects takes this sum of terms and constructs the X matrix. The way this is done depends upon the nature of the variables A and B.

By default, the first column of X contains all ones. The LinMix object references this column as the intercept. To exclude it, select the No Intercept checkbox in the Fixed Effects tab. The rules for trans­lating terms to columns of the X matrix depend on the variable types, regressor/covariate or classifi­cation. This is demonstrated using two classification variables, Drug and Form, and two continuous variables, Age and Weight, the “Indicator variables and product of classification with continuous vari­ables” table. A term containing one continuous variable produces a single column in the X matrix.

An integer value is associated with every level of a classification variable. The association between levels and codes is determined by the numerical order for converted numerical variables and by the character collating sequence for converted character variables. When a single classification variable is specified for a model, an indicator variable is placed in the X matrix for each level of the variable. The “Indicator variables and product of classification with continuous variables” table demonstrates this for the variable Drug.

The model specification allows products of variables. The product of two continuous variables is the ordinary product within rows. The product of a classification variable with a continuous variable is obtained by multiplying the continuous value by each column of the indicator variables in each row. For example, the “Indicator variables and product of classification with continuous variables” table shows the product of Drug and Age. The “Two classification plus indicator variables” table shows two classification and indicator variables. The “Product of two classification variables” table gives their product.

The product operations described in the preceding paragraph can be succinctly defined using the Kronecker product. Let A=(aij) be a m´n matrix and let B=(bij) be a p´q matrix. Then the Kronecker product A Ä B=(aij B) is an mp ´ nq matrix expressible as a partitioned matrix with aij B as the (i, j)th partition, i=1,…, m and j=1,…,n.

For example, consider the fifth row of the “Indicator variables and product of classification with contin­uous variables” table. The product of Drug and Age is:

 linmixgenmodel00570.png

Now consider the fifth row of the “Two classification plus indicator variables” table. The product of Drug and Form is:

linmixgenmodel00572.png 

The result is the fifth row of the “Product of two classification variables” table.

Indicator variables and product of classification with continuous variables

Drug

Indicator variables

Age

Drug*Age

Name

Value

DrugA

DrugB

DrugC

 

DrugA

DrugB

DrugC

DrugA

0

1

0

0

48

48

0

0

DrugA

0

1

0

0

34

34

0

0

DrugA

0

1

0

0

27

27

0

0

DrugA

0

1

0

0

23

23

0

0

DrugB

1

0

1

0

45

0

45

0

DrugB

1

0

1

0

26

0

26

0

DrugB

1

0

1

0

41

0

41

0

DrugB

1

0

1

0

32

0

32

0

DrugC

2

0

0

1

38

0

0

38

DrugC

2

0

0

1

17

0

0

17

DrugC

2

0

0

1

32

0

0

32

DrugC

2

0

0

1

23

0

0

23

 

Two classification plus indicator variables

Drug

Indicators

Form

Indicators

Name

Val.

DrugA

DrugB

DrugC

Name

Val.

capsule

drops

syrup

tablet

DrugA

0

1

0

0

capsule

0

1

0

0

0

DrugA

0

1

0

0

drops

1

0

1

0

0

DrugA

0

1

0

0

syrup

2

0

0

1

0

DrugA

0

1

0

0

tablet

3

0

0

0

1

DrugB

1

0

1

0

capsule

0

1

0

0

0

DrugB

1

0

1

0

drops

1

0

1

0

0

DrugB

1

0

1

0

syrup

2

0

0

1

0

DrugB

1

0

1

0

tablet

3

0

0

0

1

DrugC

2

0

0

1

capsule

0

1

0

0

0

DrugC

2

0

0

1

drops

1

0

1

0

0

DrugC

2

0

0

1

syrup

2

0

0

1

0

DrugC

2

0

0

1

tablet

3

0

0

0

1

Product of two classification variables

Drug

Form

Indicator

DrugA

capsule

1

0

0

0

0

0

0

0

0

0

0

0

DrugA

drops

0

1

0

0

0

0

0

0

0

0

0

0

DrugA

syrup

0

0

1

0

0

0

0

0

0

0

0

0

DrugA

tablet

0

0

0

1

0

0

0

0

0

0

0

0

DrugB

capsule

0

0

0

0

1

0

0

0

0

0

0

0

DrugB

drops

0

0

0

0

0

1

0

0

0

0

0

0

DrugB

syrup

0

0

0

0

0

0

1

0

0

0

0

0

DrugB

tablet

0

0

0

0

0

0

0

1

0

0

0

0

DrugC

capsule

0

0

0

0

0

0

0

0

1

0

0

0

DrugC

drops

0

0

0

0

0

0

0

0

0

1

0

0

DrugC

syrup

0

0

0

0

0

0

0

0

0

0

1

0

DrugC

tablet

0

0

0

0

0

0

0

0

0

0

0

1

Linear combinations of elements of beta

Most of the inference done by LinMix is done with respect to linear combinations of elements of linmixgenmodel00574.png. Consider, as an example, the model where the expected value of the response of the jth subject on the ith treatment is:

linmixgenmodel00576.png 

(7)

where linmixgenmodel00578.png is a common parameter and linmixgenmodel00580.png is the expected value of the ith treatment.

Suppose there are three treatments. The X matrix has a column of ones in the first position followed by three treatment indicators.

The linmixgenmodel00582.png vector is:

linmixgenmodel00584.png 

(8)

The first form is for the general representation of a model. The second form is specific to the model equation (7) above. The expression:

linmixgenmodel00586.png 

(9)

with the lj constant, is called a linear combination of the elements of linmixgenmodel00588.png.

The range of the subscript j starts with zero if the intercept is in the model and starts with one other­wise. Most common functions of the parameters, such as linmixgenmodel00590.png+t1, t1 – t3, and t2, can be generated by choosing appropriate values of lj. Linear combinations of linmixgenmodel00592.png are entered in LinMix through the Esti­mates and Contrasts tabs. A linear combination of linmixgenmodel00594.png is said to be estimable if it can be expressed as a linear combination of expected responses. In the context of the model equation (7), a linear combi­nation of the elements of linmixgenmodel00596.png is estimable if it can be generated by choosing c1, c2, and c3.

c1(linmixgenmodel00598.png+t1)+c2(linmixgenmodel00600.png+t2)+c3(linmixgenmodel00602.png+t3)

(10)

Note that c1=1, c2=0, and c3=0 generates linmixgenmodel00604.png+t1, so linmixgenmodel00606.png+t1 is estimable. A rearrangement of expression (10) is:

(c1+c2+c3)m+c1t1+c2t2+c3t3

(11)

The form in equation (11) makes some things more obvious. For example, lj=cj for j=1,2,3. Also, any linear combination involving only ts is estimable if, and only if, l1+l2+l3=0. A linear combination of effects, where the sum of the coefficients is zero, is called a contrast. The linear combination t1 – t3 is a contrast of the t’s and thus is estimable. The linear combination t2 is not a contrast of the t’s and thus is not estimable.

Determining estimability numerically

The X matrix for the model in equation (7) has three distinct rows, and these are shown in the “Basis of the space orthogonal to the rows of X” table. Also shown in the same table is a vector (call it z) that is a basis of the space orthogonal to the rows of X. Let l be a vector whose elements are the lj of expression (9). The linear combination in vector notation is lTb and is estimable if lTZ=0. Allow for some rounding error in the computation of z. The linear combination is estimable if:

linmixgenmodel00608.png 

(12)

where || l ||¥ is the largest absolute value of the vector, and tol is the Estimability Tolerance.

In general, the number of rows in the basis of the space orthogonal to the rows of X is equal the num­ber of linear dependencies among the columns of X. If there is more than one z, then equation (12) must be satisfied for each z.

Basis of the space orthogonal to the rows of X

Elements of linmixgenmodel00610.png

linmixgenmodel00612.png 

t1

t2

t3

Distinct rows of X

1

1

0

0

 

1

0

1

0

 

1

0

0

1

Basis of orthogonal space

–1

1

1

1

Substitution of estimable functions for non-estimable functions

It is not always easy for a user to determine if a linear combination is estimable or not. LinMix tries to be helpful, and if a user attempts to estimate a non-estimable function, LinMix will substitute a “nearby” function that is estimable. Notice is given in the text output when the substitution is made, and it is the user's responsibility to check the coefficients to see if they are reasonable. This section is to describe how the substitution is made.

The most common attempt to estimate a non-estimable function of the parameters is probably trying to estimate a contrast of main effects in a model with interaction, so this will be used as an example. Consider the model:

yijk=linmixgenmodel00614.png+ai+linmixgenmodel00616.pngj+(alinmixgenmodel00618.png)ij+eijk

(13)

where linmixgenmodel00620.png is the over-all mean; ai is the effect of the i-th level of a factor A,i=1, 2; j is the effect of the j-th level of a factor B, j=1, 2; (alinmixgenmodel00622.png)ij is the effect of the interaction of the i-th level of A with the j-th level of B; and the eijk are independently distributed N(0, s2). The canonical form of the QR factorization of X is displayed in rows zero through eight of the “QR factorization of Z scaled to canonical form” table. The coefficients to generate a1 – a2 are shown in row 9. The process is a sequence of regressions followed by residual computations. In the linmixgenmodel00624.png column, row 9 is already zero so no operation is done. Within the a columns, regress row 9 on row 3. (Row 4 is all zeros and is ignored.) The regression coefficient is one. Now calculate residuals: (row 9) – 1´ (row 3). This operation goes across the entire matrix, i.e., not just the a columns. The result is shown in row 10, and the next operation applies to row 10. The numbers in the linmixgenmodel00626.png columns of row 10 are zero, so skip to the ab columns. Within the alinmixgenmodel00628.png columns, regress row 10 on row 5. (Rows 6 through 8 are all zeros and are ignored.). The regression coefficient is zero, so there is no need to compute residuals. At this point, row 10 is a deviation of row 9 from an estimable function. Subtract the deviation (row 10) from the original (row 9) to get the esti­mable function displayed in the last row. The result is the expected value of the difference between the marginal A means. This is likely what the user had in mind.

This section provides a general description of the process just demonstrated on a specific model and linear combination of elements of linmixgenmodel00630.png. Partition the QR factorization of X vertically and horizontally cor­responding to the terms in X. Put the coefficients of a linear combination in a work row. For each ver­tical partition, from left to right, do the following:

Regress (i.e., do a least squares fit) the work row on the rows in the diagonal block of the QR factorization.

Calculate the residuals from this fit across the entire matrix. Overwrite the work row with these residuals.

Subtract the final values in the work row from the original coefficients to obtain coefficients of an estimable linear combination.

QR factorization of Z scaled to canonical form

 

linmixgenmodel00632.png 

a1

a2

linmixgenmodel00634.png1

linmixgenmodel00636.png2

(alinmixgenmodel00638.png)11

(alinmixgenmodel00640.png)12

(alinmixgenmodel00642.png)21

(alinmixgenmodel00644.png)22

0

1

0.5

0.5

0.5

0.5

0.25

0.25

0.25

0.25

1

 

1

–1

0

0

0.5

0.5

–0.5

–0.5

2

 

 

0

0

0

0

0

0

0

3

 

 

 

1

–1

0.5

–0.5

0.5

–0.5

4

 

 

 

 

0

0

0

0

0

5

 

 

 

 

 

1

–1

–1

1

6

 

 

 

 

 

 

0

0

0

7

 

 

 

 

 

 

 

0

0

8

 

 

 

 

 

 

 

 

0

9

0

1

–1

0

0

0

0

0

0

10

0

0

0

0

0

–0.5

–0.5

0.5

0.5

Final result

0

1

–1

0

0

0.5

0.5

–0.5

–0.5

Fixed effects

The specification of the fixed effects (as defined in “General linear mixed effects model” introduction) can contain both classification variables and continuous regressors. In addition, the model can include interaction and nested terms.

Output parameterization

Crossover example continued

Output parameterization

Suppose X is n´p, where n is the number of observations and p is the number of columns. If X has rank p, denoted rank(X), then each parameter can be uniquely estimated. If rank(X) < p, then there are infinitely many solutions. To solve this issue, one must impose additional constraints on the esti­mator of linmixgenmodel00646.png.

Suppose column j is in the span of columns 0, 1,…, j – 1. Then column j provides no additional infor­mation about the response y beyond that of the columns that come before. In this case, it seems sen­sible to not use the column in the model fitting. When this happens, its parameter estimate is set to zero.

As an example, consider a simple one-way ANOVA model, with three levels. The design matrix is:

X0 

X1 

X2 

X3 

1

1

0

0

1

0

1

0

1

0

0

1

1

1

0

0

1

0

1

0

1

0

0

1

X0 is the intercept column. X1, X2, and X3 correspond to the three levels of the treatment effect. X0 and X1 are clearly linearly independent, as is X2 linearly independent of X0 and X1. But X3 is not lin­early independent of X0, X1, and X2, since X3=X0 – X1 – X2. Hence linmixgenmodel00648.png3 would be set to zero. The degrees of freedom for an effect is the number of columns remaining after this process. In this exam­ple, treatment effect has two degrees of freedom, the number typically assigned in ANOVA.

See “Residuals worksheet and predicted values” for computation details.

Singularity tolerance 

If intercept is in the model, then center the data. Perform a Gram-Schmidt orthogonalization process. If the norm of the vector after GS divided by the norm of the original vector is less than d, then the vector is called singular.

Transformation of the response 

The Dependent Variables Transformation menu provides three options: No transformation, ln trans­formation (loge), or log transformation (log10). Transformations are applied to the response before model fitting. Note that all estimates using log will be the same estimates found using ln, only scaled by ln 10, since log(x)=ln(x)/ln(10).

If the standard deviation is proportional to the mean, taking the logarithm often stabilizes the variance, resulting in better fit. This implicitly assumes that the error structure is multiplicative with lognormal distribution.

Crossover example continued

The fixed effects are given in equation (9). Using that model, the design matrix X would be expanded into that presented in the “Design matrix expanded for fixed effects model Sequence+Period+Treat­ment” table. The vector of fixed effects corresponding to that design matrix would be:

linmixgenmodel00650.png = (linmixgenmodel00652.png0, linmixgenmodel00654.png1, linmixgenmodel00656.png2, linmixgenmodel00658.png3, linmixgenmodel00660.png4, linmixgenmodel00662.png5, linmixgenmodel00664.png6).

Notice that X2=X0 – X1. Hence set linmixgenmodel00666.png2=0. Similarly, X4=X0 – X3 and X6=X0 – X5, and hence set linmixgenmodel00668.png4=0 and linmixgenmodel00670.png6=0.

Design matrix expanded for fixed effects model Sequence+Period+Treatment

Intercept

Sequence

Period

Treatment

X0

AB [X1]

BA [X2]

1 [X3]

2 [X4]

A [X5]

B [X6]

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

0

1

1

0

1

0

1

1

0

0

1

1

0

1

1

0

0

1

1

0

1

1

0

0

1

1

0

1

1

0

0

1

1

0

1

1

0

0

1

1

0

1

1

0

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

1

1

0

0

1

0

1

Variance structure

The LinMix Variance Structure tab specifies the Z, G, and R matrices defined in “General linear mixed effects model” introduction. The user may have zero, one, or more random effects specified. Only one repeated effect may be specified. The Repeated effect specifies the R=Var(e). The Random effects specify Z and the corresponding elements of G=Var(linmixgenmodel00672.png).

Repeated effect

Random effects

Multiple random effects vs. multiple effects on one random effect

Covariance structure types in the Linear Mixed Effects object

Repeated effect

The repeated effect is used to model a correlation structure on the residuals. Specifying the repeated effect is optional. If no repeated effect is specified, then R=s2 IN is used, where N denotes the number of observations, and IN is the N ´ N identity matrix.

All variables used on this tab of the Linear Mixed Effects object must be classification variables.

To specify a particular repeated effect, one must have a classification model term that uniquely identi­fies each individual observation. Put the model term in the Repeated Specification field. Note that a model term can be a variable name, an interaction term (e.g., Time*Subject), or a nested term (e.g., Time(Subject)).

The variance blocking model term creates a block diagonal R matrix. Suppose the variance blocking model term has b levels, and further suppose that the data are sorted according to the variance block­ing variable. Then R would have the form:

linmixgenmodel00674.png 

(14)

where S is the variance structure specified.

The variance S is specified using the Type pull-down menu. Several variance structures are possible, including unstructured, autoregressive, heterogeneous compound symmetry, and no-diagonal factor analytic. See the Help file for the details of the variance structures. The autoregressive is a first-order autoregressive model.

To model heterogeneity of variances, use the group variable. Group will accept any model term. The effect is to create additional parameters of the same variance structure. If a group variable has levels g=1, 2,…, ng, then the variance for observations within group g will be Sg.

An example will make this easier to understand. Suppose five subjects are randomly assigned to each of three treatment groups; call the treatment groups T1, T2, and T3. Suppose further that each subject is measured in periods t=0, 1, 2, 3, and that serial correlations among the measurements are likely. Suppose further that this correlation is affected by the treatment itself, so the correlation struc­ture will differ among T1, T2, and T3. Without loss of generality, one may assume that the data are sorted by treatment group, then subject within treatment group.

First consider the effect of the group. It produces a variance structure that looks like:

linmixgenmodel00676.png 

where each element of R is a 15´15 block. Each Rg has the same form. Because the variance blocking variable is specified, the form of each Rg is:

linmixgenmodel00678.png 

(15)

I5 is used because there are five subjects within each treatment group. Within each subject, the vari­ance structured specified is:

linmixgenmodel00680.png 

This structure is the autoregressive variance type. Other variance types are also possible. Often com­pound symmetry will effectively mimic autoregressive in cases where autoregressive models fail to converge.

The output will consist of six parameters: s2 and linmixgenmodel00682.png for each of the three treatment groups.

Random effects

Unlike the (single) repeated effect, it is possible to specify up to 10 random effects. Additional random effects can be added by clicking Add Random. To delete a previously specified random effect, click Delete Random. It is not possible to delete all random effects; however, if no entries are made in the random effect, then it will be ignored.

Model terms entered into the Random Effects model produce columns in the Z matrix, constructed in the same way that columns are produced in the X matrix. It is possible to put more than one model term in the Random Effects Model, but each random page will correspond to a single variance type. The variance matrix appropriate for that type is sized according to the number of columns produced in that random effect.

The Random Intercept checkbox puts an intercept column (that is, all ones) into the Z matrix.

The Variance Blocking Variables has the effect of inducing independence among the levels of the Variance Blocking Variables. Mathematically, it expands the Z matrix by taking the Kronecker product of the Variance Blocking Variables with the columns produced by the intercept checkbox and the Ran­dom Effects Model terms.

The Group model term is used to model heterogeneity of the variances among the levels of the Group model term, similarly to Repeated Effects.

Multiple random effects vs. multiple effects on one random effect

Suppose one has two random effect variables, R1 and R2. Suppose further that R1 has three levels, A, B, and C, while R2 has two levels, Y and Z. The hypothetical data are shown as follows.

R1 

R2 

A 

Y 

B 

Y 

C 

Y 

A 

Z 

B 

Z 

C 

Z 

Specifying a random effect of R1+R2 will produce different results than specifying R1 on Random 1 and R2 on Random 2 pages. This example models the variance using the compound symmetry vari­ance type. To illustrate this, build the resulting Z and G matrices.

For R1+R2 on Random 1 page, put the columns of each variable in the Z matrix to get the following.

R1 

R2 

A 

B 

C 

Y 

Z 

1

0

0

1

0

0

1

0

1

0

0

0

1

1

0

1

0

0

0

1

0

1

0

0

1

0

0

1

0

1

The random effects can be placed in a vector linmixgenmodel00684.png=(linmixgenmodel00686.png1, linmixgenmodel00688.png2, linmixgenmodel00690.png3, linmixgenmodel00692.png4, linmixgenmodel00694.png5). R1 and R2 share the same vari­ance matrix G where:

linmixgenmodel00696.png

Now, consider the case where R1 is placed on Random 1 tab, and R2 is placed on Random 2 tab. For R1, the Z matrix columns are:

R1 

A 

B 

C 

1

0

0

0

1

0

0

0

1

1

0

0

R1 

 

A 

B 

C 

1

0

0

linmixgenmodel00698.png 

0

1

0

0

0

1

1

0

0

0

1

0

0

0

1

For R2, the columns in the Z matrix and the corresponding G is:

R2 

 

Y 

Z 

 

1

0

linmixgenmodel00700.png 

1

0

1

0

0

1

0

1

0

1

In this case:

linmixgenmodel00702.png

Covariance structure types in the Linear Mixed Effects object

The Type menu in the Variance Structure tab allows users to select a variety of covariance structures. The variances of the random-effects parameters become the covariance parameters for this model. Mixed linear models contain both fixed effects and random effects parameters. For more on variance structures see “Variance structure” for an explanation of variance in linear mixed effects models. See “Covariance structure types” in the Bioequivalence section for descriptions of covariance structures.

Contrasts

A contrast is a linear combination of the parameters associated with a term where the coefficients sum to zero. Contrasts facilitate the testing of linear combinations of parameters. For example, con­sider a completely randomized design with four treatment groups: Placebo, Low Dose, Medium Dose, and High Dose. The user may wish to test the hypothesis that the average of the dosed groups is the same as the average of the placebo group. One could write this hypothesis as:

H0: linmixgenmodel00704.pngPlacebo – (linmixgenmodel00706.pngLow+linmixgenmodel00708.pngMedium+linmixgenmodel00710.pngHigh)/3=0
or equivalently
mPlacebo – (linmixgenmodel00712.pngLow+linmixgenmodel00714.pngMedium+linmixgenmodel00716.pngHigh)=0

(16)

Both of these are contrasts since the sum of the coefficients is zero. Note that both contrasts make the same statement. This is true generally: contrasts are invariant under changes in scaling.

Contrasts are tested using the Wald (1943) test statistic:

linmixgenmodel00718.png 

(17)

where:

linmixgenmodel00720.png is the estimator of linmixgenmodel00722.png.

linmixgenmodel00724.png is the estimator of V.

L must be a matrix such that each row is in the row space of X.

This requirement is enforced in the wizard. The Wald statistic is compared to an F distribution with rank(L) numerator degrees of freedom and denominator degrees of freedom estimated by the pro­gram or supplied by the user.

Joint versus single contrasts

Nonestimability of contrasts

Degrees of freedom

Other options

Joint versus single contrasts

Joint contrasts are constructed when the number of columns for contrast is changed from the default value of one to a number larger than one. Note that this number must be no more than one less than the number of distinct levels for the model term of interest. This tests both hypotheses jointly, i.e., in a single test with a predetermined level of the test.

A second approach would be to put the same term in the neighboring contrast, both of which have one column. This approach will produce two independent tests, each of which is at the specified level.

To see the difference, compare Placebo to Low Dose and Placebo to Medium Dose. Using the first approach, enter the coefficients as follows.

 

Contrast #1

 

 

Title

Joint tests

 

 

Effect

Treatment

 

 

Contrast

Treatment

 

 

#1

Placebo

–1

–1

#2

Low Dose

1

0

#3

Medium Dose

0

1

#4

High Dose

0

0

This will produce one test, testing the following hypothesis.

linmixgenmodel00726.png 

Now compare this with putting the second contrast in its own contrast.

 

Contrast #1

 

Contrast #2

 

Title

Low vs. Placebo

 

Medium vs. Pla­cebo

 

Effect

Treatment

 

Treatment

 

Contrast

Treatment

 

Treatment

 

#1

Placebo

–1

Placebo

–1

#2

Low Dose

1

Low Dose

0

#3

Medium Dose

0

Medium Dose

1

#4

High Dose

0

High Dose

0

This produces two tests:

H01: [–1 1 0 0] linmixgenmodel00728.png=[0]          H02: [–1 0 1 0] linmixgenmodel00730.png=[0]

Note that this method inflates the overall Type I error rate to approximately 2a, whereas the joint test maintains the overall Type I error rate to a, at the possible expense of some power.

Nonestimability of contrasts

If a contrast is not estimable, then an estimable function will be substituted for the nonestimable func­tion. See “Substitution of estimable functions for non-estimable functions” for the details.

Degrees of freedom

The User specified degrees of freedom checkbox available on the Contrasts, Estimates, and Least Squares Means tabs, allows the user to control the denominator degrees of freedom in the F approx­imation. The numerator degrees of freedom is always calculated as rank(L). If the checkbox is not selected, then the default denominator degrees of freedom will be used. The default calculation method of the degrees of freedom is controlled on the General Options tab and is initially set to Sat­terthwaite.

Note:For a purely fixed effects model, Satterthwaite and residual degrees of freedom yield the same number. For details of the Satterthwaite approximation, see “Satterthwaite approximation for degrees of freedom”.

To override the default degrees of freedom, select the checkbox and type an appropriate number in the field. The degrees of freedom must greater than zero.

Other options

To control the estimability tolerance, enter an appropriate tolerance in the Estimability Tolerance field. For more information see “Substitution of estimable functions for non-estimable functions”.

The Show Coefficients checkbox produces extra output, including the coefficients used to construct the tolerance. If the contrast entered is estimable, then it will repeat the contrasts entered. If the con­trasts are not estimable, it will enter the estimable function used instead of the nonestimable function.

Estimates

The Estimates tab produces estimates output instead of contrasts. As such, the coefficients need not sum to zero. Additionally, multiple model terms may be included in a single estimate.

Unlike contrasts, estimates do not support joint estimates. The intervals are marginal. The level must be between zero and 100. Note that it is entered as a percent: Entering 0.95 will yield a very narrow interval with coverage just less than 1%.

All other options act similarly to the corresponding options in the Contrasts tab.

Least squares means

Sometimes a user wants to see the mean response for each level of a classification variable or for each combination of levels for two or more classification variables. If there are covariables with unequal means for the different levels, or if there are unbalanced data, the subsample means are not estimates that can be validly compared. This section describes least squares means, statistics that make proper adjustments for unequal covariable means and unbalanced data in the linear mixed effects model.

Consider a completely randomized design with a covariable. The model is:

yij=ai+xij linmixgenmodel00732.png+eij 

(18)

where yij is the observed response on the jth individual on the ith treatment; ai is the intercept for the ith treatment; xij is the value of the covariable for the jth individual in the ith treatment; linmixgenmodel00734.png is the slope with respect to x; and eij is a random error with zero expected value. Suppose there are two treatments; the average of the x1j is 5; and the average of the x2j is 15. The respective expected values of the sample means are a1+5linmixgenmodel00736.png and a2+15linmixgenmodel00738.png. These are not comparable because of the different coeffi­cients of linmixgenmodel00740.png. Instead, one can estimate a1+10linmixgenmodel00742.png and a2+10linmixgenmodel00744.png where the overall mean of the covariable is used in each linear combination.

Now consider a 2 ´ 2 factorial design. The model is:

yijk=linmixgenmodel00746.png+ai+linmixgenmodel00748.pngj+eijk

(19)

where yijk is the observed response on the kth individual on the ith level of factor A and the jth level of factor B; linmixgenmodel00750.png is the over-all mean; ai is the effect of the ith level of factor A; linmixgenmodel00752.pngj is the effect of the jth level of factor B; and eijk is a random error with zero expected value. Suppose there are six observations for the combinations where i=j and four observations for the combinations where i ¹ j. The respective expected values of the averages of all values on level 1 of A and the averages of all values on level 2 of A are linmixgenmodel00754.png+(0.6 linmixgenmodel00756.png1+0.4 linmixgenmodel00758.png2)+a1 and linmixgenmodel00760.png + (0.4 linmixgenmodel00762.png1+0.6 linmixgenmodel00764.png2)+a2. Thus, sample means cannot be used to compare levels of A because they contain different functions of linmixgenmodel00766.png1 and linmixgenmodel00768.png2. Instead, one compares the linear combinations linmixgenmodel00770.png+(linmixgenmodel00772.png1+linmixgenmodel00774.png2)/2+a1 and linmixgenmodel00776.png+(linmixgenmodel00778.png1+linmixgenmodel00780.png2)/2+a2.

The preceding examples constructed linear combinations of parameters, in the presence of unbal­anced data, that represent the expected values of sample means in balanced data. This is the idea behind least squares means. Least squares means are given in the context of a defining term, though the process can be repeated for different defining terms for the same model. The defining term must contain only classification variables and it must be one of the terms in the model. Treatment is the defining term in the first example, and factor A is the defining term in the second example. When the user requests least squares means, LinMix automatically generates the coefficients lj of expression (9) and processes them almost as it would process the coefficients specified in an estimate state­ment. This chapter describes generation of linear combinations of elements of b that represent least squares means. A set of coefficients are created for each of all combinations of levels of the classifi­cation variables in the defining term. For all variables in the model, but not in the defining term, aver­age values of the variables are the coefficients. The average value of a numeric variable (covariable) is the average for all cases used in the model fitting. For a classification variable with k levels, assume the average of each indicator variable is 1/k. The value 1/k would be the actual average if the data were balanced. The values of all variables in the model have now been defined. If some terms in the model are products, the products are formed using the same rules used for constructing rows of the X matrix as described in the “Fixed effects” section. It is possible that some least squares means are not estimable.

For example, suppose the fixed portion of the model is:

Drug+Form+Age+Drug*Form.

To get means for each level of Drug, the defining term is Drug. Since Drug has three levels, three sets of coefficients are created. Build the coefficients associated the first level of Drug, DrugA. The first coefficient is one for the implied intercept. The next three coefficients are 1, 0, and 0, the indicator variables associated with DrugA. Form is not in the defining term, so average values are used. The next four coefficients are all 0.25, the average of a four factor indicator variable with balanced data. The next coefficient is 32.17, the average of Age. The next twelve elements are:

linmixgenmodel00782.png 

The final result is shown in the DrugA column in the following table. The results for DrugB and DrugC are also shown in the table. No new principles would be illustrated by finding the coefficients for the Form least squares means. The coefficients for the Drug*Form least squares means would be like representative rows of X except that Age would be replaced by the average of Age.

Coefficients of least squares means

Column of X 

linmixgenmodel00784.png 

lj DrugA

lj DrugB

lj DrugC

Intercept

 

linmixgenmodel00786.png0 

1

1

1

Drug

DrugA

linmixgenmodel00788.png1 

1

0

0

DrugB

linmixgenmodel00790.png2 

0

1

0

DrugC

linmixgenmodel00792.png3 

0

0

1

Form

capsule

linmixgenmodel00794.png4 

0.25

0.25

0.25

drops

linmixgenmodel00796.png5 

0.25

0.25

0.25

syrup

linmixgenmodel00798.png6 

0.25

0.25

0.25

tablet

linmixgenmodel00800.png7 

0.25

0.25

0.25

Age

 

linmixgenmodel00802.png8 

32.17

32.17

32.17

Drug*Form

DrugA*cap­sule

linmixgenmodel00804.png9 

0.25

0

0

DrugA*drops

linmixgenmodel00806.png10 

0.25

0

0

DrugA*syrup

linmixgenmodel00808.png11 

0.25

0

0

DrugA*tablet

linmixgenmodel00810.png12 

0.25

0

0

DrugB*cap­sule

linmixgenmodel00812.png13 

0

0.25

0

DrugB*drops

linmixgenmodel00814.png14 

0

0.25

0

DrugB*syrup

linmixgenmodel00816.png15 

0

0.25

0

DrugB*tablet

linmixgenmodel00818.png16 

0

0.25

0

DrugC*cap­sule

linmixgenmodel00820.png17 

0

0

0.25

DrugC*drops

linmixgenmodel00822.png18 

0

0

0.25

DrugC*syrup

linmixgenmodel00824.png19 

0

0

0.25

DrugC*tablet

linmixgenmodel00826.png20 

0

0

0.25

Comparing the Linear Mixed Effects object to SAS PROC MIXED

The following table outlines differences between LinMix and PROC MIXED in SAS.

Feature

LinMix

PROC MIXED

Degrees of freedom

Uses Satterthwaite approximation for all degrees of freedom, by default.
User can also select Residual.

PROC MIXED chooses different defaults based on the model, random, and repeated statements.
To make LinMix and SAS® consistent, set the SAS ‘ddfm=satterth’ option.

Multiple fac­tors on the same random command

In LinMix, effects associated with the matrix columns from a single random statement are assumed to be correlated per the variance specified.
Regardless of variance structure:
If random effects are correlated, put them on the same random tab.
If independent, on separate tabs.

SAS behaves similarly, except that it does not follow this rule in the case of Variance Components, so LinMix and SAS will differ for models with multiple terms on a random statement using the VC structure.

REML

REML (Restricted Maximum Likeli­hood) estimation, as defined in Cor­beil and Searle (1976), does maximum likelihood estimation on the residual vector.
This is how it is done in LinMix.

By default, SAS modifies the REML by ‘profiling out’ residual variance.
To be consistent with LinMix, set the ‘noprofile’ option on the PROC MIXED statement.
In particular, LinMix –log(likelihood) may not equal twice SAS ‘–2 Res Log Like’ if noprofile is not set.

Akaike and Schwarz’s cri­teria

LinMix uses the smaller-is-better form of AIC and SBC in order to match Phoenix.

SAS uses the bigger-is-better form. In addition, AIC and SBC are functions of the likelihood, and the likelihood cal­culation method in LinMix also differs from SAS.

Saturated vari­ance structure

If the random statement saturates the variance, the residual is not esti­mable.
LinMix sets the residual variance to zero.
A user-specified variance structure takes priority over that supplied by LinMix.

SAS sets the residual to some other number.
The SAS residual needs to be added to the other SAS variance parameter estimates in order to match the LinMix output.

Convergence criteria

The convergence criterion is gT H–1g < tolerance with H positive defi­nite, where g is the gradient and H is the Hessian of the negative log-likelihood.

The default convergence criterion options in SAS are ‘convh’ and ‘rela­tive’.
The LinMix criterion is equivalent to SAS with the ‘convh’ and ‘absolute’ options set.

Modified likeli­hood

LinMix always tries to keep the final Hessian (and hence the variance matrix) positive definite.
It will determine when the variance is over-specified, and set some parameters to zero.

SAS allows the final Hessian to be nonpositive definite, and hence will produce different final parameter esti­mates.
This often occurs when the variance is over-specified (i.e., too many parame­ters in the model).

Keeping VC positive

For VC structure, LinMix keeps the final variance matrix positive defi­nite, but will not force individual variances > 0.

If SAS estimates a negative variance component, it outputs 0.

Partial tests of model effects

The partial tests in LinMix are not equivalent to the Type III method in SAS though they coincide in most situations.
See “Partial Tests worksheet”.

The Type III results can change depending on how one specifies the model.
For details, consult the SAS manual.

Not estimable parameters

If a variable is not estimable, LinMix outputs ‘Not estimable.’
LinMix provides the option to write out 0 instead.

SAS writes the value of 0.

ARH, CSH variance struc­tures

LinMix uses the standard deviations for the parameters, and prints the standard deviations in the output.

SAS prints the variances in the output for final parameters, though the model is expressed in terms of standard deviations.

CS variance structure

LinMix does not estimate a Resid­ual parameter, but includes it in the csDiag parameter.

SAS estimates Residual separately. csDiag in WNL=Variance+Residual in SAS.

UN variance structure

LinMix does not estimate a Resid­ual parameter, but includes it in the diagonal term of the UN matrix.

SAS estimates Residual separately. UN (1,1) in WNL=UN(1,1)+Residual in SAS, etc.
(If a weighting variable is used, the Residual is multiplied by 1/weight.)

Repeated without any time variable

LinMix requires an explicit repeated model.
To run some SAS examples, a time variable would need to be added.

SAS doesn’t require an explicit repeated model.
It instead creates an implicit time field and assumes data are sorted appro­priately.

Group as clas­sification vari­able

LinMix requires variables used for the subject and group options to be declared as classification variables.

SAS allows regressors, but requires data to be sorted and considers each new regressor value to indicate a new subject or group.
Classification variables are treated similarly.

Singularity tol­erance

Define dii as the i-th diagonal ele­ment of the QR factorization of X, and Dii the sum of squares of the i-th column of the QR factorization.
The i-th column of X is deemed lin­early dependent on the proceeding columns if dii2 < (singularity toler­ance)´Dii.
If intercept is present, the first row of Dii is omitted.

Define dii as the diagonal element of the XTX matrix during the sweeping process.
If the absolute diagonal element |dii| < (singularity tolerance)´Dii, where Dii is the original diagonal of XTX, then the parameter associated with the col­umn is set to zero, and the parameter estimates are flagged as biased.

References

Allen and Cady (1982). Analyzing Experimental Data By Regression. Lifetime Learning Publications, Belmont, CA

Corbeil and Searle (1976). Restricted maximum likelihood (REML) estimation of variance compo­nents in the mixed models, Technometrics, 18:31–8

Fletcher (1980). Practical methods of optimization, Vol. 1: Unconstrained Optimization. John Wiley & Sons, New York

Giesbrecht and Burns (1985). Two-stage analysis based on a mixed model: Large sample asymptotic theory and small-sample simulation results. Biometrics 41:477–86

Gill, Murray and Wright (1981). Practical Optimization. Academic Press

Wald (1943). Tests of statistical hypotheses concerning several parameters when the number of observations is large. Transaction of the American Mathematical Society 54

 


Last modified date:6/26/19
Certara USA, Inc.
Legal Notice | Contact Certara
© 2019 Certara USA, Inc. All rights reserved.