art00 cfaordinal 2010 simulacion.pdf

Upload: raul-araque

Post on 02-Jun-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    1/33

    PLEASE SCROLL DOWN FOR ARTICLE

    This article was downloaded by: [UAM University Autonoma de Madrid] On: 28 April 2011Access details: Access Details: [subscription number 933814845] Publisher Psychology Press Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

    Structural Equation Modeling A Multidisciplinary JournalPublication details, including instructions for authors and subscription information:http://www.informaworld.com/smpp/title~content=t775653699

    Confirmatory Factor Analysis of Ordinal Variables With MisspecifiedModelsFan Yang-Wallentin a; Karl G. Jreskoga; Hao Luoba Uppsala University and Norwegian School of Management, b Uppsala University,

    Online publication date: 08 July 2010

    To cite this Article Yang-Wallentin, Fan , Jreskog, Karl G. and Luo, Hao(2010) 'Confirmatory Factor Analysis of OrdinalVariables With Misspecified Models', Structural Equation Modeling: A Multidisciplinary Journal, 17: 3, 392 423To link to this Article DOI 10.1080/10705511.2010.489003URL http://dx.doi.org/10.1080/10705511.2010.489003

    Full terms and conditions of use: http://www.informaworld.com/terms-and-conditions-of-access.pdf

    This article may be used for research, teaching and private study purposes. Any substantial orsystematic reproduction, re-distribution, re-selling, loan or sub-licensing, systematic supply ordistribution in any form to anyone is expressly forbidden.

    The publisher does not give any warranty express or implied or make any representation that the contentswill be complete or accurate or up to date. The accuracy of any instructions, formulae and drug dosesshould be independently verified with primary sources. The publisher shall not be liable for any loss,actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directlyor indirectly in connection with or arising out of the use of this material.

    http://www.informaworld.com/smpp/title~content=t775653699http://dx.doi.org/10.1080/10705511.2010.489003http://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://www.informaworld.com/terms-and-conditions-of-access.pdfhttp://dx.doi.org/10.1080/10705511.2010.489003http://www.informaworld.com/smpp/title~content=t775653699
  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    2/33

    Structural Equation Modeling , 17:392423, 2010Copyright Taylor & Francis Group, LLCISSN: 1070-5511 print/1532-8007 onlineDOI: 10.1080/10705511.2010.489003

    Conrmatory Factor Analysis of OrdinalVariables With Misspecied Models

    Fan Yang-Wallentin and Karl G. JreskogUppsala University and Norwegian School of Management

    Hao LuoUppsala University

    Ordinal variables are common in many empirical investigations in the social and behavioralsciences. Researchers often apply the maximum likelihood method to t structural equation modelsto ordinal data. This assumes that the observed measures have normal distributions, which is notthe case when the variables are ordinal. A better approach is to use polychoric correlations andt the models using methods such as unweighted least squares (ULS), maximum likelihood (ML),weighted least squares (WLS), or diagonally weighted least squares (DWLS). In this simulation

    evaluation we study the behavior of these methods in combination with polychoric correlationswhen the models are misspecied. We also study the effect of model size and number of categorieson the parameter estimates, their standard errors, and the common chi-square measures of t whenthe models are both correct and misspecied. When used routinely, these methods give consistentparameter estimates but ULS, ML, and DWLS give incorrect standard errors. Correct standarderrors can be obtained for these methods by robustication using an estimate of the asymptoticcovariance matrix W of the polychoric correlations. When used in this way the methods are herecalled RULS, RML, and RDWLS.

    Structural equation modeling (SEM) is widely used in the social and behavioral sciences, andwithin this area, conrmatory factor analysis (CFA) is the most common type of analysis.

    CFA was originally developed for continuous variables using the maximum likelihood (ML)method, which assumes that the observed variables have a multivariate normal distribution(see, e.g., Jreskog, 1969). For continuous nonnormal variables, Browne (1984) developedan asymptotically distribution free (ADF) method, which is a weighted least squares (WLS)method using the inverse of the asymptotic covariance matrix W of the sample variances andcovariances as a weight matrix.

    The variables used in many empirical studies in the social and behavioral sciences are oftenordinal rather than continuous. Observations on an ordinal variable are assumed to represent

    Correspondence should be addressed to Fan Yang-Wallentin, Uppsala University, Department of Statistics, S-751

    20 Uppsala, Sweden. E-mail: [email protected]

    392

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    3/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 393

    responses to a set of ordered categories such as a ve-category Likert scale. This is typicalwhen data are collected through questionnaires. Although a question might be designed to

    measure a theoretical concept, the observed responses are only a discrete realization of a smallnumber of categories.

    Methods developed for continuous variables are not suitable for such ordinal variables (seeB. O. Muthn & Kaplan, 1985, 1992). Jreskog (1990) suggested that polychoric correlationsand WLS could be used to estimate a CFA model based on ordinal variables. B. O. Muthn(1984) developed a general WLS methodology for continuous and categorical variables (CVM).Jreskog (1994) derived the asymptotic covariance matrix of the polychoric correlations. Inthe special case of only ordinal variables the weight matrices in B. O. Muthn (1984) and inJreskog (1994) are very similar and applying them to WLS gives almost identical results.A common experience with WLS both for continuous and categorical variables is the poorperformance of the WLS estimators and their associated standard errors and chi-squares (see,e.g., Bentler, 1995; Dolan, 1994; Flora & Curran, 2004; Potthast, 1993; West, Finch, & Curran,1995). Most likely these problems occur because the estimate of W 1 used as a weight matrixis very unstable unless the sample size is very large.

    A better method of estimation for ordinal variables was proposed by B. O. Muthn, du Toit,and Spisic (1997). This method is called Robust WLS in L. K. Muthn and Muthn (1998,pp. 357358). It is similar to the method termed DWLS by Jreskog and Srbom (1996a, pp.2324). Both methods use only the diagonal elements of W in the tting of the model anduse the full W to obtain correct standard errors and chi-squares. Simulation studies by B. O.Muthn et al. (1997) and by Flora and Curran (2004) show that Robust WLS works muchbetter than full WLS especially for large models and small to moderate sample sizes under

    correct specication of the model. In this article, we study the behavior of robust unweightedleast squares (RULS), robust maximum likelihood (RML), and robust diagonally weighted leastsquares (RDWLS) under misspecied models and we nd that these methods perform betterthan full WLS. We also nd that none of these three methods is uniformly better than the other.However, our results also show that the simpler method of RULS and even RML, if used in aspecic way, also produce estimates and standard errors that are equally good.

    RESEARCH QUESTIONS

    The purpose of this study is to compare the performance of different estimators (i.e., RULS,RML, WLS, and RDWLS) for estimating the parameters in conrmatory factor analysis modelsunder conditions of correctly and incorrectly specied models using data on ordinal variables.We also examine the effect of number of categories and their probability distribution. We studyone small and one large model. Similar simulation studies have been reported by Potthast (1993)and Dolan (1994). However, as far as we are aware, no one has studied all these methods undermisspecied models.

    We attempt to answer the following research questions:

    1. Is any of these methods uniformly better or worse than the others on criteria such as bias

    and mean square error?

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    4/33

    394 YANG-WALLENTIN, JRESKOG, LUO

    2. Is any of these methods uniformly better or worse than the others in estimating thestandard errors of the estimated parameters, using the same criteria?

    3. Do bias and mean square error increase with increasing degree of misspecication ordecreasing number of categories?

    4. Do problems of nonconvergence increase with increasing degree of misspecication ordecrease with increasing number of categories? Nonconvergence is discussed later.

    5. How large a sample is needed to avoid problems of nonconvergence? How does thissample size depend on the degree of misspecication and on the number of categories?

    We are not aware of any studies that address all these questions.

    CONFIRMATORY FACTOR ANALYSIS WITH ORDINAL VARIABLES

    Ordinal Variables

    Let x1; x2; : : : ; x p be p ordinal variables to be analyzed. Following B. O. Muthn (1984), Lee,Poon, and Bentler (1990), Jreskog (1990), and others, it is assumed that there is a continuousvariable x?i underlying the ordinal variable xi . This continuous variable x

    ?i represents the

    attitude underlying the ordered responses to xi and is assumed to have a range from 1 toC1 . It is the underlying variable x ?i that is assumed to follow a conrmatory factor analysismodel.

    The underlying variable x?i is unobservable. Only the ordinal variable xi is observed. For

    an ordinal variable xi with mi categories, the connection between the ordinal variable xi andthe underlying variable x?i is

    x i D c () .i /c 1 < x

    ?i <

    .i /c ; c D 1 ; 2 ; : : : ; m i ; (1)

    where

    .i /0 D 1 ; .i /1 <

    .i /2 < : : : <

    .i /mi 1 ;

    .i /mi D C1 ; (2)

    are threshold parameters. For variable x i with mi categories, there are mi 1 strictly increasingthreshold parameters .i /1 ;

    .i /2 ; : : : ;

    .i /mi 1 .

    Because only ordinal information is available about x i , the distribution of x?i is determinedonly up to a monotonic transformation. It is convenient to let x?i have the standard normaldistribution with density function .:/ and distribution function .:/ . Then the probability .i /cof a response in category c on variable xi , is

    .i /c D Prxi D c D Pr

    .i /c 1 < x

    ?i <

    .i /c D Z

    .i /c

    .i/c 1

    .u/du D . .i /c / ..i /c 1/ ; (3)

    for c D 1 ; 2 ; : : : ; m i 1, so that

    .i /c D 1 . .i /1 C .i /2 C C .i /c / ; (4)

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    5/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 395

    where 1 is the inverse of the standard normal distribution function. The quantity . .i /1 C.i /2 C C

    .i /c / is the probability of a response in category c or lower.

    The probabilities .i /c are unknown population quantities. In practice, .i /c can be estimatedconsistently by the corresponding percentage p .i /c of responses in category c on variable xi .Then, estimates of the thresholds can be obtained as

    O.i /c D 1.p .i /1 C p

    .i /2 C Cp

    .i /c /; c D 1 ; : : : ; m 1 : (5)

    The quantity .p .i /1 C p.i /2 C Cp

    .i /c / is the proportion of cases in the sample responding in

    category c or lower on variable xi .

    Polychoric Correlations

    Let xi and x j be two ordinal variables with mi and mj categories, respectively. Their marginaldistribution in the sample is represented by a contingency table

    0BBBB@

    n .ij/11 n.ij/12 n

    .ij/1m j

    n .ij/21 n.ij/22 n

    .ij/2m j

    ::::::

    ::::::

    :::n .ij/m i 1 n

    .ij/m i 2 n

    .ij/mi m j

    1CCCCA; (6)

    where n.ij/

    ab is the number of cases in the sample in category a on variable xi and in categoryb on variable x j . The underlying variables x ?i and x?j are assumed to be bivariate normal with

    zero means, unit variances, and with correlation ij , the polychoric correlation.Let .i /1 ;

    .i /2 ; : : : ;

    .i /mi 1 be the thresholds for variable x

    ?i and let

    .j /1 ;

    .j /2 ; : : : ;

    .j /m j 1 be the

    thresholds for variable x?j .The polychoric correlation can be estimated by maximizing the log-likelihood of the multi-

    nomial distribution (see Olsson, 1979)

    ln L Dmi

    Xa

    D1

    mj

    Xb

    D1

    n .ij /ab log.ij/ab ; (7)

    where

    .ij/ab D Prxi D a; x j D b D Z

    .i /a

    .i/a 1 Z .j /b

    .j /b 1

    2 .u; v/dudv ; (8)

    and

    2 .u; v/ D 1

    2 p 1 2e

    12.1 2 /

    .u 2 2uv Cv2 / ; (9)

    is the standard bivariate normal density with correlation ij . Maximizing ln L gives the samplepolychoric correlation denoted rij .

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    6/33

    396 YANG-WALLENTIN, JRESKOG, LUO

    The polychoric correlation can be estimated by a two-step procedure (see Olsson, 1979).In the rst step, the thresholds are estimated from the univariate marginal distributions by

    Equation 5. In the second step, the polychoric correlations are estimated from the bivariatemarginal distributions by maximizing ln L for given thresholds. The parameters can also beestimated by a one-step procedure that maximizes ln L with respect to the thresholds and thepolychoric correlation simultaneously but this is not necessary because the estimates are almostthe same as with the two-step procedure and it is not practical because it would yield differentthresholds for the same variable when paired with different variables. For an example, seeJreskog (20022005, Table 3, p. 13).

    Jreskog (1994) showed that the polychoric correlation rij is asymptotically linear in thebivariate marginal proportions P ij , where P ij is a matrix of order mi mj whose elementsare p .ij/ab D n

    .ij/ab =N , where N is the sample size. Thus, rij ' t r . 0ij P ij / . The elements of the

    matrix ij

    are given in Jreskog (1994, Equation 16). Using this result one can estimate theasymptotic covariance N ACov.r gh ; r ij / for all g h and i j (see Jreskog, 1994, fordetails).

    ESTIMATION METHODS

    For continuous variables, several methods are available for estimating structural equationmodels and conrmatory factor analysis models: ML and various least squares methods. If combined with an estimated asymptotic covariance matrix, these methods can also providecorrect standard errors even for nonnormal variables under certain assumptions. None of these

    methods can be used directly with ordinal variables but can be used in modied forms to tthe models to polychoric correlations.The model to be estimated is a factor model of the form

    x ? D C ; (10)

    where x ? is a vector of order p 1 of underlying variables corresponding to the p 1 vectorof the observed ordinal variables x , as dened earlier. The vectors of order k 1 and of order p 1 represent the factors and the unique variables that are assumed to be uncorrelated.The matrix of order p k contains the factor loadings ij . Some elements of might bexed at zero.

    Let and be the covariance matrices of and , respectively. We assume that the uniquefactors are uncorrelated so that is a diagonal matrix. For convenience we assume that isa correlation matrix with ones in the diagonal. The covariance matrix of x ? is

    D 0C : (11)

    Because the underlying variables x ?i have variances equal to 1, it follows that

    D I diag . 0/ ; (12)

    so that

    . ; / D 0C I diag . 0/ : (13)

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    7/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 397

    We write . ; / to emphasize that is a function of and . This is the correlationmatrix implied by the model that is to be tted to the matrix of polychoric correlations R .

    To estimate the model four alternative methods are considered and compared in this article,namely unweighted least squares (ULS), diagonally weighted least squares (DWLS), WLS, andML. In the estimation, the constraints in Equation 13 are achieved by using the constrainedparameter feature in LISREL (see Jreskog & Srbom, 1996a, pp. 345349). An exampleLISREL syntax le is given in the Appendix.

    Three Least Squares Methods

    The three least squares methods are two-step methods. In the rst step the polychoric corre-lations r and their asymptotic covariance matrix W are estimated as described earlier. Notethat r D .r 21 ; r 31; r 32 ; : : : ; r p;p 1/0 is a vector of the polychoric correlations below the diagonalof the polychoric correlation matrix R . The 1s in the diagonal are not included in the vectorr . As described earlier, both r and W are estimated from the sample data without the use of the model. Let s D p.p 1/=2 . The vector r is of order s 1 and the matrix W is of orders s . The matrix W contains the elements of the estimated N ACov.r gh ; r ij / arranged tocorrespond to r .

    In the second step and are tted to r by minimizing the t function

    F . r ; ; / D r . ; / 0V r . ; / ; (14)

    where V is a positive matrix and . ; / is a vector of the elements of 0 below thediagonal. The three least squares methods differ in the choice of weight matrix V :

    ULS W V D I (15)

    DWLS W V D . diag W / 1 (16)

    WLS W V D W 1 (17)

    The main difference between these weight matrices is that for ULS and DWLS the weightmatrix is diagonal, whereas for WLS the weight matrix is the inverse of the full matrix W .For DWLS only the diagonal elements of W are used. In scalar form these t functions can

    be written asULS W F . r ; ; / D Xi .r i i /2 (18)

    DWLS W F. r ; ; / D Xi .r i i /2=wi i (19)WLS W F . r ; ; / D Xi Xj .r i i /.r j j /w ij (20)

    where wij

    is an element of W1.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    8/33

    398 YANG-WALLENTIN, JRESKOG, LUO

    The Maximum Likelihood Method

    The method of ML has no theoretical justication for use with ordinal variables. Nevertheless,it works if used as follows. Let R be the matrix of polychoric correlations with ones in thediagonal and let be dened as in Equation 13. The ML t function is

    F 1 . R ; ; / D log j j C t r . R 1/ log jR j p ; (21)

    which is to be minimized with respect to the free elements of and .This is of a totally different form from Equation 14. We show that this can also be written

    in the form of Equation 14. Let r ? D .1;r 21; 1 ; r 31; r 32 ; 1 ; : : : ; r p;p 1; 1/ 0; that is, r ? is a vectorof the elements of R below the diagonal and including the 1s in the diagonal of R . Similarly,let ? . ; / be a vector of the corresponding elements of . ; / , noting that also has 1s

    in the diagonal. Let K be a matrix of order s p.p C 1/=2 with elements 0 and 1 such thatr D Kr ? .Minimizing F 1 in Equation 21 is equivalent to minimizing

    F 2 . r ? ; ; / D r ? ? . ; / 0V ? r ? ? . ; / ; (22)

    with

    V ? D D 0. O 1 O 1/D ; (23)

    where denotes a Kronecker product, D is the duplication matrix (Magnus & Neudecker,1999, pp. 4853), and O D . O ; O / . This equation should be understood as follows. SupposeO and O are estimates of and in the i th iteration. New estimates of and can be

    obtained by minimizing F 2 using V ? in Equation 23. Update V ? in each iteration. This iterationalgorithm converges to the ML estimates O and O , which minimizes F 1 . This shows that theML estimates can be obtained by iteratively reweighted least squares. Minimizing Equation 22is equivalent to minimizing Equation 14 using

    V D V ML D KV ? K 0 ; (24)

    which shows that ML also ts in the same framework as ULS, DWLS, and WLS. The only

    difference is that its weight matrix V

    is a bit more complicated.

    Standard Errors and Chi-Squares

    For continuous variables, various formulas for asymptotic standard errors and chi-squares havebeen developed, notably by Browne (1984) and Satorra (1989). As far as we are aware ithas not been shown that these formulas are also valid for ordinal variables and polychoriccorrelations. Because the vector r has an asymptotic normal distribution we conjecture thatthese formulas can be used in modied form also in the situation considered here.

    Let be a t 1 vector of the free elements of and and let O be the minimizer of

    F . r ; / in Equation 14 for ULS, DWLS, ML, and WLS.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    9/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 399

    Then a consistent estimate of N ACov. O/ is

    . O0V O/ 1 O0VWV O. O0V O/ 1 ; (25)

    where O D @ =@ evaluated at O. This is Brownes (1984) formula (2.12a) applied topolychoric correlations. The standard error of Oi is 1=N times the square root of the i thdiagonal element of this matrix. Thus, the same formula is used to obtain the standard errorsfor all methods. Here W is the same for all methods, whereas V and O vary over methods.Note again that the vector does not include the diagonal elements of .

    At least four so-called chi-squares have been suggested for testing structural equation modelswith continuous variables. Following the notation in Jreskog, Srbom, Du Toit, and Du

    Toit (2003), these chi-squares are denoted c1, c2, c3, and c4. These are valid under differentconditions. If the observed variables have a multivariate normal distribution, c1 and c2 havean asymptotic chi-square distribution if the model is correctly specied. c3 is the Satorra andBentler (1988) SB statistic, which is c1 or c2 multiplied by a scale factor that is estimated fromthe sample and involves an estimate of the asymptotic covariance matrix (ACM) of the samplevariances and covariances. Although the asymptotic distribution of c3 is not exactly 2 , it isused as a 2 statistic because the scale factor is estimated such that c3 has an asymptoticallycorrect mean. The test statistic c3 is considered as a way of correcting c1 or c2 for the effectsof nonnormality. c4 is the ADF statistic in Browne (1984, Equation 2.20a). This involves theinverse of the ACM. Browne (1984) showed that c4 has an asymptotic 2 distribution undercertain standard conditions.

    As far as we are aware these test statistics have not been shown to be valid for ordinalvariables and polychoric correlations. For the situation considered in this article, we modifythe denitions of c1, c2, c3 , and c4 as follows:

    c1 is N 1 times the minimum value of Equation 21. c2 is N 1 times the minimum value of Equation 22. c3 is d = h times c2 , where d is the degrees of freedom and

    h D tr. 0c V 1 c / 1 . 0c W c / : (26)

    Here Oc is an orthogonal complement to O such that O0c O D 0 . c4 is N 1 times the minimum value of the WLS t function.

    We are not claiming that any of these cs have an asymptotic 2 distribution with d degreesof freedom even if the model holds. The properties of the cs are investigated by simulation.

    In the simulation study reported in this article, we found that c1, c2 , and c4 performedmuch worse than c3, which performed reasonably well. We therefore report only the resultsfor c3 . The standard errors reported in this article have been obtained from Equation 25 and

    the chi-squares are c3 for RULS, RDWLS, and RML and c4 for WLS.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    10/33

    400 YANG-WALLENTIN, JRESKOG, LUO

    Nothing is known about the estimated standard errors and chi-squares when the model doesnot hold. 1 The behavior of the standard errors and chi-squares under misspecied models are

    studied by simulation.

    SIMULATION DESIGN

    Models

    CFA is one of the most widely used applications in the social and behavioral sciences. Themodels we use in the simulation study are typical examples of CFA. Two models are used,referred to as Model 1 and Model 2. Model 1 is a small model with p D 6 observed variablesand k D 2 factors and Model 2 is a large model with p D 16 observed variables and k D4 factors. The models will be studied under correct specication and under three levels of misspecication. By correct specication we mean that the model is properly specied suchthat the estimated model matches the population model that is used to generate the data.Structurally misspecied models are models where the population model used to generate thedata differs from the model actually estimated. Thus, for each model (Model 1 and Model2) there are four different population models. In addition we study three different number of categories with symmetric and nonsymmetric distributions of the observed ordinal variables.Altogether there are 24 experimental cells for each model.

    Model 1. A path diagram of Model 1 is shown in Figure 1.In matrix form the model is:

    0BBBBBB@

    x ?1x ?2x ?3x ?4x ?5x ?6

    1CCCCCCAD0BBBBBB@

    11 021 031 041 420 520 62

    1CCCCCCA 12

    C0BBBBBB@

    123456

    1CCCCCCA; (27)

    where the correlation matrix of 1 and 2 is

    D 1 21 1 ; (28)

    and the covariance matrix of . 1 ; 2 ; 3; 4; 5; 6/ is

    D diag.1 ; 2; 3 ; 4; 5; 6 / : (29)

    1 For continuous variables, Satorra (1989, 2003) developed a robustness theory for structural equation models whereit is assumed that the degree of misspecication is of the order of magnitude 1=p N as the sample size N increases.We do not make this assumption in this article.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    11/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 401

    FIGURE 1 Path diagram of Model 1.

    When the model is estimated it is assumed that 41 D 0 but the data is generated usingdifferent values of 41. Because x?i has variance 1 the value of 41 affects the variance of 4 ,denoted by the symbol 4 .

    The parameter values used to generate data for Model 1 are

    .1 ; 2; 3; 41; 4; 5; 6/ D .0:9; 0:8; 0:7; 41; 0:6; 0:7; 0:8/ ; (30)

    . 11 ; 21; 22/ D .1:0; 0:6; 1:0/ ; (31)

    .1 ; 2; 3; 4; 5; 6/ D .0:19; 0:36; 0:51; 4; 0:51; 0:36/ ; (32)

    where

    41 D .0; 0:1; 0:3; 0:5/ ; (33)

    and

    4 D .0:64; 0:558; 0:334; 0:03/ : (34)

    Thus, in addition to the base model where 41 D 0, there are three levels of misspecications

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    12/33

    402 YANG-WALLENTIN, JRESKOG, LUO

    where 41 D 0:1; 0; 3; 0:5 , respectively. Note that the parameters are only used in the datagenerating process; as explained earlier, they are not involved in the estimation of the model.

    Thus, there are seven parameters to be estimated in and for Model 1.

    Model 2. A path diagram for Model 2 is shown in Figure 2.

    FIGURE 2 Path diagram of Model 2.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    13/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 403

    In matrix form Model 2 is as follows:

    0BBBBBBBBBBBBBBBBBBBBBBBBBB@

    x ?1x ?2x ?3x ?4x ?5x ?6x ?7x ?8x ?9x ?10

    x?11

    x ?12x ?13x ?14x ?15x ?16

    1CCCCCCCCCCCCCCCCCCCCCCCCCCA

    D

    0BBBBBBBBBBBBBBBBBBBBBBBBBB@

    11 0 0 1421 0 0 031 0 0 041 0 0 00 52 0 00 62 0 00 72 0 00 82 0 00 0 93 00 0 10;3 0

    0 0 11;3 00 0 12;3 00 0 0 13;40 0 0 14;40 0 0 15;4

    16;1 0 0 16;4

    1CCCCCCCCCCCCCCCCCCCCCCCCCCA

    0BB@

    1234

    1CCA

    C

    0BBBBBBBBBBBBBBBBBBBBBBBBBB@

    12345678910

    111213141516

    1CCCCCCCCCCCCCCCCCCCCCCCCCCA

    ; (35)

    where the correlation matrix of .1 , 2 , 3 , 4 / is

    D 0BB@

    1 21 1

    31 32 1 41 42 43 1

    1CCA

    ; (36)

    and the covariance matrix of . 1 ; 2; 3; 4; 5 ; 6; 7; 8; 9; 10 ; 11 ; 12; 13 ; 14 ; 15 ; 16/ is

    D diag. 1 ; 2; 3; 4; 5; 6; 7; 8; 9; 10; 11 ; 12 ; 13 ; 14; 15; 16/ : (37)

    The parameter values for Model 2 are chosen as:

    .11 ; 21; 31; 41; 52; 62; 72; 82; 93; 10;3 ; 11;3 ; 12;3 ; 13;4 ; 14;4 ; 15;4 ; 16;4 / D.0:4; 0:5; 0:6; 0:7; 0:8; 0:7; 0:6; 0:5; 0:6;0:7; 0:8; 0:9; 0:8; 0:7; 0:5; 0:3/ ;

    . 21 ; 31 ; 32 ; 41 ; 42 ; 43 / D .0:2; 0:4; 0:6; 0:8; 0:5; 0:3/ ;

    . 1 ; 2; 3; 4 ; 5; 6 ; 7; 8; 9 ; 10 ; 11 ; 12; 13; 14 ; 15 ; 16 / D

    . 1 ; 0:75; 0:64; 0:51; 0:36; 0:51; 0:64; 0:75; 0:64; 0:51; 0:36; 0:19; 0:36; 0:51;0:75; 16/ ;

    where

    14 & 16;1 D .0; 0:1; 0:3; 0:5/ ; (38)

    1 D .0:84; 0:798; 0:654; 0:43/ ; (39)

    16 D .0:91; 0:788; 0:748; 0:54/ : (40)

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    14/33

    404 YANG-WALLENTIN, JRESKOG, LUO

    Again, the values of the parameters are only used to generate the data. In the model to beestimated there are 22 independent parameters in and .

    Number of Categories

    Each ordinal variable x i is assigned to have two, ve, or seven categories with and without sym-metric distributionsas shown in Figure 3. The category probabilitiesare the same for all variables.

    Sample Sizes and Number of Replications

    Five sample sizes are used in the simulation study. They represent sample sizes commonlyencountered in applied research, ranging from fairly small to fairly large. The sample sizes are

    100; 200; 400; 800; and 1,600.

    FIGURE 3 Category probabilities.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    15/33

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    16/33

    406 YANG-WALLENTIN, JRESKOG, LUO

    alternatives are not available because it is impossible to intervene in the simulation processand give special treatment to those samples that do not converge. The nonconverged samples

    are not known until after the 2,000 replicates are nished.There seem to be two approaches to deal this problem in the literature, both of which are

    unsatisfactory. One approach is to ignore the nonconverged samples. This gives rise to sampleselection bias in estimates of bias and mean square error and other outcome variables. Theother approach is to continue replications until 2,000 converged samples have been obtained.This approach also gives biased estimates of outcome variables and it gives no informationabout the frequency of occurrence of nonconvergence.

    Nonconverged solutions only occur in small samples. Users of SEM should be aware of these problems and know how large sample size is needed to avoid these problems. In thisstudy we report the frequency of occurrence of nonconverged solutions for the sample sizesN D 100 and N D 200 and for these sample sizes we do not report results on other outcomevariables such as bias and mean square error. For the sample sizes N D 400 , N D 800, andN D 1,600, problems of nonconvergence do not occur and the results on bias and mean squareerror and other outcome variables that we report are therefore unbiased.

    Bias and Root Mean Square Error

    Our primary interest was in the overall properties of the different estimation methods, and theindividual parameter estimates were of secondary importance. Because of the large number of estimated factor loadings both within and across different conditions, we examined the averagevalues of factor loadings to summarize our results to be more efcient. We considered three

    major outcome variables of interest: parameter estimates (including both factor loadings andfactor correlations), standard errors, and chi-square test statistics. We examined the averagerelative bias or percentage bias of each outcome variables across all study conditions. Basedon prior simulation studies (Curran, West, & Finch, 1996; Kaplan, 1989), we considered relativebias values less than 5% indicating a trivial bias, values between 5% and 10% indicating amoderate bias, and values greater than 10% indicating a substantial bias. Because the bias andmean square error of an individual parameter can depend on the size of the true parametervalue we use the following estimates of bias and mean square error.

    Let Oij be the estimated parameter value of the j th parameter in the i th sample (replicate),i D 1 ; 2 ; : : : ; R , j D 1 ; 2 ; : : : ; n , (thus n D number of parameters, R D number of replicates),and let j be the corresponding true parameter value.

    Average relative bias (ARB).

    ARB D 100.1=R/ Xi .1=n/ XjOij j

    j ! : (42)Average root mean square error (AMSE).

    . AMSE / D .1=R/

    Xi

    s .1=n/

    Xj

    .Oij j /=j 2 : (43)

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    17/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 407

    TABLE 1Model 1: Percentage of Samples with Nonconvergent Solutions

    Symmetric Nonsymmetric

    ULS DWLS ML WLS ULS DWLS ML WLS

    41 D0N D100 0.0 0.1 1.4 0.2 0.1 0.3 8.7 0.8N D200 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.041 D0:1N D100 0.0 0.0 1.0 0.1 0.0 0.0 9.0 0.6N D200 0.0 0.0 0.0 0.0 0.0 0.0 1.1 0.041 D0:3N D100 0.0 0.0 1.2 0.0 0.0 0.1 6.9 0.2N D200 0.0 0.0 0.0 0.0 0.0 0.0 2.8 0.041 D

    0:5N D100 0.0 0.0 3.9 0.1 0.0 0.0 0.0 0.0N D200 0.0 0.0 0.1 0.0 0.0 0.0 0.0 0.0 Note. Number of categoriesD2. ULSDunweighted least squares; DWLSDdiagonally weighted least squares;ML D maximum likelihood; WLS Dweighted least squares.

    Bias and Root Mean Square Error for Standard Errors

    Equations 42 and 43 can applied to the estimated standard errors O ij if j is replaced by ij , the standard deviation of the parameter estimates in the distribution of the R D 2,000replicates.

    RESULTS

    Nonconvergence

    The percentage of nonconverged solutions is given in Table 1 for Model 1 and in Table 2 forModel 2. Nonconverged samples occur only for sample sizes N D 100 and N D 200. Evenfor those sample sizes, nonconvergence is not a serious problem. The largest percentage of

    nonconvergent samples is 9.0 for Model 1 and 7.2 for Model 2. The largest percentage occursfor method ML 2 for sample size N D 100 . For other methods and for N D 200 the percentagesare very small. The percentages seem to decrease with increasing number of categories and donot seem to increase with increasing degree of misspecication. If the number of categoriesis ve or seven, there are no nonconvergent samples for Model 1 and at N D 200 for Model2, regardless of the degree of misspecication. There are nonconverged samples observed atN D 100 for Model 2, but the nonconvergence rates are not larger than 2%.

    2 We suspect that the reason for this is that the matrix of polychoric correlations is not positive denite as requiredfor the ML t function. At the time when the simulations were performed it was not possible distinguish this reason

    of nonconvergence from other reasons.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    18/33

    408 YANG-WALLENTIN, JRESKOG, LUO

    TABLE 2Model 2: Percentage of Samples with Nonconvergent Solutions

    Symmetric Nonsymmetric

    ULS DWLS ML WLS ULS DWLS ML WLS

    4;1 D16;1 D0N D100 0.2 0.7 5.6 0.0 0.2 0.9 7.2 0.0N D200 0.0 0.0 0.1 0.3 0.0 0.1 0.4 0.44;1 D16;1 D0:1;0:1N D100 0.0 0.7 4.2 0.0 0.1 0.5 4.4 0.0N D200 0.0 0.1 0.2 0.1 0.0 0.0 0.2 0.14;1 D16;1 D0:3N D100 0.0 0.5 2.6 0.0 0.0 0.2 1.8 0.0N D200 0.0 0.0 0.0 0.0 0.0 0.0 0.3 0.04;1 D

    16;1

    D0:5

    N D100 0.1 0.4 2.2 0.0 0.0 0.4 1.7 0.0N D200 0.0 0.0 0.3 0.0 0.0 0.1 0.3 0.1

    Properties of Parameter Estimates

    The ARB for parameter estimates as a function of the degree of misspecication and differentcategory distributions is shown in Figure 4 for sample size N D 400 for Model 1. It is seenthat the bias increases almost linearly with increasing degree of misspecication. It is also

    seen that the bias is larger for WLS than for the other three methods that are very similar.The number of categories and their distributions do not seem to have any effect on bias. Thecorresponding gures for sample sizes N D 800 and N D 1,600, not shown here, look verysimilar, although the differences between methods seem to become smaller for the larger samplesizes.

    The corresponding gure for Model 2 is shown in Figure 5, which exhibits the samecharacteristics as Figure 4, except that the distance between WLS and the other three methodsseems to be even larger.

    Figure 6 shows ARB as a function of sample size for Model 1 with no specication error.Again it is seen that the bias is much larger for WLS than for the other methods and thatthe relative bias converges to 0 as the sample size increases. Even with N D 1,600, however,there is considerable bias for WLS. Again the number of categories and their distributionsdo not seem to make a difference. We have similar gures, not shown here, for the cases41 D 0:1; 0:3; 0:5 . They exhibit similar characteristics except that the biases are larger forthe larger degrees of misspecication. The corresponding gures for Model 2 show similarcharacteristics.

    The number and shape of the distribution of categories do not seem to make any difference.The reason for this is probably that these characteristics do not have an effect on the estimationof the polychoric correlations and their asymptotic covariance matrix and because these are thesame for all methods, the methods are unaffected. In the following we therefore show guresonly for the case of seven categories with a nonsymmetric distribution. For this case, we can

    show results for both models in the same gure as in Figure 7 where ARB is shown as a

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    19/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 409

    FIGURE 4 Model 1: Average relative bias (ARB) as a function of degree of misspecication (N D400 ).

    function of degree of misspecication and sample size. This gure shows that ARB

    increases as the degree of misspecication increases. decreases as the sample size increases. increases from Model 1 to Model 2. is essentially unaffected by the number and shapes of category probabilities. is signicantly larger for WLS than the other three methods, regardless of condition.

    None of the methods underestimates the parameters on average. With no specication errorULS, DWLS, and ML are essentially unbiased, whereas WLS slightly overestimates theparameters on average. With all positive degree of misspecication, all methods overestimateparameters on average.

    The previously shown gures give only estimates of average bias. From these gures it is

    clear that WLS is worse in terms of bias than the other three methods, which in turn look very

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    20/33

    410 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 5 Model 2: Average relative bias (ARB) as a function of degree of misspecication (N D400).

    similar. Are the biases of ULS, DWLS, and ML equal? Because we have R D 2,000 replicatesof ARB for each degree of misspecication and each combination of number and shape of categories, we estimated a multivariate analysis of covariance model to test for signicantdifferences in the mean of ARB after controlling for the effects of sample size. Although therewere no signicant differences in several cases, there were a majority of cases where ULS wassignicantly better than DWLS and some cases where ULS was signicantly better than ML.

    Figure 8 shows AMSE as a function of sample size for Model 2 with no specication errorand for different number and shapes of category probabilities. Figure 9 shows AMSE as afunction of degree of misspecication for Model 2 estimated at N D 800 , for different numberand shapes of category probabilities. Similarly to ARB these gures show that AMSE

    increases with the degree of misspecication increases. decreases as the sample size increases.

    increases from Model 1 to Model 2.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    21/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 411

    FIGURE 6 Model 1 with no specication error: Average relative bias (ARB) as a function of sample size.

    the number and shapes of category probabilities do not seem to matter much. is larger for WLS than the other three methods, regardless of condition.

    We have similar gures for Model 1 and for different degrees of misspecication andsample sizes, not shown here. Figure 10 for seven categories with a nonsymmetric distributionsummarizes the situation for both Model 1 and 2.

    Properties of Standard Errors

    In structural equation modeling (SEM) one is not only interested in estimating the parametersof the model; one would also like to know how precise the estimates are. This is answered bythe standard errors provided in most computer programs for SEM. The parameter estimate andstandard error are usually transformed into a t value or z value to judge whether the parameter

    estimate is statistically signicant.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    22/33

    412 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 7 Seven categories nonsymmetric: Average relative bias (ARB) as a function of degree of misspecication and sample size.

    The standard error depends on the model, on the method of estimation, and on the samplesize. Here we also investigate how it depends on the degree of misspecication and the numberand shape of the category probabilities.

    As explained earlier, we use the standard deviation of the R D 2,000 estimates as the truestandard error and compute ARB and AMSE in the same way as for parameter estimates. Forthe standard errors we use the notation ARBSE and average mean square error for standarderrors (AMSESE).

    Figure 11 shows ARBSE for Model 2 with no specication error as a function of samplesize for the different category probabilities. It is seen that the standard errors are grosslyunderestimated with WLS at all sample sizes. The bias is about 35% at N D 400 and about

    10% at N D 1,600. ML also underestimates standard errors at N D 400 but the bias is smalland vanishes at larger sample sizes. The negative bias for ML seems to decrease with increasing

    numbers of categories. This is particularly noticeable at N D 400. ULS and DWLS seem toestimate the standard errors best with essentially no bias at any of the sample sizes investigated.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    23/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 413

    FIGURE 8 Model 2 with no specication error: Average root mean square error (AMSE) as a function of sample size.

    Figure 12 shows ARBSE for Model 2 estimated at N D 800 as a function of the degree of misspecication for the different category probabilities. It is seen that for ULS, DWLS, and ML,the bias in standard errors remains fairly constant as a function of degrees of misspecication,whereas the bias for WLS seem to get worse with increasing degree of misspecication. Thus,an important result is that the standard errors for ULS, DWLS, and ML are very good evenwith misspecied models. Furthermore there seems to be no effect of the number and shapeof categories on the bias of standard errors.

    Figure 13 shows AMSESE for both models as a function of sample size and degrees of misspecication for the case of seven nonsymmetric category distribution. Here one can seethat the root mean square error in the standard errors increases with increasing degree of misspecication. For Model 1 the rate of increase is larger for ML and WLS than for ULS

    and DWLS. For Model 2 such rate of increase is only noticeable for WLS. For all methodsthe root mean square error decreases with increasing sample size.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    24/33

    414 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 9 Model 2 estimated at N D800 : Average root mean square error (AMSE) as a function of degreeof misspecication.

    Model Fit Chi-Square

    One important issue in SEM is the testing of model t and the assessment of t. In testing themodel certain chi-squares are often used. Sometimes other measures of t are also applied, butmost of these depend on chi-square.

    As explained earlier, under certain assumptions the chi-square should have a 2-distributionwith d degrees of freedom if the model holds. Therefore, we investigate whether the R D2,000 replicates of chi-squares follow a 2d distribution when the model is correctly specied.Although this can be done more accurately, we focus on only two characteristics of the chi-square distribution, namely the mean, which should be d , and the proportionof times chi-squareexceeds the 95th percentile of the 2d distribution, which should be 0.05. Results are shownin Table 3 for Model 1 and in Table 4 for Model 2. These results are for the case of seven

    nonsymmetric categories.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    25/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 415

    FIGURE 10 Seven categories nonsymmetric: Average root mean square error (AMSE) as a function of degree of misspecication and sample size.

    TABLE 3

    Model 1 With No Specification Error

    N ULS DWLS ML WLS

    Average chi-square with 8 df 400 8.03 8.05 8.03 8.30800 7.88 7.89 7.89 8.04

    1,600 8.13 8.13 8.13 8.24Estimated probability of p value 0.05400 0.049 0.049 0.049 0.048

    800 0.053 0.053 0.053 0.0531,600 0.046 0.046 0.045 0.043

    Note. ULS D unweighted least squares; DWLS D diagonally weightedleast squares; MLDmaximum likelihood; WLS D weighted least squares.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    26/33

    416 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 11 Model 2 with no specication error: Average relative bias for standard errors (ARBSE) as afunction of sample size.

    TABLE 4

    Model 2 With No Specification Error

    N ULS DWLS ML WLS

    Average chi-square with 98 df 400 98.58 98.65 98.50 141.31800 98.28 98.31 98.24 116.57

    1,600 98.43 98.45 98.41 107.28Estimated probability of p value 0.05

    400 0.041 0.040 0.042 0.001800 0.042 0.041 0.040 0.004

    1,600 0.054 0.054 0.054 0.019

    Note. ULS D unweighted least squares; DWLS D diagonally weighted leastsquares; MLDmaximum likelihood; WLS Dweighted least squares.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    27/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 417

    FIGURE 12 Model 2 estimated at N D800 : Average relative bias for standard errors (ARBSE) as a functionof degree of misspecication.

    For Model 1 the mean of chi-square should be 8. It seems to be overestimated at N D1,600 and underestimated at N D 800 for ULS, DWLS, and ML. P values are slightlyunderestimated at N D 400 and more underestimated at N D 1,600 and overestimated atN D 800. For Model 1 there are no large differences between methods, although the largemean for WLS at N D 400 and N D 1,600 stands out as different.

    For Model 2 the mean of chi-square should be 98. It is rather well estimated for ULS,DWLS, and ML at all sample sizes but highly overestimated for WLS at all sample sizes. Thep values are highly underestimated for WLS at all sample sizes. They are also underestimatedfor ULS, DWLS, and ML at N D 400 and N D 800 and slightly overestimated at N D 1,600.

    From these results it is clear that the chi-square for WLS does not work well for largemodels. The reason for this is probably the use of W 1 in the formula. This requires very large

    samples to be estimated accurately. For Model 2 this matrix is of the order 120 120 .

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    28/33

    418 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 13 Seven categories nonsymmetric: AMSESE as a function of degree of misspecication andsample size.

    The previous results are based on the case of seven nonsymmetric categories. To see if thenumber of categories and the shape of distribution have any effect on these results, considerFigure 14 showing the mean of chi-squares for Model 2 with no specication error as a functionof sample size. It is seen that the mean remains fairly constant at about 98 for ULS, DWLS,and ML, whereas it is highly overestimated for WLS. These results were already clear fromTable 4 but here we can also see that the number of categories and the shape of distributionhave no effect on these results.

    Figure 15 shows the mean of chi-square as a function of the degree of misspecicationfor Model 2 estimated at N D 400 . If the model is misspecied we do not expect chi-squareto have a 2d distribution. Figure 15 shows that the mean increases with increasing degree of misspecication.

    Figure 16 shows the corresponding results for p values. If the model is misspecied one

    expects the p values to be less than .05 so that the model is rejected and one expects the p

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    29/33

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    30/33

    420 YANG-WALLENTIN, JRESKOG, LUO

    FIGURE 15 Model 2 estimated at N D400 : Average chi-square as a function of degree of misspecication.

    In sum, we make the following conclusions based on our experimental design and associatedndings.

    First, WLS performs poorly under all conditions compared to ULS, DWLS, and ML,although WLS performs better for the small model than for the large. Second, the numberof categories and shape of distribution do not seem to matter. All methods work equally fortwo, ve, and seven categories. Third, in general, the differences between the ULS, DWLS,and ML methods are small over all conditions. The striking result is the good performance of ULS.

    ACKNOWLEDGMENT

    The research reported in this article has has been supported by the Swedish Research Council

    (VR) under the program Structural Equation Modeling With Ordinal Variables .

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    31/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 421

    FIGURE 16 Model 2 estimated at N D400 . P values as a function of degree of misspecication.

    REFERENCES

    Bentler, P. M. (1995). EQS structural equations program manual . Encino, CA: Multivariate Software.Browne, M. W. (1984). Asymptotically distribution-free methods for the analysis of covariance structures. British

    Journal of Mathematical & Statistical Psychology, 37, 6283.Curran, P. J., West, S. G., & Finch, J. F. (1996). The robustness of test statistics to nonnormality and specication

    error in conrmatory factor analysis. Psychological Methods, 1, 1629.Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5, and 7 response categories: A comparison of categorical

    variable estimators using simulated data. British Journal of Mathematical & Statistical Psychology, 47, 309326.

    Flora, D. B., & Curran, P. J. (2004). An empirical evaluation of alternative methods of estimation for conrmatoryfactor analysis with ordinal data. Psychological Methods, 9, 466491.

    Jreskog, K. G. (1969). A general approach to conrmatory maximum likelihood factor analysis. Psychometrika, 34,183202.

    Jreskog, K. G. (1990). New developments in Lisrel: analysis of ordinal variables using polychoric correlations and

    weighted least squares. Quality and Quantity, 24, 387404.

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    32/33

    422 YANG-WALLENTIN, JRESKOG, LUO

    Jreskog, K. G. (1994). On the estimation of polychoric correlations and their asymptotic covariance matrix. Psy-chometrika, 59, 381389.

    Jreskog, K. G. (20022005). Structural equation modeling with ordinal variables using Lisrel. Available at http:// www.ssicentral.com/lisrel/techdocs/ordinal.pdf

    Jreskog, K. G., & Srbom, D. (1996a). LISREL 8: Users reference guide . Chicago, IL: Scientic Software Interna-tional.

    Jreskog, K. G., & Srbom, D. (1996b). PRELIS 2: Users reference guide . Chicago, IL: Scientic Software Interna-tional.

    Jreskog, K. G., Srbom, D., du Toit, S., & du Toit, M. (2003). LISREL 8: New statistical features . Chicago, IL:Scientic Software International.

    Kaplan, D. (1989). A study of the sampling variability and z-values of parameter estimates from misspecied structuralequation models. Multivariate Behavioral Research, 24, 4157.

    Lee, S., Poon, W., & Bentler, P. M. (1990). Full maximum likelihood analysis of structural equation models withpolytomous variables. Statistics & Probability Letters, 9, 9197.

    Magnus, J. R., & Neudecker, H. (1999). Matrix differential calculus with applications in statistics and econometrics(2nd ed.). Hoboken, NJ: Wiley.

    Muthn, B. O. (1984). A general structural equation model with dichotomous, ordered categorical, and continuouslatent variable indicators. Psychometrika, 49, 115132.

    Muthn, B. O., du Toit, S., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimatingequations in latent variable modeling with categorical and continuous outcomes. Unpublished manuscript.

    Muthn, B. O., & Kaplan, D. (1985). A comparison of some methodologies for the factor analysis of non-normallikert variables. British Journal of Mathematical & Statistical Psychology, 38, 171189.

    Muthn, B. O., & Kaplan, D. (1992). A comparison of some methodologies for the factor analysis of non-normallikert variables: A note on the size of the model. British Journal of Mathematical & Statistical Psychology, 45,1930.

    Muthn, L. K., & Muthn, B. O. (1998). M plus users guide . Los Angeles, CA: Muthn & Muthn.Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefcient. Psychometrika, 44, 443

    460.Potthast, M. J. (1993). Conrmatory factor analysis of ordered categorical variables with large models. British Journalof Mathematical & Statistical Psychology, 46, 273286.

    Satorra, A. (1989). Alternative test criteria in covariance structure analysis: A unied approach. Psychometrika, 54,131151.

    Satorra, A. (2003). Power of chi-square goodness-of-t test in structural equation models: The case of nonnormal data.In H. Yanai, A. Okada, K. Shigemasu, Y. Kano, & J. J. Meulman (Eds.), New developments of psychometrics (pp.5768). Springer Verlag, Tokyo.

    Satorra, A., & Bentler, P. M. (1988). Scaling corrections for chi-square statistics in covariance structure analysis. InAmerican Statistical Association, Proceedings of the Business and Economic Section , pp. 308313.

    APPENDIX

    Assuming that the raw data on six ordinal variables are in the text le ordata.raw , one can usethe following PRELIS syntax le to estimate the polychoric correlations and their asymptoticcovariance matrix:

    DA NI=6RA=ORDATA.RAWOR ALLOU MA=PM PM=ORDATA.PM AC=ORDATA.ACP

  • 8/10/2019 Art00 CFAordinal 2010 Simulacion.pdf

    33/33

    CFA OF ORDINAL VARIABLES WITH MISSPECIFIED MODELS 423

    Model 1 can then be estimated by RML using the following LISREL syntax le:

    DA NI=6 MA=PM NO=400LAX1 X2 X3 X4 X5 X6CM=ORDATA.PMAC=ORDATA.ACPMO NX=6 NK=2FR LX(1,1) LX(2,1) LX(3,1) LX(4,2) LX(5,2) LX(6,2)CO TD(1)=1-LX(1,1)**2CO TD(2)=1-LX(2,1)**2CO TD(3)=1-LX(3,1)**2CO TD(4)=1-LX(4,2)**2CO TD(5)=1-LX(5,2)**2CO TD(6)=1-LX(6,2)**2

    OU ME=ML

    To obtain ULS or DWLS estimates, replace ML by ULS or DWLS in the last line.