Anonymous:
I. Specific Question: Fit Measures: Chi-Square and RMSEA
On this week we learned about Fit Indices. There are different kinds of fit indices since there are different ways of measuring discrepancies among two matrices: S and S^. For instance, we learned that the Chi-square index has a problem: the chi-square value is inflated as the sample size increases. For this reason, big discrepancies for small samples are not significant and small discrepancies for big samples sizes are significant. Given this situation, the RMSEA solves the problem by eliminating the effect of the sample size, since RMSEA “normalizes” the Chi-square index, right? However, Grimm and Yarnold (2000) mention that there is a second problem associated with the Chi-square: the index is also sensitive to an assumption of multivariate normality of the Y’s. Thus, the Chi-square index is also inflated when the normality assumption is not hold. Then, I was wondering if this second problem is still present with the RMSEA index. Does the RMSEA is robust to normality? Should this be a concern? Larger samples sizes would solve this second problem?
In addition and as I mentioned at the beginning, different indices evaluate different ways of fit. And I assume that the indices we saw in class are the most relevant (e.g. Chi-Square, RMR, RMSEA, GFI), and we may have a higher preference of convergent indices. But what about if we have the case in which RMR is fairly good, the RMSEA is not so good, and the GFI has a marginal value, in which index should I base my conclusions concerning model fit? What would the decision rule be in this case? If one fit index is bad, should I conclude that the estimated-covariance matrix is not good? My rationale is as follows: if there are different ways of measuring fitness, then the model can be evaluated without subjectivity or without relying on the good judgment of the person who is analyzing the model. Is my thinking correct?
Grimm, L.G., & Yarnold P. R. (2000). Reading and Understanding more Multivariate Statistics. Washington, DC: American Psychological Association.
You said, " For this reason, big discrepancies for small samples are not significant and small discrepancies for big samples sizes are significant. Given this situation, the RMSEA solves the problem by eliminating the effect of the sample size, since RMSEA “normalizes” the Chi-square index, right? "
Very good.
Then you said, "Does the RMSEA is robust to normality? Should this be a concern? Larger samples sizes would solve this second problem?"
Yes, non-normality is also a concern. I am going to de-emphasize it, not because it isn't important, but rather because I think there is only so much we can get across clearly at one time.
For now, I want you to understand what all these indices are, as clearly as possible, without the additional confounding concern about non-normality. Later, we will talk about polychoric correlation, which is a solution to the non-normality problem that results from Likert scale (1,2,3,4,5) discrete questionnaire data.
The other major concern about non-normality is outliers, an we have already discussed that.
This is still an area of ongoing research in SEMs, how to deal with nonnormality. Much research looks at the effects of nonnormally distributed factors and errors, but they still assume additivity of the model. The additivity of the model is as questionable or more questionable than nonnormality, so this line of research is incomplete. More needs to be done.
Larger sample sizes do not solve the problem of nonnormality here, because these are tests on variances, and these methods get the variance of the estimated variances wrong when normality is assumed. Tests of means, on the other hand, are robust to nonnormality because the methods typically get the variance of the estimated mean right, whether or not the data come from a normal process; also because the Central Limit Theorem implies that the estimated mean has an approximately normal distribution, regardless of the parent distribution.
As far as making your case for model fit, I want you to have some idea what all these indices are doing, and to understand the logic of why "lower is better" or why "higher is better". For publication purposes, the best strategy is to look at your literature and see what others have been able to get away with. Certainly, the better all statistics are, the easier your task.
Realize too that no model, SEM or otherwise, can *ever* be stated to be exactly right. For example, people use regression all the time, even though linearity is violated (to some degree) in most applications, but linearity is rarely questioned. So the question isn't "is the model correct", but rather, "is the model reasonable."
A model is reasonable if the data produced by the model mimic real data. If so, then you can use the model to make predictions of external data, and to generalize.
100 100 70 90
II.General Question: Convergent and Discriminant Validity.
One of the topics of Thursday’s lecture was Convergent and Discriminant Validity. If I got the right idea, our concern is to validate the constructs defined by each factor. Convergent validity shows the degree in which multiple measures (Y’s) of the same construct (Factor) demonstrate an agreement. That is, if we can view certain Y’s as a measure of F. Now, suppose that I have a survey in which I want to measure the work environment into two different latent factors: comfort and safety. Thus, my survey has questions related to Y’s such as:
Comfort: number of complains and number of breaks requested to supervisor.
Safety: number of accidents in work station, number of discrepancies of safety procedures (not following procedures), and number of visits to medical department due to small injuries.
In this sense, the survey is intended to be a measurement instrument of two constructs: comfort and safety that will give me an idea of the work environment status. So, if I collect data of 50 works stations and test if it is convergent or divergent, can I consider the convergent validity as a validation test for the survey? That is, can I validate the proper definition of questions as a measure of comfort and safety by using the concept of convergent validity? What I am trying to say is: would this convergent validity be an indicator of the accuracy of the measurement device (survey)?
By the same token, discriminant validity tests if the constructs are really measuring different things, then the correlation among these factors should be close to zero. Right? Now, the method for testing Convergent and Discriminant Validity (DV) that we first saw, was by looking at the correlations of the matrix: high correlations in the diagonals matrices indicate convergent validity and low correlations off-diagonal matrices indicate discriminant validity. So I was wondering if it makes sense to think about a partitioned matrix with low correlations in the diagonals-matrices and in the off-diagonals matrices at the same time? I am thinking about the case of looking a glass of water if it is half empty or half full. Or can we say that the convergent and discriminant are mutually exclusive events?
"What I am trying to say is: would this convergent validity be an indicator of the accuracy of the measurement device (survey)?"
Sure, it's just the reliability that we discussed earlier.
"By the same token, discriminant validity tests if the constructs are really measuring different things, then the correlation among these factors should be close to zero. "
Definitely NOT! That's why I said RELATIVELY smaller covariances. Say the correlations within are all .9 and the correlations between are all .8. There is clear discriminant validity in this case. Remember, the goal of the study os to estimate strength of relationship between the factors. So if those correlations are close to zero, then you have little relationship between factors. Sure, you may have discriminant validity, but you have no interesting main result of your analysis, so the whole project goes into the garbage can.
Then you said "high correlations in the diagonals matrices indicate convergent validity and low correlations off-diagonal matrices indicate discriminant validity. "
Again, "RELATIVELY LOWER," not "low".
Then you said,
"So I was wondering if it makes sense to think about a partitioned matrix with low correlations in the diagonals-matrices and in the off-diagonals matrices at the same time? I am thinking about the case of looking a glass of water if it is half empty or half full. Or can we say that the convergent and discriminant are mutually exclusive events?"
This is confusing. Why couldn't all the correlations be small? If all the variables are independent, then they are all zero.
As far as mutually exclusive goes, "mutually exclusive" means that the one excludes the other, and no, they do not exclude. You can have high convergent validity and no discriminant validity. An example is the single factor parallel model with high reliability that I indicated in class with all the correlations .9.
100 90 70 90