<?xml version="1.0" encoding="UTF-8" ?>
<?xml-stylesheet type="text/xsl" href="http://tltc.ttu.edu/cs/utility/FeedStylesheets/rss.xsl" media="screen"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:wfw="http://wellformedweb.org/CommentAPI/"><channel><title>ISQS 6348 (Dr. Westfall)</title><link>http://tltc.ttu.edu/cs/forums/43.aspx</link><description /><dc:language>en</dc:language><generator>CommunityServer 2007 SP2 (Build: 20611.960)</generator><item><title>D47295184  I. latent variable for discrete data  II. SEM structural error</title><link>http://tltc.ttu.edu/cs/forums/thread/426.aspx</link><pubDate>Sat, 22 Nov 2008 12:10:46 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:426</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/426.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=426</wfw:commentRss><description>&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Specific Question:&lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;In this Tuesday’s class, we started &lt;i style="mso-bidi-font-style:normal;"&gt;latent variable for discrete data&lt;/i&gt; with a simple example in which Y represents the success in task and X means the experience. Then we got a probit model: Y*=β0+β1 X +&lt;span style="mso-bidi-font-family:Arial;"&gt;ε, &lt;/span&gt;where Y is the latent variable while X is not. And then, you showed us &lt;i style="mso-bidi-font-style:normal;"&gt;factor analysis with discrete data &lt;/i&gt;by using the outsourcing example Yij= βj Fi +&lt;span style="mso-bidi-font-family:Arial;"&gt;εij. Then, the FA model we got from this example is Yij*= &lt;/span&gt;βj Fi +&lt;span style="mso-bidi-font-family:Arial;"&gt;εij, where Yij* is a latent, continuous success propensity variable. Here, you emphasized Fi itself is a latent variable and we can not get Fi* latent variable. Yes, I agree with what you said, but what confused me is the simple example you mentioned in the beginning of the class. In that &lt;/span&gt;probit model: Y*=β0+β1 X +&lt;span style="mso-bidi-font-family:Arial;"&gt;ε, the X is not a latent variable but we didn’t let the model be &lt;/span&gt;Y*=β0+β1 X* +&lt;span style="mso-bidi-font-family:Arial;"&gt;ε, so why? In what kind of cases, when X is not a latent variable, we need to make it to be a latent variable X* to match the latent variable Y*?&lt;/span&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;/b&gt;&lt;/font&gt;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;General Question:&lt;/b&gt; &lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;My question refers to an article from &lt;/font&gt;&lt;a href="http://www2.gsu.edu/~mkteer/sem2.html"&gt;&lt;font face="Calibri" color="#800080" size="3"&gt;http://www2.gsu.edu/~mkteer/sem2.html&lt;/font&gt;&lt;/a&gt;&lt;font face="Calibri" size="3"&gt;. In the &lt;/font&gt;&lt;a class="" name="structerr"&gt;&lt;/a&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font face="Calibri" size="3"&gt;Structural Error&lt;/font&gt;&lt;/b&gt;&lt;font face="Calibri" size="3"&gt; part, it says “To achieve consistent parameter estimation, these error terms are assumed to be uncorrelated with the model&amp;#39;s exogenous constructs. Violations of this assumption come about as a result of the &lt;b style="mso-bidi-font-weight:normal;"&gt;excluded predictor problem&lt;/b&gt;” How to understand the excluded predictor problem here? Then it says “However, structural error terms may be modeled as being correlated with other structural error terms. Such a specification indicates that the endogenous constructs associated with those error terms share common variation that is not explained by predictor relations in the model.” My question is in the graphed model the article mentions, if there is a correlation between zeta1 and zeta2, can we say that there is a reciprocal causation between the latent variables eta1 and eta2? Probably not, because the reciprocal causation exists only between latent variables? &lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>B24231599: I:Estimating SEM parameters II: Supply-Demand path diagram, instrumental variables</title><link>http://tltc.ttu.edu/cs/forums/thread/425.aspx</link><pubDate>Sat, 22 Nov 2008 07:12:58 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:425</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/425.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=425</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-tab-count:1;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;strong&gt;Specific Question&lt;/strong&gt;: &lt;/span&gt;The first part of my specific question concerns SEM models. Equation (8) in “Structural Equations Models” document specifies the relationship between latent variables in the form of&amp;nbsp;several equations. Paper discusses methods to estimate the parameters of the model, such as: 1- mapping observed&amp;nbsp;ample moments&amp;nbsp;with model’s moments or 2- Maximum likelihood Solutions. &lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;Assume for the sake of discussion that all the variables in equation (8) are observable, then, I suppose the only reason that we are not using simple OLS estimation procedure to estimate parameters (like &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;a&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;s and &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;g&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;s ) is the endogenous relation between &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; and &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;. But if I can rearrange the equations in (8) in a way that the right hand side contains variables uncorrelated with the error term then I see no reason for not using OLS estimates. Consider the following equations obtained from (8):&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&amp;nbsp;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;"&gt;&lt;font size="3"&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; = &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;a&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; + &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;g&lt;/span&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;11 &lt;/font&gt;&lt;/span&gt;&lt;/sub&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;x&lt;sub&gt;1&lt;/sub&gt; + g&lt;sub&gt;12&lt;/sub&gt; x&lt;sub&gt;2&lt;/sub&gt; + z&lt;sub&gt;1&lt;/sub&gt;, &lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-tab-count:1;"&gt;&amp;nbsp; &lt;/span&gt;&lt;span style="mso-tab-count:5;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(1)&lt;/font&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/span&gt;&lt;/font&gt;&amp;nbsp; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;"&gt;&lt;font size="3"&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; = &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;a&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; + &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;b&lt;/span&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;21&lt;/font&gt;&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;+ z&lt;sub&gt;2&lt;/sub&gt;, &lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;span style="mso-tab-count:1;"&gt;&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-tab-count:6;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(2)&lt;/font&gt;&lt;/span&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/span&gt;&lt;/font&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-tab-count:1;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;"&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style="mso-tab-count:1;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Since Cov(&lt;/font&gt;&lt;/span&gt;&lt;b&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;x&lt;/span&gt;&lt;/b&gt;&lt;font face="Times New Roman"&gt;&lt;b&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;`&lt;/span&gt;&lt;/b&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;,&lt;/span&gt;&lt;/font&gt;&lt;b&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;z&lt;/span&gt;&lt;/b&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;)=0 and there is no endogenous variable in (1) so OLS can be used on (1). Equation (2) contains endogenous variable &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; on RHS, but still &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;z&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; is uncorrelated with &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; because there is no &lt;/span&gt;&lt;/font&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;h&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt; term on the RHS of (1). Therefore again we can use OLS to estimate parameters.&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-tab-count:1;"&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Now considering both latent and observable variables, my question is that if by rearranging all the equations: (8), (9) and (10) (in the document) in a way that I can have the dependent observable variables on the left sides and make the same arguments as in above paragraph to prove that the error term are uncorrelated with other variables on the right side, then does that give me sufficient reasons to use OLS estimates?&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;My second specific question is about assigning scales to latent variables to achieve identifiability. The author of “Structural Equations Model” article contends that setting the mean of latent variable to zero and its variance to one would not be a good idea when confronting panel or cross sectional data. I’m wondering what would be the answer then. As the latent variable could have different mean value across different groups, does that mean that we should scale the latent variable within each group?&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;strong&gt;General Question: &lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;strong&gt;&lt;/strong&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;First part of my general question is related to path diagrams. I am trying to draw a path graph like Figure 1 in “Structural Equation Models” document. Assume that the demand for houses in the aggregate economy depends on two variables: the price of new house and the price of a substitute product (like rental costs) therefore,&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&amp;nbsp;&lt;/span&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;Q&lt;sub&gt;t&lt;/sub&gt; = &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;a&lt;sub&gt;1&lt;/sub&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;b&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; P&lt;sub&gt;t&lt;/sub&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;g&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; R&lt;sub&gt;t&lt;/sub&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;z&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;1t&lt;/span&gt;&lt;/sub&gt;&lt;sub&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;&lt;/span&gt;&lt;/sub&gt;&lt;/font&gt;&lt;/font&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&amp;nbsp;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;On the other hand assume that supply of housing depends on the prices of new real estate and also some measure of technology in house building:&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;font size="3"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt;Q&lt;sub&gt;t&lt;/sub&gt; = &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;a&lt;sub&gt;2&lt;/sub&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;b&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; P&lt;sub&gt;t&lt;/sub&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;g&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman"&gt; T&lt;sub&gt;t&lt;/sub&gt; + &lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-FAMILY:Symbol;mso-ansi-language:EN-US;"&gt;z&lt;/span&gt;&lt;font face="Times New Roman"&gt;&lt;sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;2t&lt;/span&gt;&lt;/sub&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&amp;nbsp;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;To draw the path diagram I assume R and T variables as exogenous and identify P and Q as endogenous since they are affecting each other. Does that mean in the path diagram I should have to arrows one with direction from P to Q and the other with direction from Q to P?&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&amp;nbsp;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0cm 0cm 0pt;TEXT-INDENT:36pt;"&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;My second general questions is about instrumental variables: econometricians use instrumental variables, which are highly correlated with endogenous variables and uncorrelated with the errors, in order to overcome the endogeneity issue. Is it feasible to replace observable variables in structural equations models with respective IVs in order to&amp;nbsp;gain better estimations of the parameters? For example in the “Structural Equations Models” document equation (18) shows the moment matrix for observable vector of variables &lt;b&gt;x&lt;sup&gt;*&lt;/sup&gt; &lt;/b&gt;which is&lt;b&gt; x &lt;/b&gt;in this case, the matrix moment for the estimated instrumental variables&amp;nbsp;of &lt;b&gt;x &lt;/b&gt;would have zero off diagonal elements. How does that affect the identifiability of the model?&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;span style="mso-ansi-language:EN-US;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;&lt;/font&gt;&lt;/font&gt;&lt;/span&gt;&amp;nbsp;</description></item><item><title>H44134794, I. Principal components and factor analysis, II. Formative and reflective constructs</title><link>http://tltc.ttu.edu/cs/forums/thread/423.aspx</link><pubDate>Sat, 22 Nov 2008 06:01:07 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:423</guid><dc:creator>salee</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/423.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=423</wfw:commentRss><description>&lt;p&gt;&lt;strong&gt;1. Principal components and factor analysis&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In the factor analysis class, we have learned that principal components are the same as the “one factor” factor analysis. Also, in the last class, we have learned that principal components are different from factors in factor analysis in that principal components are obtained by linearly transforming with manifest variables, but the factors are assumed as pre-existing variables. For the first sentence, I am confusing between the number of principal components in principal components analysis and the number of factor in the factor analysis. Why principal components are the same as the “one factor”? I think that because the number of principal components (PC) is vary according to the number of variables, the number of PC should be matched the number of factors in the factor analysis. However, the sentence presents only “one factor” for multiple PCs.&lt;/p&gt;
&lt;p&gt;For the second sentence, I think PC and Factors are the same in terms of the method for extracting PC and factors. The PCs are produced by linear combination of covariance matrix which is made using manifest variables. Also, factors in the factor analysis are generated by covariance matrix which is made using observable variables. Thus, PC and factor are generated from a covariance matrix of observable variables. So, except for theoretical assumptions that factors are pre-existing, what is the distinct difference between two analyses? Furthermore, in the proc factor in SAS, there is an option, “method=prin”, for factor analysis. If PC are factor analysis are different, is it reasonable to employ PC in the factor analysis? In addition, can we call principal component analysis? I think that because PC is just linear transformation, it does not seem to give a model like the regression model. &lt;/p&gt;
&lt;p&gt;&lt;strong&gt;2. Formative and reflective constructs&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We have studied that items or indicators used to measure latent variables are reflective. Is this mean that we can only use reflective variables in the SEM? Or can we consider these two different constructs at the same time in the SEM? I thought that in the following path, the relation between F1 and F2 is formative, and two different constructs can be used in the SEM model.&lt;br /&gt;Y1&lt;br /&gt;Y2&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; F1&amp;nbsp;&amp;nbsp; -&amp;gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; F2&lt;br /&gt;Y3&lt;/p&gt;
&lt;p&gt;Also, are there any rules to divide variables as formative and reflective? Is this distinction fixed or changed according to how we look at variables? &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>J80032900: I) Specific Question: Principle Component and Goodness of Fit II) General Question: combining the usage of factor analysis and manifest modeling.</title><link>http://tltc.ttu.edu/cs/forums/thread/422.aspx</link><pubDate>Sat, 22 Nov 2008 05:27:54 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:422</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/422.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=422</wfw:commentRss><description>&lt;p&gt;&amp;nbsp;I) Specific Question: Principle Component and Goodness of Fit&lt;br /&gt;On Thursday, November 20th lecture, you mention that the goal is not to explain all the principal component, the goal is to explain as much information as possible with a few principal component. I not there is not golden rule that say we have to use one or two principle to explain the information but we only pick one pr to principle component to explain the information, will this action cause the model to have bad fit? During the same lecture you also mention that the R-squared statistics make latent variable more preferred is that how we can justify the bad fit for principal component models?&lt;br /&gt;&lt;br /&gt;II) General Question: combining the usage of factor analysis and manifest modeling.&lt;br /&gt;At the beginning of Thursday, November 20th lecture you were saying FA is more theory based and we cannot use the model to predict the outcome because we do not know what is the factor. Whereas manifest modeling is more practical because we can predicted the outcome because we can use the model since the variables are manifest variables. Then you brief mention about combining the to method in business application. I think combining this two method in business application make more sense because most of them time human behavior are unpredictable and maybe the theory that explain human behavior might help the manifest models to come up with some outcomes that the model couldn&amp;#39;t figured out. I understand that if a company want to perform FA before creating manifest modeling they will need to spend more time and money. If they feel that this action is worth it what are the steps they should take?&lt;/p&gt;</description></item><item><title>K65419096 I. SEM Factor errors    II. Polychoric threshold clarification</title><link>http://tltc.ttu.edu/cs/forums/thread/421.aspx</link><pubDate>Fri, 21 Nov 2008 23:35:49 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:421</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/421.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=421</wfw:commentRss><description>I.&lt;span&gt;&amp;nbsp;
&lt;/span&gt;SEM Factor errors.&lt;span&gt;&amp;nbsp; &lt;/span&gt;If you make an
observation, there will be measurement error, thus I understand the use of ε.&lt;span&gt;&amp;nbsp; &lt;/span&gt;But
if a factor is the underlying, unobservable truth, why would there be any error, δ?&lt;span&gt;&amp;nbsp;
&lt;/span&gt;At first I thought it was only when one combined multiple factors, but
as I review my notes, whenever we use the schematic portrayal (a path diagram
with circles and squares), we have a δ
going into an F circle.&amp;nbsp;

&lt;p style="margin:0in 0in 0.0001pt;"&gt;&amp;nbsp;&lt;/p&gt;

II.&lt;span&gt;&amp;nbsp; &lt;/span&gt;In class on
Tuesday, the 18&lt;sup&gt;th&lt;/sup&gt;, we covered polychoric analysis using outsourcing,
the speaker survey, and a simulation as examples.&lt;span&gt;&amp;nbsp; &lt;/span&gt;To be sure I understand the concept, we are
taking discrete data and finding out the thresholds as if it were continuous,
correct?&lt;span&gt;&amp;nbsp; &lt;/span&gt;&lt;b&gt;Are each of these lines c1 and c2 the midpoints of the existing data, merely
allowing for figures which are between the allowable discrete data points?&lt;/b&gt;&lt;span&gt;&amp;nbsp; &lt;/span&gt;In the outsourcing example, there was little
chance of Y1=1 and Y2=0 (the lower right quadrant) with our positive tilt to
our ellipse of the underlying propensity to outsource Y1*(R&amp;amp;D) and
Y2*(Service).&lt;span&gt;&amp;nbsp; &lt;/span&gt;So where the thresholds
are is not the important part, it is only within relation to the correlations
by calculating the probabilities will it help us understand anything.
</description></item><item><title>D66201843 I. Free Parameters &amp; SEM  II. Bollen’s SEM Paper-Respecification</title><link>http://tltc.ttu.edu/cs/forums/thread/420.aspx</link><pubDate>Fri, 21 Nov 2008 18:43:40 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:420</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/420.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=420</wfw:commentRss><description>&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;I. Specific Question (Understanding Free Parameters with SEM)&lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;/font&gt;&amp;nbsp; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;In last week Tuesday’s lecture, we discussed free parameters with SEM examples.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;We discussed the two requirements which are 1) no more than m(m-1)/2 free parameters and 2) identifiable parameters.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;In trying to understand parameters and free parameters I have a few questions.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;It is my understanding that there are a certain number of parameters in the model and this is fixed?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Then there are a certain number of free parameters in the model and this is also fixed however, you don’t have to use all the free parameters?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Do you just want the right amount of free parameters that optimizes the model and provides the best fit?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;If you have too many parameters then it is over-parameterized or non-identifiable but if you have too few parameters then it is under-parameterized and there will be less fit?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this correct?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;Then, based on the SEM Wikipedia entry from the class website, for the estimation of free parameters – is it correct that the parameter estimation compares the actual covariance matrices with the estimated covariance matrices to get the best fitting model?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Then, does the researcher determine how many free parameters to use or does SAS generate the best model?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I believe in class, we had to specify the free parameters and then SAS produced the fits and we compared these with several different free parameters to get the best model fit?&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;II. General Question&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(Ken Bollen’s SEM Paper-Respecification) &lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;This question is in reference to Ken Bollen’s paper on Structural Equation Models (SEM) from the Encyclopedia of Biostatistics, pages 4363 – 5372 from the class website.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;In this paper he talks about doing an SEM and indicates the step in the process, with the last one being respecification.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;He said that you look at the overall fit and the fit for the components in the model.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;He indicates “it is not unusual to find that the initial model specification provides an inadequate match to the data.” So the researcher attempts to improve the model by respecification.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;He notes that the researcher should replicate the final model on an independent data set.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So it would be my understanding that if this was applied to research, you have a dataset and theory and run an SEM and then end up respecifying your model but then is it true that you have to find another dataset and test your model?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this common in SEM for publication and if so, do you include both datasets but only the final model?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I believe the value in testing your model on another dataset is so the model is valid and generalizable but I’m wondering about the costs involved?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;It seems to me the limited papers I have read that use SEM were only with one model and one dataset?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>J68703397. I. Identification Rules for SEM II. RMSEA Issues</title><link>http://tltc.ttu.edu/cs/forums/thread/419.aspx</link><pubDate>Fri, 21 Nov 2008 15:18:54 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:419</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/419.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=419</wfw:commentRss><description>&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;I. SPECIFIC QUESITON.&amp;nbsp; IDENTIFICATION RULES FOR SEM&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This question is related to the SEM examples provided by the UCLA website at &lt;a href="http://www.ats.ucla.edu/stat/sas/library/proc_calis.htm"&gt;http://www.ats.ucla.edu/stat/sas/library/proc_calis.htm&lt;/a&gt;. &amp;nbsp;&amp;nbsp;In example 4, the path diagram shown involves 3 latent factors: F1 (mediator factor), F2 (endogenous factor) and F3 (exogenous factor).&amp;nbsp;&amp;nbsp; Looking at the SAS code, I noticed that the variance of the exogenous factor is not constrained to one, and instead, it is telling SAS to calculate the variance of F3 from the data (Std F3 = ef3).&amp;nbsp; From my standpoint, this code is not following the rules of identification provided in class: 1) set variances of 1 for latent variables that are completely exogenous; and 2) allow correlations among exogenous variables. Am I right?&amp;nbsp; So I ran the code on SAS to see if I had a non- identifiability issue (which I shouldn&amp;#39;t have since the number of free-parameters is 3 (F3,&amp;nbsp; D1 and D2 right?) and, this is not exceeding 3(3-1)/2 = 3, (where m=3).&amp;nbsp; &amp;nbsp;The code used was the following ( you may want to look at the attached file):&lt;/p&gt;
&lt;p&gt;&lt;b&gt;data&lt;/b&gt; power(TYPE=COV);&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; _type_ = &amp;#39;cov&amp;#39;; &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; input _name_ $ v1-v6; &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; datalines;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v1&amp;nbsp;&amp;nbsp; 11.834&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v2&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.947&amp;nbsp;&amp;nbsp;&amp;nbsp; 9.364&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v3&amp;nbsp;&amp;nbsp;&amp;nbsp; 6.819&amp;nbsp;&amp;nbsp;&amp;nbsp; 5.091&amp;nbsp;&amp;nbsp; 12.532&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v4&amp;nbsp;&amp;nbsp;&amp;nbsp; 4.783&amp;nbsp;&amp;nbsp;&amp;nbsp; 5.028&amp;nbsp;&amp;nbsp;&amp;nbsp; 7.495&amp;nbsp;&amp;nbsp;&amp;nbsp; 9.986&amp;nbsp;&amp;nbsp;&amp;nbsp; .&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v5&amp;nbsp;&amp;nbsp; -3.839&amp;nbsp;&amp;nbsp; -3.889&amp;nbsp;&amp;nbsp; -3.841&amp;nbsp;&amp;nbsp; -3.625&amp;nbsp;&amp;nbsp; 9.610&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; .&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; v6&amp;nbsp;&amp;nbsp; -2.189&amp;nbsp;&amp;nbsp; -1.883&amp;nbsp;&amp;nbsp; -2.175&amp;nbsp;&amp;nbsp; -1.878&amp;nbsp;&amp;nbsp; 3.552&amp;nbsp; 4.503&lt;/p&gt;
&lt;p&gt;;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;run&lt;/b&gt;; &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;proc&lt;/b&gt; &lt;b&gt;calis&lt;/b&gt; cov data=power method = ml nobs = &lt;b&gt;932&lt;/b&gt;;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; lineqs&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V1 =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; F1 + E1,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V2 = &lt;b&gt;.833&lt;/b&gt; F1 + E2,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V3 =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; F2 + E3,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V4 = &lt;b&gt;.833&lt;/b&gt; F2 + E4,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V5 =&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; F3 + E5,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; V6 =&amp;nbsp;&amp;nbsp; a6 F3 + E6,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; F1 =&amp;nbsp;&amp;nbsp; c1 F3 + D1,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; F2 =&amp;nbsp;&amp;nbsp; c2 F1 + c3 F3 + D2;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;std&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; D1 - D2 = ed:,&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&amp;nbsp;&amp;nbsp; F3 = ef3,&amp;nbsp;&amp;nbsp;&amp;nbsp; /* HERE is my concern */&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; E1 = ee1,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; e3 = ee1,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; e2 = ee2,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; e4 = ee2,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; e5 = ee3,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp; e6 = ee4;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;cov&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; E1 E3 = theta1,&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; E2 E4 = theta1;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;run&lt;/b&gt;;&lt;/p&gt;
&lt;p&gt;When I looked at the SAS results, I first checked if I had any warning message in the Log, and I got the following: &lt;/p&gt;
&lt;p&gt;WARNING: Shorter parameter list than variable list in STD statement.&amp;nbsp; The parameter list is&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; filled up with 1 entries ed . .&lt;/p&gt;
&lt;p&gt;NOTE: ABSGCONV convergence criterion satisfied.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;I assumed that the Warning is because SAS didn&amp;#39;t like the specification in the STD statement: &amp;quot; D1 - D2 = ed:,&amp;quot;&amp;nbsp; right?&amp;nbsp; So I just changed the code to &amp;quot;D1 - D2 = ed1-ed2,&amp;quot; and the message did not appear again. &amp;nbsp;&amp;nbsp;Additionally, looking at the Note, SAS tells me that the ABSGCONV&amp;nbsp;&amp;nbsp; convergence criterion is satisfied, so this means that the model is identifiable. However, this ABSGCONV criterion seems to be different from the GCONV convergence criterion that we saw in class. &amp;nbsp;&amp;nbsp;I am not sure if this difference in convergence criterion represents a problem.&amp;nbsp; Should this be a concern?&amp;nbsp; &amp;nbsp;&amp;nbsp;Now, referring to the fit statistics I got the following:&amp;nbsp; RMR = 0.1507; Chi-Sq = 13.4764 DF=9 and RMSEA=0.0231.&amp;nbsp; This tells me that the model fits quite well. &amp;nbsp;Should I feel confident with this model even if the identification rules were not used?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Well, I tried to answer that question to myself by changing the code using the restriction &amp;nbsp;Var(F3) =1.&amp;nbsp; That is, under the STD statement, I only changed from &amp;quot;F3=ef3&amp;quot; to &amp;nbsp;&amp;quot;FE=1&amp;quot;.&amp;nbsp;&amp;nbsp; However, I expected to have better results, but I got this warning message &amp;quot;&lt;b&gt;The central parameter matrix _PHI_ has probably 1 negative eigenvalue(s).&lt;/b&gt; &amp;quot;&amp;nbsp; This definitely looks like something is wrong.&amp;nbsp; But I am not sure what it is.&amp;nbsp; After I got this message I looked at the Fit statistics and I did get worse results:&amp;nbsp; RMR=1.3593, Chi-Squared= 178.2328,&amp;nbsp; DF= 10, and RMSEA=0.1344.&amp;nbsp; &amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Given the results, I can say that using the identification rules made the model fit worse! I am confused with these results.&amp;nbsp;&amp;nbsp; I do not understand what is going on here: for one side, the example 4 does not seem to follow the identification rules, and yet I got a good fit; and from the other, when I constrained the model, instead of getting better results, I got worse results. Is this because in this example there is only one exogenous latent variable?&amp;nbsp; Under what conditions should I use the identification rules? I am assuming that whatever the reason is, it does not have to do with the fact that the code is using the covariance matrix as an input.&amp;nbsp; Am I Right?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;II. GENERAL QUESTION. RMSEA Issues&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;My questions are related to the article &amp;quot;An Empirical Evaluation of the Use of Fixed Cutoff Points in RMSEA Test Statistic in Structural Equation Models&amp;quot;.&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/p&gt;
&lt;p&gt;1) The RMSEA formula expressed in the article involves the term &lt;i&gt;N-1&lt;/i&gt;, whereas the formula given in class involves &lt;i&gt;n&lt;/i&gt;.&amp;nbsp; I am assuming that &lt;i&gt;N&lt;/i&gt; and &lt;i&gt;n&lt;/i&gt; stands for the sample size of the dataset.&amp;nbsp; As far as I understand, the distinction between using &lt;i&gt;N-1&lt;/i&gt; and &lt;i&gt;n&lt;/i&gt; depends on the type of covariance matrix we are using.&amp;nbsp; That is, we use &lt;i&gt;N-1&lt;/i&gt; if the COV or CORR matrix is analyzed, and we use only &lt;i&gt;n&lt;/i&gt; if the &lt;i&gt;uncorrected&lt;/i&gt; correlation or covariance matrix.&amp;nbsp; Is this correct?&amp;nbsp; However, I do not exactly see the difference between an uncorrected and a corrected correlation or covariance matrix. Does it have to do with the standardized issue?&lt;/p&gt;
&lt;p&gt;2) According to my understanding, the RMSEA index is a standardized version of the Chi-Squared index.&amp;nbsp; We learned that the Chi-Squared gets larger as the sample size increases and this is not a desirable behavior of a fit index, right?&amp;nbsp; Thus, we need fit indices that converge as the sample size increases.&amp;nbsp; I thought that the RMSEA solved this problem of the Chi-squared by getting rid of the &amp;quot;n&amp;quot; in the RMSEA formula.&amp;nbsp; And by that, I automatically implied that the RMSEA index is not being sensible to the sample size.&amp;nbsp;&amp;nbsp; After reading the paper, I am not sure about that.&amp;nbsp; In the article, it is mentioned several times that the &lt;b&gt;RMSEA tends to over-reject with small sample sizes&lt;/b&gt;. So that means that it is sensible to the sample sizes, right?&amp;nbsp; &amp;nbsp;&amp;nbsp;Well, as a matter of fact the authors mention that the RMSEA is dependent on the sample size, model misspecification and degrees of freedom.&amp;nbsp; So it seems that the RMSEA solves the problem of convergence, but it seems that a good interpretation of the RSMEA is subject to the experience of the researcher.&amp;nbsp; &amp;nbsp;So I wonder if whenever I have small sample sizes should I focus more on the Chi-Square index, and with larger sample sizes should I focus more on the RMSEA index?&amp;nbsp; And then, how large should n need to be in order to be considered large? &lt;/p&gt;
&lt;p&gt;3) The authors also mentioned the conclusions of a study carried out by Nevit and Hancock (2000).&amp;nbsp; These conclusions are related to the difficulty of evaluating a hypothesis test for misspecified models under nonnormal conditions.&amp;nbsp;&amp;nbsp; If I am understanding this correctly, then the hypothesis Ho: S = S(q) may not be true if the manifest variables do not have a normal behavior and/or the estimated parameters are incorrect.&amp;nbsp;&amp;nbsp; Is this right?&amp;nbsp; Then, it is not clear to me why the normality condition of the manifest variables is involved if the model being tested involves latent variables?&amp;nbsp; Isn&amp;#39;t the structure of S implied by the SEM model we are specifying?&amp;nbsp;&amp;nbsp; Should we check the normality assumption before any attempt of SEM analysis?&lt;/p&gt;
&lt;p&gt;4) My last question is concerned with the method of analysis.&amp;nbsp; The authors used 84 experimental conditions and 500 replications for each condition.&amp;nbsp; Then they did hypothesis testing.&amp;nbsp; Well, we learned the problem implied with multiple hypothesis testing: &amp;nbsp;the probability of committing at least one type I error increases.&amp;nbsp; Thus, the more tests I perform the more probability of making a false conclusion.&amp;nbsp; False rejections are easily observed on the graphs shown in the article.&amp;nbsp; Shouldn&amp;#39;t this issue be considered?&amp;nbsp; In order to do that, do the cut-off values should be adjusted in a similar way as the FDR_p-value or Bonferroni_p-value?&amp;nbsp; Does this make sense?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>D47295184  I. SAS-Comparing RMR with GFI  II. Convergent &amp; Discriminant Validity</title><link>http://tltc.ttu.edu/cs/forums/thread/409.aspx</link><pubDate>Sat, 08 Nov 2008 11:55:52 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:409</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/409.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=409</wfw:commentRss><description>&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;Specific Question:&lt;/font&gt;&lt;/font&gt;&lt;/b&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;LINE-HEIGHT:normal;mso-layout-grid-align:none;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;In Tuesday’s class, we were given the formula sqrt( &lt;/font&gt;&lt;span style="FONT-FAMILY:Arial;"&gt;∑∑&lt;/span&gt;&lt;font face="Calibri"&gt;(Sij-Sijhat)^2/p(p+1)) where i&amp;lt;j to calculate the RMR. Here S is the observed covariance matrix and Shat is the fitted covariance matrix. Then we performed a SAS code to compare RMR and GIF but you said there might be a mistake in that code. After I looked into that code, I got an answer but I don’t know whether it’s right or not. That part of my code is &lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;RMR1 = sqrt(( sum(R1#R1)) +trace(R1#R1) /&lt;/span&gt;&lt;b&gt;&lt;span style="BACKGROUND:white;COLOR:teal;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;2&lt;/span&gt;&lt;/b&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;*(q1*(q1+&lt;/span&gt;&lt;b&gt;&lt;span style="BACKGROUND:white;COLOR:teal;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;1&lt;/span&gt;&lt;/b&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;)) ) ;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;LINE-HEIGHT:normal;mso-layout-grid-align:none;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;RMR2 = sqrt( (sum(R2#R2)) +trace(R2#R2) /&lt;/span&gt;&lt;b&gt;&lt;span style="BACKGROUND:white;COLOR:teal;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;2&lt;/span&gt;&lt;/b&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;*(q2*(q2+&lt;/span&gt;&lt;b&gt;&lt;span style="BACKGROUND:white;COLOR:teal;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;1&lt;/span&gt;&lt;/b&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;)) ) ;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="BACKGROUND:white;COLOR:black;mso-bidi-font-family:&amp;#39;Courier New&amp;#39;;mso-fareast-language:ZH-CN;"&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt;After run this, RMR1= RMR2=0.3464102.&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt;Is this code right?&lt;/span&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/b&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;/b&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;General Question:&lt;/b&gt; &lt;/font&gt;&lt;/font&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;After went trough the article on the web page (Convergent &amp;amp; Discriminant Validity), I am confused about the last part which says that”It does show that, as you predicted, the three self esteem measures seem to reflect the same construct (whatever that might be), the three locus of control measures also seem to reflect the same construct (again, whatever that is) and that the two sets of measures seem to be reflecting two different constructs (whatever they are).” But there is no simple answer to the question “How do we show that our measures are actually measuring self esteem or locus of control?” This article mentions some ideas to address this question, but my general idea has two steps. First, we need to find the correlation matrix of all the Y’s and group them into different measures based on how highly correlated they are and our prior-knowledge to get the construct validity (let’s say 3 measures). Second, we can perform EFA or CFA (if we know there are correlations between the measures) to conform what the measures might be (while, we can not be 100% sure about that). And also, if we cannot conduct a valid construct, then all the following procedures won’t make sense. Is my idea right?&lt;/font&gt;&lt;/p&gt;</description></item><item><title>H44134794, I. Model test, II. Too many model fit indices</title><link>http://tltc.ttu.edu/cs/forums/thread/408.aspx</link><pubDate>Sat, 08 Nov 2008 06:00:04 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:408</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/408.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=408</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;1. Model test&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;The structural equation model is composed of measurement model and the structure among constructs. Suppose the following model with F1 and F2 are constructs measured with Y1, Y2, and Y3, Y4 and Y5 indicators respectively. Also, F1 and F2 affect to F3, and subsequently F3 affects F4.&lt;/p&gt;
&lt;p&gt;&lt;pre&gt;Y1 &amp;lt;-   F1 -----           
Y2 &amp;lt;-        ---&amp;gt;  F3   --&amp;gt;   F4
Y3 &amp;lt;-         --
             ----
Y4 &amp;lt;-    F2 ----
Y5 &amp;lt;-
&lt;/pre&gt;
&lt;p&gt;When testing this model with a survey, we usually put the questions that are only considered between two constructs at a time, questions like F1toF3, F2toF3, and F3toF4. I think this test has a problem because the test assumes that if F1toF3 and F2toF3 are true, F3toF4 is true. We do not know F1 affects F4, and then F3. Thus, why we do not make a question about this like “I think F1 and F2 affect F3, then subsequently F4. 1 2 3 4 5 6 7 (7 Likerts)”&lt;/p&gt;
&lt;p&gt;Also, if we assume that latent variables such as satisfaction are exist, why don’t we ask about latent variables to respondents directly, rather than making indicator questions and drawing latent variables based on them? Because respondents also understand what satisfaction is, I think it is better to ask latent variable directly. Sometimes, we can feel that researchers treat respondent inferior when taking questionnaires. Generally most of constructs are rephrasing same contents or contexts. So we can know what a researcher asks although there are many items. If so, why researchers ask questions with difficulty leaving out easy question?&lt;/p&gt;&lt;br /&gt;
&lt;p&gt;&lt;b&gt;2. Too many model fit indices&lt;/b&gt;&lt;br /&gt;&lt;br /&gt;In the regression analysis, we need relatively few of model fit indices such as t-test and F-test. However, in the structural equation model, we can see a lot of model fit indices, usually occupied one page in the SAS output. Although fit indices usually measure the difference between observed data and estimated data, why SEM requires a lot of fit indices? If we can apply the five assumptions in the regression analysis to the SEM, is it enough with only just few tests such as t-test and F-test? Also why SEM fit indices have no single statistical test of significance that identifies a correct model given the sample data? I can conjecture two reasons for many model fit indices. One is that SEM is more complicate than the regression model. However, the complexity of SEM could not be a reason because the regression model also can have very complicate model. Next, small sample, violation of normality and independence, and estimation methods can affect the number of fit indices. However, these restrictions are also limited to the regression analysis. &lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>J80032900 I) Conducting new research instead of using model modification. II) Factor Analysis for Pre-test and Post-test type survey.</title><link>http://tltc.ttu.edu/cs/forums/thread/407.aspx</link><pubDate>Sat, 08 Nov 2008 04:16:30 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:407</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/407.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=407</wfw:commentRss><description>&lt;p&gt;&amp;nbsp;Specific question: Conducting new research instead of using model modification.&lt;br /&gt;On Tuesday’s lecture you talked about model modification the two main points that I learned from this part is if we are using modification indices we are going back from CFA to EFA. Furthermore, you said that if we want to modify the model we need to have a theory based explanations to justify our action of modifying the model. This two points makes me wonder is it possible for us to use the information we get from the modification indices to conduct another new research because you said EFA can be used to help formulate a better research for CFA. The reason I said this is because what if we do not have a very good theory to back up the modification we what to do for the model but we can get a valid theory to justify it after further investigation on the issue. If what I just mention is an option what is the limit because I am aware that if this process is done it will be a never ending loop that is wasting the researcher’s time, effort, and financial resources.&lt;br /&gt;&lt;br /&gt;General question: Factor Analysis for Pre-test and Post-test type survey.&lt;br /&gt;If a pre-test and post-test type survey was conducting how can I conduct factor analysis on the result? Some here is the survey’s background. The main concern of the survey the students understanding of business ethics over the semester. Students from a freshmen level class answered a survey that contained the same questions. The same questions were asked twice during the semester. First survey the first day of class. The second survey was conducted one week after the instructor went over materials about ethics. &lt;br /&gt;Here are some of my questions is it possible for me to get two different/unrelated factors for the pre and post-test result? If I get two different sets of factors can I related a factor from factor for the pre-test result to another factor from factor for the post-test result? If my concern is to whether the students have better understanding about the issue (ethics) over the semester would regression be a better option? If that is the can how can FA help me understand the students’ understanding of ethics from the pre-test and post-test survey?&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;</description></item><item><title>K65419096 I. Parameters into GFI? II. Divergent Validity</title><link>http://tltc.ttu.edu/cs/forums/thread/406.aspx</link><pubDate>Fri, 07 Nov 2008 18:04:14 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:406</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/406.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=406</wfw:commentRss><description>I.&lt;span&gt;&amp;nbsp;
&lt;/span&gt;When we discussed the goodness-of-fit measures on Tuesday, for the GFI
measure we never discussed numbers of parameters.&lt;span&gt;&amp;nbsp; &lt;/span&gt;I understand the idea of GFI being a way to
solve one of the problems of the RMR because the squaring of the differences
weighs the differences which are near the upper end of correlation more heavily
than those at the lower end.&lt;span&gt;&amp;nbsp; &lt;/span&gt;This is
very slick and good, but it leaves out the number of parameters, which RMR
includes, and raises a question in my mind.&lt;span&gt;&amp;nbsp;
&lt;/span&gt;It seems to me that if a model has a lot of parameters then the trace
will include more numbers and thus be impacted differently than would a model
with less parameters and less figures in the trace.&lt;span&gt;&amp;nbsp; &lt;/span&gt;Even if the differences are very slight, by
the time you add more and more small numbers, you are more likely to get
something which could total to a significant amount; whereas when you only have
a few small numbers, you may remain with an insignificant total.&lt;span&gt;&amp;nbsp; &lt;/span&gt;Am I looking at this properly?

&lt;p style="margin:0in 0in 0.0001pt;"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;II.&lt;span&gt;&amp;nbsp; &lt;/span&gt;In Thursday’s
lecture about divergent validity, you said that we want the off-diagonal &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;AB and &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;BA
to be relatively smaller than &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;AA and &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;BB, though still large.&lt;span&gt;&amp;nbsp; &lt;/span&gt;I understand why we don’t want them so large
that A and B appear to be one factor and not two, but why we want them to be
anything but as small as possible?&lt;span&gt;&amp;nbsp; &lt;/span&gt;Were
you just setting us up for the following example when we have a situation with
such high &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;AB and &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;BA that we would need to conduct a &lt;span style="font-family:Symbol;"&gt;c&lt;/span&gt;2 test to test for divergent validity?&lt;span&gt;&amp;nbsp; &lt;/span&gt;You are trying to show that your Ys are
related to Factor A and not to Factor B (or vice versa), so why would you not
want the lowest possible &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;AB and &lt;span style="font-family:Symbol;"&gt;S&lt;/span&gt;BA?&lt;/p&gt;


</description></item><item><title>D66201843 I. Model and Fit Measures  II. EFA, CFA &amp; SEM - Applicability to Research</title><link>http://tltc.ttu.edu/cs/forums/thread/405.aspx</link><pubDate>Fri, 07 Nov 2008 17:55:32 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:405</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/405.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=405</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;I. Specific Question (Model and Understanding Fit Measures)&lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;This week we discussed four fit measures, where a fit measure indicates how well the model fits the data and we discussed this in the context of CFA.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I have two questions regarding fit measures.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;First, do we look at how well the loadings load on the various factors in our model first and then look at the fit measures to determine if we have a good model?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;It would seem to me that if the loadings don’t seem to load high on any factor and load on most factors then we don’t have a very good model and looking at the fit measures wouldn’t be necessary or hopefully would only confirm this?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this correct?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Second, for the various fit measures discussed in class, it would seem to me that GFI and RMSEA are the best to use but that RMR is not that good because of the two problems indicated in class: 1) it is not clear from the covariance matrices what large or small values should be or what they mean and 2) different sizes of correlations should be given different weights.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this correct?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Further, when would someone want to use the Chi Squared because it doesn’t seem to be a good measure because of the dependency with sample size?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Do people just throw this in because if they have a large sample size, then the number looks good? &lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;/font&gt;&amp;nbsp;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;II. General Question&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(EFA, CFA and SEM – Putting it all together for research) &lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;Thus far, we have discussed EFA, CFA and SEM.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;EFA is exploring the data to determine how many factors we may have in our data.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;CFA is estimating the correlation between the factors or latent variables based on theory and SEM is estimating the relationship or making an argument about the data based on theory because we can’t really get at cause.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;But when I think of all three of these together, it would seem to me that there could be an over-use of analysis on the data, or that &lt;i style="mso-bidi-font-style:normal;"&gt;data-snooping&lt;/i&gt; could be an issue here?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is there a threat for this or are these methods just intended for one to gain a better understanding of the data?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;Further, when I think of these items, I also think of &lt;i style="mso-bidi-font-style:normal;"&gt;deductive&lt;/i&gt; versus &lt;i style="mso-bidi-font-style:normal;"&gt;inductive reasoning&lt;/i&gt; (from a research methods class I took) where deductive reasoning is &lt;span style="mso-ansi-language:EN;"&gt;&lt;a title="Reasoning" href="http://en.wikipedia.org/wiki/Reasoning"&gt;&lt;font color="#22229c"&gt;reasoning&lt;/font&gt;&lt;/a&gt; which uses deductive &lt;a title="Argument" href="http://en.wikipedia.org/wiki/Argument"&gt;&lt;font color="#22229c"&gt;arguments&lt;/font&gt;&lt;/a&gt; to move from given statements (&lt;a title="Premise" href="http://en.wikipedia.org/wiki/Premise"&gt;&lt;font color="#22229c"&gt;premises&lt;/font&gt;&lt;/a&gt;) to &lt;a title="Conclusion" href="http://en.wikipedia.org/wiki/Conclusion"&gt;&lt;font color="#22229c"&gt;conclusions&lt;/font&gt;&lt;/a&gt; whereas inductive reasoning reasons from a large number of particular examples to a general rule (both definitions obtained from Wikipedia).&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;It would seem to me that data-snooping and inductive reasoning could be very similar here because if we run several analyses on the data and come up with a pattern this might be considered inductive reasoning vs. data snooping.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;For example, let’s say I have an interest in accounting research and I have gained access to a special charitable contributions dataset.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I have read the literature and understand what the concepts are related to charitable contributions and have developed some thoughts on what I would expect to see in the dataset.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So then I run EFA and play with the factors to determine how many factors I feel are appropriate based on looking at the factor loadings.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Then I feel that three factors are important so I run CFA on the dataset with these three factors and look at the fit measures and determine my three factors appear appropriate.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Then I run SEM on the dataset to determine the relationship of the factors to make an argument for how I think things work.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this how it usually works?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Because I could see where there is room here for a researcher to change their mind.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For example, if they find that in EFA there are other factors or if in CFA the factors are too highly correlated or there doesn’t appear to be convergent or discriminant validity, they might try other things until something works.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Would this be data-snooping or inductive reasoning?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I’m thinking it could be inductive reasoning because maybe the researcher has come up with something new to add to current theory?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>J68703397.   I. Fit Measures: Chi-Square and RMSEA II. Convergent and Discriminant Validity</title><link>http://tltc.ttu.edu/cs/forums/thread/404.aspx</link><pubDate>Fri, 07 Nov 2008 16:40:06 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:404</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/404.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=404</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;I. Specific Question:&amp;nbsp;&amp;nbsp;Fit Measures: Chi-Square and RMSEA&lt;/b&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;On this week we learned about Fit Indices.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;There are different kinds of fit indices since there are different ways of measuring discrepancies among two matrices: S and S^.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For instance, we learned that the Chi-square index has a problem: the chi-square value is inflated as the sample size increases.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;For this reason, big discrepancies for small samples are not significant and small discrepancies for big samples sizes are significant.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Given this situation, the RMSEA solves the problem by eliminating the effect of the sample size, since RMSEA “normalizes” the Chi-square index, right?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;However, Grimm and Yarnold (2000) mention that there is a second problem associated with the Chi-square:&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;the index is also sensitive to an assumption of multivariate normality of the Y’s.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Thus, the Chi-square index is also inflated when the normality assumption is not hold.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Then, I was wondering if this second problem is still present with the RMSEA index.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Does the RMSEA is robust to normality? Should this be a concern? &lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;Larger samples sizes would solve this second problem? &lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font face="Calibri" size="3"&gt;In addition and as I mentioned at the beginning, different indices evaluate different ways of fit.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;And I assume that the indices we saw in class are the most relevant (e.g. Chi-Square, RMR, RMSEA, GFI), and we may have a higher preference of convergent indices.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;But what about if we have the case in which RMR is fairly good, the RMSEA is not so good, and the GFI has a marginal value, in which index should I base my conclusions concerning model fit?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;What would the decision rule be in this case?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;If one fit index is bad, should I conclude that the estimated-covariance matrix is not good?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;My rationale is as follows: if there are different ways of measuring fitness, then the model can be evaluated without subjectivity or without relying on the good judgment of the person who is analyzing the model. Is my thinking correct?&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;
&lt;p&gt;Grimm, L.G., &amp;amp; Yarnold P. R. (2000). &lt;i&gt;Reading and Understanding more Multivariate Statistics&lt;/i&gt;. Washington, DC: American Psychological Association.&lt;/p&gt;&amp;nbsp;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;II.General Question: Convergent and Discriminant Validity.&lt;/font&gt;&lt;/font&gt;&lt;/b&gt; 
&lt;p&gt;&lt;font face="Calibri" size="3"&gt;One of the topics of Thursday’s lecture was Convergent and Discriminant Validity.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;If I got the right idea, our concern is to validate the constructs defined by each factor.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Convergent validity shows the degree in which multiple measures (Y’s) of the same construct (Factor) demonstrate an agreement. &lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;That is, if we can view certain Y’s as a measure of F.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Now, suppose that I have a survey in which I want to measure the &lt;i style="mso-bidi-font-style:normal;"&gt;work environment&lt;/i&gt; into two different latent factors: &lt;i style="mso-bidi-font-style:normal;"&gt;comfort&lt;/i&gt; and &lt;i style="mso-bidi-font-style:normal;"&gt;safety&lt;/i&gt;.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Thus, my survey has questions related to Y’s such as:&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;Comfort:&lt;/b&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;number of complains and number of breaks requested to supervisor.&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;Safety:&lt;/b&gt; number of accidents in work station, number of discrepancies of safety procedures (not following procedures), and number of visits to medical department due to small injuries.&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font face="Calibri" size="3"&gt;In this sense, the survey is intended to be a measurement instrument of two constructs: comfort and safety that will give me an idea of the work environment status.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So, if I collect data of 50 works stations and test if it is convergent or divergent, can I consider the convergent validity as a validation test for the survey? That is, can I validate the proper definition of questions as a measure of comfort and safety by using the concept of convergent validity?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;What I am trying to say is: would this convergent validity be an indicator of the accuracy of the measurement device (survey)?&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;TEXT-ALIGN:justify;"&gt;&lt;font face="Calibri" size="3"&gt;By the same token, discriminant validity tests if the constructs are really measuring different things, then the correlation among these factors should be close to zero.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Right?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Now, the method for testing Convergent and Discriminant Validity (DV) that we first saw, was by looking at the correlations of the matrix: high correlations in the diagonals matrices indicate convergent validity and low correlations off-diagonal matrices indicate discriminant validity.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So I was wondering if it makes sense to think about a partitioned matrix with low correlations in the diagonals-matrices and in the off-diagonals matrices at the same time?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;I am thinking about the case of looking a glass of water if it is half empty or half full.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;Or can we say that the convergent and discriminant are mutually exclusive events?&lt;/font&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>D47295184  I. Questions about a paper on the class web page  II. Model fit indices</title><link>http://tltc.ttu.edu/cs/forums/thread/391.aspx</link><pubDate>Sat, 01 Nov 2008 09:26:46 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:391</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/391.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=391</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;Specific Question:&lt;/b&gt; &lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;After went though the paper &lt;i style="mso-bidi-font-style:normal;"&gt;&lt;span style="mso-bidi-font-weight:bold;mso-bidi-font-family:Arial;"&gt;Basic&lt;/span&gt;&lt;/i&gt;&lt;span style="mso-bidi-font-weight:bold;mso-bidi-font-family:Arial;"&gt; &lt;i style="mso-bidi-font-style:normal;"&gt;Concepts and Procedures of Confirmatory Factor Analysis&lt;/i&gt; on the class web page several times, I got following questions.&lt;/span&gt;&lt;/font&gt;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font size="3"&gt;&lt;span style="mso-bidi-font-weight:bold;mso-fareast-language:ZH-CN;mso-no-proof:yes;"&gt;①&lt;/span&gt;&lt;font face="Calibri"&gt;&lt;span style="mso-bidi-font-weight:bold;mso-bidi-font-family:Arial;"&gt; It say in the first part (&lt;/span&gt;&lt;em&gt;Confirmatory Factor Analysis&lt;/em&gt;&lt;span style="mso-bidi-font-weight:bold;"&gt;):&lt;/span&gt; Factor analysis is a generic term that we use to describe a number of methods designed to analyze interrelationships within a set of variables or objects…&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;In class until now, we covered two different factor analysis methods: EFA&amp;amp;CFA. I thought there were only two methods about factor analysis before. However, after read “a number of methods” in the paper, I used Google to search other methods and found no answers about the other methods. I am curious if there are some other factor analysis methods.&lt;/font&gt;&lt;/p&gt;&lt;span style="FONT-SIZE:11pt;mso-no-proof:yes;"&gt;②&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt; It says (in &lt;span&gt;&lt;em&gt;Exploratory Factor Analysis&lt;/em&gt; part&lt;/span&gt;): The determination as to which form to use in an analysis is made based on the purpose of the &lt;u&gt;data analysis&lt;/u&gt;. Exploratory factor analysis is used to explore data to determine the number or the nature of factors that account for the covariation between variables when the researcher does not have, a priori, sufficient evidence to form a hypothesis about the number of factors underlying the data. Therefore, exploratory factor analysis is generally thought of as more of a theory-generating procedure as opposed to a theory-testing procedure.&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt;I know that when performing EFA by SAS, we can use PC model as default to get the number of factors that SAS will give us. But, if we use ML method, we need to set the number of factors by ourselves. So, can I conclude that when we believe that there are 3 factors for example, and the correlations of the factors are not clear, we can use both EFA-ML and CFA and then use the fit statistic to choose the better fit one?&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;mso-no-proof:yes;"&gt;③&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt; It says (in &lt;span&gt;&lt;em&gt;Criticisms of exploratory factor analysis&lt;/em&gt;&lt;strong&gt; &lt;/strong&gt;part&lt;/span&gt;): In a practical sense, there is no question that exploratory factor analysis serves a useful purpose in suggesting hypotheses for further research.&lt;/span&gt;&lt;span style="FONT-FAMILY:Calibri;"&gt;&lt;font size="3"&gt;Here, does the “suggesting hypotheses” mean the number of factors we can get from the EFA model?&lt;/font&gt;&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;mso-no-proof:yes;"&gt;④&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt; It says (in &lt;span&gt;&lt;em&gt;Confirmatory Factor Analysis&lt;/em&gt; part): &lt;/span&gt;In addition, confirmatory factor analysis offers the researcher a more viable method for evaluating construct validity. The researcher is able to explicitly test hypotheses concerning the factor structure of the data due to having the predetermined model specifying the number and composition of the factors.&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt;My understanding of this sentence is: we can use the fit statistics to test if the predetermined model we set is good fitted. Am I right? &lt;/span&gt;&lt;span style="FONT-SIZE:11pt;mso-no-proof:yes;"&gt;⑤&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt; It says (in &lt;span&gt;&lt;em&gt;Interpreting Confirmatory Factor Analyses&lt;/em&gt; part&lt;/span&gt;): It is important to remember when interpreting the findings from a confirmatory factor analysis that more than one model can be determined that will adequately fit the data.&lt;/span&gt;&lt;span style="FONT-SIZE:11pt;FONT-FAMILY:Calibri;"&gt;Dose it mean we may assume many different models by using different number and composition of the factors, and we may find that the results of these models are all good?? If my understanding is right, then does it also mean the hypothesis prior to the CFA is not unique?&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;General Question:&lt;/b&gt; &lt;/font&gt;&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;I read a paper about model fit indices online.&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;a href="http://eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/16/cc/b3.pdf"&gt;&lt;font face="Calibri" color="#800080" size="3"&gt;http://eric.ed.gov/ERICDocs/data/ericdocs2sql/content_storage_01/0000019b/80/16/cc/b3.pdf&lt;/font&gt;&lt;/a&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;span style="COLOR:black;"&gt;The paper &lt;/span&gt;&lt;span style="COLOR:black;mso-fareast-language:ZH-CN;"&gt;reviews the most frequently used structural equation modeling (SEM) fit statistics including &lt;/span&gt;&lt;span style="COLOR:black;"&gt;chi squared, goodness of fit (GFI) and adjusted goodness of fit (AGFI) indices etc.&lt;/span&gt;&lt;/font&gt;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;I have some questions about the model fit indices related to this paper. The paper holds that Chi-squared is the conventional overall test of fit in structure equation modeling. However, one of the main shortcomings of Chi-squared test is that the chi-squared test may not be a good enough guide to model adequacy when the sample size is small or big (Page6-7). I understand that this when the sample size is small, but can’t understand when the sample size is big. I though the larger the sample size, the better the test is.&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;The paper maintains that GFI and AGFI have the benefit of being more specific indices of fit than chi-squared statistics and they take degrees of freedom into account and eliminate some of the problems inherent in the chi-squared statistics alone (Page7-8). I am not clear about the GIF and AGIF here. What are the null hypotheses of GFI and AGFI in this paper, or saying GFI and AGFI in the structure equation models? Are they the same as those in chi-squared test?&lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri" size="3"&gt;The paper also supports that the root mean squared error of approximation (RMSEA) is one of most recently proposed tests of model fit and has been seen as a better indicator of fit than root mean square residual (RMR). And RMSEA is less affected by sample size, unlike the chi-squared statistics. I then run the SAS Job Satisfaction example we went through on Thursday again, finding that the RMSEA= 0.0975 with uncorrelated factors and RMSEA=0.0762 with correlated factor which means the second model is “fair fit”. I do not see the difference between RMSEA and RMR, RMSEA and GFI/AGFI actually, but based on the emphasis of the importance of RMR you made in class, I believe that RMSEA and RMR can be the most important fit statistics in our class. But in reality, we may refer to different fit statistics depending on different cases and models. Right?&lt;/font&gt;&lt;/p&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>H44134794, I. SEM and Multivariate Regression, II. Re-specification of a model</title><link>http://tltc.ttu.edu/cs/forums/thread/390.aspx</link><pubDate>Sat, 01 Nov 2008 05:02:38 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:390</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/390.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=390</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;1. SEM and Multivariate Regression&lt;/b&gt;&lt;br /&gt;In the recent class of SEM, we have leaned how to use PROC CALIS in SAS. In that procedure, we specify a lot of lineq model such as c1 = b1 f1 + e1 c2 = b2 f1 + e2 and so on. In the terms of regression perspective, the relation between indicators and constructs can be expressed with the following.&lt;/p&gt;
&lt;p&gt;Y_i = b_i F_k + e_i, &lt;br /&gt;where Y_i = the ith indicator, b_i = coefficient representing effect of latent variable on indicator, F_k = latent variable, and e_i = measurement error for indicator i. This model implies that a common factor affects indicators.&lt;/p&gt;
&lt;p&gt;However, I am thinking how about to reverse this relation as the following multivariate regression.&lt;br /&gt;F = b_i X_i + b_n X_n + e, &lt;br /&gt;where F = the construct being estimated, b_i = beta weights for items, X_i = item scores, and e = a disturbance term. If we consider that a construct is composed of multiple items, why don’t we use the multivariate regression method rather than the structural equation model. If we use the multivariate regression, we can consider the correlation between indicators, and also the beta in the multivariate regression shows the degree of relation to a construct.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;2. Re-specification of a model&lt;/b&gt;&lt;br /&gt;We have learned that the SEM&amp;nbsp;can be&amp;nbsp;used as the confirmatory factor analysis, meaning that we have to set the model before analyzing data. After scrutinizing several theories, suppose we set a model. Through collecting data and analyzing it, however, we find that other relation between latent variables we did not specified is significant. When we find this kind of unexpected relation in a model, do we need to modify an original model? Or, can we regard it as an abnormal or exception results for a setting model? Is there any fit indices to check the unexpected relation?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>K65419096 I. Forcing correlation to test the null II. 3 factors but only 2 correlated</title><link>http://tltc.ttu.edu/cs/forums/thread/389.aspx</link><pubDate>Sat, 01 Nov 2008 01:52:34 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:389</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/389.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=389</wfw:commentRss><description>I.&lt;span&gt;&amp;nbsp;
&lt;/span&gt;In researching some more about allowing a correlation among variables, I
ran across some code that said “in order to test the hypothesis that the
correlation between X and Y is the same as that between Z and Q” we would want “to
impose the constraint that the COV(X,Y) = COV(Z,Q)” by giving the same name to
both in the “cov” statement. &lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;span&gt;I.e., &lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;/span&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;&lt;span&gt;Cov &lt;span&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &lt;/span&gt;X Y = A,&lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;&lt;span&gt;Z Q = A, &lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;&lt;span&gt;X Z = B, &lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;&lt;span&gt;X Q = C, &lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;&lt;span&gt;Y Z = D, &lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;text-indent:0.5in;"&gt;Y Q = E;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;"&gt;The code makes sense, but why would
that test the hypothesis that they are the same if you are imposing that on it?&lt;span&gt;&amp;nbsp; &lt;/span&gt;Would you force them to be the same and then
allow them to differ and then compare the fit statistics? &lt;span&gt;&amp;nbsp;&lt;/span&gt;Which statistics would you look at and how
would you compare them? &lt;span&gt;&amp;nbsp;&lt;/span&gt;One method
(forced fit or allow to be different) may fit better, but at what point could
you say the fit statistics are different enough to reject the null? &lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;

&lt;p style="margin:0in 0in 0.0001pt;"&gt;&amp;nbsp;&lt;/p&gt;

II.&lt;span&gt;&amp;nbsp; &lt;/span&gt;I
understand that the model default is that all latent factors are
uncorrelated.&lt;span&gt;&amp;nbsp; &lt;/span&gt;I understand that we can
tell the SAS software “cov” with the factors to allow a correlation.&lt;span&gt;&amp;nbsp; &lt;/span&gt;If we have three factors and our theory says
two are correlated, we can restrict the “cov” to just one relationship instead
of all, but what if they really are correlated?&lt;span&gt;&amp;nbsp;
&lt;/span&gt;We wouldn’t find this in EFA, so do we need to try different CFA
combinations?&lt;span&gt;&amp;nbsp; &lt;/span&gt;Not difficult with just
three factors, but as you increase the number of factors, you really increase
the potential combinations.&lt;span&gt;&amp;nbsp; &lt;/span&gt;In
regression we learned about forward/backward/stepwise model selection.&lt;span&gt;&amp;nbsp; &lt;/span&gt;Is there anything like this for CFA?
</description></item><item><title>J80032900 1) Identifiablity Definition Clarification 2) Similarity between Regression Analysis and Confirmatory Factor analysis.</title><link>http://tltc.ttu.edu/cs/forums/thread/388.aspx</link><pubDate>Fri, 31 Oct 2008 22:53:39 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:388</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/388.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=388</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;strong&gt;Specific question: Identifiablity Definition Clarification&lt;/strong&gt; &lt;/font&gt;&lt;/p&gt;
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;You started Tuesday’s lecture with identifiability, and I got some of the concept but I am still confused about some of the concept that you went over. You defined identifiability of the parameters from a covariance structure model standpoint is the parameters (theta) is identifiable if Sigma(theta1) is not equal to Sigma(theta2) whenever theta1 is not equal to theta2. Conversely, the parameter (theta) is not identifiable if Sigma(theta1) is equal to Sigma(theta2) for some theta1 not equal to theta2. My issue I could not grasp the idea the parameter (theta) is not identifiable if Sigma(theta1) is equal to Sigma(theta2) for some theta1 not equal to theta2. Is that means that &lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;that is the covariance matrix is the same we could not identify the parameters hence we could not estimate the parameters.&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;Also you mention that we will be using the concept of identifiability later on in this course. So can you tell me what should I pay attention now of what is the most important thing that I need to know about Identifiablity.&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;&lt;strong&gt;General question: Similarity between Regression Analysis and Confirmatory Factor analysis.&lt;/strong&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;When I did some research online about the similarity between regression analysis and CFA I cam across the concept where factor analysis is part of the general linear model (GLM) family of procedures. Hence factor analysis makes many assumptions like regression. From the regression analysis class you mentioned there are five assumptions we use for regression analysis which is Assumption 0 – the data is from a randomness process, Assumption 1 – the means of the distribution fall on a function that we speficy, Assumption 2 – the data have constant variance, or homoscedasticity, Assumption 3 – the residual is Uncorrelated, and Assumption 4 - The distribution p(y|&lt;b&gt;X&lt;/b&gt;) is a normal distribution&amp;nbsp; for every &lt;b&gt;X. &lt;/b&gt;&lt;/font&gt;&lt;/font&gt;&lt;b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/b&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;span style="mso-bidi-font-weight:bold;"&gt;&lt;font face="Times New Roman" size="3"&gt;I know from the class discussion that we are violating the assumption 2 because of the correlation of the factors so we can explain the outcomes easier. Am I getting this concept correctly? Furthermore, I am wondering since the variances are correlated will the errors be possibly correlated as well? If that is possible how do we justify the violation of Assumption 3?&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>J68703397. I. Commonality and Heywood Case II. EFA and CFA</title><link>http://tltc.ttu.edu/cs/forums/thread/387.aspx</link><pubDate>Fri, 31 Oct 2008 22:41:25 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:387</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/387.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=387</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;I. SPECIFIC QUESTION.&amp;nbsp; COMMONALITY AND HEYWOOD.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;One of the problems that can come up when doing EFA or CFA is to get a commonality greater than 1.&amp;nbsp;&amp;nbsp; This error occurred to me on the HMWK assignment.&amp;nbsp; We learned that commonality is the proportion of variance in Yi explained by the model, which is basically the sum of the squared loadings of the model.&amp;nbsp; I thought that commonality greater than 1 implied a redundancy in the variables when trying to explain the variance of Yi, but I am not sure about that.&amp;nbsp; The related demonstration file of the class explains the Heywood case which occurs when one of the values of Psi is negative or the correlations are high or it even may occur when we have positive values for Psi but we have an imprecise estimation of S.&amp;nbsp; Then, when we have the &amp;quot;Heywood&amp;quot; case, SAS puts an error message: &amp;quot;ERROR: Communality greater than 1.0&amp;quot;.&amp;nbsp; So my question is: Is commonality the effect of the Heywood case?&amp;nbsp; I do not understand the connection between commonality &amp;gt; 1 and a negative Psi.&amp;nbsp; As far as I understand, commonality is the sum of the squared loadings, what does it have to do with negative Phi?&amp;nbsp;&amp;nbsp;&amp;nbsp; Should I look at the VAR(Yi) which is the commonality + Psi ?&amp;nbsp; &lt;/p&gt;
&lt;p&gt;Furthermore, there are three different countermeasures when we have the Heywood case: 1) Use Principal Components instead, but we can have a bad fit of the model; 2) Use PROC FACTOR with &amp;quot;Heywood&amp;quot; option, and we can have a bad fit too; 3) Use PROC FACTOR with &amp;quot;Ultraheywood&amp;quot; option, in which we can have a perfect fit; and 4) We can set boundaries to the values of Psi (no negative values) when using PROC CALIS.&amp;nbsp; So it seems that when we use any countermeasure, we can either compromise the model fitness (lack of fit or perfect fit, RMR = 0.&amp;nbsp;&amp;nbsp; Is that correct?&amp;nbsp; With these countermeasures, it pops up several questions: what is the difference between using the Heywood and the Ultraheywood option?&amp;nbsp; Are we constraining in a different way the values of Psi?&amp;nbsp; If during our FA analysis we have the Heywood case and use one of the countermeasures mentioned, and suppose we get a reasonable RMR value, how confident should I be with that model?&amp;nbsp; I am asking this because without applying any countermeasure when the Heywood case is present, that can be translated to a lack-of-fit issue.&amp;nbsp; Right?&amp;nbsp; But when applying the &amp;quot;Heywood&amp;quot; option, I understand that I am telling SAS if you get a negative value of Psi, just make it zero, and what I get? A model that is not bad at all.&amp;nbsp; Is my understanding correct?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;II. GENERAL QUESTION.&amp;nbsp; EFA &amp;amp; CFA&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This week, you emphasized the difference between Exploratory Factor Analysis (EFA) and Confirmatory Factor Analysis (CFA). Among all the differences we saw in class, in this question, I am referring to the difference in their purpose: the exploratory is theory-generating, while CFA is theory-testing.&amp;nbsp; However, in both analyses we get a model. &amp;nbsp;&amp;nbsp;As far as I understand, in the EFA case from the data we suggest a model, while for the CFA we use a model to estimate the parameters of that model.&amp;nbsp;&amp;nbsp; So, the difference is that in order to perform an EFA we need raw data in terms of the correlation matrix; while the CFA needs an input model to estimate the parameters and the correlations between the factors. Am I right? Then, I was wondering if there is a connection between the EFA and the CFA.&amp;nbsp; That is, can we take the EFA model as an input model to perform the CFA?&amp;nbsp; Or where do we get the input model for the CFA?&amp;nbsp;&amp;nbsp; Is the theory generated from the EFA what we are testing through the CFA?&amp;nbsp;&amp;nbsp;&amp;nbsp; Are we confirming the model with the same information that comes from our original data set?&amp;nbsp; Is it using the same correlation matrix?&amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp; I am asking this, because I am familiar with Design of Experiments (DOE), and after we perform a DOE, there is a solution with a model in which basically defines the levels of each factor (independent variables) to obtain a desirable output of the response variable (dependent variable).&amp;nbsp; &amp;nbsp;&amp;nbsp;So in this sense, we are generating &amp;quot;theory&amp;quot; from data.&amp;nbsp; Then, we use confirmatory experiments by setting the values to the process parameters according to the DOE results, and we see if get the desirable outcome from the response variable.&amp;nbsp; In this sense, we are confirming our model obtained from our initial DOE with new data.&amp;nbsp; The difference that I notice here is that we generate new data in order to confirm the model.&amp;nbsp; But I think this is a different kind of &amp;quot;confirmation&amp;quot; compared with the &amp;quot;confirmation&amp;quot; from the CFA, is it?&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>D66201843 I. Varimax Rotation  II. CFA - Multicollinearity and Construct Validity</title><link>http://tltc.ttu.edu/cs/forums/thread/386.aspx</link><pubDate>Fri, 31 Oct 2008 16:40:56 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:386</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/386.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=386</wfw:commentRss><description>&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;I. Specific Question (Varimax Rotation – Understanding the details)&lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;In Tuesday’s lecture on Varimax Rotation, we discussed that you want to choose a rotation that maximizes the sum of the variances of the within factor squared loadings.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;What specifically does ‘within factor squared loadings’ mean?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is the ‘within factor’ just the items in the L matrix and you square them? - Because in class you showed the L matrix and then showed each item squared.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;But, then how does this tie into the formula S&lt;sub&gt;1&lt;/sub&gt;&lt;sup&gt;2 &lt;/sup&gt;+ S&lt;sub&gt;2&lt;/sub&gt;&lt;sup&gt;2&lt;/sup&gt;+ … + S&lt;sub&gt;m&lt;/sub&gt;&lt;sup&gt;2 &lt;/sup&gt;= criterion where you want to choose the rotation to maximize the criterion?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I understand that S = LL` + ψ but I’m having trouble understanding the connection with the L and S because it seems that the L is specific to each S but in the formula you square and add up the S’s? &lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;I believe I understand the concept of rotation based on the homework Question 2 (EFA) but I’m trying to gain a better understanding of the details of the formulas.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I believe the concept of rotation can be explained with the example from Question 2 of the homework.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For Question 2, we ran the EFA and generated unrotated and rotated results and clearly the rotated results had larger differences within the loadings and it was clearer which variables loaded on to what factors.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So varimax rotation made it clearer which variables loaded on which factors.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; Is this correct?&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;II. General Question&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(Understanding Confirmatory Factor Analysis with respect to Multicollinearity and Construct Validity) &lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;In Thursday’s lecture, we discussed that the main point for confirmatory factor analysis (CFA) is to estimate the correlation between the factors or latent variables.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;We went over the example of job salience and job satisfaction as two factors or latent variables that represent the variables studied.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;What about when the factors are too highly correlated, is there such a thing for CFA? &lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&lt;/span&gt;Would you want it to happen?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I would think not because, when I think of correlation, I can’t help but think about multicollinearity and construct validity, concepts I learned about in a regression and research methods class I have taken.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For multicollinearity, I would say that if the factors are too highly correlated then maybe we are measuring the same thing.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For construct validity, we want to make sure that what we think we are measuring is really what we are measuring.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Or according to Wikipedia, construct validity &lt;span style="mso-ansi-language:EN;"&gt;refers to whether a &lt;a title="Scale (social sciences)" href="http://en.wikipedia.org/wiki/Scale_(social_sciences)"&gt;&lt;font color="#22229c"&gt;scale&lt;/font&gt;&lt;/a&gt; measures or correlates with a theorized psychological &lt;a title="Construct" href="http://en.wikipedia.org/wiki/Construct"&gt;&lt;font color="#22229c"&gt;construct&lt;/font&gt;&lt;/a&gt;.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;span style="mso-ansi-language:EN;"&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/span&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;span style="mso-ansi-language:EN;"&gt;&lt;font face="Times New Roman" size="3"&gt;In understanding these concepts with respect to CFA, I understand from class, that CFA is driven by theory and the factors are known by theory so lets say we are trying to measure mood and depression and we ask certain questions in a survey to determine the mood of a person (good or bad) and depression (depressed or not and level).&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I would think that if mood and depression are very highly correlated (multicollinearity) from the CFA then we have a problem in our research because we would be measuring the same thing (construct validity issue because we are not measuring what we think we are for one of the factors)?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;In which case, we would need to redesign our study or look back to theory regarding mood and depression.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this correct when trying to understand CFA with multicollinearity and construct validity?&lt;/font&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>D47295184  I. PC model  II. The Chi-squared goodness of fit test</title><link>http://tltc.ttu.edu/cs/forums/thread/382.aspx</link><pubDate>Sat, 25 Oct 2008 20:16:32 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:382</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/382.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=382</wfw:commentRss><description>&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font face="Calibri"&gt;&lt;font size="3"&gt;&lt;span class="mw-headline"&gt;&lt;span style="FONT-FAMILY:Calibri;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;Specific Question:&lt;/b&gt; PC model&lt;/font&gt;&lt;/font&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;In Tuesday’s class at the end of the PC model part, you calculated the &lt;span style="mso-bidi-font-family:Arial;"&gt;ψ= R-[LL’] where R is the observed covariance matrix and LL’ is the matrix without smaller eigenvalues. And you said this ψ matrix lacks fit, and also you said residual matrix shows lack of fit. So is here ψ the exact residual matrix which equals to observed R minus fitted R hat? You also indicated ψ can be a matrix with non-zero off the diagonal, meaning ψ=dig [R-LL’] which can infer that LL’+ ψ has diagonals equal to 1. Why can we get this inference? Is that because R itself is a correlation matrix? At last, you said by choosing ψ, residual matrix can have 0 on the diagonal. I thought ψ can be only obtained by ψ=dig [R-LL’]. Then what do you mean by “choosing”?&lt;/span&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;/b&gt;&lt;/font&gt;&lt;/font&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/b&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;General Question:&lt;/b&gt; The&lt;span class="mw-headline"&gt;&lt;span style="FONT-FAMILY:Calibri;"&gt; Chi-squared goodness of fit test&lt;/span&gt;&lt;/span&gt;&lt;/font&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;/b&gt;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 10pt;"&gt;&lt;font size="3"&gt;&lt;font face="Calibri"&gt;I Thursday’s class, we covered the fit measures part which includes RMSODR and Chi-squared test, and you said the model is good if S is “close” to S hat. It is not difficult to understand that the smaller the RMSODR is, the better the model is because this test is based on the residual matrix (S- S hat). While it seems to be a little difficult for me understand the chi-squared test. We don’t want to reject Ho because we want &lt;span style="mso-bidi-font-family:Arial;"&gt;∑&lt;/span&gt;=LL’+&lt;span style="mso-bidi-font-family:Arial;"&gt;ψ&lt;/span&gt;. But here next, what do you mean by “likelihood ratio chi-square”? From the text book of Dr. Conover’s class, in the &lt;span class="mw-headline"&gt;&lt;span style="FONT-FAMILY:Calibri;"&gt;Chi-squared goodness of fit test section,&lt;/span&gt;&lt;/span&gt; I learnt that we reject Ho if T is greater than the 1-α quantile from the chi-squared distribution with c-1 degree of freedom where T= &lt;span style="mso-bidi-font-family:Arial;"&gt;∑[(O-E)^2/E]. From that textbook, we went through some exercises of goodness of fit calculated by the formula above whether the random variables are discrete or continuous. However, my understanding is that you used &lt;/span&gt;“likelihood ratio chi-square” rather than &lt;span style="mso-bidi-font-family:Arial;"&gt;∑ [(O-E) ^2/E].I am confused are they the same thing or totally different method of Chi-square test? &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;/b&gt;&lt;font face="Calibri" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p&gt;&amp;nbsp;&lt;/p&gt;</description></item><item><title>H44134794, I. Mixed data and missing values II. Symmetric matrix and asymmetric matrix for PCA and FA</title><link>http://tltc.ttu.edu/cs/forums/thread/376.aspx</link><pubDate>Sat, 25 Oct 2008 05:02:23 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:376</guid><dc:creator>salee</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/376.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=376</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;1. Mixed data and missing values&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;I have a data set (n=180) that has 20 variables. In the data set, 13 variables are measured with 7 Likert scales, and the remaining 7 variables are nominal data such as sex, education level, and religions. Using this data, I conducted principal component analysis (PCA), and found that the most remaining variable is nominal data. Other variables measured 7 Likert scales mostly are not important in the PCA. So, I want to figure out the reason why this happen. Can I use a data set mixed with different scales for PCA? If so, is nominal data has more weight than the data of large Likert scales? If not, is it a normal result?&lt;/p&gt;
&lt;p&gt;In addition, when making a covariance matrix for this data, I deleted an entire row when a respondent do not answer some of questions. Alternatively, I think that I can consider the only answered portion even for missing data. Between two options, what method is more appropriate for PCA and factor analysis? In my opinion, if we consider the answered portion in missing data, the covariance will be distorted because some covariance parts have more data, and then the relative importance will be different with other parts. &lt;/p&gt;
&lt;p&gt;&lt;b&gt;2. Symmetric matrix and asymmetric matrix for PCA and FA&lt;/b&gt;&lt;br /&gt;
&lt;p&gt;Through the last two classes, I thought that PCA and FA are only applied to a covariance matrix or a correlation matrix. However, I found that these analyses can be applied to an asymmetric matrix by reading a paper (Sidorova et al, 2008)1. They collect abstracts of papers of MIS area, and make term-document frequency table where term = 1,318 and document= 1,615. Then, they conduct singular value decomposition (SVD) and factor analyses, and determine the number of factors and label them. I think that PCA internally can be implemented with SVD, and they are same kinds of analysis. So, is it possible to PCA to an asymmetric matrix?&lt;/p&gt;
&lt;p&gt;1UNCOVERING THE INTELLECTUAL CORE OF THE INFORMATION SYSTEMS DISCIPLINE, MIS Quarterly, Sep2008, Vol. 32 Issue 3, p467-A20&lt;br /&gt;&lt;/p&gt;</description></item><item><title>K65419096 I. Interpretation of Factors  II. Potential Misuse of EFA</title><link>http://tltc.ttu.edu/cs/forums/thread/375.aspx</link><pubDate>Sat, 25 Oct 2008 02:29:12 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:375</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/375.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=375</wfw:commentRss><description>I.&lt;span&gt;&amp;nbsp; &lt;/span&gt;I am confused
about the factor loadings and factors.&lt;span&gt;&amp;nbsp;
&lt;/span&gt;If potential speaker Barbara Bush’s model is determined to be Y1=-.15F1
+ .78F2 + e after looking at all the observations and we’ve hypothesized, by
looking at the other potential speakers’ F2s, that F2 is an inherent trait of
“those who have a social-conservative stance,” how can we interpret and not
interpret this?&lt;span&gt;&amp;nbsp; &lt;/span&gt;Per your posting on
interpretation of factors, we can only say that a person/observation with a
high F2 tends to have high Ys (more likely to support choosing that speaker),
therefore F2 appears to represent a characteristic of the people surveyed.&lt;span&gt;&amp;nbsp; &lt;/span&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;I would tend to interpret this instead to say that a person
with a high F2 tends to choose those speakers with high loadings on F2 (i.e.,
Barbara with .78 and George with .86).&lt;span&gt;&amp;nbsp;
&lt;/span&gt;And a person/observation that has an opposite stance, say a
social-liberal stance, would choose a speaker with a low loading (i.e. Dan
Barker at -.48).&lt;span&gt;&amp;nbsp; &lt;/span&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;Then to complicate this one more level, if F2
is social-conservative and F1 is interest in science, people in quadrant II who
are less interested in science (anti-science) than they are interested in conservative
issues (pro-conservative) might discount F2 and go purely with F1 as what is
the unobserved reason for their choice.&lt;span&gt;&amp;nbsp; &lt;/span&gt;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;Am I making an unsupported leap of logic or have I missed
the point entirely?&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;II. &lt;span&gt;&amp;nbsp;&lt;/span&gt;I understood from
other classes that exploratory factor analysis was done on test/sample data to potentially
reduce the number of necessary variables needing to be measured in a full
sample while still maintaining enough information to accomplish the goal of
your research – testing support for your theory.&lt;span&gt;&amp;nbsp; &lt;/span&gt;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;If Y1 (self-reported stress levels) and Y2 (blood pressure measurements)
both measured the underlying truth of an F1 you have labeled “anxiety” yet Y2 (blood
pressure) also contributed heavily to an F2 you have labeled “stress” and an F3
which you have labeled “tendency towards irritability,” you might consider
dropping Y1 (self-reported stress levels) in favor of more Y2 (blood pressure) measurements.&lt;span&gt;&amp;nbsp; &lt;/span&gt;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;But this seems to be going on that assumption that the
factors are correlated (individuals with high values of F1, F2, and F3 tend to
have high values of Y2) and with EFA the factors are supposed to be
uncorrelated (one of the prime assumptions). &lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;&amp;nbsp;&lt;/p&gt;

&lt;p class="MsoNormal"&gt;Have I been learning an improper use of EFA?&lt;/p&gt;


</description></item><item><title>J80032900 1) What is m in Exploratory Factor Analysis? 2) EFA vs. goodness of fit, over fitting, and data snooping.</title><link>http://tltc.ttu.edu/cs/forums/thread/374.aspx</link><pubDate>Fri, 24 Oct 2008 22:47:57 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:374</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/374.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=374</wfw:commentRss><description>&lt;p&gt;Specific Question: What is m in Exploratory Factor Analysis?&lt;br /&gt;When I am looking at the note for Exploratory Factor Analysis from Tuesday lecture, you showed us the case where m=1 to one. My first question is what is m represents? From my understanding after going through the notes for a few more times m is the number of columns of the Y. In another words m is the number of loadings for factors in a model. Am I am on the right path? Then, my next question is do you just use m=1 so the model is a simple model and we would not be confused? Can m be more than one? If m is more than one is that means the model have more than on latent factors in it? In other words, the class example, Faculty and staff ratings of potential speakers and topics, you picked nfactors = 2 is that means m=2 for this model?&lt;br /&gt;&lt;br /&gt;General Question: EFA vs. goodness of fit, over fitting, and data snooping.&lt;br /&gt;We talked about goodness of fit, which describe how well a statistical model fits a set of observations. And we strive to have good models that models that mimic the real life process. And the main propose we do EFA to find a good model that has good goodness of fit. However, you went over the issue of over fitting in egression analysis class, where over fitting is fitting a statistical model that has too many parameters. And my concern is if we have too much factors in the model we created wouldn’t the model over fitted. In this case the model is only good for the set of data that we are analyzing? Also another issue that came to my mind is data snooping, which is a form of statistical bias generated by the misuse of data mining techniques, which can lead to bogus results in scientific research.&amp;nbsp; I know that it is a fine line we have to aware of but is this not our main concern when we are doing EFA? (I know I should have know this but I still have hard time understanding it from regression analysis class.)&lt;br /&gt;&lt;br /&gt;&lt;/p&gt;</description></item><item><title>D66201843 I. EFA and Estimation Methods  II. EFA and Assumptions</title><link>http://tltc.ttu.edu/cs/forums/thread/373.aspx</link><pubDate>Fri, 24 Oct 2008 22:37:44 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:373</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/373.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=373</wfw:commentRss><description>&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;I. Specific Question (Exploratory Factor Analysis (EFA) – Method to use?)&lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;In Tuesday’s lecture for EFA, we went through the direct equation method.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is it correct that we went through this method to get L and ψ?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;However, this method is not common so is it correct to say that the other methods that we talked about on Tuesday and Thursday: principal components (PC), unweighted least squares (ULS), weighted least squares (WLS) and maximum likelihood (ML) are also used to get L and ψ?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So these would be methods for how we estimate EFA?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;Then I understand the PC method is robust so if you have problems in the data, (the example in class with the singular matrix) you can use PC but is there an ordering for when you should use the others.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;It would seem to me that ML would be my first choice followed by WLS and then ULS.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;Is this correct?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I would think that my choice would be data driven.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For example, if I assume some distribution, MVN for example, then I would use ML but if I couldn’t assume MVN, then I would choose WLS?&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font face="Times New Roman" size="3"&gt;Finally, to further clarify, when we have a dataset, I understand that we run EFA to determine how many factors should be in the model.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;When I say ‘run EFA’ I&amp;nbsp;just set the factors in SAS (1,2,3) but then, does SAS use one of the methods discussed above to calculate how the factors load?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;If this is correct, then I understand that SAS&amp;nbsp;uses PC unless you specify another one.&lt;/font&gt;&lt;/p&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt;&lt;b style="mso-bidi-font-weight:normal;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;II. General Question&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp;&amp;nbsp; &lt;/span&gt;(Exploratory Factor Analysis - Assumptions?) &lt;/font&gt;&lt;/font&gt;&lt;/b&gt;&lt;font face="Times New Roman" size="3"&gt;&amp;nbsp;&lt;/font&gt; 
&lt;p class="MsoNormal" style="MARGIN:0in 0in 0pt;"&gt;&lt;font size="3"&gt;&lt;font face="Times New Roman"&gt;In class this week we learned that one of the assumptions of exploratory factor analysis is that the F’s are uncorrelated which is a big assumption.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;I’m wondering if this assumption can really hold (in reality) when we analysis data?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;For example, you talked in class about the homework where we are looking at police applicants and you indicated they might load on a strength factor, an endurance factor, maybe a speed factor or acuity.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;However, if I think about these carefully, I would think that they are correlated because if I am stronger then my endurance factor would probably also be higher whereas if I never work out, my endurance factor is low and my strength factor would probably also be low.&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;So how do we understand this assumption because it seems good in theory but I question the practically?&lt;span style="mso-spacerun:yes;"&gt;&amp;nbsp; &lt;/span&gt;&lt;/font&gt;&lt;/font&gt;&lt;/p&gt;</description></item><item><title>J68703397. I. FA assumptions   II. Linking concepts &amp; PC vs. FA</title><link>http://tltc.ttu.edu/cs/forums/thread/372.aspx</link><pubDate>Fri, 24 Oct 2008 16:04:08 GMT</pubDate><guid isPermaLink="false">4d9299ce-34a7-4813-8f2c-27fe3b84faa4:372</guid><dc:creator>Anonymous</dc:creator><slash:comments>1</slash:comments><comments>http://tltc.ttu.edu/cs/forums/thread/372.aspx</comments><wfw:commentRss>http://tltc.ttu.edu/cs/forums/commentrss.aspx?SectionID=43&amp;PostID=372</wfw:commentRss><description>&lt;p&gt;&lt;b&gt;I. Specific Question.&amp;nbsp; FA assumptions.&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;In Tuesday&amp;#39;s Lecture you explained the Congeneric Model which is also known as Factor Analysis (FA) with 1 factor.&amp;nbsp;&amp;nbsp; Well, reading the link you provided in the class website, it says that FA is a method to explain the observed variables (Y&amp;#39;s) in terms of &lt;i&gt;fewer &lt;/i&gt;unobservable (latent) variables called factors. &amp;nbsp;&amp;nbsp;In this definition, I can see two objectives of the FA: 1) to explain the variability of the Y&amp;#39;s in terms of F&amp;#39;s, and 2) to reduce the factors that explain the Y&amp;#39;s.&amp;nbsp; I am inferring that the word &amp;quot;fewer&amp;quot; implies reduction.&amp;nbsp; Am I correct?&amp;nbsp; &amp;nbsp;&amp;nbsp;Then, the Wikipedia FA definition also says that &amp;quot;FA assumes that all the rating data can be reduced to a &lt;i&gt;few&lt;/i&gt; important dimensions&amp;quot; and this is &amp;quot;&lt;u&gt;because all the attributes are related&lt;/u&gt;&amp;quot;.&amp;nbsp; My understanding of this is as follows: because the factors are correlated, then I can determine de &lt;i&gt;critical few &lt;/i&gt;that explain the variation of Y&amp;#39;s.&amp;nbsp; Right?&amp;nbsp; Well, I went back to my notes and I noticed that I only got the case of FA with 1 factor.&amp;nbsp; So I did some more reading about FA, and I found that one of the assumptions of the FA model is that &lt;u&gt;the covariance-matrix of the factors is the identity matrix&lt;/u&gt;! [Rencher, 1998].&amp;nbsp; Isn&amp;#39;t this a contradiction?&amp;nbsp;&amp;nbsp; We know that an identity covariance-matrix means zero correlation between the factors and variance of 1 for each factor.&amp;nbsp;&amp;nbsp; This is confusing, for one side it says that FA uses the correlation among factors to reduce the factors that explain the Y&amp;#39;s, and for the other side, the FA has the assumption of uncorrelated factors.&amp;nbsp; How can this be?&amp;nbsp;&amp;nbsp; I do not know if what I read is a &amp;quot;special&amp;quot; case of FA.&amp;nbsp; The rest of the assumptions of the model are: E(F) =0; COV(F)=I (which triggered my question); E(e)=0, cov(e) =psi-matrix; and cov(F,e)=0. &lt;/p&gt;
&lt;p&gt;Rencher, Alvin C., &amp;quot;Multivariate Statistical Inference and Applications,&amp;quot; Wiley-Interscience Publication. New York: 1998.&lt;/p&gt;
&lt;p&gt;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;II. GENERAL QUESTION.&amp;nbsp; PC vs. FA&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;This is a two-fold question.&amp;nbsp; First, I would like to confirm if I am following the logic of the concepts seen on class so far.&amp;nbsp; After the midterm exam, we learned the Latent Variable Measurement Model.&amp;nbsp; General speaking, when we measure things our observable measure is in terms of two unobservable measures: true value and an error term.&amp;nbsp; Then, we learned how to measure the consistency of measure the observable measure related to the true value.&amp;nbsp; This measure is called reliability. &amp;nbsp;&amp;nbsp;After that, we learned that the Cronbach&amp;#39;s alpha coefficient is a lower bound of the reliability if the true values are independent from the errors and if these errors are independent among themselves.&amp;nbsp; Then, since the reliability does not necessarily means &amp;quot;closeness&amp;quot; to the true value, it may be the case where the measurement model is consistently far from the true value. Therefore, we may have a consistent model but unuseful.&amp;nbsp; Then, we need to validate somehow the model.&amp;nbsp; Right?&amp;nbsp; One way of validating the model is using Exploratory Factor Analysis (EFA).&amp;nbsp; For that purpose, we estimate a model using one of the EFA estimating methods, such as Principal Components (PC). Then the estimated model is compared with the observed data and we see how well our estimated model fits the data.&amp;nbsp; The important issue here is that we need a model that is useful; can I define validity in terms of usefulness and usefulness in terms of consistency and fitness of data?&amp;nbsp; &amp;nbsp;Is my understanding correct?&lt;/p&gt;
&lt;p&gt;Second, while doing some reading I found out that Factor Analysis is different from Principal Components (PC).&amp;nbsp;&amp;nbsp; If I understood it correctly, then the difference is at a conceptual level:&amp;nbsp; PC model is a linear combination of &amp;quot;Y&amp;#39;s&amp;quot; whereas the FA is a linear combination of &amp;quot;latent&amp;quot; variables which are factors.&amp;nbsp; In practice, I really do not see any difference at all.&amp;nbsp; I thought that latent variables were always present whenever we measure things.&amp;nbsp;&amp;nbsp; In both cases, FA and PC we are trying to explain a response variable (Y&amp;#39;s) in terms of factors or other variables (X&amp;#39;s).&amp;nbsp;&amp;nbsp; In practice what would be the difference between PC and FA? &amp;nbsp;Is it in the way we collect data?&amp;nbsp; Would the difference between observing data and &amp;quot;generating&amp;quot; data in a simulation process? &lt;/p&gt;</description></item></channel></rss>