The Freedman article distributed in class, wikipedia, and class discussions (4/23 @ 1:08:14) have used the economic theories of supply and demand as examples of endogeneity and/or reciprocal causation. Both state that a variable is considered endogenous when it can be predicted by another variable in the model; more specifically, the equilibrium price of a good or service is endogenous because suppliers change their price of a product in response to demand and consumers change their demand for a product in response to changes in price.
In a recently read article, the authors examine the substitutability of ATMs and EFTPOS (electronic fund transfers at point of sale, i.e. debt cards) by consumers when purchasing a good. The authors argue that the consumer’s choice of payment method (ATM or EFTPOS) is associated with (1) the ATM transaction fee, (2) the size of the ATM network, as measured by the number of ATM machines in a particular bank’s network, (3) ATM transactions by members, (4) ATM transactions by non-members, and (5) EFTPOS transactions.
There are 46 banks used in this study and data is collected from 1997 to 2003. The empirical methodology of this article involved using a system of five equations estimated using a three stages least squares (3SLS) with fixed effects.
First, is a three stage least squares regression versus an OLS used because of endogeneity and reciprocal causation among all five variables? For example, there is endogenity in (i) the transaction fee a bank charges for the use of an ATM machine in its network and (ii) the size of the ATM network because banks determine the fee they will charge for ATM usage based on the consumer demand, and consumers’ demand for ATMs will be affected by the transaction fee charged for use. Likewise, a bank’s choice to expand its network will be a function of the fee it can charge for usage and consumer’s demand for ATMs in a network will be affected by the transaction fee charged by the network? As well as because the data is panel data with autocorrelation and cross-sectional correlation since the data is for the same banks (within subject) across time (4/9 @ 40:15 and 4/23 @ 41:26).
However, would not a maximum likelihood model be “better;” possibly using the PROC SYSLIN and SUR procedures to analysis the five simultaneous equations, one for each dependent variable?
Furthermore, would not a random effects model be preferable to a fixed effects model? There are 46 banks or levels in the data and this is relatively large (more than 20, but less than but less than 133 that were in the “major” example we analyzed on 4/14 @ 52:47) and we are not concerned with each bank, but more with the cross-sectional “average” bank (4/14 @ 1:13:27).