November 23, 2018

Control marginal


In log-linear model approach, we are taught that we should control `marginal distribution' in contingency tables. Let's think about why. In the sample table below, C is cohort (just 2), W and H refer to wife's and husband's education, and each number corresponds to educational rank (the larger, the higher). A design matrix(quasihetero.txt) allows us to see a kind of quasi-independence, but in this example, we do not control marginal distribution. As usual, I'm interested in the magnitude of homogamy parameters. Results are agains my expectation, since I suppose that the diagonal cells are more likely to occur, but of course, this model did not consider marginals, so the results are understandable given that the union between highly educated men and women are less likely to occur. This is why we need to control the marginal distribution of wife's and husband's education. As this example shows, even when we are not interested in the marginals, we need to consider adding the parameter. In the same token, conventionally lower BIC means worse fit, so we need to do something when we see the BIC is not negative, even if it seems we add enough parameters. 




LEM: log-linear and event history analysis with missing data.
Developed by Jeroen Vermunt (c), Tilburg University, The Netherlands.
Version 1.0 (September 18, 1997).


*** INPUT ***

    man 3                     
    dim 2 6 6                 
    lab C W H 
   ***HOMOGAMY (MAT1) with changing RCII
    mod {fac(WH,7) }      des quasihetero.txt
        dat sample.fre                   
      nco     


*** STATISTICS ***

  Number of iterations = 20
  Converge criterion   = 0.0000003082

  X-squared            = 9082.8294 (0.0000)
  L-squared            = 8078.8462 (0.0000)
  Cressie-Read         = 8310.0803 (0.0000)
  Dissimilarity index  = 0.3188
  Degrees of freedom   = 64
  Log-likelihood       = -34936.45669
  Number of parameters = 7 (+1)
  Sample size          = 9684.0
  BIC(L-squared)       = 7491.4395
  AIC(L-squared)       = 7950.8462
  BIC(log-likelihood)  = 69937.1610
  AIC(log-likelihood)  = 69886.9134

  Eigenvalues information matrix
    2241.6153   522.8455   441.5313   291.4483   182.6207   151.1644
      68.8027


*** LOG-LINEAR PARAMETERS ***

* TABLE CWH [or P(CWH)] *

  effect           beta  std err  z-value   exp(beta)     Wald  df  prob
  main           4.5501                       94.6463 
  fac(WH)
   1             1.0333   0.0458   22.552      2.8104 
   2             2.8773   0.0227  126.488     17.7661 
   3             0.0037   0.0740    0.050      1.0037 
   4             0.5093   0.0583    8.741      1.6641 
   5            -0.7967   0.0466  -17.092      0.4508 
   6            -0.1744   0.0807   -2.162      0.8400 
   7            -0.9948   0.1204   -8.260      0.3698 18807.52   7 0.000


No comments:

Post a Comment