Does an uneven sample size distribution across settings matter in cross-classified multilevel modeling? Results of a simulation study.

Carly E Milliren, Clare R Evans, Tracy K Richmond, Erin C Dunn
Author Information
  1. Carly E Milliren: Center for Applied Pediatric Quality Analytics, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA; Division of Adolescent/Young Adult Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA. Electronic address: carly.milliren@childrens.harvard.edu.
  2. Clare R Evans: Department of Sociology, University of Oregon, 736 PLC 1291, Eugene, OR 97403, USA. Electronic address: cevans@uoregon.edu.
  3. Tracy K Richmond: Division of Adolescent/Young Adult Medicine, Boston Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA; Department of Pediatrics, Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA. Electronic address: tracy.richmond@childrens.harvard.edu.
  4. Erin C Dunn: Psychiatric and Neurodevelopmental Genetics Unit, Center for Human Genetic Research, Massachusetts General Hospital, 55 Fruit Street, Boston, MA 02114, USA; Department of Psychiatry, Harvard Medical School, 401 Park Drive, Boston, MA 02215, USA; Stanley Center for Psychiatric Research, The Broad Institute of Harvard and MIT, 75 Ames Street, Cambridge, MA 02142, USA. Electronic address: edunn2@mgh.harvard.edu.

Abstract

BACKGROUND: Recent advances in multilevel modeling allow for modeling non-hierarchical levels (e.g., youth in non-nested schools and neighborhoods) using cross-classified multilevel models (CCMM). Current practice is to cluster samples from one context (e.g., schools) and utilize the observations however they are distributed from the second context (e.g., neighborhoods). However, it is unknown whether an uneven distribution of sample size across these contexts leads to incorrect estimates of random effects in CCMMs.
METHODS: Using the school and neighborhood data structure in Add Health, we examined the effect of neighborhood sample size imbalance on the estimation of variance parameters in models predicting BMI. We differentially assigned students from a given school to neighborhoods within that school's catchment area using three scenarios of (im)balance. 1000 random datasets were simulated for each of five combinations of school- and neighborhood-level variance and imbalance scenarios, for a total of 15,000 simulated data sets. For each simulation, we calculated 95% CIs for the variance parameters to determine whether the true simulated variance fell within the interval.
RESULTS: Across all simulations, the "true" school and neighborhood variance parameters were estimated 93-96% of the time. Only 5% of models failed to capture neighborhood variance; 6% failed to capture school variance.
CONCLUSIONS: These results suggest that there is no systematic bias in the ability of CCMM to capture the true variance parameters regardless of the distribution of students across neighborhoods. Ongoing efforts to use CCMM are warranted and can proceed without concern for the sample imbalance across contexts.

Keywords

References

  1. J Epidemiol Community Health. 2006 Feb;60(2):149-55 [PMID: 16415266]
  2. J Health Serv Res Policy. 1996 Jul;1(3):154-64 [PMID: 10180862]
  3. Soc Sci Med. 2016 Aug;162:21-31 [PMID: 27322912]
  4. Drug Alcohol Depend. 2006 Sep 1;84(1):56-68 [PMID: 16413142]
  5. Int J Obes (Lond). 2012 Jan;36(1):45-52 [PMID: 22005718]
  6. Obes Rev. 2005 Feb;6(1):5-7 [PMID: 15655032]
  7. J Epidemiol Community Health. 2001 Feb;55(2):111-22 [PMID: 11154250]
  8. Soc Sci Med. 2014 Oct;119:81-7 [PMID: 25150654]
  9. Annu Rev Public Health. 2000;21:171-92 [PMID: 10884951]
  10. Psychol Bull. 2000 Mar;126(2):309-37 [PMID: 10748645]

Grants

  1. K01 MH102403/NIMH NIH HHS
  2. P01 HD031921/NICHD NIH HHS
  3. R01 MH113930/NIMH NIH HHS

MeSH Term

Bias
Catchment Area, Health
Computer Simulation
Humans
Likelihood Functions
Multilevel Analysis
Residence Characteristics
Sample Size
Schools

Word Cloud

Created with Highcharts 10.0.0variancemultilevelneighborhoodssampleacrossschoolneighborhoodparametersmodelingegmodelsCCMMdistributionsizeimbalancesimulatedcaptureschoolsusingcross-classifiedcontextwhetherunevencontextsrandomeffectsdatastudentswithinscenariossimulationtruefailedBACKGROUND:Recentadvancesallownon-hierarchicallevelsyouthnon-nestedCurrentpracticeclustersamplesoneutilizeobservationshoweverdistributedsecondHoweverunknownleadsincorrectestimatesCCMMsMETHODS:UsingstructureAddHealthexaminedeffectestimationpredictingBMIdifferentiallyassignedgivenschool'scatchmentareathreeimbalance1000datasetsfivecombinationsschool-neighborhood-leveltotal15000setscalculated95%CIsdeterminefellintervalRESULTS:Acrosssimulations"true"estimated93-96%time5%6%CONCLUSIONS:resultssuggestsystematicbiasabilityregardlessOngoingeffortsusewarrantedcanproceedwithoutconcernsettingsmattermodeling?ResultsstudyContextualCross-classifiedNeighborhoodsSchoolsSimulation

Similar Articles

Cited By