Fast Inference for the Latent Space Network Model Using a Case-Control Approximate Likelihood.

Adrian E Raftery, Xiaoyue Niu, Peter D Hoff, Ka Yee Yeung
Author Information
  1. Adrian E Raftery: Department of Statistics, University of Washington, Seattle, Wash., USA.
  2. Xiaoyue Niu: Department of Statistics, University of Washington, Seattle, Wash., USA.
  3. Peter D Hoff: Department of Statistics, University of Washington, Seattle, Wash., USA.
  4. Ka Yee Yeung: Department of Statistics, University of Washington, Seattle, Wash., USA.

Abstract

Network models are widely used in social sciences and genome sciences. The latent space model proposed by (Hoff et al. 2002), and extended by (Handcock et al. 2007) to incorporate clustering, provides a visually interpretable model-based spatial representation of relational data and takes account of several intrinsic network properties. Due to the structure of the likelihood function of the latent space model, the computational cost is of order (), where is the number of nodes. This makes it infeasible for large networks. In this paper, we propose an approximation of the log likelihood function. We adopt the case-control idea from epidemiology and construct a case-control likelihood which is an unbiased estimator of the full likelihood. Replacing the full likelihood by the case-control likelihood in the MCMC estimation of the latent space model reduces the computational time from () to (), making it feasible for large networks. We evaluate its performance using simulated and real data. We fit the model to a large protein-protein interaction data using the case-control likelihood and use the model fitted link probabilities to identify false positive links.

Keywords

References

  1. Soc Networks. 2009 Jul 1;31(3):204-213 [PMID: 20191087]
  2. Nat Rev Genet. 2004 Feb;5(2):101-13 [PMID: 14735121]
  3. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D535-9 [PMID: 16381927]
  4. PLoS Comput Biol. 2009 Aug;5(8):e1000454 [PMID: 19662157]
  5. J Mach Learn Res. 2008 Sep;9:1981-2014 [PMID: 21701698]
  6. BMC Bioinformatics. 2007 Jul 23;8:262 [PMID: 17645798]
  7. IARC Sci Publ. 1980;(32):5-338 [PMID: 7216345]
  8. Trends Genet. 2002 Oct;18(10):529-36 [PMID: 12350343]
  9. Nature. 2000 Feb 10;403(6770):623-7 [PMID: 10688190]
  10. J Am Stat Assoc. 1996 Mar;91(433):14-28 [PMID: 12155399]
  11. Nat Genet. 2000 May;25(1):25-9 [PMID: 10802651]

Grants

  1. R01 GM084163/NIGMS NIH HHS
  2. R01 HD054511/NICHD NIH HHS
  3. R01 HD070936/NICHD NIH HHS

Word Cloud

Created with Highcharts 10.0.0likelihoodmodelcase-controllatentspacedatalargeNetworksocialsciencesgenomeetalclusteringfunctioncomputationalnetworksfullusingprotein-proteininteractionsciencemodelswidelyusedproposedHoff2002extendedHandcock2007incorporateprovidesvisuallyinterpretablemodel-basedspatialrepresentationrelationaltakesaccountseveralintrinsicnetworkpropertiesDuestructurecostordernumbernodesmakesinfeasiblepaperproposeapproximationlogadoptideaepidemiologyconstructunbiasedestimatorReplacingMCMCestimationreducestimemakingfeasibleevaluateperformancesimulatedrealfitusefittedlinkprobabilitiesidentifyfalsepositivelinksFastInferenceLatentSpaceModelUsingCase-ControlApproximateLikelihoodMarkovchainMonteCarlograph

Similar Articles

Cited By