Fuzzy association rules for biological data analysis: a case study on yeast.

Francisco J Lopez, Armando Blanco, Fernando Garcia, Carlos Cano, Antonio Marin
Author Information
  1. Francisco J Lopez: Department of Computer Science and AI, University of Granada, 18071, Granada, Spain. fjavier@decsai.ugr.es

Abstract

BACKGROUND: Last years' mapping of diverse genomes has generated huge amounts of biological data which are currently dispersed through many databases. Integration of the information available in the various databases is required to unveil possible associations relating already known data. Biological data are often imprecise and noisy. Fuzzy set theory is specially suitable to model imprecise data while association rules are very appropriate to integrate heterogeneous data.
RESULTS: In this work we propose a novel fuzzy methodology based on a fuzzy association rule mining method for biological knowledge extraction. We apply this methodology over a yeast genome dataset containing heterogeneous information regarding structural and functional genome features. A number of association rules have been found, many of them agreeing with previous research in the area. In addition, a comparison between crisp and fuzzy results proves the fuzzy associations to be more reliable than crisp ones.
CONCLUSION: An integrative approach as the one carried out in this work can unveil significant knowledge which is currently hidden and dispersed through the existing biological databases. It is shown that fuzzy association rules can model this knowledge in an intuitive way by using linguistic labels and few easy-understandable parameters.

References

  1. J Biochem Mol Biol. 2004 Jan 31;37(1):93-106 [PMID: 14761307]
  2. Bioinformatics. 2006 May 1;22(9):1122-9 [PMID: 16500941]
  3. EMBO J. 1990 Jun;9(6):1873-81 [PMID: 1693332]
  4. Yeast. 2003 Jun;20(8):703-11 [PMID: 12794931]
  5. Nat Genet. 2003 Mar;33 Suppl:305-10 [PMID: 12610540]
  6. BMC Evol Biol. 2006 Aug 15;6:61 [PMID: 16911784]
  7. Genome Biol. 2004;5(11):R94 [PMID: 15535870]
  8. Development. 2007 Sep;134(18):3227-38 [PMID: 17686968]
  9. Appl Bioinformatics. 2002;1(4):191-222 [PMID: 15130837]
  10. Proc Natl Acad Sci U S A. 1998 Dec 8;95(25):14863-8 [PMID: 9843981]
  11. Mol Cell. 1998 Jul;2(1):65-73 [PMID: 9702192]
  12. Proc Natl Acad Sci U S A. 1991 Nov 1;88(21):9675-9 [PMID: 1946386]
  13. Cell. 1985 Jan;40(1):129-37 [PMID: 2981624]
  14. Nature. 2003 Oct 16;425(6959):686-91 [PMID: 14562095]
  15. Comput Biol Med. 2006 Oct;36(10):1104-25 [PMID: 16226240]
  16. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W91-6 [PMID: 17478504]
  17. Gene. 2004 May 26;333:151-5 [PMID: 15177690]
  18. Trends Genet. 2007 May;23(5):250-7 [PMID: 17379352]
  19. Nat Rev Mol Cell Biol. 2006 Mar;7(3):198-210 [PMID: 16496022]
  20. Bioinformatics. 2003 Jan;19(1):79-86 [PMID: 12499296]
  21. Cell. 1988 Jul 29;54(3):403-11 [PMID: 2840207]
  22. Annu Rev Genet. 2000;34:77-137 [PMID: 11092823]
  23. Curr Opin Genet Dev. 2002 Apr;12(2):130-6 [PMID: 11893484]
  24. IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):24-45 [PMID: 17048406]
  25. Nature. 2003 Oct 16;425(6959):671-2 [PMID: 14562083]
  26. Yeast. 2000 Sep 15;16(12):1131-45 [PMID: 10953085]
  27. Nature. 1997 May 29;387(6632 Suppl):5 [PMID: 9169864]
  28. Genome Biol. 2000;1(2):RESEARCH0003 [PMID: 11178228]
  29. Proc Natl Acad Sci U S A. 2000 Oct 10;97(21):11383-90 [PMID: 11027339]
  30. Mol Biol Cell. 2000 Dec;11(12):4241-57 [PMID: 11102521]
  31. Proc Natl Acad Sci U S A. 1996 Aug 20;93(17):8863-7 [PMID: 8799118]
  32. Nature. 1999 Nov 25;402(6760):418-21 [PMID: 10586882]
  33. Proc Natl Acad Sci U S A. 1999 Mar 16;96(6):2907-12 [PMID: 10077610]
  34. Eur J Biochem. 1980 Mar;105(1):75-80 [PMID: 6989605]
  35. J Cell Biochem. 1994 May;55(1):93-7 [PMID: 8083304]
  36. Mol Biol Evol. 2002 Jul;19(7):1181-97 [PMID: 12082137]
  37. Trends Genet. 1996 Jul;12(7):263-70 [PMID: 8763498]
  38. Nucleic Acids Res. 2002 Jan 1;30(1):69-72 [PMID: 11752257]
  39. Gene. 2002 Oct 30;300(1-2):63-8 [PMID: 12468087]
  40. Nat Genet. 2006 Jul;38(7):830-4 [PMID: 16783381]
  41. BMC Bioinformatics. 2006 Feb 07;7:54 [PMID: 16464256]
  42. Nucleic Acids Res. 2000 Mar 15;28(6):1481-8 [PMID: 10684945]
  43. Genes Dev. 1988 Jun;2(6):766-72 [PMID: 3047008]

MeSH Term

Algorithms
Base Sequence
Chromosome Mapping
Databases, Genetic
Fuzzy Logic
Genome, Fungal
Molecular Sequence Data
Pattern Recognition, Automated
Saccharomyces cerevisiae

Word Cloud

Created with Highcharts 10.0.0dataassociationfuzzybiologicalrulesdatabasesknowledgecurrentlydispersedmanyinformationunveilassociationsimpreciseFuzzymodelheterogeneousworkmethodologyyeastgenomecrispcanBACKGROUND:Lastyears'mappingdiversegenomesgeneratedhugeamountsIntegrationavailablevariousrequiredpossiblerelatingalreadyknownBiologicaloftennoisysettheoryspeciallysuitableappropriateintegrateRESULTS:proposenovelbasedruleminingmethodextractionapplydatasetcontainingregardingstructuralfunctionalfeaturesnumberfoundagreeingpreviousresearchareaadditioncomparisonresultsprovesreliableonesCONCLUSION:integrativeapproachonecarriedsignificanthiddenexistingshownintuitivewayusinglinguisticlabelseasy-understandableparametersanalysis:casestudy

Similar Articles

Cited By