Biocuration: Distilling data into knowledge.

International Society for Biocuration
Author Information

Abstract

Data, including information generated from them by processing and analysis, are an asset with measurable value. The assets that biological research funding produces are the data generated, the information derived from these data, and, ultimately, the discoveries and knowledge these lead to. From the time when Henry Oldenburg published the first scientific journal in 1665 (Proceedings of the Royal Society) to the founding of the United States National Library of Medicine in 1879 to the present, there has been a sustained drive to improve how researchers can record and discover what is known. Researchers' experimental work builds upon years and (collectively) billions of dollars' worth of earlier work. Today, researchers are generating data at ever-faster rates because of advances in instrumentation and technology, coupled with decreases in production costs. Unfortunately, the ability of researchers to manage and disseminate their results has not kept pace, so their work cannot achieve its maximal impact. Strides have recently been made, but more awareness is needed of the essential role that biological data resources, including biocuration, play in maintaining and linking this ever-growing flood of data and information. The aim of this paper is to describe the nature of data as an asset, the role biocurators play in increasing its value, and consistent, practical means to measure effectiveness that can guide planning and justify costs in biological research information resources' development and management.

References

  1. Nucleic Acids Res. 2016 Jan 4;44(D1):D48-50 [PMID: 26657633]
  2. Nucleic Acids Res. 2014 Jan;42(Database issue):D966-74 [PMID: 24217912]
  3. Brief Bioinform. 2016 Jan;17(1):132-44 [PMID: 25935162]
  4. BMC Biol. 2016 Jun 22;14:49 [PMID: 27334346]
  5. Proc Natl Acad Sci U S A. 2004 Apr 6;101 Suppl 1:5228-35 [PMID: 14872004]
  6. Nat Methods. 2017 Oct 31;14(11):1021-1022 [PMID: 29088127]
  7. Bioinformatics. 2011 Jul 15;27(14):2021-2 [PMID: 21622664]
  8. Structure. 2017 Mar 7;25(3):536-545 [PMID: 28190782]
  9. Bioinformatics. 2014 Jun 15;30(12):1791-2 [PMID: 24574118]
  10. Lancet. 2010 May 1;375(9725):1525-35 [PMID: 20435227]
  11. Database (Oxford). 2016 Sep 01;2016: [PMID: 27589961]
  12. Bioinformatics. 2017 Nov 1;33(21):3454-3460 [PMID: 29036270]
  13. PeerJ. 2013 Oct 01;1:e175 [PMID: 24109559]
  14. Sci Transl Med. 2015 Oct 28;7(311):311ra174 [PMID: 26511511]
  15. PLoS One. 2015 Mar 23;10(3):e0121409 [PMID: 25799293]
  16. Brief Bioinform. 2016 Jan;17(1):23-32 [PMID: 25888696]
  17. BMC Bioinformatics. 2010 Jun 30;11:358 [PMID: 20591175]
  18. Database (Oxford). 2013 Jan 17;2013:bas056 [PMID: 23327936]
  19. Nat Methods. 2013 Feb;10(2):91 [PMID: 23479796]
  20. Genetics. 2016 Aug;203(4):1491-5 [PMID: 27516611]
  21. Nucleic Acids Res. 2017 Jan 4;45(D1):D1100-D1106 [PMID: 27924013]
  22. PeerJ. 2013 Sep 05;1:e148 [PMID: 24032093]
  23. Gene. 2016 Nov 5;592(2):235-8 [PMID: 27150585]
  24. PLoS Biol. 2017 Jun 29;15(6):e2001414 [PMID: 28662064]
  25. Mol Genet Genomics. 2010 May;283(5):415-25 [PMID: 20221640]
  26. Brief Bioinform. 2005 Dec;6(4):344-56 [PMID: 16420733]
  27. Nat Genet. 2000 May;25(1):25-9 [PMID: 10802651]
  28. PLoS One. 2007 Mar 21;2(3):e308 [PMID: 17375194]
  29. Nucleic Acids Res. 2017 Jan 4;45(D1):D865-D876 [PMID: 27899602]
  30. Science. 2011 Feb 11;331(6018):728-9 [PMID: 21311016]
  31. PLoS Comput Biol. 2005 Aug;1(3):179-81 [PMID: 16158097]
  32. Database (Oxford). 2014 Apr 07;2014(0):bau033 [PMID: 24715220]
  33. Nature. 2008 Sep 4;455(7209):47-50 [PMID: 18769432]
  34. Brief Bioinform. 2016 Sep;17(5):841-62 [PMID: 26494363]
  35. Database (Oxford). 2016 Feb 09;2016: [PMID: 26861660]
  36. Database (Oxford). 2012 Apr 18;2012:bas020 [PMID: 22513129]
  37. Genome Biol. 2013 Aug 30;14(8):R93 [PMID: 24000942]
  38. F1000Res. 2014 Jan 09;3:6 [PMID: 25653834]
  39. Nucleic Acids Res. 2014 Jan;42(Database issue):D980-5 [PMID: 24234437]
  40. PeerJ Comput Sci. 2015;1: [PMID: 26167542]
  41. Genet Med. 2016 Jun;18(6):608-17 [PMID: 26562225]
  42. Am J Hum Genet. 2015 Jul 2;97(1):111-24 [PMID: 26119816]
  43. Database (Oxford). 2016 May 17;2016: [PMID: 27189610]
  44. Nat Methods. 2012 Apr;9(4):345-50 [PMID: 22453911]
  45. PLoS Comput Biol. 2006 Oct 27;2(10):e125 [PMID: 17069454]
  46. Genome Biol. 2016 Aug 23;17(1):177 [PMID: 27552985]
  47. BMC Bioinformatics. 2008 Apr 14;9:193 [PMID: 18410678]
  48. Database (Oxford). 2016 Dec 26;2016: [PMID: 28025340]

Grants

  1. R13 GM109648/NIGMS NIH HHS

MeSH Term

Data Aggregation
Data Science
Humans
Information Dissemination
Information Management
Knowledge

Word Cloud

Similar Articles

Cited By