Assessing the primary data hosted by the Spanish node of the Global Biodiversity Information Facility (GBIF).

Javier Otegui, Arturo H Ariño, María A Encinas, Francisco Pando
Author Information
  1. Javier Otegui: Department of Zoology and Ecology, University of Navarra, Pamplona, Navarra, Spain. javier.otegui@gmail.com

Abstract

In order to effectively understand and cope with the current 'biodiversity crisis', having large-enough sets of qualified data is necessary. Information facilitators such as the Global Biodiversity Information Facility (GBIF) are ensuring increasing availability of primary biodiversity records by linking data collections spread over several institutions that have agreed to publish their data in a common access schema. We have assessed the primary records that one such publisher, the Spanish node of GBIF (GBIF.ES), hosts on behalf of a number of institutions, considered to be a highly representative sample of the total mass of available data for a country in order to know the quantity and quality of the information made available. Our results may provide an indication of the overall fitness-for-use in these data. We have found a number of patterns in the availability and accrual of data that seem to arise naturally from the digitization processes. Knowing these patterns and features may help deciding when and how these data can be used. Broadly, the error level seems low. The available data may be of capital importance for the development of biodiversity research, both locally and globally. However, wide swaths of records lack data elements such as georeferencing or taxonomical levels. Although the remaining information is ample and fit for many uses, improving the completeness of the records would likely increase the usability span for these data.

References

  1. Nature. 2005 Mar 3;434(7029):32-3 [PMID: 15744284]
  2. Science. 2008 Oct 10;322(5899):225-30 [PMID: 18845749]
  3. Nature. 2006 Jul 20;442(7100):245-6 [PMID: 16855567]
  4. Science. 2011 Apr 1;332(6025):53-8 [PMID: 21454781]
  5. PLoS Biol. 2010 Jun 01;8(6):e1000385 [PMID: 20532234]
  6. PLoS One. 2007 Nov 07;2(11):e1124 [PMID: 17987112]
  7. Brief Bioinform. 2007 Sep;8(5):347-57 [PMID: 17704120]
  8. BMC Bioinformatics. 2011;12 Suppl 15:S3 [PMID: 22373200]
  9. Science. 1995 Jul 21;269(5222):347-50 [PMID: 17841251]
  10. Bioinformatics. 2009 Feb 15;25(4):421-8 [PMID: 19129210]
  11. BMC Bioinformatics. 2011;12 Suppl 15:S2 [PMID: 22373175]
  12. Proc Natl Acad Sci U S A. 2011 Jul 26;108(30):12337-42 [PMID: 21746924]
  13. PLoS One. 2011;6(9):e25440 [PMID: 21980457]
  14. Science. 2003 Nov 14;302(5648):1175-7 [PMID: 14615529]
  15. Ecol Lett. 2007 Aug;10(8):663-72 [PMID: 17594421]
  16. Annu Rev Entomol. 2007;52:421-38 [PMID: 16956323]
  17. Science. 2000 Sep 29;289(5488):2309-12 [PMID: 11009408]

MeSH Term

Biodiversity
Databases, Factual
Forms and Records Control
Government Programs
Humans
Internet
Publishing
Spain

Links to CNCB-NGDC Resources

Database Commons: DBC004810 (GBIF)

Word Cloud

Created with Highcharts 10.0.0dataGBIFrecordsInformationprimaryavailablemayorderGlobalBiodiversityFacilityavailabilitybiodiversityinstitutionsSpanishnodenumberinformationpatternseffectivelyunderstandcopecurrent'biodiversitycrisis'large-enoughsetsqualifiednecessaryfacilitatorsensuringincreasinglinkingcollectionsspreadseveralagreedpublishcommonaccessschemaassessedonepublisherEShostsbehalfconsideredhighlyrepresentativesampletotalmasscountryknowquantityqualitymaderesultsprovideindicationoverallfitness-for-usefoundaccrualseemarisenaturallydigitizationprocessesKnowingfeatureshelpdecidingcanusedBroadlyerrorlevelseemslowcapitalimportancedevelopmentresearchlocallygloballyHoweverwideswathslackelementsgeoreferencingtaxonomicallevelsAlthoughremainingamplefitmanyusesimprovingcompletenesslikelyincreaseusabilityspanAssessinghosted

Similar Articles

Cited By