Metadata analysis of retracted fake papers in Naunyn-Schmiedeberg's Archives of Pharmacology.

Jonathan Wittau, Roland Seifert
Author Information
  1. Jonathan Wittau: Institute of Pharmacology, Hannover Medical School, Carl-Neuberg-Stra��e 1, 30625, Hannover, Germany.
  2. Roland Seifert: Institute of Pharmacology, Hannover Medical School, Carl-Neuberg-Stra��e 1, 30625, Hannover, Germany. seifert.roland@mh-hannover.de.

Abstract

An increasing fake paper problem is a cause for concern in the scientific community. These papers look scientific but contain manipulated data or are completely fictitious. So-called paper mills produce fake papers on a large scale and publish them in the name of people who buy authorship. The aim of this study was to learn more about the characteristics of fake papers at the metadata level. We also investigated whether some of these characteristics could be used to detect fake papers. For that purpose, we examined metadata of 12 fake papers that were retracted by Naunyn-Schmiedeberg's Archives of Pharmacology (NSAP) in recent years. We also compared many of these metadata with those of a reference group of 733 articles published by NSAP. It turned out that in many characteristics the fake papers we examined did not differ substantially from the other articles. It was only noticeable that the fake papers came almost exclusively from a certain country, used non-institutional email addresses more often than average, and referenced dubious literature significantly more often. However, these three features are only of limited use in identifying fake papers. We were also able to show that fake papers not only contaminate the scientific record while they are unidentified but also continue to do so even after retraction. Our results indicate that fake papers are well made and resemble honest papers even at the metadata level. Because they contaminate the scientific record in the long term and this cannot be fully contained even by their retraction, it is particularly important to identify them before publication. Further research on the topic of fake papers is therefore urgently needed.

Keywords

References

  1. Abalkina A, Bishop DVM (2022) Paper mills: a novel form of publishing malpractice affecting psychology. PsyArXiv 5 September 2022. https://doi.org/10.31234/osf.io/2yf8z
  2. Ahlmann-Eltze C, Patil I (2021) ggsignif: R package for displaying significance brackets for ���ggplot2���. PsyArXiv 31 March 2021. https://doi.org/10.31234/osf.io/7awm6
  3. Auguie B (2017) gridExtra: miscellaneous functions for ���grid��� graphics. R package version 2.3, https://CRAN.R-project.org/package=gridExtra
  4. Beall J (2012) Predatory publishers are corrupting open access. Nature 489(7415):179 [DOI: 10.1038/489179a]
  5. Becker O, Minka A, Deckmyn A (2022) maps: draw geographical maps. R package version 3.4.1, https://CRAN.R-project.org/package=maps
  6. Bishop DVM (2023) Red flags for paper mills need to go beyond the level of individual articles: a case study of Hindawi Special Issues. PsyArXiv 6 February 2023. https://doi.org/10.31234/osf.io/6mbgv
  7. Bivand R, Rundel C (2023) rgeos: interface to Geometry Engine - Open Source ('GEOS'). R package version 0.6���4, https://CRAN.R-project.org/package=rgeos
  8. Byrne JA, Christopher J (2020) Digital magic, or the dark arts ofthe 21st century���how can journals and peer reviewers detectmanuscripts and publications from paper mills? FEBS Lett 594:583���589 [DOI: 10.1002/1873-3468.13747]
  9. Cabanac G, Labb�� C, Magazinov A (2021) Tortured phrases: a dubious writing style emerging in science. Evidence of Critical Issues Affecting Established Journals. ArXiv:2107.06751. https://doi.org/10.48550/arXiv.2107.06751
  10. Candal-Pedreira C, Ross JS, Ruano-Ravina A, Egilman DS, Fern��ndez E, P��rez-R��os M (2022) Retracted papers originating from paper mills: cross sectional study. BMJ 379:e071517 [DOI: 10.1136/bmj-2022-071517]
  11. Candal-Pedreira C, Ruano-Ravina A, Fern��ndez E, Ramos J, Campos-Varela I, P��rez-R��os M (2020) Does retraction after misconduct have an impact on citations? A pre-post study. BMJ Glob Health 5(11):e003719
  12. COPE & STM (2022) Paper Mills ��� research report from COPE & STM ��� English. https://publicationethics.org/node/55256 . https://doi.org/10.24318/jtbG8IHL . Accessed 10 Sept 2023
  13. Dadkhah M, Oermann MH, Heged��s M, Raman R, D��vid LD (2023) Detection of fake papers in the era of artificial intelligence. Diagnosis. https://doi.org/10.1515/dx-2023-0090
  14. Day A (2022) Exploratory analysis of text duplication in peer-review reveals peer-review fraud and paper mills. Scientometrics 127:5965���5987 [DOI: 10.1007/s11192-022-04504-5]
  15. Else H (2022) Paper-mill detector put to the test in push to stamp out fake science. Nature 612(7940):386���387 [DOI: 10.1038/d41586-022-04245-8]
  16. Else H, Van Noorden R (2021) The fight against fake-paper factories that churn out sham science. Nature 591:516���519 [DOI: 10.1038/d41586-021-00733-5]
  17. Fister I Jr, Fister I, Perc M (2016) Toward the discovery of citation cartels in citation networks. Frontiers in Physics 4:49 [DOI: 10.3389/fphy.2016.00049]
  18. Kohl M (2023) MKinfer: Inferential Statistics. R package version 1.1, https://www.stamats.de
  19. Lin S (2013) Why serious academic fraud occurs in China. Learn Publ 26:24���27 [DOI: 10.1087/20130105]
  20. Liu Y, Bi T, Yuan F, Gao X, Jia G, Tian Z (2020) S-adenosylmethionine induces apoptosis and cycle arrest of gallbladder carcinoma cells by suppression of JAK2/STAT3 pathways. Naunyn Schmiedebergs Arch Pharmacol 393(12):2507���2515
  21. Park Y, West RA, Pathmendra P, Favier B, Stoeger T, Capes-Davis A, Cabanac G, Labb�� C, Byrne JA (2022) Identification of human gene research articles with wrongly identified nucleotide sequences. Life Sci Alliance 5(4):e202101203
  22. Pathmendra P, Park Y, Enguita FJ, Byrne JA (2023) Verification of nucleotide sequence reagent identities in original publications in high impact factor cancer research journals. Naunyn Schmiedebergs Arch Pharmacol. https://doi.org/10.1101/2023.02.03.526922
  23. Posit team (2023) RStudio: integrated development environment for R. Posit Software. http://www.posit.co/
  24. Quan W, Chen B, Shu F (2017) Publish or impoverish: an investigation of the monetary reward system of science in China (1999���2016). Aslib J Inf Manag 69(5):486���502 [DOI: 10.1108/AJIM-01-2017-0014]
  25. R Core Team (2023) R: a language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
  26. Rousseau R (2018) Institutional versus commercial email addresses: which one to use in your publications? LSE Impact Blog. https://blogs.lse.ac.uk/impactofsocialsciences/2018/06/21/institutional-versus-commercial-email-addresses-which-one-to-use-in-your-publications/ . Accessed 27 Sept 2023
  27. Sabel BA, Knaack E, Gigerenzer G, Bilc M (2023) Fake publications in biomedical science: red-flagging method indicates mass production. medRxiv. https://doi.org/10.1101/2023.05.06.23289563
  28. Seifert R (2021) How Naunyn-Schmiedeberg���s Archives of Pharmacology deals with fraudulent papers from paper mills. Naunyn-Schmiedeberg���s Arch Pharmacol 394:431���436 [DOI: 10.1007/s00210-021-02056-8]
  29. Shen S, Rousseau R Wang D (2018) Do papers with an institutional e-mail address receive more citations than those with a non-institutional one?. Scientometrics 115:1039���1050
  30. South A (2011) rworldmap: a new R package for Mapping Global Data. R J 3(1):35���43 [DOI: 10.32614/RJ-2011-006]
  31. Tian M, Su Y, Ru X (2016) Perish or publish in China: pressures on young Chinese scholars to publish in internationally indexed journals. Publications 4(2):9 [DOI: 10.3390/publications4020009]
  32. Van Noorden R (2022) Journals adopt AI to spot duplicated images in manuscripts. Nature 601(7891):14���15
  33. Van der Heyden MAG (2021) The 1-h fraud detection challenge. Naunyn Schmiedebergs Arch Pharmacol 394(8):1633���1640 [DOI: 10.1007/s00210-021-02120-3]
  34. White K (2021) Publications output: U.S. trends and international comparisons. National Center for Science and Engineering Statistics (NCSES). https://ncses.nsf.gov/pubs/nsb20214/publication-output-by-country-region-or-economy-and-scientific-field . Accessed 27 Sept 2023
  35. Wickham H (2016) ggplot2: elegant graphics for data analysis. Springer-Verlag, New York [DOI: 10.1007/978-3-319-24277-4]
  36. Wickham H, Bryan J (2023) readxl: read excel files. R package version 1.4.2, https://CRAN.R-project.org/package=readxl
  37. Wickham H, Fran��ois R, Henry L, M��ller K, Vaughan D (2023) dplyr: a grammar of data manipulation. R package version 1.1.2, https://CRAN.R-project.org/package=dplyr
  38. Wickham H, Seidel D (2022) scales: scale functions for visualization. R package version 1.2.1, https://CRAN.R-project.org/package=scales
  39. Wickham H, Vaughan D, Girlich M (2023) tidyr: tidy messy data. r package version 1.3.0, https://cran.r-project.org/package=tidyr
  40. Wittau J, Celik S, Kacprowski T, Deserno TM, Seifert R (2023) Fake paper identification in the pool of withdrawn and rejected manuscripts submitted to Naunyn���Schmiedeberg���s Archives of Pharmacology. Naunyn Schmiedebergs Arch Pharmacol. https://doi.org/10.1007/s00210-023-02741-w

MeSH Term

Metadata
Pharmacology
Periodicals as Topic
Scientific Misconduct
Humans
Retraction of Publication as Topic
Authorship

Word Cloud

Created with Highcharts 10.0.0papersfakescientificmetadataalsopapercharacteristicsArchivesPharmacologyevenlevelusedexaminedretractedNaunyn-Schmiedeberg'sNSAPmanyarticlesoftencontaminaterecordretractionMetadataincreasingproblemcauseconcerncommunitylookcontainmanipulateddatacompletelyfictitiousSo-calledmillsproducelargescalepublishnamepeoplebuyauthorshipaimstudylearninvestigatedwhetherdetectpurpose12recentyearscomparedreferencegroup733publishedturneddiffersubstantiallynoticeablecamealmostexclusivelycertaincountrynon-institutionalemailaddressesaveragereferenceddubiousliteraturesignificantlyHoweverthreefeatureslimiteduseidentifyingableshowunidentifiedcontinueresultsindicatewellmaderesemblehonestlongtermfullycontainedparticularlyimportantidentifypublicationresearchtopicthereforeurgentlyneededanalysisFakeNaunyn-Schmiedeberg���sPapermillRetracted

Similar Articles

Cited By (2)