Shokouh Shakouri: Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Mohammad Amin Bakhshali: Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Parvaneh Layegh: Department of Radiology, Faculty of Medicine, Imam Reza Hospital, Mashhad University of Medical Sciences, Mashhad, Iran.
Behzad Kiani: Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. ORCID
Farid Masoumi: Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Saeedeh Ataei Nakhaei: Nuclear Medicine Research Center, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran.
Sayyed Mostafa Mostafavi: Department of Medical Informatics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran. MostafaviTM@mums.ac.ir. ORCID
OBJECTIVES: The ongoing Coronavirus disease 2019 (COVID-19) pandemic has drastically impacted the global health and economy. Computed tomography (CT) is the prime imaging modality for diagnosis of lung infections in COVID-19 patients. Data-driven and Artificial intelligence (AI)-powered solutions for automatic processing of CT images predominantly rely on large-scale, heterogeneous datasets. Owing to privacy and data availability issues, open-access and publicly available COVID-19 CT datasets are difficult to obtain, thus limiting the development of AI-enabled automatic diagnostic solutions. To tackle this problem, large CT image datasets encompassing diverse patterns of lung infections are in high demand. DATA DESCRIPTION: In the present study, we provide an open-source repository containing 1000+ CT images of COVID-19 lung infections established by a team of board-certified radiologists. CT images were acquired from two main general university hospitals in Mashhad, Iran from March 2020 until January 2021. COVID-19 infections were ratified with matching tests including Reverse transcription polymerase chain reaction (RT-PCR) and accompanying clinical symptoms. All data are 16-bit grayscale images composed of 512 × 512 pixels and are stored in DICOM standard. Patient privacy is preserved by removing all patient-specific information from image headers. Subsequently, all images corresponding to each patient are compressed and stored in RAR format.