Research data infrastructure for high-throughput experimental materials science.
Kevin R Talley, Robert White, Nick Wunder, Matthew Eash, Marcus Schwarting, Dave Evenson, John D Perkins, William Tumas, Kristin Munch, Caleb Phillips, Andriy Zakutayev
Author Information
Kevin R Talley: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Robert White: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Nick Wunder: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Matthew Eash: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Marcus Schwarting: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Dave Evenson: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
John D Perkins: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
William Tumas: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Kristin Munch: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Caleb Phillips: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
Andriy Zakutayev: Materials, Chemical and Computational Science Directorate, National Renewable Energy Laboratory, Golden, CO 80401, USA.
The High-Throughput Experimental Materials Database (HTEM-DB, htem.nrel.gov) is a repository of inorganic thin-film materials data collected during combinatorial experiments at the National Renewable Energy Laboratory (NREL). This data asset is enabled by NREL's Research Data Infrastructure (RDI), a set of custom data tools that collect, process, and store experimental data and metadata. Here, we describe the experimental data flow from the RDI to the HTEM-DB to illustrate the strategies and best practices currently used for materials data at NREL. Integration of the data tools with experimental instruments establishes a data communication pipeline between experimental researchers and data scientists. This work motivates the creation of similar workflows at other institutions to aggregate valuable data and increase their usefulness for future machine learning studies. In turn, such data-driven studies can greatly accelerate the pace of discovery and design in the materials science domain.