Rapidly growing data volumes at light sources demand increasingly automated data collection, distribution, and analysis processes, in order to enable new scientific discoveries while not overwhelming finite human capabilities. We present here the case for automating and outsourcing light source science using cloud-hosted data automation and enrichment services, institutional computing resources, and high- performance computing facilities to provide cost-effective, scalable, and reliable implementations of such processes. We discuss three specific services that accomplish these goals for data distribution, automation, and transformation. In the first, Globus cloud-hosted data automation services are used to implement data capture, distribution, and analysis workflows for Advanced Photon Source and Advanced Light Source beamlines, leveraging institutional storage and computing. In the second, such services are combined with cloud-hosted data indexing and institutional storage to create a collaborative data publication, indexing, and discovery service, the Materials Data Facility (MDF), built to support a host of informatics applications in materials science. The third integrates components of the previous two projects with machine learning capabilities provided by the Data and Learning Hub for science (DLHub) to enable on-demand access to machine learning models from light source data capture and analysis workflows, and provides simplified interfaces to train new models on data from sources such as MDF on leadership scale computing resources. We draw conclusions about best practices for building next-generation data automation systems for future light sources.
Skip Nav Destination
Article navigation
15 January 2019
PROCEEDINGS OF THE 13TH INTERNATIONAL CONFERENCE ON SYNCHROTRON RADIATION INSTRUMENTATION – SRI2018
11–15 June 2018
Taipei, Taiwan
Research Article|
January 15 2019
Data automation at light sources
Ben Blaiszik;
Ben Blaiszik
1
Data Science and Learning Division, Argonne National Laboratory
, Argonne IL 60439, USA
Search for other works by this author on:
Kyle Chard;
Kyle Chard
1
Data Science and Learning Division, Argonne National Laboratory
, Argonne IL 60439, USA
Search for other works by this author on:
Ryan Chard;
Ryan Chard
1
Data Science and Learning Division, Argonne National Laboratory
, Argonne IL 60439, USA
Search for other works by this author on:
Ian Foster;
Ian Foster
a)
1
Data Science and Learning Division, Argonne National Laboratory
, Argonne IL 60439, USA
2
Department of Computer Science, University of Chicago
, Chicago IL 60637, USA
a)Corresponding author: foster@anl.gov
Search for other works by this author on:
Logan Ward
Logan Ward
1
Data Science and Learning Division, Argonne National Laboratory
, Argonne IL 60439, USA
Search for other works by this author on:
a)Corresponding author: foster@anl.gov
AIP Conf. Proc. 2054, 020003 (2019)
Citation
Ben Blaiszik, Kyle Chard, Ryan Chard, Ian Foster, Logan Ward; Data automation at light sources. AIP Conf. Proc. 15 January 2019; 2054 (1): 020003. https://doi.org/10.1063/1.5084563
Download citation file:
Citing articles via
Related Content
Toughening the Macro Defect Free (MDF) Cements
AIP Conference Proceedings (June 2010)
The mechanical characteristics of hybridised MDF from empty fruit bunch as well as kenaf following ATH treating and prepared by pre-polymerisation method
AIP Conference Proceedings (November 2016)
Automated data collection based on RoboDiff at the ESRF beamline MASSIF-1
AIP Conference Proceedings (July 2016)
The effect of magnetic mineral grain size on magnetic susceptibility and MDF of iron sand from Senggigi beach, West Lombok, West Nusa Tenggara province of Indonesia
AIP Conference Proceedings (January 2023)
Effect of fibre aspect ratio onto the modulus of palm-based medium-density fibreboard
AIP Conference Proceedings (September 2015)