The Open Knowledgebase of Interatomic Models (OpenKIM) is a framework intended to facilitate access to standardized implementations of interatomic models for molecular simulations along with computational protocols to evaluate them. These protocols include tests to compute material properties predicted by models and verification checks to assess their coding integrity. While housing this content in a unified, publicly available environment constitutes a major step forward for the molecular modeling community, it further presents the opportunity to understand the range of validity of interatomic models and their suitability for specific target applications. To this end, OpenKIM includes a computational pipeline that runs tests and verification checks using all available interatomic models contained within the OpenKIM Repository at https://openkim.org. The OpenKIM Processing Pipeline is built on a set of Docker images hosted on distributed, heterogeneous hardware and utilizes open-source software to automatically run test–model and verification check–model pairs and resolve dependencies between them. The design philosophy and implementation choices made in the development of the pipeline are discussed as well as an example of its application to interatomic model selection.
REFERENCES
For HPC environments, Singularity images can be constructed from Docker images.
There are only two parts of the process shown in Fig. 2 that the Web App is aware of: (1) that a new item has been submitted, at which point it notifies the Gateway’s control API in step 1; (2) it periodically checks to see if new results or errors have been uploaded by the Gateway by scanning the contents of some of its directories. This loose coupling obviates the need to deal with synchronization between the Web App and the Gateway that would otherwise be necessary.
Note that the KIM-property python package (https://github.com/openkim/kim-property) can be used to create and write property instances. A native implementation in LAMMPS is also available.
Strictly speaking, what is listed are lineages of Tests, which encompass all versions of that Test. The dependency is always taken to correspond to the latest existing version in that lineage.
This is applicable in the event where a new version of an existing Test is uploaded, which forces its downstream dependents to be rerun. The reason is that jobs associated with the downstream dependents being removed from the list could otherwise eventually be run twice when downstream resolution is performed on the Test Results of jobs associated with the others. However, this mechanism can fail if more complicated structures exist in the dependency graph. A point of future work is to address this shortcoming with a global graph traversal method, e.g. a topological sorting algorithm, while taking care not to needlessly sequentialize jobs in independent branches.