Pesticides benefit agriculture by increasing crop yield, quality, and security. However, pesticides may inadvertently harm bees, which are valuable as pollinators. Thus, candidate pesticides in development pipelines must be assessed for toxicity to bees. Leveraging a dataset of 382 molecules with toxicity labels from honey bee exposure experiments, we train a support vector machine (SVM) to predict the toxicity of pesticides to honey bees. We compare two representations of the pesticide molecules: (i) a random walk feature vector listing counts of length-L walks on the molecular graph with each vertex- and edge-label sequence and (ii) the Molecular ACCess System (MACCS) structural key fingerprint (FP), a bit vector indicating the presence/absence of a list of pre-defined subgraph patterns in the molecular graph. We explicitly construct the MACCS FPs but rely on the fixed-length-L random walk graph kernel (RWGK) in place of the dot product for the random walk representation. The L-RWGK-SVM achieves an accuracy, precision, recall, and F1 score (mean over 2000 runs) of 0.81, 0.68, 0.71, and 0.69, respectively, on the test data set—with L = 4 being the mode optimal walk length. The MACCS-FP-SVM performs on par/marginally better than the L-RWGK-SVM, lends more interpretability, but varies more in performance. We interpret the MACCS-FP-SVM by illuminating which subgraph patterns in the molecules tend to strongly push them toward the toxic/non-toxic side of the separating hyperplane.
Skip Nav Destination
Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels
,
,
,
CHORUS
Article navigation
21 July 2022
Research Article|
July 15 2022
Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels

Ping Yang
;
Ping Yang
(Data curation, Formal analysis, Investigation, Methodology, Software, Writing – original draft, Writing – review & editing)
1
School of Chemical, Biological, and Environmental Engineering, Oregon State University
, Corvallis, Oregon 97331, USA
Search for other works by this author on:
E. Adrian Henle;
E. Adrian Henle
(Formal analysis, Methodology, Software, Validation, Writing – review & editing)
1
School of Chemical, Biological, and Environmental Engineering, Oregon State University
, Corvallis, Oregon 97331, USA
Search for other works by this author on:
Xiaoli Z. Fern;
Xiaoli Z. Fern
(Methodology, Supervision, Writing – original draft, Writing – review & editing)
2
School of Electrical Engineering and Computer Science, Oregon State University
, Corvallis, Oregon 97331, USA
Search for other works by this author on:
Cory M. Simon
Cory M. Simon
a)
(Conceptualization, Software, Supervision, Writing – original draft, Writing – review & editing)
1
School of Chemical, Biological, and Environmental Engineering, Oregon State University
, Corvallis, Oregon 97331, USA
a)Author to whom correspondence should be addressed: [email protected]
Search for other works by this author on:
Ping Yang
1
E. Adrian Henle
1
Xiaoli Z. Fern
2
Cory M. Simon
1,a)
1
School of Chemical, Biological, and Environmental Engineering, Oregon State University
, Corvallis, Oregon 97331, USA
2
School of Electrical Engineering and Computer Science, Oregon State University
, Corvallis, Oregon 97331, USA
a)Author to whom correspondence should be addressed: [email protected]
J. Chem. Phys. 157, 034102 (2022)
Article history
Received:
March 08 2022
Accepted:
May 24 2022
Citation
Ping Yang, E. Adrian Henle, Xiaoli Z. Fern, Cory M. Simon; Classifying the toxicity of pesticides to honey bees via support vector machines with random walk graph kernels. J. Chem. Phys. 21 July 2022; 157 (3): 034102. https://doi.org/10.1063/5.0090573
Download citation file:
Pay-Per-View Access
$40.00
Sign In
You could not be signed in. Please check your credentials and make sure you have an active account and try again.
Citing articles via
DeePMD-kit v2: A software package for deep potential models
Jinzhe Zeng, Duo Zhang, et al.
CREST—A program for the exploration of low-energy molecular chemical space
Philipp Pracht, Stefan Grimme, et al.
Related Content
Foraging on the potential energy surface: A swarm intelligence-based optimizer for molecular geometry
J. Chem. Phys. (November 2012)
Direct selected multireference configuration interaction calculations for large systems using localized orbitals
J. Chem. Phys. (July 2011)
Clustering a database of optically absorbing organic molecules via a hierarchical fingerprint scheme that categorizes similar functional molecular fragments
J. Chem. Phys. (April 2022)
Multi-scale multireference configuration interaction calculations for large systems using localized orbitals: Partition in zones
J. Chem. Phys. (September 2012)
Electronic properties of metal-arene functionalized graphene
J. Chem. Phys. (July 2011)