BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//Memento EPFL//
BEGIN:VEVENT
SUMMARY:"Machine learning in chemistry and beyond" (ChE-651) seminar by Co
 ry Simon "Classifying the toxicity of pesticides to honey bees via support
  vector machines with random walk graph kernels"
DTSTART:20220607T171500
DTEND:20220607T181500
DTSTAMP:20260610T054736Z
UID:df9776a2259c6c969819022e6104faed7462c0f984b2af7d21d2ee39
CATEGORIES:Conferences - Seminars
DESCRIPTION:Cory Simon hails from a small town in Ohio. He earned his B.S.
  in Chemical Engineering from the University of Akron. He then studied mat
 hematics at the University of British Columbia in Vancouver\, Canada for t
 wo years. In 2016\, he earned his Ph.D. in Chemical Engineering from the U
 niversity of California\, Berkeley. He conducted scientific research at Vi
 rginia Tech\, Okinawa Institute of Science and Technology\, Lawrence Berke
 ley National Laboratory\, École Polytechnique Fédérale de Lausanne\, an
 d Altius Institute for Biomedical Sciences and interned in industry at Bri
 dgestone Research (chemical engineering) and Stitch Fix (data science). Si
 nce 2017\, Cory is an assistant professor at Oregon State University in th
 e School of Chemical\, Biological\, and Environmental Engineering. His res
 earch group employs molecular models and simulations\, machine learning\, 
 and statistical mechanics to discover nanoporous materials for gas storage
 \, separations\, and sensing. Cory digs hiking/backpacking in scenic place
 s\, snowboarding\, wine\, and going on walks with his dog\, Oslo.\nPestici
 des benefit agriculture by increasing crop yield\, quality\, and security.
  However\, pesticides may inadvertently harm bees\, which are valuable as 
 pollinators. Thus\, candidate pesticides in development pipelines must be 
 assessed for toxicity to bees. \n\nLeveraging a data set of 382 molecules
  with toxicity labels from honey bee exposure experiments\, we train a sup
 port vector machine (SVM) to predict the toxicity of pesticides to honey b
 ees. We compare two representations of the pesticide molecules: (i) a rand
 om walk feature vector listing counts of length-L walks on the molecular g
 raph with each vertex- and edge-label sequence and (ii) the MACCS structur
 al key fingerprint (FP)\, a bit vector indicating the presence/absence of 
 a list of pre-defined subgraph patterns in the molecular graph. We explici
 tly construct the MACCS FPs\, but rely on the fixed-length-L random walk g
 raph kernel (RWGK) in place of the dot product for the random walk represe
 ntation. \n\nThe L-RWGK-SVM achieves an accuracy\, precision\, recall\, a
 nd F1 score (mean over 2000 runs) of 0.81\, 0.68\, 0.71\, and 0.69 on the 
 test data set---with L=4 the mode optimal walk length. The MACCS-FP-SVM pe
 rforms on par/marginally better than the L-RWGK-SVM\, lends more interpret
 ability\, but varies more in performance. We interpret the MACCS-FP-SVM by
  illuminating which subgraph patterns in the molecules tend to strongly pu
 sh them towards the toxic/non-toxic side of the separating hyperplane. 
LOCATION:https://epfl.zoom.us/j/64473017589?pwd=Vmpnd1pleGhEb1hFb3kxUlNIUW
 JyQT09
STATUS:CONFIRMED
END:VEVENT
END:VCALENDAR
