The Fall 2021 MatSE 590 for graduate students consists of an exciting and jam-packed schedule. MATSE 590 is a colloquium (1-3 credits) consist of a series of individual lectures by faculty, students, or outside speakers.
Graduate students will receive a weekly email with information via @psu.edu email. Graduate students are required to attend all 590 Seminars. If you have any questions, please email Hayley Barnes at firstname.lastname@example.org.
*Due to the ongoing Covid Pandemic this program is being offered virtually through Zoom. Please reference the weekly email from Hayley Barnes (email@example.com) for Zoom link.
December 2, 2021
“Making Machine Learning Work in Chemical and Materials Research”
Johannes Hachmann, Ph.D.
Associate Professor, Department of Chemical and Biological Engineering, University at Buffalo, The State University of New York
The process of developing new chemistry and materials is increasingly driven by computational modeling and simulation, which allow us to characterize candidates before pursuing them in the laboratory. The use of modern machine learning, informatics, and virtual screening approaches is a relatively new development in the chemical and materials domain. Yet, it holds tremendous promise for the practical realization of accelerated discovery, rational design, and inverse engineering. We present a high-throughput computational study to identify novel polymers with exceptional refractive index (RI) values for use as optic or optoelectronic materials. Our study utilizes an RI prediction protocol based on a combination of first-principles and data modeling, which we employ on a large-scale library of candidate compounds. We deploy our virtual screening software ChemHTPS to automate the assessment of this extensive pool of polymer structures in order to determine the performance potential of each candidate. This rapid and efficient approach yields a number of highly promising leads compounds. Using the data mining and machine learning program package ChemML, we perform a materials informatics analysis of the top candidates, e.g., with respect to prevalent structural features and feature combinations that distinguish them from less promising ones. In particular, we explore the utility of various strategies that introduce highly polarizable moieties into the polymer backbone to increase its RI yield. We will show how the results of traditional modeling approaches can be calibrated, as well as how we can devise new data-derived models that are predictive and that can augment or replace physics-derived models at a fraction of the computational cost. The structure-property relationships revealed by data mining serve as the foundation for targeted de novo design of novel compounds with tailored properties.
Data science techniques have been exceedingly successful in other application fields, and there is no fundamental reason why they should not have a similarly transformative impact on chemical and materials research. However, adapting techniques from other application domains for the study of chemical and materials systems requires a substantial rethinking and redevelopment of the existing methods. In this presentation, we will also discuss our work on designing advanced, physics-infused neural network architectures, the fusion of unsupervised clustering with supervised regression for local ensemble models, active and transfer learning techniques, bootstrapping approaches to minimize our training data footprint, methods to increase the applicability domain of data-derived models, and automated hyperparameter optimization.
Johannes Hachmann is an Associate Professor of Chemical Engineering at the University at Buffalo (UB), the Director of the Engineering Science in Data Science graduate program, a Core Member of the UB Computational and Data-Enabled Science and Engineering graduate program, and a Faculty Member of the New York State Center of Excellence in Materials Informatics. He earned a Dipl.-Chem. degree (2004) after undergraduate studies at the universities of Jena and Cambridge, M.Sc. (2007) and Ph.D. (2010) degrees in Chemistry from Cornell University, and he conducted postdoctoral research at Harvard University before joining the UB faculty in 2014. The research of the Hachmann Group fuses (first-principles) molecular and materials modeling with virtual high-throughput screening and modern data science (i.e., the use of database technology, machine learning, and informatics) to advance a data-driven discovery and rational design paradigm in the chemical and materials disciplines. One of the centerpieces of the group’s efforts is the creation of an open, general-purpose software ecosystem for the data-driven design of chemical systems and the exploration of chemical space. This work was recognized with a 2018 NSF CAREER Award.