CBE - Solving the Structure Identification Problem with Machine Learning

The structure identification problem consists of determining the chemical structure of a species based on indirect pieces of information like spectra or property measurements. Structure identification is a bottleneck in many research, industrial, and regulatory contexts and the absence of a general solution has enormous societal costs. For example, consider the importance of identifying impurities in plastics, foodstuffs, or pharmaceuticals. Or consider the mechanistic information that could be gleaned from inexpensively determining degradation products, minor products from chemical reactions, or natural products produced in organisms. 

The state of the art for structure identification is still manual expert interpretation of spectra. This project is seeking to change the state of the art using machine learning to more reliably and comprehensively reason from commonly available pieces of information. The ideal candidate will be eager to learn, have a deep passion for puzzles, and be excited by applying machine learning to a chemical context. The technical skills that will be learned as part of participation include training machine learning models (transformers and u-net architectures to start), working with large datasets, understanding some common physics-based modeling approaches, and interacting with GPU resources. Student(s) will work with senior graduate students and Prof. Savoie and will be expected to work at a level that merits authorship on an eventual publication. A two semester commitment is required. 

Name of research group, project, or lab
The Savoie Research Group
Why join this research group or lab?

The Savoie Research Group develops new methods for predicting, simulating, and designing organic materials. Working with this group will give you broad experience beyond just machine learning.

Logistics Information:
Project categories
Chemical and Biomolecular Engineering
Student ranks applicable
First Year
Sophomore
Junior
Student qualifications

The ideal candidate will be eager to learn, have a deep passion for puzzles, and be excited by applying machine learning to a chemical context. No other prerequisites are needed. Student(s) will work with senior graduate students and Prof. Savoie and will be expected to work at a level that merits authorship on an eventual publication. A two semester commitment is required. 

Hours per week
3 credits / 12+ hours
Compensation
Research for Credit
Number of openings
3
Techniques learned

The technical skills that will be learned as part of participation include training machine learning models (transformers and u-net architectures to start), working with large datasets, understanding some common physics-based modeling approaches, and interacting with GPU resources. 

Project start
Spring 2025
Contact Information:
Mentor
bsavoie2@nd.edu
Professor
Name of project director or principal investigator
Brett Savoie
Email address of project director or principal investigator
bsavoie2@nd.edu
3 sp. | 0 appl.
Hours per week
3 credits / 12+ hours
Project categories
Chemical and Biomolecular Engineering