Prediction of molecular and reaction properties
Project Description
Machine learning has been used to accelerate the discovery in chemistry. Primarily, people use supervised learning to predict the properties of molecules and reactions to inform new developments. The prediction of molecular and reaction properties have therefore been at the center of machine learning research in chemistry. The key challenges exist in developing the most appropriate representations, choosing the best model architecture, and, in the long run, development of better-quality datasets.

This project focuses on develop models to represent chemical molecules and reactions, and use them to predict molecular properties like toxicity, solubility and reaction properties like yield, impurities. Different representations will be explored and physical or chemical knowledge can be incorporated to enhance the effectiveness of the models.
Supervisor
GAO Hanyu
Quota
1
Course type
UROP1100
Applicant's Roles
In this project, you will use build machine learning models for the prediction of molecular and reaction properties. Initially the focus will be on solubility and reaction yield, both of which are important considerations when designing synthetic routes for new molecules. You will go through the pipeline of defining and acquiring data, model construction and testing, and connect the model with downstream applications.
Applicant's Learning Objectives
Through working on the project, you are expected to gain the following experiences/expertise:
1. Programming;
2. Knowledge about organic chemistry and chemical reactions;
3. Machine learning algorithms;
4. Cheminformatics methods.
Complexity of the project
Challenging