Exploring The Chemical Subspace of RPLC: A Data-driven Approach

Publication
Article
ColumnNovember 2024
Volume 20
Issue 11
Pages: 2–4

Saer Samanipour from the Van ‘t Hoff Institute for Molecular Sciences (HIMS) at the University of Amsterdam spoke to LCGC International about the benefits of a data-driven reversed-phase liquid chromatography (RPLC) approach his team developed to enhance RPLC method development, including increased efficiency for non-targeted analysis and suspect screening, a reduction in the amount of false positives produced, and a predictive way to determine if a chemical can be separated using RPLC.

You recently published a paper entitled “Exploring The Chemical Subspace of RPLC: A Data-driven Approach” (1). What was the rationale behind this research? Why is exploring the chemical sub space important and what applications did you explore?

The main hypothesis here is not all organic small molecules can actually be analyzed using reversed-phase liquid chromatography (RPLC) or other specific selectivity. Moreover, it is difficult—almost impossible—to say what organic molecules are measurable by a specific method. At the moment, this is mainly done by assuming a linear relationship between the hydrophobicity of chemicals and their retention behaviour. We have seen time and time again that this assumption is not an accurate one, implying that a lack of detection for a chemical does not necessarily mean that it absent in the sample. Therefore, we decided to see whether the structure of a chemical can give us enough information to assess its measurability by RPLC.

In terms of application, this approach has been mainly used for the method development and structural elucidation during non-target analysis. As an example, you could easily exclude chemicals that are not measurable using RPLC when screening your samples against large databases such as Norman SuSDat. This reduces the number of candidates and consequently the number of false identifications.

Can you expand on your findings on what this data-driven approach discovered?

We have shown for the first time that molecular fingerprints alone, when optimized, have enough structural information to be used in QSAR models.

Around 20000 environmentally-relevant chemicals in Norman SuSDat database are not measurable with RPLC and need a different separation strategy. It should be noted that this does not mean that all measurable chemicals with RPLC can be separated in one single run.

The approach can help streamline the identification of compounds from complex environmental and biological samples by focusing on chemicals that are realistically detectable using RPLC, leading to more accurate and faster analysis.

What is novel about this approach and why is it useful to separation scientists?

This approach reduces false positives in chemical analysis. Separation scientists, particularly those working in non-targeted analysis (NTA), often face the challenge of dealing with false positives—chemicals that are predicted to be present in samples but are undetectable by the chosen method (in this case, RPLC). This approach directly addresses that by narrowing down the list of candidate chemicals to those likely to be retained in RPLC.

For suspect screening, it helps in filtering out compounds that will not elute properly, thereby reducing the list of potential matches. This significantly lowers the number of false positives and reduces the computational resources needed to process large datasets.

Additionally, this approach increases efficiency in NTA and suspect screening. By identifying chemicals that are outside the RPLC subspace, this approach allows scientists to focus on analyzable compounds, thereby streamlining the process of chemical analysis. Instead of spending time on compounds that cannot be detected, separation scientists can target those that fit within the method’s capabilities.

This model also saves computational time during suspect screening by automatically filtering out chemicals that will not fit the RPLC separation, thus accelerating the analysis workflow.

It also guides method development by offering separation scientists a predictive understanding of whether a given chemical can be separated using RPLC. This is particularly useful in method development, where knowing the chemical space covered by RPLC can inform decisions on solvent gradients, column selection, and mobile phase compositions. By applying this model early in the method development process, analysts can determine whether RPLC is suitable for a specific sample or whether an alternative chromatography method (such as hydrophilic interaction LC [HILIC]) is necessary, thus avoiding trial-and-error experimentation.

Are you planning to extend this approach further?

We are planning to expand this to other selectivities as well as the detection via mass spectrometry. The ultimate goal here is to comprehensively map the fraction of chemical space that is measurable with our current analytical technologies.

About the Interviewee

Saer Samanipour is an Assistant Professor at the University of Amsterdam, where he leads the Environmental Modeling & Computational Mass Spectrometry (EMCMS) group. He is also an Honorary Associate Professor at the Queensland Alliance for Environmental Health Sciences (QAEHS) at the University of Queensland, Australia. Samanipour earned his Ph.D. from the École polytechnique fédérale de Lausanne (EPFL) in Switzerland in 2015, where his research focused on the fate and behavior of hydrophobic organic pollutants. Following his Ph.D., he worked as a research scientist at the Norwegian Institute for Water Research (NIVA), where he developed analytical tools for resolving complex mixtures and worked on a variety of projects funded by public and private sectors. He specializes in environmental modeling, computational mass spectrometry, and the development of data analysis tools for non-targeted analysis (NTA) using liquid chromatography and high-resolution mass spectrometry (LC-HRMS).  In addition to his work on machine learning models and NTA, Samanipour has made significant contributions to understanding the chemical subspaces of exposome, which is essential for the protection of environmental and human health. His efforts have enhanced the speed and accuracy of chemical screening processes, particularly for complex environmental samples. His interdisciplinary research continues to push the boundaries of chemical analysis and offers tools that support environmental monitoring, regulatory science, and public health.

Saer Samanipour is an Assistant Professor at the University of Amsterdam, where he leads the Environmental Modeling & Computational Mass Spectrometry (EMCMS) group. He is also an Honorary Associate Professor at the Queensland Alliance for Environmental Health Sciences (QAEHS) at the University of Queensland, Australia. Samanipour earned his Ph.D. from the École polytechnique fédérale de Lausanne (EPFL) in Switzerland in 2015, where his research focused on the fate and behavior of hydrophobic organic pollutants. Following his Ph.D., he worked as a research scientist at the Norwegian Institute for Water Research (NIVA), where he developed analytical tools for resolving complex mixtures and worked on a variety of projects funded by public and private sectors. He specializes in environmental modeling, computational mass spectrometry, and the development of data analysis tools for non-targeted analysis (NTA) using liquid chromatography and high-resolution mass spectrometry (LC-HRMS).

In addition to his work on machine learning models and NTA, Samanipour has made significant contributions to understanding the chemical subspaces of exposome, which is essential for the protection of environmental and human health. His efforts have enhanced the speed and accuracy of chemical screening processes, particularly for complex environmental samples. His interdisciplinary research continues to push the boundaries of chemical analysis and offers tools that support environmental monitoring, regulatory science, and public health. Direct correspondence to: s.samanipour@protonmail.com

Reference

(1) van Herwerden, D.; Nikolopoulos, A.; Barron, L. P.; O’Brien, J. W.; Pirok, B. W. J.; Thomas, K. V.; Samanipour, S. Exploring the Chemical Subspace of RPLC: A Data-Driven Approach. Anal. Chim. Acta 2024, 1317, 342869. DOI: 10.1016/j.aca.2024.342869

Recent Videos
Toby Astill | Image Credit: © Thermo Fisher Scientific
Robert Kennedy
John McLean | Image Credit: © Aaron Acevedo
Related Content