Robust linear regression, stepwise linear regression, and linear regression were tested, compared, and combined in this experimental setup, in an effort to address an environmental problem with short- and long-term effects.
The idea behind the term “ensemble learning” is that it is a machine learning approach in which several models are trained to solve a common problem, then combined to yield better performance (1). This concept was put to the test in a recent study in the Journal of Chromatography A, developed by a group of seven authors from the Interdisciplinary Research Center for Membrane and Water Security at King Fahd University of Petroleum and Minerals in Dhahran, Saudi Arabia.
What those researchers wanted to emphasize in their work was the importance of wastewater management procedures that are both health-conscious and sustainable, which they said can be demonstrated with reliable modeling of oily wastewater (1). Discharge of this byproduct, they said, can not only deteriorate water resources—presaging the eventual destruction of the ecosystem—but on a more immediate level can also be carcinogenic in human exposure.
The World Health Organization (WHO), in 2006 and downloadable online as of 2013, published a four-volume set of instructions on proper handling of wastewater: guidelines for the safe use of wastewater, excreta, and greywater; wastewater use in agriculture; wastewater and excreta use in agriculture; and excreta and greywater use in agriculture (2). (WHO, as the co-custodian of the Sustainable Development Goals [SDG], is the leading authority in global monitoring of wastewater treatment.)
In this experimental setup, polypyrrole-coated ceramic-polymeric membranes were devised to model the separation efficiency (SE) and permeate flux (PF) of oily wastewater, comparing and creating a treatment application method from three different modes: robust linear regression (RLR), stepwise linear regression (SWLR), and linear regression (LR) (1). A new and simple average ensemble paradigm functioned to reduce errors and improve predictability. RLR, SWLR, and LR are known as “soft-computing” models that also fit into the category of chemometric modeling, which is a fairly novel approach when it comes to membrane technology, predicting performance based on a membrane’s intrinsic physiochemical properties (1).
The results of running the advanced predictive regression models were shared in detail by the authors. SE values were found to be more consistent, with an average of 99.92% and minimal standard deviation of 0.026%, than the more variable PF values (1). With a root mean square error (RMSE) of 0.21951, the LR model was shown to have superior accuracy to the other models tested; however, stepwise linear regression (SWLR) had advantages in speed, with a “remarkable” 110 observations processed per second. Not to be outdone, the RLR model was reliable in its own way, according to the authors’ findings, balancing accuracy and efficiency with an error reduction of 35.29% against SWLR.
In closing, the researchers advocated for the application of interpretable machine learning in oily wastewater modeling, but offered a few words of caution. First, they said real-world deployment of their strategies could present challenges not present in a controlled environment. But also, they allowed that both membrane types and research conditions might vary, however adding that recognition of these limiting factors would go a long way toward effective interpretation of the findings they presented.
(1) Baig, U.; Usman, J.; Abba, S. I.; et al. Insight into Soft Chemometric Computational Learning for Modelling Oily-Wastewater Separation Efficiency and Permeate Flux of Polypyrrole-Decorated Ceramic-Polymeric Membranes. J. Chromatogr. A 2024, 1725, 464897. DOI: 10.1016/j.chroma.2024.464897
(2) Water Sanitation and Health: Wastewater. World Health Organization, 2024. https://www.who.int/teams/environment-climate-change-and-health/water-sanitation-and-health/sanitation-safety/wastewater (accessed 2024-04-30).
GC–MS Targeted Analysis of PFAS Helps Expand Knowledge of Toxicokinetic Data
November 1st 2024Limited toxicokinetic and toxicologic information is available about a diverse set of per- and polyfluoroalkyl substances (PFAS), but methods based on gas chromatography–tandem mass spectrometry (GC–MS/MS) can help unravel some of the mystery.
Green Chemistry: What is it (and What Is It Not)? And How Does It Apply to Gas Chromatography?
October 31st 2024Everyone is talking about sustainability, and organizations are creating sustainability programs. But what does green chemistry really mean, and how does it apply to gas chromatography?