Cheminformatics approaches for analysing and modelling the gas-ionic liquid distribution of organic solutes
Laen...
Kuupäev
Autorid
Ajakirja pealkiri
Ajakirja ISSN
Köite pealkiri
Kirjastaja
Tartu Ülikooli Kirjastus
Abstrakt
Tehisintellekti ja masinõppe (ML) rakendamine kvantitatiivsete struktuur-omadus sõltuvuste raames võimaldab uuringuid materjalide ja ainete füüsikalis-keemiliste omaduste mõistmiseks. Üks selline ainete klass on ioonvedelikud (IL) milles orgaaniliste lahustunud ainete jaotusomaduste mõistmine ja hindamine annab aluse erinevate rakenduslike keemiliste keskkondade uurimiseks ja arendamiseks. Orgaaniliste lahustite potentsiaalsete keskkonnasõbralike alternatiividena on IL-id oluline uurimisobjekt kuna neil on arvukalt rakendusi. Samas on IL-ide struktuur-omadus sõltuvuste kohta jaotusomaduste osas tehtud vähe süstemaatilisi uuringuid.
Eesmärk oli uurida gaas-IL jaotusteguri (log K) ja orgaaniliste lahustunud ainete struktuuri ja/või IL ioonsete komponentide vahelisi seoseid, kasutades keemiainformaatika lähenemisviise. See hõlmas teoreetiliste molekulaartunnuste ja täiustatud ML meetodite kasutamist, et modelleerida lahustunud aine ja IL-i struktuurist lähtuvaid interaktsioonimehhanisme mitme-komponentses süsteemis.
Orgaanilise lahustunud aine, katioonide ja anioonide struktuuridele vastavate andmeridade modelleerimine ja analüüs näitas, et juhumetsa, tugivektori regressiooni ja Gaussi protsessi regressiooni ML meetodid esitavad molekulaartunnustes kodeeritud lahustunud aine ja IL-i vahelisi sõltuvusi tõhusamalt kui tavapärane multilineaarne regressioon. Samas on viimane lihtsamini interpreteeritav. Nii lineaarsed kui ka mittelineaarsed mudelid rõhutavad katioonide ja anioonide koostise kriitilist mõju lahustunud aine jaotusele. Tulemused näitavad ka, et kogu lahustunud aine-IL süsteemi modelleerimine, kombineerides lahustunud aine, katiooni ja aniooni tunnuseid, parandab ennustusvõimet suurte ja keemiliselt mitmekesiste andmekogumite puhul, rõhutades mitme-komponentsete lähenemisviiside olulisust. Tuletatud mudelitesse kaasatud molekulaartunnused selgitavad võimalikke interaktsioone lähtudes dispersioonijõududest, kulonilis-dipolaarsetest vastasmõjudest ja vesiniksidemete tekkest. Lisaks mehhanistlikele teadmistele, võimaldavad tuletatud sõltuvused kujundada selektiivsemaid ja tõhusamaid IL-keskkondi sihipäraste tööstuslike, keskkonna- ja teadusrakenduste jaoks.
The application of artificial intelligence and machine learning (ML) in the framework of quantitative structure-property relationships enables studies to understand the physicochemical properties of materials and substances. One such class of substances is ionic liquids (ILs), in which understanding and evaluating the partitioning properties of organic solutes provides a basis for the study and development of such applied chemical environments. As potential environmentally friendly alternatives to organic solvents, ILs are an important research object due to their numerous applications. However, few systematic studies have been conducted on the structure-property relationships of ILs in terms of partitioning properties. The aim was to investigate the relationships between the gas-IL partition coefficients (log K) and the structure of organic solutes and/or the ionic components of the IL using cheminformatics approaches. This involved the use of theoretical molecular descriptions and advanced ML methods to model the interaction mechanisms of the solute and IL structure in a multicomponent system. Modeling and analysis of data sets corresponding to the structures of organic solute, cations and anions showed that Random Forest, Support Vector Regression and Gaussian Process Regression ML methods represent the solute-IL relationships encoded in molecular descriptors more effectively than conventional Multi Linear Regression. At the same time, the latter is easier to interpret. Both linear and nonlinear models emphasize the critical influence of cation and anion composition on solute distribution. The results also show that modeling the entire solute-IL system, combining solute, cation and anion descriptors, improves the predictive power for large and chemically diverse data sets, emphasizing the importance of multicomponent approaches. The molecular features included in the derived models explain possible interactions based on dispersion forces, Coulomb-dipolar interactions and hydrogen bonding. In addition to mechanistic insights, the derived dependencies allow for the design of more selective and efficient IL environments for targeted industrial, environmental, and scientific applications.
The application of artificial intelligence and machine learning (ML) in the framework of quantitative structure-property relationships enables studies to understand the physicochemical properties of materials and substances. One such class of substances is ionic liquids (ILs), in which understanding and evaluating the partitioning properties of organic solutes provides a basis for the study and development of such applied chemical environments. As potential environmentally friendly alternatives to organic solvents, ILs are an important research object due to their numerous applications. However, few systematic studies have been conducted on the structure-property relationships of ILs in terms of partitioning properties. The aim was to investigate the relationships between the gas-IL partition coefficients (log K) and the structure of organic solutes and/or the ionic components of the IL using cheminformatics approaches. This involved the use of theoretical molecular descriptions and advanced ML methods to model the interaction mechanisms of the solute and IL structure in a multicomponent system. Modeling and analysis of data sets corresponding to the structures of organic solute, cations and anions showed that Random Forest, Support Vector Regression and Gaussian Process Regression ML methods represent the solute-IL relationships encoded in molecular descriptors more effectively than conventional Multi Linear Regression. At the same time, the latter is easier to interpret. Both linear and nonlinear models emphasize the critical influence of cation and anion composition on solute distribution. The results also show that modeling the entire solute-IL system, combining solute, cation and anion descriptors, improves the predictive power for large and chemically diverse data sets, emphasizing the importance of multicomponent approaches. The molecular features included in the derived models explain possible interactions based on dispersion forces, Coulomb-dipolar interactions and hydrogen bonding. In addition to mechanistic insights, the derived dependencies allow for the design of more selective and efficient IL environments for targeted industrial, environmental, and scientific applications.
Kirjeldus
Väitekirja elektrooniline versioon ei sisalda publikatsioone
Märksõnad
doktoritööd