Equilibrium partition coefficients or partition ratios are a fundamental concept in physical chemistry, with wide applications in environmental chemistry. While comprehensive data compilations for the octanol–water partition ratio and the Henry’s law constant have existed for many years, no comparable effort for the octanol–air partition ratio (KOA) exists. Considering the increasing use of KOA in understanding a chemical’s partitioning between a wide variety of organic phases (organic phases in atmospheric particles, plant foliage, polymeric sorbents, soil organic matter, animal tissues, etc.) and the gas phase, we have compiled all KOA values reported in the published literature. The dataset includes more than 2500 experimentally derived values and more than 10 000 estimated values for KOA, in total covering over 1500 distinct molecules. The range of measured log10 KOA values extends from −2 to 13. Many more measured values have been reported in the log10 KOA range from 2 to 5 and from 6 to 11 compared to the range from 5 to 6, which is due to the complementary applicability range of static and dynamic measurement techniques. The compilation also identifies measured data that are judged not reliable. KOA values for substances capable of undergoing strong hydrogen bonding derived from regressions with retention times on nonpolar gas chromatographic columns deviate strongly from values estimated by prediction techniques that account for such intermolecular interactions and should be considered suspect. It is hoped that the database will serve as a source for locating existing KOA data and for the calibration and evaluation of new KOA prediction techniques.
1. Introduction
Understanding the affinity of a chemical for liquid octanol, liquid water, and the gas phase is often the first step to understanding its potential environmental and biological fate and behavior. The physical–chemical properties to quantify those affinities include equilibrium partition ratios, saturation solubilities, and vapor pressure, which are related to one another through a series of thermodynamic triangles (Fig. 1). Chemical equilibrium partition ratios, hereafter simply referred to as partition ratios, are a concept fundamental to physical chemistry, with many applications in the fields of environmental, medicinal, and pharmaceutical sciences. While in the literature the thermodynamic property is more commonly referred to as a partition coefficient, we follow IUPAC nomenclature guidelines and describe the distribution of a chemical between two phases at equilibrium as a partition ratio.
Thermodynamic triangles of physical–chemical properties relating solvation in octanol, water, and in the pure liquid with the gas phase.
Thermodynamic triangles of physical–chemical properties relating solvation in octanol, water, and in the pure liquid with the gas phase.
The unitless octanol–air partition ratio (KOA) describes the distribution of a chemical between octan-1-ol (CAS No. 111-87-5) and the gas phase at equilibrium,
where CO and CA are the concentrations of a compound in n-octanol and the gas phase in mol m−3, respectively. KOA has many possible applications, most notably in linear free–energy relationships for predicting the equilibrium distribution of compounds between the gas phase and atmospheric particles (Finizio et al., 1997), blood (Batterman et al., 2002), soil (Hippelein and McLachlan, 2000; 1998), foliage (Müller et al., 1994; Paterson et al., 1990), and some of the polymers used in passive air samplers (Ockenden et al., 1998; Shoeib and Harner, 2002a).
Many comprehensive reviews (Mackay et al., 2015) and databases of octanol–water partition ratios (KOW) (Leo et al., 1971), Henry’s law constants (kH) (Mackay and Shiu, 1981; Sander 2015), and other physical–chemical properties [e.g., Mackay et al. (2006); Rumble et al. (2019); US EPA (2012)] exist in the literature. While Jin et al. (2017) compiled KOA data for the development of an estimation model, there has been no comprehensive collection or review of KOA data to date. Our aim is to assemble comprehensively and critically all previously published experimental and estimated KOA data. This work further includes an overview of the different techniques that have been used to obtain KOA values. The assembled database should be an easy-to-look-up repository of existing KOA data but also be suitable for evaluating existing KOA prediction techniques and the development of new ones.
1.1. Reporting KOA values
In this section, we briefly review the various ways in which KOA has been reported in the literature. In most cases, the values included in the database were reported as KOA or log10 KOA values; however, 1409 values were derived from reported Ostwald coefficients in octanol (Loct), Henry’s law constants in octanol (, Pa m3 mol−1), the Gibbs energies of dissolution into octanol from the gas phase (∆G°, J mol−1), or activity coefficients of a chemical in octanol at infinite dilution (). While various papers report values using different units for pressure, temperature, and volume, we have reported all equations and variables using SI units (e.g., Pa for pressure, K for temperature, and m3 for volume) unless otherwise stated.
1.1.1. Ostwald coefficient in octanol (Loct)
The earliest measurement of the solvation of a compound in octan-1-ol from the gas phase that we found was published in 1960 and was reported as an Ostwald coefficient in octanol (Loct) by Boyer and Bircher (1960). The Ostwald coefficient has been used for over a century to describe the solubility of gases in liquids (Ostwald, 1891). Since Ostwald initially coined the term, the following definitions for the Ostwald coefficient at equilibrium have been used (Battino, 1984):
- LV0 is the volume of gas (VG) dissolved in a volume of pure liquid (VL0),(2)
- LV is the volume of gas (VG) dissolved in a volume of solution (VL),(3)
- LC represents the concentration of a gas in the liquid phase (CL) divided by its concentration in the vapor phase (CV),and(4)
- LC∞ is LC at the infinite-dilution concentration of the gas in the liquid,(5)
Battino (1984) reviews more comprehensively the differences between these definitions and judges concentration-based definitions for equilibrium ratios to be the most thermodynamically reliable and useful method for reporting Ostwald coefficients (Battino, 1984; Wilhelm and Battino, 1985). The use of Loct to describe octanol–air partitioning is not altogether common and is to our knowledge limited to Boyer and Bircher (1960), Wilcock et al. (1978), Pollack et al. (1984), and Bo et al. (1993). Unless the reference states otherwise, we assume all published Loct values to be concentration ratios, equivalent to the KOA value (Abraham et al., 2001).
1.1.2. Gibbs energy (Δ)
The Gibbs energy describing the energy required to transfer a solute between two phases can also be expressed in two ways (Schwarzenbach et al., 2005). If the Gibbs energy for octanol–air transfer is reported on a concentration basis (ΔG°), we can directly solve for KOA using
where R is the ideal gas constant (8.314 J K−1 mol−1) and T is the absolute temperature (in K). If the Gibbs energy is reported using partial pressure and mole fraction (ΔG*), a conversion to ΔG° is first required (Berti et al., 1986; Cabani et al., 1991),
where voct is the molar volume of octanol (0.000 158 m3 mol−1 at 25 °C) (Rumble et al., 2019; Yaws, 2012).
1.1.3. Henry’s law constant in octanol ()
Air–water equilibrium is often expressed with the Henry’s law constant (kH) (Fig. 1), typically with units of Pa m3 mol−1. Likewise, partitioning between octanol and air can be described as the kH in octanol () with units of Pa m3 mol−1. Leng et al. (2015) and Roberts (2005) described octanol–air partitioning using the Henry’s law solubility constant, the reciprocal of (k′Hoct, mol m−3 Pa−1). KOA is obtained using
1.1.4. Activity coefficients () and liquid vapor pressures (PL)
KOA can be related to a chemical’s solubility SO (in units of mol m−3 octanol) or activity coefficient at infinite dilution (hereafter referred to as the activity coefficient) in octanol,
where PL is the liquid-phase vapor pressure (in Pa). A KOA value can therefore be calculated from a reported activity coefficient using the thermodynamic triangle of Eq. (9) if the vapor pressure of the liquid solute PL is available. For the purposes of this database, we include KOA values calculated using Eq. (9) if and PL were measured for the same system (Hussam and Carr, 1985) or if the PL was used to derive (Bhatia and Sandler, 1995; Dallas and Carr, 1992; Fukuchi et al., 2001; 1999; Tse and Sandler 1994). Chemicals for which the reported solubilities or activity coefficients and the vapor pressures derive from different studies are not currently included in the database of measured KOA values.
1.1.5. KOA and K′OA
Using another thermodynamic triangle, KOA can be related to the ratio of the octanol–water (KOW in units of m3 water m−3 octanol) and air–water partition ratios (KAW in units of m3 water m−3 air) or the Henry’s law constant in water (kH in units of Pa m3 water mol−1),
Because during a KOW determination, water–saturated octanol is being equilibrated with octanol–saturated water, the thermodynamic triangle of Eq. (10) yields the partitioning ratio between water–saturated octanol (referred to occasionally as “wet” octanol) and the gas phase, which we call K′OA. The presence of water in octanol may increase the octanol solubility of more hydrophilic chemicals and reduce the octanol solubility of more hydrophobic chemicals (Beyer et al., 2002).
In most instances, the KOA reported in the literature refers to the ratio of concentrations of a chemical in pure octanol and the gas phase at equilibrium. However, this is not always the case [e.g., Xu and Kropscott (2014; 2012)]. Therefore, we note within the database whether KOA or K′OA is reported.
1.1.6. Internally consistent K values
The thermodynamic constraints imposed on the partitioning properties by the four thermodynamic triangles displayed in Fig. 1 have been used to adjust properties that are subject to measurement errors to yield a set of properties, called final adjusted values (FAVs), that is internally consistent and, by inference, subject to reduced error (Beyer et al., 2002). Those efforts also take into account the potential discrepancy between KOA and K′OA. Whereas FAVs for the KOA of hexachlorocyclohexanes (Xiao et al., 2004), other organochlorine pesticides (Shen and Wania, 2005), polycyclic aromatic hydrocarbons (Ma et al., 2010), polybrominated diphenyl ethers (Wania and Dugani, 2003), polychlorinated biphenyls (Li et al., 2003), polychlorinated dibenzo-p-dioxins and -furans (Åberg et al., 2008), and volatile methylsiloxanes (Xu et al., 2014) have been reported in the literature, this database does not include them.
1.2. Temperature dependence of KOA
KOA is often highly temperature dependent. At higher temperatures, the KOA of a chemical will be lower, as it becomes more volatile; at low temperatures, KOA is higher. As an example, Fig. 2 plots log10 KOA of DDT (CAS No. 50-29-3) against reciprocal absolute temperature T, where KOA spans multiple orders of magnitude over a 50 °C temperature range. The slope m of the linear regression between log10 KOA and 1/T is related to the molar internal energy of octanol-to-air phase transfer (∆U°OA, J mol−1),
If ∆U°OA is assumed to be constant over a small range of temperatures, the van’t Hoff equation can be used to calculate the KOA at different temperatures,
Here, ∆U°OA expresses the temperature dependence of a partition ratio with the gas phase if the abundance of the chemical in the gas phase is expressed using a volumetric concentration (Goss and Eisenreich, 1996; Atkinson and Curthoys, 1978). The molar enthalpy of solution in octanol from the gas phase (∆H°OA, J mol−1) is used when the chemical’s abundance in air is expressed as partial pressure. ∆U°OA is related to ∆H°OA as follows:
While KOA is almost invariably reported on a volume basis, we found that in some instances ∆U°OA has been mistakenly referred to as ∆H°OA. We note the difference between the two variables because prediction techniques for ΔH°OA (Mintz et al., 2008; 2007) and direct measurements of ∆H°OA using calorimetric techniques (Fuchs and Stephenson, 1985; Stephenson and Fuchs, 1985a; 1985b; 1985c; 1985d; 1985e) exist in the literature.
Example of the temperature dependence of log10 KOA for DDT (CAS No. 50-29-3) between −10 and 45 °C (Harner and Mackay, 1995; Shoeib and Harner, 2002b).
Example of the temperature dependence of log10 KOA for DDT (CAS No. 50-29-3) between −10 and 45 °C (Harner and Mackay, 1995; Shoeib and Harner, 2002b).
The ∆U°OA must be negative because the slope m in Eq. (11) has a positive value (as log10 KOA decreases with increasing temperature). Many papers report a positive ΔU°OA value, which we believe to be ΔU°AO values.
2. Experimental Techniques
The different experimental techniques used to measure log10 KOA can be grouped into three broad categories: dynamic, static, and indirect. Many of the reported values are direct measurements made using the dynamic generator column technique or indirect measurements using gas chromatography retention time (GC-RT) methods. Dynamic techniques typically involve streaming air through or over a stationary octanol phase. In static measurement techniques, the octanol phase and air phase are in direct contact with each other in a closed vessel; however, neither phase is moving. Indirect techniques require a reference compound with a well-established measured KOA value, and the elution time of the analyte relative to that of the reference compound is used to determine KOA. In this section, we discuss each of these measurement techniques in greater detail. Table 1 summarizes the different techniques used to measure KOA. Most techniques have a specific applicability range for KOA. We also list the temperature range for these different measurements.
Summary of the different techniques used to obtain experimental KOA values, including the KOA and temperature ranges of the values reported in the database
2.1. Static methods
In static techniques, either the gas phase, the octanol phase, or both are directly sampled and analyzed for the solutes once they have reached equilibrium within a closed system. This includes a variety of headspace techniques [e.g., Dallas (1995); Hussam and Carr (1985); Park et al. (1987); Treves et al. (2001); Xu and Kropscott (2014; 2013; 2012); and Lei et al. (2019)], a vacuum distillation method (Hiatt 1998; 1997), and a method based on measuring the kinetics of approaching an equilibrium distribution (Ha and Kwon 2010; Lee and Kwon 2016).
2.1.1. Headspace techniques
In the basic headspace technique, the solute is equilibrated between octanol and headspace in a closed container, whose temperature is controlled, for example, with a water bath. The concentration in the headspace is then quantified using gas chromatography and an external calibration. The concentration in octanol is determined by dissolving a known quantity of solute into a known volume of octanol, and the KOA is then determined using Eq. (1). Headspace techniques can measure multiple solutes at the same time, at different temperatures, and at low solute concentrations.
Rohrschneider (1973) was one of the first to use headspace analysis to measure solvent–air interactions in many different solvents, including octanol. A small volume of solute was added to 2 ml of solvent and allowed to equilibrate for two to fifteen hours in a temperature bath. The headspace of the vial was sampled and calibrated against the response for the solute in a solvent for which KiA is known (where i is a solvent).
The group of Carr et al. (Hussam and Carr, 1985; Park et al., 1987; Dallas, 1995; and Castells et al., 1999) refined the headspace technique for measuring solute partitioning between solvents and the gas phase. This technique has also been used by Cheong (1989), Abraham et al. (2001), and Dallas and Carr (1992). Typically, and PL are reported, allowing for KOA to be derived using Eq. (9), or KOA was reported directly. The data by Castells et al. (1999) are excluded from the database as no PL values were reported.
Instead of a headspace vial, Xu and Kropscott (2013) used a 100 ml Hamilton syringe to equilibrate a solute between octanol and air. For analysis, air and octanol samples are taken through the same sampling port, with the former being collected onto a cold trap. A more complex apparatus involving two syringes connected by a small valve was used by Xu and Kropscott (2012) to simultaneously measure the partitioning equilibria between two solvents and the headspace. Using octanol saturated with water and water saturated with octanol as the two solvents, Xu and Kropscott (2012) measured the K′OA with this system. While this technique can determine multiple phase equilibria of relatively volatile chemicals at the same time, it is extremely challenging to implement because all three phases need to be sampled quickly to avoid disturbing the equilibrium of the system.
The variable phase ratio headspace technique introduced by Ettre et al. (1993), and first applied to the measurement of KOA by Lei et al. (2019), improves on the basic headspace technique by doing away with the need to quantify the solute concentration in the headspace. Variable volumes of the same octanol solution are placed into sealed vials and allowed to equilibrate. The reciprocal signal strength obtained from headspace analysis is regressed against the phase ratio, which is the volume of air to the volume of octanol solution present in each vial (Lei et al., 2019). The KOA is then determined as the intercept divided by the slope of the linear regression (Lei et al., 2019), i.e., no calibration or quantification is required.
Whereas headspace techniques work well for volatile compounds, they are unsuitable for chemicals with log10 KOA greater than about 4 (Lei et al., 2019). One challenge of applying headspace techniques to less volatile solutes is that the concentrations in the headspace are often too small for reliable quantification. Treves et al. (2001) used solid-phase microextraction (SPME) fibers to collect the solute from the headspace and thus increase the amount delivered onto the GC column for analysis. A quantification of the headspace concentration, however, would require knowledge of a solute’s gas–fiber partition ratio (KFG) and a fiber-specific constant (kF). Treves et al. (2001) eliminated the need to empirically determine KFG and kF of a chemical by using a reference compound with a known kH. The response of each sample is plotted against the solution concentration, where the slope is equal to KFG·kF over ·RT.
Seeking to measure the partitioning of anesthetic gases between air and blood, Strum and Eger (1987) developed a headspace technique that can work with small amounts of solvent, which is particularly advantageous when working with human samples (e.g., blood). This technique was widely used in the field of anesthesiology for a range of solvents. The technique as described by Taheri et al. (1991), and variations thereof, have been employed by Eger and colleagues to measure KOA at 37 °C (Eger et al., 1997; 2001; Fang et al., 1997a; 1997b; 1996; Ionescu et al., 1994; and Taheri et al., 1993). A volume of the gaseous analyte is dissolved into octanol and the concentration of the solute in the headspace is determined using gas chromatography. A small aliquot of the octanol solution is then added to a larger evacuated flask. The pressure in the flask is slowly released, and a syringe is used to pump additional air to the system and mix the gaseous phase. The air in the syringe is then analyzed to determine the concentration of the solute in the gaseous phase. KOA in this method is derived as a function of the volume of the flask, the volume of the aliquot of octanol solution in the flask, and the initial and final concentrations of the solute in the gas phase sampled above the octanol solution. This is a highly complex methodology and is therefore more likely to be prone to error. It is also limited to gaseous solutes, which must be available in a relatively pure form. These gaseous solutes will have low log10 KOA values.
2.1.2. Vacuum distillation and gas chromatography
Hiatt reported KOA values while working to improve upon earlier designs of a vacuum distillation with gas chromatography and mass spectrometry (VD/GC/MS) technique for quantifying volatile organic compounds (VOCs) in complex environmental matrices, such as fish tissue and vegetation (Hiatt, 1998; 1997). A sample and a spike containing the analytes of interest are placed in the sample chamber and allowed to equilibrate for three hours (Hiatt 1995). The sample chamber is then evacuated using a vacuum pump for five minutes, and the evacuated air passes first through a condenser column, to collect water vapor, and then a cryo-loop, submerged in liquid nitrogen, to collect the distillate (Hiatt, 1995). A carrier gas is then used to push the distillate through to a GC/MS for analysis (Hiatt, 1995). KOA is then calculated based on the analyte recovery from the organic phase and a calculated KOA of surrogate analytes (Hiatt, 1997). A major flaw of this measurement technique is the use of calculated K′OA values for the surrogate analytes. The method also assumes that fish tissue and leaves are representative of pure octanol—however, we note that the values reported in these works are not explicitly indicated to be KOA measurements. While the reverse can be used as an estimation technique, this assumption is not ideal for deriving physical–chemical properties of chemicals. Some of the reported KOA values have a large degree of error (Hiatt, 1997).
2.1.3. Gas solubility techniques
Two general techniques were found to measure the Loct of gaseous compounds. The first is used specifically for measuring the solubility of xenon (Xe, CAS No. 7440-63-3). Here, Pollack et al. (1984) used a NaI(Tl) crystal paired with a photomultiplier, which is directed at a fixed amount of gaseous Xe held within a sealed chamber (Pollack and Himm, 1982). The chamber is connected to a flask containing a known amount of solvent, in this case octanol, with some headspace (Pollack and Himm, 1982). The Xe is allowed to reach equilibrium with the solvent and excess gas. KOA can be determined based on the volume of the gaseous phases in the two chambers and the volume of the solvent and by quantifying the amount of Xe present before and after equilibrium is reached (Pollack and Himm, 1982).
The second gas solubility technique often involves the use of specific equipment, such as the Van Slyke–Neill blood gas apparatus (Boyer and Bircher, 1960), modified Morrison–Billett apparatus (Wilcock et al., 1978), or the Ben–Naim/Baer-type apparatus (Bo et al., 1993). These techniques are scarcely described in the original literature; however, Battino and Clever (1966) described the technique using the Morrison–Billet apparatus and the Ben–Naim/Baer-type apparatus in an early review. An excess amount of gas is dissolved into a solvent and then the solution is degassed into an apparatus. The solvent is then saturated with the gas analyte at a constant temperature (Battino and Clever, 1966). Knowledge of the volume of the solvent in which the gas was dissolved and the pressure and volume of gas dissolved yields the gas solubility, and this combined with the partial pressure of the system can provide the Loct.
2.1.4. Droplet kinetics
In the technique by Ha and Kwon (2010), a tiny droplet of octanol is suspended above an octanol solution within a sealed vial. The kinetics of uptake in the droplet of the solutes of interest and of a reference chemical with a well-established log10 KOA is recorded by measuring the concentrations in the droplet after variable periods of time. The KOA can then be derived from the kinetics of uptake if the thickness of the air boundary layer, the molecular diffusivity of the chemical in air, and the surface area and volume of the octanol droplet are known. The reference compound serves to calibrate the thickness of the air diffusive boundary layer. The KOA of the analytes of interest must be sufficiently high so that the mass transfer resistance of the chemical in the octanol is negligible relative to that in air (Ha and Kwon 2010). The length of the experiment depends on the anticipated KOA value, as it will take longer for a change in the chemical concentration in the octanol droplet to be quantifiable for chemicals with high log10 KOA values (Ha and Kwon, 2010). Although their measurements were conducted at 25 °C, Ha and Kwon suggested that this method can be used to obtain KOA at different temperatures, as long as the octanol drop does not evaporate (Ha and Kwon, 2010). This measurement technique is applicable to chemicals with a log10 KOA between 5 and 9 (Ha and Kwon, 2010), i.e., it extends to higher values than are typically accessible with static headspace techniques.
2.1.5. Partial pressure
Measurements of the partial vapor pressure of solutes in octanol can be used to determine the ΔG° of solvation into octanol (Berti et al., 1986; Cabani et al., 1991). The vapor pressure of octanol is first determined using a static apparatus (Berti et al., 1986). The partial pressure of the solute over solution is measured at varying molar ratios and is used to solve for ΔG′OA (Berti et al., 1986). By regressing the molar ratio with ΔG′OA, the authors extrapolated to solve for ΔG*OA where the pressure (in atm) and molar ratio are equal to 1 (Berti et al., 1986). Equation (7) is then used to solve for ΔG°OA (Berti et al., 1986).
2.2. Dynamic methods
The challenge of static techniques for KOA determination is that the amount of less volatile compounds in the gas phase is too small for reliable determination. It is therefore often necessary to greatly increase the volume of air that is being equilibrated with the octanol phase. If the determination is based on the amount of solute being lost from the octanol phase, it can also be beneficial to minimize the volume of octanol in the experimental system. Dynamic techniques for measuring KOA involve passing a stream of air through or past a stationary octanol solution. Therefore, the volume of air can be increased by extending the length of time that the air is flowing past the octanol.
The generator column techniques require the amount of analyte transferred from octanol to the gas stream to be quantified, whereas in gas stripping techniques only the rate of change in the concentration of the analyte in the gas stream or the solvent must be recorded. In the dynamic gas–liquid chromatography technique, KOA is derived from the time it takes for a chemical to travel through a gas chromatographic column with octanol as a stationary phase. The generator column technique is by far the most commonly applied dynamic method because it is one of the few techniques readily applicable to less volatile solutes.
2.2.1. Generator column or fugacity meter
The generator column technique, sometimes also referred to as the fugacity meter technique, involves passing large volumes of air through a stationary octanol phase. First used by Harner and Mackay (1995), this technique has since been used extensively in different configurations (Kömp and McLachlan, 1997; Dreyer et al., 2009; etc.). Either glass wool or glass beads are coated with a small volume of an octanol solution and are placed in a column. Air passing through the column at a controlled rate for a measured length of time equilibrates with the octanol. The air is saturated with octanol prior to passage through the column to prevent the vaporization of octanol. The amount of chemical that partitions from the spiked octanol into the air phase is trapped and quantified to determine a concentration in air, CA. Using the known concentration of the chemical in octanol, CO, yields KOA from Eq. (1).
This method requires the validity of several assumptions to yield reliable results. The concentration of the analytes of interest in the octanol needs to be sufficiently high to remain constant throughout the measurement. The flow rate must be sufficiently slow for the chemicals to reach equilibrium between octanol and air. The length of an experiment must balance the need to collect an amount of chemical from the air stream that is sufficient for reliable quantification but not so much that it would deplete the chemical from the spiked column.
2.2.2. Gas stripping and bubbling techniques
This technique is commonly applied for measuring kH and involves passing air past a stationary solvent phase. Two variations of this technique have been applied to measuring KOA.
Adopting the gas stripping method by Leroi et al. (1977), Fukuchi et al. (2001; 1999) moved small air bubbles through a very small volume of octanol containing the solute of interest. Equilibration is assured by a slow flow rate and small bubble size. A temperature bath allows for measurements at different temperatures. By recording the concentration change of the solute in the gas phase over time, Fukuchi et al. derived from the gas flow rate and the solute’s estimated PL. The volume of octanol is assumed to be constant (Leroi et al., 1977). We used Eq. (9), the measured , and the estimated PL to derive the KOA value. This technique has only been used for four ether compounds.
In the technique by Roberts (2005), the solute is not added directly to the octanol, but the gas is first bubbled through a small volume of the liquid solute prior to being bubbled through a volume of octanol. Once the solute has reached equilibrium between the gas and octanol, the gas concentration of the solute at the outlet will be constant. At this point, the solute is removed from the gas flow, and the gas begins to strip the octanol of any solute (Roberts, 2005). Measuring the change in the concentration of the solute at the outlet allows for the determination of a first-order rate loss constant for the chemical from octanol. When combined with the octanol volume and gas flow rate, KOA can be obtained (Roberts, 2005). This technique has been applied to measure the KOA of peroxyacetyl nitrate (CAS No. 2278-22-0) (Roberts, 2005) and triethylamine (CAS No. 121-44-8) (Leng et al., 2015).
Among the advantages of the gas stripping techniques are that analysis of only one phase is required and that no quantification is necessary because the change in signal strength over time can be plotted in place of concentration. This also eliminates the need for a calibration curve. Finally, this technique uses multiple measurements to obtain a single KOA value, which increases the reliability of the experimental value. However, solute volatility limits the applicability of gas stripping techniques to a fairly narrow range of KOA. The technique employed by Roberts (2005) is also limited to liquid solutes.
2.2.3. Gas–liquid chromatography retention time
Some dynamic methods rely on the determination of the retention of a solute in a gas chromatographic column containing octanol as a stationary phase. No quantification of the amount of solute in either octanol or gas phase is necessary. The use of octanol as a stationary phase sets these methods apart from other retention time techniques using commercial columns, which rely on correlations and always require reference compounds with a known KOA. They will be discussed in the next section.
Gruber et al. (1997) recorded the net retention volume (VN) on columns with variable volumes of octanol (VL) coated on the inside. When VN/VL is regressed against the reciprocal of VL, the intercept yields KOA (Gruber et al., 1997). This technique has similarities with the static variable phase ratio technique by Lei et al. (2019) described above.
Sandler et al. (Tse and Sandler, 1994; Bhatia and Sandler, 1995) used a slightly different gas chromatographic method, relying on the use of a reference compounds with a known (hexane and heptane), to measure the of halogenated alkanes. The ratio of the elution time of the reference compound and the solutes of interest relative to that of methane is used, together with an estimated PL. We utilize the reported and PL to calculate KOA using Eq. (9).
2.3. Indirect gas-chromatographic retention time methods
Indirect techniques seek to derive KOA from the retention time of solutes on commercial gas chromatographic columns, i.e., the stationary phases of those columns serve as surrogates for the octanol phase. Because these surrogates are imperfect, indirect methods always require a calibration and often relate the retention times of the analytes of interest to those of reference compounds with previously measured KOA values. There are a few variations of the gas chromatography retention time (GC-RT) technique; however, they all have in common that at least one chemical with a well-established KOA value at different temperatures is required.
The first instance of measuring KOA using GC-RT was by Zhang et al. (1999), who regressed capacity factors of chemicals on multiple columns with their KOA to obtain a multiple linear regression (MLR) equation. The KOA of multiple calibration chemicals need to be known as a function of temperature, as separate MLR equations are required for different temperatures. While the use of multiple columns with different solid phases is meant to better account for different types of interactions of a chemical with octanol (Zhang et al., 1999), Su et al. (2002) showed that a linear regression with a single column’s capacity factor worked equally well and yielded KOA values with a smaller error.
The retention time index (RTI) method is essentially a technique for extrapolating known KOA values within a group of structurally related compounds by linearly regressing directly determined log10 KOA values against the compounds’ RTI (Harner et al., 2000). The RTI relates the retention time of the solute to that of linear alkanes. The regression equation is then used to estimate KOA for other related compounds using their RTI. A separate regression for different experimental temperatures is required. By further regressing the slope and intercept of these linear regressions against temperature, the log10 KOA at different temperatures can be determined solely from the RTI of a chemical. This method relies heavily on having direct measurements of KOA at different temperatures for different congeners and RTI values for each congener. When applying this method to polychlorinated dibenzo-dioxins and -furans (PCDD/Fs), Harner et al. (2000) also accounted for the position and number of chlorine substitutions because measurements with the generator column technique had revealed that tetra-, penta-, and hexa- PCDD/Fs with 3-4 chlorines in the 2,3,7, and/or 8 positions had a higher affinity to the octanol phase (Harner et al., 2000). This illustrates the need for good calibration and reference data when using indirect KOA measurement techniques.
Adapting a technique for the determination of PL, Wania et al. (2002) used the retention time of a chemical relative to a single reference chemical in order to obtain KOA. In principle, the relative retention times of the analyte (tRi) and the reference compound (tRref) are proportional to the partition ratio between the stationary phase of the column and air, which is also proportional to KOA (Wania et al., 2002). Thus, this method only requires a single reference compound to have well established KOA values at different temperatures. These log10 KOA values are plotted against ln (tRi/tRref) to produce a linear regression with a slope equal to ∆UOAi/∆UOAref − 1. The internal energy of octanol gas phase transfer ∆UOAi then allows for the determination of KOA at different temperatures (Wania et al., 2002). The obtained KOA values are then regressed against literature values of KOA obtained using direct measurement techniques. Therefore, even though only one chemical is needed as a reference compound, calibration compounds with established KOA values are needed to improve the reliability of the results. Wania et al. (2002) also showed the importance of selecting an appropriate reference compound because interactions of different compounds with the stationary phase and octanol may be dissimilar. This technique is the most commonly applied GC-RT for KOA determination.
3. Estimation Techniques
Numerous techniques for estimating KOA exist. We describe here a few of the major techniques if they had been specifically designed for estimating KOA and if KOA estimated with those techniques have been reported in the literature. If KOA values had been calculated in the context of studies on passive air sampling, atmospheric particle–gas partitioning, or environmental fate modeling, they are not considered. Only articles focusing on physical–chemical property estimation techniques or work comparing experimental and/or estimated KOA values are included within the database and in this review. Table 2 summarizes the different techniques for estimating KOA. These techniques tend to have a wider applicability range than the experimental ones. We also list the temperature range for these methods. Most of the estimation models for KOA are Quantitative Structure–Property Relationships (QSPRs). Density functional theory-based solvation models have also been used to determine KOA by first obtaining ΔG°OA of a chemical in octanol [see Eq. (6)].
Summary of the different techniques used to obtain estimated KOA values, including the KOA and temperature ranges of the values reported in the database
3.1. QSPR techniques
QSPR techniques typically involve the regression of descriptors against the property of interest to obtain an equation of best fit that will most accurately predict KOA. These models can be very simple, using basic thermodynamic relationships and linear regressions or using machine learning algorithms to estimate KOA based on a series of chemical descriptors.
3.1.1. Thermodynamic triangles
KOA can be derived from other properties using thermodynamic triangles [see Fig. 1 and Eqs. (9) and (10)]. The two property values used in such an estimation should ideally be experimentally derived. If they are themselves estimated values, the uncertainty of their prediction propagates to KOA.
Most estimations of KOA reported in the literature are derived using Eq. (10), using either experimental or estimated values of KOW and KAW. This can be a useful estimation method for chemicals with well-established KOW and KAW values. However, Finizio et al. (1997) already noted that six KOA values estimated this way were between 0.48 and 1.04 log10 units smaller than experimental values, which may be related to the estimation yielding wet octanol–air partition ratio (K′OA) (see Sec. 1.1.5). Meylan and Howard (2005) conducted the first comprehensive assessment of this technique for estimating KOA using KOW and KAW. They also explored the temperature dependence of KOA by combining a temperature-adjusted KAW value with the KOW of a chemical at 25 °C and using KAW and KOW values estimated with EPISuiteTM’s HENRYWIN and KOWWIN (Meylan and Howard, 2005). This estimation technique is what is used in the KOAWIN model included in EPISuiteTM (EPI Suite Data, 2012).
The use of Eq. (9) is less common but advantageous as it does not yield a K′OA value. Abraham et al. (2001) presented the KOA of some chemicals derived from measured PL and SO. Sepassi and Yalkowsky (2007) used PL and SO estimated from other physical–chemical properties of a compound, including boiling-point temperature and enthalpy of boiling. Best et al. (1997) applied a combination of Eqs. (6) and (10) to estimate ΔG°OA using ΔG°AW and log10 KOW.
Many works report KOA values calculated using thermodynamic triangles or estimated using EPISuiteTM [e.g., Alarie et al. (1995), Sühring et al. (2016), Tamaru et al. (2019), and Xu et al. (2014)]. We have elected to not include all these KOA values. When we did include KOA values obtained through thermodynamic triangles in the database, we also report the original source of the two property values in the property table (see Sec. 4.2.5).
3.1.2. Regression models
Numerous regression models for predicting KOA exist, most frequently restricted in applicability to a specific set of closely related compounds, such as the polychlorinated biphenyls (PCBs) and naphthalenes (PCNs) or the polybrominated diphenyl ethers (PBDEs). The models differ based on the compound group, the type of regression, and the source and type of chemical descriptors. The statistical techniques applied include ordinary or MLRs, partial least-squares models, and principal component regression models. Some models also incorporate temperature into the regression analysis (Chen et al., 2003c; 2003b; 2002b; Jin et al., 2017; and Li et al., 2006). Tables 2 and 3 include a list of the KOA models whose predictions are included in the database and the parameterization as described in this paper.
A list of all the regression models whose KOA predictions are included in the database. MLR: stepwise multiple linear regression models; OLS: ordinary least squares; PCR: principal component regression; PLS: partial least squares; SLR: single linear regression; and MC: Monte Carlo. Note that some of these papers referenced utilize existing models with new descriptors to obtain novel KOA values
Compound class . | Regression method . | Descriptors . | References . |
---|---|---|---|
Methyl and alkyl | MLR | Abraham descriptors | Abraham et al. (2005) |
substituted naphthalenes | |||
Methyl and alkyl | MLRa | Abraham descriptors | Abraham et al. (2005) |
substituted naphthalenes | |||
PCDD/Fs | PLS | MOPAC descriptors | Chen et al. (2001) |
PCBs | PLS | MOPAC descriptors | Chen et al. (2002a) |
PCDD/Fs | PLS | MOPAC descriptors | Chen et al. (2002b) |
PCNs, CBz | PLS | MOPAC descriptors | Chen et al. (2003a) |
PCBs | PLS | MOPAC descriptors, theoretical descriptors (CS ChemOffice) | Chen et al. (2003b) |
PBDEs | PLS | MOPAC descriptors, theoretical descriptors (CS ChemOffice) | Chen et al. (2003c) |
PCBs | PLS | CoMFA | Chen et al. (2016) |
PCBs | PLS | CoMSIA | Chen et al. (2016) |
Phthalate esters | SLR | LeBas molar volume | Cousins and Mackay (2000) |
Simple diverse compounds | MC and MLR | Total solvent-accessible surface area, solute–solvent Coulomb energy, hydrophobic SASA, number of solute as donor hydrogen bonds | Duffy and Jorgensen (2000) |
PAHs | PLS | Electronic descriptors (MOPAC), topological descriptors [see Ferreira (2001) for equations], geometric descriptors [Sanders and Wise Database, see Ferreira (2001) for equations] | Ferreira (2001) |
PBDEs | MLR | Molecular distance-edge vector indexes | Jiao et al. (2014) |
POPs, other hydrocarbons | MLR | Abraham descriptors | Jin et al. (2017) |
PCDDs | SLR | Molecular descriptorsb | Kim et al. (2016) |
POPs | MLR | Fragment constant approach | Li et al. (2006) |
PBDEs | PLS | CoMFA | Liu et al. (2013) |
PBDEs | PLS | CoMSIA | Liu et al. (2013) |
Diverse compounds | MLR | Additive approach using geometric fragmentsc | Mathieu (2020) |
Nonpolar organic compounds | MLR | Abraham descriptors | Nabi et al. (2014) |
Nonpolar organic compounds | MLR | CODESSA PRO QSAR software and hydrogen bonding descriptor | Nabi et al. (2014) |
PBDEs, other hydrocarbons | OLS | DRAGON descriptors | Papa et al. (2009) |
PCNs | PCR | Quantum-chemical descriptors (GAUSSIAN 03), topological descriptors (DRAGON) | Puzyn and Falandysz (2005) |
PCDD/Fs | SLR | Quantum-chemical descriptorsb | Vikas and Chayawan (2015) |
PBDEs | MLR | Quantum-chemical based structural parameters (Gaussian98) | Wang et al. (2008) |
PBDEs | MLR | Electrostatic potential indices (MOPAC and Gaussian98), physicochemical properties (TSAR) | Xu et al. (2007) |
PCBs | MLR | DRAGON descriptors | Yuan et al. (2016) |
PCBs | PLS | HQSAR descriptors | Yuan et al. (2016) |
PCDDs | MLR | Quantum-chemical based structural parameters (Gaussian98) | Zeng et al. (2013) |
Pesticides | MLR | Abraham descriptorsd | Zhang et al. (2016) |
CBz | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PAHs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PBDES | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PCDD/Fs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PCNs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
Compound class . | Regression method . | Descriptors . | References . |
---|---|---|---|
Methyl and alkyl | MLR | Abraham descriptors | Abraham et al. (2005) |
substituted naphthalenes | |||
Methyl and alkyl | MLRa | Abraham descriptors | Abraham et al. (2005) |
substituted naphthalenes | |||
PCDD/Fs | PLS | MOPAC descriptors | Chen et al. (2001) |
PCBs | PLS | MOPAC descriptors | Chen et al. (2002a) |
PCDD/Fs | PLS | MOPAC descriptors | Chen et al. (2002b) |
PCNs, CBz | PLS | MOPAC descriptors | Chen et al. (2003a) |
PCBs | PLS | MOPAC descriptors, theoretical descriptors (CS ChemOffice) | Chen et al. (2003b) |
PBDEs | PLS | MOPAC descriptors, theoretical descriptors (CS ChemOffice) | Chen et al. (2003c) |
PCBs | PLS | CoMFA | Chen et al. (2016) |
PCBs | PLS | CoMSIA | Chen et al. (2016) |
Phthalate esters | SLR | LeBas molar volume | Cousins and Mackay (2000) |
Simple diverse compounds | MC and MLR | Total solvent-accessible surface area, solute–solvent Coulomb energy, hydrophobic SASA, number of solute as donor hydrogen bonds | Duffy and Jorgensen (2000) |
PAHs | PLS | Electronic descriptors (MOPAC), topological descriptors [see Ferreira (2001) for equations], geometric descriptors [Sanders and Wise Database, see Ferreira (2001) for equations] | Ferreira (2001) |
PBDEs | MLR | Molecular distance-edge vector indexes | Jiao et al. (2014) |
POPs, other hydrocarbons | MLR | Abraham descriptors | Jin et al. (2017) |
PCDDs | SLR | Molecular descriptorsb | Kim et al. (2016) |
POPs | MLR | Fragment constant approach | Li et al. (2006) |
PBDEs | PLS | CoMFA | Liu et al. (2013) |
PBDEs | PLS | CoMSIA | Liu et al. (2013) |
Diverse compounds | MLR | Additive approach using geometric fragmentsc | Mathieu (2020) |
Nonpolar organic compounds | MLR | Abraham descriptors | Nabi et al. (2014) |
Nonpolar organic compounds | MLR | CODESSA PRO QSAR software and hydrogen bonding descriptor | Nabi et al. (2014) |
PBDEs, other hydrocarbons | OLS | DRAGON descriptors | Papa et al. (2009) |
PCNs | PCR | Quantum-chemical descriptors (GAUSSIAN 03), topological descriptors (DRAGON) | Puzyn and Falandysz (2005) |
PCDD/Fs | SLR | Quantum-chemical descriptorsb | Vikas and Chayawan (2015) |
PBDEs | MLR | Quantum-chemical based structural parameters (Gaussian98) | Wang et al. (2008) |
PBDEs | MLR | Electrostatic potential indices (MOPAC and Gaussian98), physicochemical properties (TSAR) | Xu et al. (2007) |
PCBs | MLR | DRAGON descriptors | Yuan et al. (2016) |
PCBs | PLS | HQSAR descriptors | Yuan et al. (2016) |
PCDDs | MLR | Quantum-chemical based structural parameters (Gaussian98) | Zeng et al. (2013) |
Pesticides | MLR | Abraham descriptorsd | Zhang et al. (2016) |
CBz | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PAHs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PBDES | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PCDD/Fs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
PCNs | MLR | Molecular connectivity indexes | Zhao et al. (2005) |
For wet-octanol.
Multiple models using different descriptors are presented in the papers and included in the database.
Coefficients for each fragment (characteristic temperature) are obtained via MLR.
Uses the ABSOLV model from ACD/Labs.
3.1.3. UPPER
The Unified Physical Property Estimation Relationship (UPPER) model by Yalkowsky et al. (1994a) uses the thermodynamic triangle between KOA, PL, and SO [Eq. (9)]. Molecular descriptors are obtained from the structure of a chemical using additive-group contribution estimations or the geometry of the structure (Lian and Yalkowsky, 2014). The descriptors are then used to derive basic physical–chemical properties (referred to as component properties, including melting and boiling points), which allow for the calculation of KOA, KAW, and KOW (Yalkowsky et al., 1994a).
3.1.4. UNIFAC
The UNIFAC model estimates with an additive fragment-based approach with group-interaction parameters (Fredenslund et al., 1975). It also considers the volume, surface area, and the number of different groups present in the solute (Fredenslund et al., 1975). Dallas (1995) used UNIFAC to estimate KOA and compare it to direct measurements and the MOSCED model (see Sec. 3.2). This author also compared the performance of the UNIFAC model with an infinite-dilution activity based UNIFAC model, which uses calculated interaction parameters using activity coefficients at infinite dilution, and a modified UNIFAC model that combines the original and the infinite-dilution activity based UNIFAC models. A summary of publicly available group-interaction parameters can be obtained from the UNIFAC Consortium webpage (http://unifac.ddbst.de/unifac_.html). Note that within the database, we include estimates that directly report KOA or include a PL for the calculation of KOA. Papers that only report a UNIFAC-estimated are not included [e.g., Castells et al. (1999), Eikens (1993), and Li et al. (1995)].
3.1.5. Machine learning
While machine learning algorithms resemble regression models in that they use descriptors to predict KOA, they differ in the approach to correlating the different variables. Jiao et al. (2014) created an artificial neural network model that uses molecular distance-edge vector index descriptors to predict KOA. The model is designed to have the smallest RMSE for the validation set (Jiao et al. 2014).
The OPERA model (Mansouri et al., 2018; Mansouri and Williams, 2017) is also a QSAR model developed using machine learning. OPERA uses the k-nearest neighbor approach and PaDEL descriptors for the number of hydrogen bond donor and the hexadecane-air partition ratio to estimate KOA. While estimates from OPERA are not included in the database, these values are easily obtained from the CompTox Dashboard (Williams et al., 2017) or the model can be downloaded from GitHub (https://github.com/kmansouri/OPERA).
3.2. Solvation models
Solvation models estimate the ΔG°i of a chemical in a solvent i. The difference of ΔG°i in octanol and the gas phase can be used to estimate ΔG°OA, which, in turn, can be used to estimate KOA (Nedyalkova et al., 2019). Such models have been applied to estimate KOA of a wide range of chemicals, and numerous variations of models for estimating ΔG°OA exist in the literature. The information included in the database is limited to models that have been specifically designed to estimate ΔG°OA and to predictions made during the comparison and assessment of these solvation estimation techniques. A subset of universal solvation models that estimate ΔG° for various air–solvent interactions are also considered. Specifically, this includes estimates from MOSCED (Modified Separation of Cohesive Energy Density) (Thomas and Eckert, 1984) and various universal solvation models [e.g., Best et al. (1997)].
The MOSCED model estimates ΔG°i in a solvent as the difference between the cohesive energy density of the pure phase and the solution (Thomas and Eckert, 1984). The SM8AD and SMD models are universal solvation models that solve for the electrostatic contribution using either the generalized Born approximation with asymmetric de-screening (SM8AD) (Marenich et al., 2009a) or the nonhomogeneous Poisson equation (SMD) (Marenich et al., 2009b). These models can be parameterized using different density functionals, which can produce slightly different results (Nedyalkova et al., 2019). Multiple variations of these solvation models for multiple solvents exist, and we have included a selection of estimates, such as Best et al. (1997), Duffy and Jorgensen (2000), Giesen et al. (1997), Li et al. (1999), and Zhu et al. (1998). While there are very likely far more universal solvation models for ΔG°OA in the literature, we have included only selected estimates in the database because these models often merely improve upon previous iterations of the SM-AD and SMD models and predict the ΔG°OA for sets of chemicals that also have experimental ΔG°OA values.
The COnductor-like Screening Model for Realistic Solvents (COSMO-RS) software suite can also be used to estimate KOA for chemicals [e.g., Parnis et al. (2015)]. COSMO-RS applies quantum chemical density functional theory and statistical thermodynamics to derive ΔG° values (Klamt et al., 2009). Endo and Hammer (2020) introduced a fragment contribution model for extrapolating COSMO-RS predicted KOA for short-chain chlorinated paraffins, which reduces calculation times.
3.3. Other models for estimating KOA
Another tool for estimating KOA is SPARC Performs Automated Reasoning in Chemistry’s online physicochemical calculator (SPARC) (available at http://archemcalc.com/). The details on how exactly SPARC works are not widely available. However, it is noted that linear free energy relationships are used to estimate thermodynamic properties such as KOA (Hilal et al., 2003). While SPARC has been applied repeatedly to estimate KOA (Zhang et al., 2016), the calculated KOA values are not often reported [e.g., Stenzel et al. (2014) and Wang et al. (2012)]. We only include KOA estimates from SPARC in the database if they are compared to experimental values or other estimates; thus, KOA values reported in papers such as Weschler and Nazaroff (2010) have not been included.
Some other publications on KOA estimation models, including various MLR models such as poly-parameter linear free-energy relationships (ppLFERs), COSMOtherm (Klamt, 2018; 2011), and OPERA (Mansouri et al., 2018), do not always report the estimated KOA values. Thus, KOA estimates made with these approaches are not included in the database. In addition, there are published MLR models for predicting KOA that do not report estimated KOA values and thus are not included in the database; a summary of these models is included in Table 4.
MLR models for predicting KOA, which have not been included in the database because no KOA estimates are published directly
Chemical specificity . | Regression method . | Descriptor . | References . |
---|---|---|---|
POPs | PLS | Quantum chemical descriptors | Chen et al. (2004) |
Diverse compounds | PLS | Quantum chemical descriptors | Fu et al. (2016) |
(DRAGON) | |||
Diverse compounds | PLS | Atom-centered fragments (DRAGON) | Fu et al. (2016) |
Chlorinated compounds | MLR | Molecular polarizabilities and multipole | Staikova et al. (2004) |
moments (GAUSSIAN 98) | |||
PCNs | MLR | Abraham descriptors | Abraham and Al-Hussaini (2001) |
N-nitrosodialkylamines | MLR | Abraham descriptors | Abraham and Al-Hussaini (2002) |
Diverse compounds | MLR | Abraham descriptors | Abraham and Acree (2008) |
Diverse compounds | MLR | Abraham descriptors | Abraham et al. (2008) |
Diverse compounds | MLR | Abraham descriptors | Endo and Goss (2014) |
Diverse compounds | MLR | General treatment of solute–solvent | Deanda et al. (2004) |
interactions (GSSI) descriptors | |||
PCNs | PLS | MOPAC and 3D-HoVAIF descriptors | Li et al. (2012) |
Organophosphorus compounds | MLR | Abraham descriptors | Abraham and Acree (2013) |
Chemical specificity . | Regression method . | Descriptor . | References . |
---|---|---|---|
POPs | PLS | Quantum chemical descriptors | Chen et al. (2004) |
Diverse compounds | PLS | Quantum chemical descriptors | Fu et al. (2016) |
(DRAGON) | |||
Diverse compounds | PLS | Atom-centered fragments (DRAGON) | Fu et al. (2016) |
Chlorinated compounds | MLR | Molecular polarizabilities and multipole | Staikova et al. (2004) |
moments (GAUSSIAN 98) | |||
PCNs | MLR | Abraham descriptors | Abraham and Al-Hussaini (2001) |
N-nitrosodialkylamines | MLR | Abraham descriptors | Abraham and Al-Hussaini (2002) |
Diverse compounds | MLR | Abraham descriptors | Abraham and Acree (2008) |
Diverse compounds | MLR | Abraham descriptors | Abraham et al. (2008) |
Diverse compounds | MLR | Abraham descriptors | Endo and Goss (2014) |
Diverse compounds | MLR | General treatment of solute–solvent | Deanda et al. (2004) |
interactions (GSSI) descriptors | |||
PCNs | PLS | MOPAC and 3D-HoVAIF descriptors | Li et al. (2012) |
Organophosphorus compounds | MLR | Abraham descriptors | Abraham and Acree (2013) |
Some models have been designed to estimate the temperature dependence of log10 KOA for a series of compounds. For example, the model by Yang et al. (2018) estimates the temperature dependence of KOA for PBDEs. Mintz et al. (2008; 2007) published two ppLFERs using Abraham descriptors to estimate ∆H°OA of a wide range of chemicals.
4. KOA Data
4.1. Data collection
This database includes all measured or estimated KOA values that we could locate in the literature using the Web of Science using variations of the keywords: Octanol–air partition coefficient (KOA), octanol–gas partition coefficient, Ostwald coefficient octanol, and Gibbs free energy octanol. References were also found by looking up citations included in the identified papers. A total of 112 literature sources were found to contain KOA data. Forty-seven sources included estimated KOA values, while 70 contained measured values. The database incorporates 209 KOA values from three dissertation theses (Cheong, 1989; Dallas, 1995; and Özcan, 2013). While a large portion of the work by Dallas (1995) was published in Abraham et al. (2001) and Dallas and Carr (1992), a portion of KOA estimates from this thesis are not available in the peer-reviewed literature. To the best of our knowledge, KOA data from Cheong (1989) and Özcan (2013) have not been published in the peer-reviewed literature. The search was limited to publications written in English, although articles containing log10 KOA estimates have also been published in other languages [e.g., Zhang et al. (2005) and Zou et al. (2005)].
We have included the error of a measurement or estimation in the database. We have also noted where the KOA reported is for a mixture of isomers or the chemical structure is ambiguous [e.g., Harner and Bidleman (1998), Kömp and McLachlan (1997), and Vuong et al. (2020)]. In some instances where a single paper has reported more than one KOA using different techniques, we note which technique or value was recommended by the authors [e.g., Su et al. (2002)].
4.2. Database structure
The database is provided in a Microsoft Excel workbook and as an R package. The data are stored in seven distinct tables (Fig. 3) to allow users to sort and filter the data based on various criteria, including author, publication year, and measurement or estimation technique.
A relational schematic representation of the KOA database. PK denotes a primary key, which is unique to a specific table. FK denotes a foreign key, which indicates how the data in the different tables are connected.
A relational schematic representation of the KOA database. PK denotes a primary key, which is unique to a specific table. FK denotes a foreign key, which indicates how the data in the different tables are connected.
The Chemical Table provides information regarding the name, CAS number, SMILES notation, and other chemical identifiers for each solute. Each unique chemical is associated with a chemical identification number within the database (chemID). Similarly, each unique literature source and author are assigned a unique identifier in the Reference Table (refID) and Author Table (auID). The Author-Reference Table is used to connect the information presented in both. Each method for predicting or estimating KOA within a paper is also assigned a unique identifier (methID). The Property Table contains the PL, γoct, KAW, KOW, and ΔG° values originally reported in the paper in SI units (with the exception of ΔG°, which is in kJ mol−1) and used to calculate the KOA; each value is assigned a propID. Finally, the KOA values are reported in the KOA Table, with each datapoint uniquely associated with a dataID.
4.2.1. Chemical table
The QSAR-ready SMILES notation for the 1643 compounds with literature data was taken from EPA’s CompTox Dashboard (Williams et al., 2017) or PubChem (Kim et al., 2017) or, if none were available, created using ChemDraw. CAS numbers and names of chemicals were verified using both SciFinder and the CompTox Dashboard. Canonical SMILES for compounds were produced using Open Babel (version 3.00) (O’Boyle et al., 2011). We also include the IUPAC name, common name, common acronym, and alternative names and acronyms for each chemical. The list of names for each compound is not exhaustive, and we recommend searching for chemicals using their CAS number. We also group chemicals into over 50 broad categories, including PCBs, PBDEs, PCNs, amines, ketones, and so on. This categorization of chemicals is also non-exhaustive as many chemicals may fall into more than one group. Within the database, each chemical is assigned a unique identifier, a chemID. In some instances, the KOA of a mixture of two or more chemicals is also reported, these mixtures are also given a unique chemID, and the CAS number and the identifying information for all chemicals in the mixture are included.
4.2.2. KOA table
KOA data are stored in the KOA Table. Each KOA value is assigned a unique identifier (dataID), which is associated with a specific chemical (chemID) and method (methID). For each KOA value, we also include the temperature of the measurement, any errors reported for the measurement, and comments or notes for the datapoint. The comments indicate if the KOA reported is for an isotopically labeled species or if any typo corrections and assumptions were made during the data curation process of the original work.
4.2.3. Methods and reference table
Each reference is stored in the Reference Table and assigned a unique identifier (refID). Some references may report or compare the results of different KOA measurement or experimental techniques; therefore, a single reference can be associated with multiple experimental and estimation techniques. Each technique from each reference is also assigned a unique identifier (methID). Different measurement and estimation techniques have been employed and published at different times over the past few decades, which is shown in Fig. SI 7 in the supplementary material.
4.2.4. Property table
In Sec. 1.1, we discussed the different ways log10 KOA has been reported in the literature. The Property Table (dark blue table, Fig. 3) includes the data originally reported in the literature in standardized units, except ΔG°, which is reported in units of kJ mol−1. This includes converting all reported kH values to KAW for convenience. Each property value within the database is associated with exactly one KOA datapoint in the main KOA Table; however, a single KOA datapoint may be associated with more than one property value. Each property datapoint is associated with a chemical (chemID), method (methID), reference (refID), and KOA value (dataID). There are 2228 dataIDs that are associated with 3723 propIDs. Figure SI 4 shows the distributions of the different property data included in the database.
4.3. Quality of the reporting
Each technique and method (i.e., methID) for determining and reporting KOA values was assigned a score based on whether
the method is an estimation or empirical technique,
the description of the used methodology is sufficiently detailed,
some analysis of the methods is provided (including, e.g., their possible limitations and scope), and
an error of the KOA value is reported and/or whether an external validation of the KOA value was performed.
Each factor is weighted equally, and the papers are thus scored from 0 to 4. The point assigned for each factor is binary, and thus, there are no half points allocated. These points are only ascribed to method, description, analysis, and error of the reported KOA value. For example, a method from a paper about empirical measurements of KOA, providing a detailed description and analysis of the methodology and the error of the prediction, will have a score of 4, whereas a thermodynamic triangle method without any additional details or analysis will have a score of 1. Based on this categorization, a total of 36 methods scored 4, 81 methods scored 3, 32 methods scored 2, and 14 methods scored 1. Note that we define each method as having a unique identifier (methID). Figure SI 2 shows the distribution of the scores across the log10 KOA range for experimental and estimated data.
4.4. KOA data
The database contains 13 264 KOA values for 1643 different chemicals between −50 and 110 °C. Of these, 2517 values (19%) are experimentally derived and 10 747 values (81%) are estimated (Fig. 4). Notably, the data are bimodally distributed with respect to log10 KOA, with one maximum between log10 KOA 2 and 4 and a second between log10 KOA 6 and 11. If we consider only measurements or estimates made at 25 °C, the peaks become even more pronounced (Fig. SI 1). These peaks are a result of the difference in the applicability range of the different measurement techniques. Many estimation techniques require a set of KOA data to develop and train new models, so the distribution of estimated KOA values follows a similar pattern.
Distribution of all experimental and estimated KOA values included in the database.
Distribution of all experimental and estimated KOA values included in the database.
4.5. Measured KOA values
There are 2517 measured KOA values in the database for 704 chemicals. Of these, 1524 are directly measured values, while 993 are obtained by indirect measurements using chromatographic retention time techniques. The highest and lowest determined log10 KOA values are −1.8 and +14 for helium (CAS No. 7440-59-7) and trichloro-benzo[a]pyrene (CAS No. 97303-27-0), respectively. Most of the reported values (52%) are for log10 KOA values between 6 and 11, followed by measurements between 2 and 5 (30%). Few measurements have been reported for chemicals with a log10 KOA less than 2 (4%) or greater than 11 (8%). There is also a drop in the number of measurements between log10 KOA 5–6 (4.92%). Most log10 KOA values less than 4 are measured directly using static techniques, while dynamic or indirect techniques were used to measure log10 KOA values greater than 6 [Fig. 6, Panel (a)]. Approximately, a quarter (26.2%) of measurements in the database were obtained using static techniques, a third (34.2%) using dynamic techniques, and 39.4% using indirect techniques. Figure 5 displays in more detail the KOA range in which different static and dynamic techniques have been applied.
Distribution of different methods used across a log10 KOA range of < −1 to > 13. The generator column, gas stripping, and dynamic GLC-RT are dynamic techniques. Headspace, phase equilibrium, droplet kinetics, partial pressure, and gas solubility methods are static techniques.
Distribution of different methods used across a log10 KOA range of < −1 to > 13. The generator column, gas stripping, and dynamic GLC-RT are dynamic techniques. Headspace, phase equilibrium, droplet kinetics, partial pressure, and gas solubility methods are static techniques.
Distribution of experimentally derived log10 KOA values included in the database. Each panel shows the distribution based on the experimental technique used [Panel (a)], the temperature range of the reported value [Panel (b)], and common classes of chemicals with measured KOA values [Panel (c)].
Distribution of experimentally derived log10 KOA values included in the database. Each panel shows the distribution based on the experimental technique used [Panel (a)], the temperature range of the reported value [Panel (b)], and common classes of chemicals with measured KOA values [Panel (c)].
There are 924 KOA values (37%) for 573 chemicals at 25 °C, and most measurements (51%) are made within the 20–30 °C range. There are relatively more measurements made at high temperatures (>30 °C) when log10 KOA is less than 3, presumably to extend the applicability of static techniques to somewhat less volatile chemicals. More surprisingly, there are also relatively more measurements at cooler temperatures (<20 °C) when log10 KOA is greater than 7 [Fig. 6, Panel (b)]. This is likely because many of those less volatile chemicals are environmental contaminants and the partitioning behavior at environmentally relevant temperatures is of primary interest. Measurements in the log10 KOA range between 7 and 10 have been made at the most diverse range of temperatures.
The types of chemicals for which directly measured KOA values have been reported are shown in Table 5. Chemicals with measured log10 KOA values greater than 6 are generally persistent organic pollutants, including CBz, PCBs, PAHs, PCNs, and PBDEs [Fig. 6, Panel (c)]. There is greater diversity among the chemicals with low measured log10 KOA values, including simple hydrocarbons such as alkanes, alkenes, cyclic hydrocarbons, haloalkanes, alcohols, and organosilicons. Measurements of KOA for small, volatile molecules are often motivated by explorations of basic partitioning behavior (Abraham et al., 2001) or methodological issues (e.g., Lei et al., 2019). At the very low log10 KOA range (<1) are typically short chain alkanes, noble gases, and inorganic gases [e.g., xenon (CAS No. 7440-63-3) and carbon monoxide (CAS No. 630-08-0)].
A summary of all papers and techniques reporting experimental KOA values that are included in the database, including the type of methodology and the log10 KOA and temperature ranges for each method
. | . | . | . | . | Temperature . | . |
---|---|---|---|---|---|---|
References . | Method . | Technique . | n . | log10 KOA range . | range (°C) . | Compound groups . |
Wilcock et al. 1978 | MA | Gas solubility | 26 | −1.79–0.24 | 9.3–40.49 | Gases |
Bo et al. 1993 | BN-B | Gas solubility | 9 | −1.57–0.24 | 25 | Gases |
Abraham et al. 2001 | HS and GC | Headspace | 81 | −1.29–5.36 | 25 | Hydrocarbons, halogenated |
Boyer and Bircher 1960 | Vgas | Gas solubility | 5 | −0.99–0.42 | 25 | Gases |
Taheri et al. 1993 | HS Vac | Headspace | 6 | −0.43–4.36 | 37 | Alkanes |
Fang et al. 1997b | HS Vac | Headspace | 6 | −0.31–1.78 | 37 | Haloalkanes |
Pollack et al. 1984 | PM | Gas solubility | 5 | 0.27–0.45 | 10–50 | Xenon |
Hiatt 1997 | VD/GC/MS | Headspace | 113 | 0.48–5.57 | 25 | Hydrocarbons, PAHs, CBz, halogenated, amines, labeled |
Taheri et al. 1991 | HS Vac | Headspace | 7 | 1.14–2.5 | 37 | Haloalkanes |
Ionescu et al. 1994 | HS Vac | Headspace | 2 | 1.52–1.73 | 37 | Halogenated compounds |
Ionescu et al. 1994 | HS Vac | Headspace | 2 | 1.56–1.78 | 37 | Halogenated compounds |
Lei et al. 2019 | VPHS | Headspace | 78 | 1.58–4.4 | 25–110 | Various hydrocarbons |
Gruber et al. 1997 | GLC-RT | GLC-RT | 96 | 1.63–3.92 | 20.29–50.28 | Alkanes, alkenes, cyclic, arenes, alcohols |
Fukuchi et al. 2001 | GS | Gas stripping | 4 | 1.65–2.12 | 10–40 | Haloethers |
Eger et al. 2001 | HS Vac | Headspace | 4 | 1.75–2.51 | 37 | Haloalkanes, haloarenes |
Dallas 1995 | HS and GC | Headspace | 75 | 1.75–5.36 | 25 | Simple hydrocarbons |
Bhatia and Sandler 1995 | GLC-RT | Retention time | 32 | 1.91–3.6 | 25–50 | Haloalkanes, alkanes |
Tse and Sandler 1994 | GLC-RT | Retention time | 15 | 1.95–3.6 | 25 | Alkanes, Cl and Br alkyl halides |
Cheong 1989 | HS and GC | Headspace | 11 | 2.02–3.9 | 25 | Alkanes |
Eger et al. 1997 | HS Vac | Headspace | 3 | 2.13–2.14 | 37 | Isoflurane |
Fukuchi et al. 1999 | GS | Gas stripping | 9 | 2.16–3.1 | 10–30 | Ether |
Berti et al. 1986 | PP | Vapor pressure | 8 | 2.16–3.66 | 25 | Simple hydrocarbons |
Fang et al. 1996 | HS Vac | Headspace | 20 | 2.16–3.87 | 37 | Haloarenes, arenes, cyclic |
Fang et al. 1997a | HS Vac | Headspace | 5 | 2.21–6.01 | 37 | Alcohols |
Park et al. 1987 | HS and GC | Headspace | 6 | 2.53–3.42 | 25 | Simple hydrocarbons |
Batterman et al. 2002 | HS and GC | Headspace | 4 | 2.55–3.97 | 37 | Halogenated alkanes |
Hussam and Carr 1985 | HS and GC | Headspace | 2 | 2.59–3.3 | 25.01 | Nitroxy, arene |
Rohrschneider 1973 | HS | Headspace | 6 | 2.61–3.37 | 25 | Nitromethane, toluene |
Su et al. 2002 | MR-SC-GC-RT | Retention time | 230 | 2.65–12.39 | 10–50 | PCNs, CBz |
Xu and Kropscott 2014 | 3P-Eqbm | Phase equilibrium | 26 | 2.69–5.68 | 4.2–35.2 | Organosiloxanes |
Xu and Kropscott 2013 | 2P-Eqbm | Phase equilibrium | 49 | 2.71–6.85 | −5–40.2 | Organosiloxanes |
Cabani et al. 1991 | PP | Vapor pressure | 10 | 2.75–4.07 | 25 | Simple hydrocarbons |
Su et al. 2002 | MR-GC-RT | Retention time | 110 | 2.86–11.31 | 10–50 | PCNs, CBz |
Dallas and Carr 1992 | HS and GC | Headspace | 11 | 2.87–5.18 | 25 | Alcohols |
Roberts 2005 | Bubbler | Gas stripping | 5 | 2.92–3.39 | 0–25 | Peroxyacetyl nitrate |
Su et al. 2002 | MR-GC-RT | Retention time | 78 | 2.99–11.28 | 10–50 | PCNs, CBz |
Eger et al. 1999 | HS Vac | Headspace | 19 | 3–5.99 | 37 | Alcohols, FTOHs |
Treves et al. 2001 | SPME | Headspace | 9 | 3.03–7.88 | 25 | Alkyl dinitrates, Alkyl nitrates, chlorobenzenes, PAHs |
Eger et al. 1999 | HS Vac | Headspace | 19 | 3.06–6.01 | 37 | Alcohols, FTOHs |
Lei et al. 2004 | SR-GC-RT | Retention time | 12 | 3.19–7.09 | 25 | Fluorinated |
Leng et al. 2015 | Bubbler | Gas stripping | 5 | 3.45–3.85 | 5–25 | Triethylamine |
Hiatt 1998 | VD/GC/MS | Headspace | 8 | 3.68–4.28 | 25 | Terpenes |
Dreyer et al. 2009 | FM | Generator column | 52 | 3.99–6.95 | 5–40 | FTAs, FOSA, FOSE |
Thuens et al. 2008 | FM | Generator column | 37 | 4.1–6.79 | 5–40 | FTOHs |
Xu and Kropscott 2012 | 2P-Eqbm | Phase equilibrium | 4 | 4.29–6.4 | 20.1–24.6 | Organosiloxanes |
Harner and Mackay 1995 | FM | Generator column | 60 | 4.36–11.83 | −10–25 | CBz, PCBs, DDT |
Xu and Kropscott 2012 | 3P-Eqbm | Phase equilibrium | 3 | 4.4–5.72 | 21.7–24.6 | Organosiloxanes |
Goss et al. 2006 | FM | Generator column | 11 | 4.8–6.72 | 0–25 | FTOHs |
Yaman et al. 2020 | SR-GC-RT | Retention time | 14 | 5.15–11.78 | 25 | OPEs |
Ha and Kwon 2010 | droplet kinetics | Droplet kinetics | 10 | 5.37–10.48 | 25 | PAHs |
Harner and Bidleman 1998 | FM | Generator column | 159 | 6.09–10.62 | 0–50 | PAHs, PCNs |
Zhang et al. 1999 | MR-GC-RT | Retention time | 208 | 6.09–13.36 | 0–20 | PCBs |
Odabasi et al. 2006a | SR-GC-RT | Retention time | 14 | 6.34–12.59 | 25 | PAHs |
Özcan 2013 | SR-GC-RT | Retention time | 11 | 6.43–8.77 | 25 | Musks |
Kömp and McLachlan 1997 | FM | Generator column | 96 | 6.52–10.66 | 10–43 | PCBs |
Okeme et al. 2020 | SR-GC-RT | Retention time | 49 | 6.59–11.44 | 25 | PCBs, musk, PAHs, DDTs, other hydrocarbons |
Harner and Bidleman 1996 | FM | Generator column | 86 | 6.64–12.57 | −10–30 | PCBs |
Wania et al. 2002 | SR-GC-RT | Retention time | 45 | 6.78–12.15 | 25 | PBDEs, PCBs, PCNs |
Wania et al. 2002 | FM | Generator column | 8 | 6.95–8.93 | 5–45 | Alkanes |
Pegoraro et al. 2015 | SR-GC-RT | Retention time | 8 | 7–11.18 | 25 | Phthalates, cinnamate |
Shoeib and Harner 2002b | FM | Generator column | 112 | 7.38–11.38 | 5–45 | OCPs |
Harner et al. 2000 | FM | Generator column | 57 | 7.4–11.66 | 0–50 | PCDD/Fs, PCB |
Shoeib et al. 2004 | FM | Generator column | 12 | 7.44–8.8 | 0–25 | PFAS |
Wang et al. 2017 | SR-GC-RT | Retention time | 14 | 7.55–13.5 | 25 | Organophosphates |
Zhang et al. 2009 | SR-GC-RT | Retention time | 7 | 7.61–9.87 | 25 | DDT, HCH |
Odabasi et al. 2006b | SR-GC-RT | Retention time | 2 | 7.68–8.03 | 25 | PAH, carbozole |
Yao et al. 2007 | FM | Generator column | 4 | 7.93–8.88 | 20 | Pesticides |
Vuong et al. 2020 | SR-GC-RT | Retention time | 34 | 8.06–13.98 | 25 | PAHs |
Shoeib and Harner 2002a | MR-GC-RT | Retention time | 16 | 8.12–10.8 | 23 | PCBs |
Zhao et al. 2010 | SR-GC-RT | Retention time | 29 | 8.3–13.29 | 25 | PBDEs |
Chen et al. 2001 | RTI | Retention time | 29 | 8.36–12.05 | 25 | PCDD/Fs |
Odabasi and Cetin 2012 | SR-GC-RT | Retention time | 7 | 8.41–10.57 | 25 | Cyclodienes |
Zhao et al. 2009 | SR-GC-RT | Retention time | 12 | 8.5–12.7 | 10–25 | FTOHs, PFASs |
Harner and Shoeib 2002 | FM | Generator column | 51 | 8.52–12.64 | 15–45 | PBDEs |
Lee and Kwon 2016 | droplet kinetics | Droplet kinetics | 8 | 8.85–11.01 | 25 | BFRs |
Harner et al. 2000 | RTI | Retention time | 17 | 10.9–13 | 7 | PCDD/Fs |
. | . | . | . | . | Temperature . | . |
---|---|---|---|---|---|---|
References . | Method . | Technique . | n . | log10 KOA range . | range (°C) . | Compound groups . |
Wilcock et al. 1978 | MA | Gas solubility | 26 | −1.79–0.24 | 9.3–40.49 | Gases |
Bo et al. 1993 | BN-B | Gas solubility | 9 | −1.57–0.24 | 25 | Gases |
Abraham et al. 2001 | HS and GC | Headspace | 81 | −1.29–5.36 | 25 | Hydrocarbons, halogenated |
Boyer and Bircher 1960 | Vgas | Gas solubility | 5 | −0.99–0.42 | 25 | Gases |
Taheri et al. 1993 | HS Vac | Headspace | 6 | −0.43–4.36 | 37 | Alkanes |
Fang et al. 1997b | HS Vac | Headspace | 6 | −0.31–1.78 | 37 | Haloalkanes |
Pollack et al. 1984 | PM | Gas solubility | 5 | 0.27–0.45 | 10–50 | Xenon |
Hiatt 1997 | VD/GC/MS | Headspace | 113 | 0.48–5.57 | 25 | Hydrocarbons, PAHs, CBz, halogenated, amines, labeled |
Taheri et al. 1991 | HS Vac | Headspace | 7 | 1.14–2.5 | 37 | Haloalkanes |
Ionescu et al. 1994 | HS Vac | Headspace | 2 | 1.52–1.73 | 37 | Halogenated compounds |
Ionescu et al. 1994 | HS Vac | Headspace | 2 | 1.56–1.78 | 37 | Halogenated compounds |
Lei et al. 2019 | VPHS | Headspace | 78 | 1.58–4.4 | 25–110 | Various hydrocarbons |
Gruber et al. 1997 | GLC-RT | GLC-RT | 96 | 1.63–3.92 | 20.29–50.28 | Alkanes, alkenes, cyclic, arenes, alcohols |
Fukuchi et al. 2001 | GS | Gas stripping | 4 | 1.65–2.12 | 10–40 | Haloethers |
Eger et al. 2001 | HS Vac | Headspace | 4 | 1.75–2.51 | 37 | Haloalkanes, haloarenes |
Dallas 1995 | HS and GC | Headspace | 75 | 1.75–5.36 | 25 | Simple hydrocarbons |
Bhatia and Sandler 1995 | GLC-RT | Retention time | 32 | 1.91–3.6 | 25–50 | Haloalkanes, alkanes |
Tse and Sandler 1994 | GLC-RT | Retention time | 15 | 1.95–3.6 | 25 | Alkanes, Cl and Br alkyl halides |
Cheong 1989 | HS and GC | Headspace | 11 | 2.02–3.9 | 25 | Alkanes |
Eger et al. 1997 | HS Vac | Headspace | 3 | 2.13–2.14 | 37 | Isoflurane |
Fukuchi et al. 1999 | GS | Gas stripping | 9 | 2.16–3.1 | 10–30 | Ether |
Berti et al. 1986 | PP | Vapor pressure | 8 | 2.16–3.66 | 25 | Simple hydrocarbons |
Fang et al. 1996 | HS Vac | Headspace | 20 | 2.16–3.87 | 37 | Haloarenes, arenes, cyclic |
Fang et al. 1997a | HS Vac | Headspace | 5 | 2.21–6.01 | 37 | Alcohols |
Park et al. 1987 | HS and GC | Headspace | 6 | 2.53–3.42 | 25 | Simple hydrocarbons |
Batterman et al. 2002 | HS and GC | Headspace | 4 | 2.55–3.97 | 37 | Halogenated alkanes |
Hussam and Carr 1985 | HS and GC | Headspace | 2 | 2.59–3.3 | 25.01 | Nitroxy, arene |
Rohrschneider 1973 | HS | Headspace | 6 | 2.61–3.37 | 25 | Nitromethane, toluene |
Su et al. 2002 | MR-SC-GC-RT | Retention time | 230 | 2.65–12.39 | 10–50 | PCNs, CBz |
Xu and Kropscott 2014 | 3P-Eqbm | Phase equilibrium | 26 | 2.69–5.68 | 4.2–35.2 | Organosiloxanes |
Xu and Kropscott 2013 | 2P-Eqbm | Phase equilibrium | 49 | 2.71–6.85 | −5–40.2 | Organosiloxanes |
Cabani et al. 1991 | PP | Vapor pressure | 10 | 2.75–4.07 | 25 | Simple hydrocarbons |
Su et al. 2002 | MR-GC-RT | Retention time | 110 | 2.86–11.31 | 10–50 | PCNs, CBz |
Dallas and Carr 1992 | HS and GC | Headspace | 11 | 2.87–5.18 | 25 | Alcohols |
Roberts 2005 | Bubbler | Gas stripping | 5 | 2.92–3.39 | 0–25 | Peroxyacetyl nitrate |
Su et al. 2002 | MR-GC-RT | Retention time | 78 | 2.99–11.28 | 10–50 | PCNs, CBz |
Eger et al. 1999 | HS Vac | Headspace | 19 | 3–5.99 | 37 | Alcohols, FTOHs |
Treves et al. 2001 | SPME | Headspace | 9 | 3.03–7.88 | 25 | Alkyl dinitrates, Alkyl nitrates, chlorobenzenes, PAHs |
Eger et al. 1999 | HS Vac | Headspace | 19 | 3.06–6.01 | 37 | Alcohols, FTOHs |
Lei et al. 2004 | SR-GC-RT | Retention time | 12 | 3.19–7.09 | 25 | Fluorinated |
Leng et al. 2015 | Bubbler | Gas stripping | 5 | 3.45–3.85 | 5–25 | Triethylamine |
Hiatt 1998 | VD/GC/MS | Headspace | 8 | 3.68–4.28 | 25 | Terpenes |
Dreyer et al. 2009 | FM | Generator column | 52 | 3.99–6.95 | 5–40 | FTAs, FOSA, FOSE |
Thuens et al. 2008 | FM | Generator column | 37 | 4.1–6.79 | 5–40 | FTOHs |
Xu and Kropscott 2012 | 2P-Eqbm | Phase equilibrium | 4 | 4.29–6.4 | 20.1–24.6 | Organosiloxanes |
Harner and Mackay 1995 | FM | Generator column | 60 | 4.36–11.83 | −10–25 | CBz, PCBs, DDT |
Xu and Kropscott 2012 | 3P-Eqbm | Phase equilibrium | 3 | 4.4–5.72 | 21.7–24.6 | Organosiloxanes |
Goss et al. 2006 | FM | Generator column | 11 | 4.8–6.72 | 0–25 | FTOHs |
Yaman et al. 2020 | SR-GC-RT | Retention time | 14 | 5.15–11.78 | 25 | OPEs |
Ha and Kwon 2010 | droplet kinetics | Droplet kinetics | 10 | 5.37–10.48 | 25 | PAHs |
Harner and Bidleman 1998 | FM | Generator column | 159 | 6.09–10.62 | 0–50 | PAHs, PCNs |
Zhang et al. 1999 | MR-GC-RT | Retention time | 208 | 6.09–13.36 | 0–20 | PCBs |
Odabasi et al. 2006a | SR-GC-RT | Retention time | 14 | 6.34–12.59 | 25 | PAHs |
Özcan 2013 | SR-GC-RT | Retention time | 11 | 6.43–8.77 | 25 | Musks |
Kömp and McLachlan 1997 | FM | Generator column | 96 | 6.52–10.66 | 10–43 | PCBs |
Okeme et al. 2020 | SR-GC-RT | Retention time | 49 | 6.59–11.44 | 25 | PCBs, musk, PAHs, DDTs, other hydrocarbons |
Harner and Bidleman 1996 | FM | Generator column | 86 | 6.64–12.57 | −10–30 | PCBs |
Wania et al. 2002 | SR-GC-RT | Retention time | 45 | 6.78–12.15 | 25 | PBDEs, PCBs, PCNs |
Wania et al. 2002 | FM | Generator column | 8 | 6.95–8.93 | 5–45 | Alkanes |
Pegoraro et al. 2015 | SR-GC-RT | Retention time | 8 | 7–11.18 | 25 | Phthalates, cinnamate |
Shoeib and Harner 2002b | FM | Generator column | 112 | 7.38–11.38 | 5–45 | OCPs |
Harner et al. 2000 | FM | Generator column | 57 | 7.4–11.66 | 0–50 | PCDD/Fs, PCB |
Shoeib et al. 2004 | FM | Generator column | 12 | 7.44–8.8 | 0–25 | PFAS |
Wang et al. 2017 | SR-GC-RT | Retention time | 14 | 7.55–13.5 | 25 | Organophosphates |
Zhang et al. 2009 | SR-GC-RT | Retention time | 7 | 7.61–9.87 | 25 | DDT, HCH |
Odabasi et al. 2006b | SR-GC-RT | Retention time | 2 | 7.68–8.03 | 25 | PAH, carbozole |
Yao et al. 2007 | FM | Generator column | 4 | 7.93–8.88 | 20 | Pesticides |
Vuong et al. 2020 | SR-GC-RT | Retention time | 34 | 8.06–13.98 | 25 | PAHs |
Shoeib and Harner 2002a | MR-GC-RT | Retention time | 16 | 8.12–10.8 | 23 | PCBs |
Zhao et al. 2010 | SR-GC-RT | Retention time | 29 | 8.3–13.29 | 25 | PBDEs |
Chen et al. 2001 | RTI | Retention time | 29 | 8.36–12.05 | 25 | PCDD/Fs |
Odabasi and Cetin 2012 | SR-GC-RT | Retention time | 7 | 8.41–10.57 | 25 | Cyclodienes |
Zhao et al. 2009 | SR-GC-RT | Retention time | 12 | 8.5–12.7 | 10–25 | FTOHs, PFASs |
Harner and Shoeib 2002 | FM | Generator column | 51 | 8.52–12.64 | 15–45 | PBDEs |
Lee and Kwon 2016 | droplet kinetics | Droplet kinetics | 8 | 8.85–11.01 | 25 | BFRs |
Harner et al. 2000 | RTI | Retention time | 17 | 10.9–13 | 7 | PCDD/Fs |
4.6. Reliability of KOA measurements
A subset of experimental log10 KOA data was assessed to be unreliable. These measurements were made for polar compounds using a gas chromatography retention time (GC-RT) technique. Figure 7 compares KOA values obtained using GC-RT methods against directly measured values, if they are available. While there is generally very good agreement between the reported values, some notable exceptions become apparent. KOA values for a series of fluorotelomer alcohols (FTOHs) measured with the GC-RT technique are much lower than those measured using the generator column technique. We suspect that the GC-RT measurements are erroneous due to the high polarity of these compounds and their ability to undergo hydrogen bonding. These compounds would be expected to interact much more strongly with octanol than with the nonpolar GC column used, particularly relative to hexachlorobenzene (CAS No. 118-74-1), the reference compound used in the study (Lei et al. 2004). Thus, when measured with GC-RT, log10 KOA for such chemicals tend to be too low.
Plot of directly versus indirectly measured KOA for compounds for which values from both techniques exist. The solid line indicates a one-to-one relationship, while dashed lines represent ±1.
Plot of directly versus indirectly measured KOA for compounds for which values from both techniques exist. The solid line indicates a one-to-one relationship, while dashed lines represent ±1.
Large discrepancies are also apparent for benz[a]anthracene (CAS No. 56-55-3) and benzo[a]pyrene (CAS No. 50-32-8), where KOA values from the GC-RT techniques (Odabasi et al. 2006a) are much higher compared to those obtained with the droplet kinetics technique (Ha and Kwon 2010). In Fig. SI 3, we compare the measurements made using the droplet kinetics technique against other experimental measurements for the same compounds. The KOA for benz[a]anthracene by Ha and Kwon (2010) is almost an order of magnitude smaller than most other measured and estimated values for this compound. On the other hand, the value obtained by GC-RT is within 0.5 log10 units of estimates using solvation models (Fu et al. 2016), the UPPER model (Lian and Yalkowsky 2014), and thermodynamic triangles (Sepassi and Yalkowsky 2007). The log10 KOA for benzo[a]pyrene by Ha and Kwon (2010) is also lower than almost all other reported values. The GC-RT derived log10 KOA for both PAHs is in excellent agreement with the final adjusted value derived by Ma et al. (2010). In addition, as these PAHs are relatively non-polar, the GC-RT technique should be applicable and, in any case, not lead to KOA values that are too high. We therefore suspect that in this case, the values reported by Ha and Kwon (2010) are more likely to be erroneous than the GC-RT values.
There are many more GC-RT-derived KOA values without complementary directly measured values. To identify other potentially flawed values, we compared the GC-RT measured value with predictions made by three different prediction models. Figure 8 displays the residuals between predicted and GC-RT measured values, whereby chemicals are color-coded by the strengths of their H-bonding with octanol. The latter is quantified as aA + bB, where A and B are the Abraham solute descriptors for hydrogen bonding acidity and basicity of the solute and a and b are the respective system constants from the poly-parameter linear free energy equation for log10 KOA by Endo and Goss (2014). Experimental solute descriptors were obtained using the UFZ-LSER Database (Ulrich et al. 2017); if experimental solute descriptors were unavailable, estimated solute descriptors were obtained using the IFSQSAR model developed by Brown and available on GitHub (https://github.com/tnbrowncontam/ifsqsar) (Brown 2014; Brown et al. 2012).
Comparing the indirectly measured KOA values against estimates made using ppLFER equations (Endo and Goss 2014) with estimated and experimental solute descriptors and COSMOtherm. Gray circles indicate limited polarity (aA + bB < 0.5), blue triangles indicate moderate polarity (0.5 < aA + bB < 1), yellow + indicate strong polarity (1 < aA + bB < 2), and red Xs indicate very strong polarity (aA + bB > 2). The dashed lines indicate a residual of ±0.5 log10 units between the experimental and estimated value.
Comparing the indirectly measured KOA values against estimates made using ppLFER equations (Endo and Goss 2014) with estimated and experimental solute descriptors and COSMOtherm. Gray circles indicate limited polarity (aA + bB < 0.5), blue triangles indicate moderate polarity (0.5 < aA + bB < 1), yellow + indicate strong polarity (1 < aA + bB < 2), and red Xs indicate very strong polarity (aA + bB > 2). The dashed lines indicate a residual of ±0.5 log10 units between the experimental and estimated value.
GC-RT-derived log10 KOA of hydrogen-bonding chemicals (aA + bB > 0.5) have unusually large residuals with all three prediction techniques, suggesting a large bias. Most residuals are negative, implying that the KOA for such chemicals is biased low, which is consistent with expectations. We conclude that the GC-RT method is unsuitable for measuring the log10 KOA of polar, and especially hydrogen-bonding, chemicals because (i) the interactions between the octanol and the reference chemical are not necessarily comparable to the interactions between octanol and the analyte of interest and (ii) the way the analyte interacts with the stationary phase will not be similar to its interaction with octanol due to the latter’s capacity to undergo hydrogen bonding.
Within the database, we have noted which KOA values obtained by GC-RT techniques may be erroneous due to the high polarity of the chemical.
4.7. Estimated KOA values
The range of 10 747 estimated KOA values in the literature is much larger than that of the experimentally derived values. The lowest estimated log10 KOA value is −3.02 for propylnitrile (CAS No. 107-12-0) by Best et al. (1997) using a thermodynamic triangle approach based on ΔG°AW and log10 KOW. The highest estimated log10 KOA value, 30.20 for 1,2-bis[(2,3,4,5,6-pentabromophenyl) methyl] 3,4,5,6-tetrabromo-1,2-benzenedicarboxylate (CAS No. 82 001-21-6) by Zhang et al. (2016), was obtained using the thermodynamic triangle approach implemented in EPISuite’s KOAWIN.
The general distribution of estimated KOA values is similar to that of the experimentally derived values, with the majority of estimated values (70%) within the log10 KOA 6–12 range (Fig. 9). A large portion of estimated log10 KOA values are also in the 2–5 range (17.5%). Fewer estimates are made above log10 KOA 13 (3.5%) or below 1 (1.3%). Between log10 KOA 5 and 6, there are also few estimates (3.4%).
Distribution of estimated log10 KOA values included in the database. Each panel shows the distribution based on the estimation technique [Panel (a)], the temperature range of the reported value [(Panel (b)], and common classes of chemicals where estimated KOA values exist [(Panel (c)].
Distribution of estimated log10 KOA values included in the database. Each panel shows the distribution based on the estimation technique [Panel (a)], the temperature range of the reported value [(Panel (b)], and common classes of chemicals where estimated KOA values exist [(Panel (c)].
Half (50.7%) of the estimated values are derived from some form of linear regression, as described in Sec. 3.1.2. Solvation models for estimating KOA are also very commonly used (34.6%), followed by thermodynamic triangle estimation techniques (12.0%). Both models are used across a broad KOA range. Likewise, the UPPER model is not restricted to a specific range of chemicals because it is rooted in principles applied to thermodynamic triangles. However, estimates are not commonly available in the literature, and almost two thirds of published values obtained with UPPER (63.7%) fall within the log10 KOA range between 1 and 5. The UNIFAC model is typically applied to estimate KOA of volatile chemicals, and thus, reported values range only from 0.4 (tetrahydropyran; CAS No. 142-68-7) to 2.38 (dimethyl sulfoxide; CAS No. 67-68-5). Estimates made using machine learning techniques are limited to the work by Jiao et al. (2014) on PBDEs. As methods for estimating physical–chemical properties using neural networks and machine learning are developed further and because these approaches are not limited to a specific subset of chemicals, we expect their estimation range to widen significantly.
Most estimates are for log10 KOA at 25 °C (71.9%). There are 3023 estimated KOA values for 486 different chemicals at non-standard temperatures, which have been reported by nine publications using either linear regressions, thermodynamic triangles, or solvation models. Linear regression models use temperature-dependent experimental KOA values for training and validation (Chen et al. 2003c, 2003b, 2002b; Li et al. 2006; Mathieu 2020). The descriptors for these models are temperature-dependent (e.g., Xi/T) because temperature and KOA are inversely correlated. Meylan and Howard (2005) estimated the temperature dependence of KOA from that of kH, i.e., ignore the temperature dependence of KOW during the application of the thermodynamic triangle of Eq. (10). Li et al. (2020) estimated a temperature-dependent KOA by estimating ΔG°OA at 25 °C using a solvation model and then solving Eq. (6) with different values of T. The assumption that ΔG°OA is not strongly temperature-dependent is similar to assuming that ΔUOA or ΔHOA are weakly temperature-dependent. Some solvation models, such as COSMOtherm, can directly estimate ΔG°OA at different temperatures (Parnis et al. 2015).
The chemical classes for which KOA is commonly estimated reflect the availability of experimental data for KOA. The most commonly estimated KOA values are for PCBs (22.7%), PCDDs (13.6%), PAHs (12.4%), PBDEs (11.1%), and PCNs (0.7%) within the log10 KOA 9–12 range. At the lower range of KOA values, there are more estimates of alcohols and haloalkanes. There are also KOA estimates for different CBz, arenes, alkanes, and OPEs. A full list of all methods and papers reporting estimated log10 KOA values is included in Table 6.
A summary of all papers and techniques that report estimated KOA values included in the database, including the type of methodology and the log10 KOA and temperature ranges for each method. The chemical classes used are not included as estimation models are not all intended for specific compound groups
. | . | . | . | log10 KOA . | Temperature . | . |
---|---|---|---|---|---|---|
References . | Method . | Technique . | n . | range . | range (°C) . | Compound groups . |
Abraham et al. (2001) | Triangle | Thermodynamic triangles | 23 | 4.46–9.59 | 25 | PAHs, CBz, Hydrocarbons |
Abraham et al. (2005) | MLR | Linear regressions | 21 | 5.52–7.67 | 25 | Methyl and alkyl substituted naphthalenes |
Abraham et al. (2005) | MLR | Linear regressions | 21 | 5.52–7.85 | 25 | Methyl and alkyl substituted naphthalenes |
Best et al. (1997) | Triangle | Thermodynamic triangles | 66 | −3.02–7.55 | 25 | Diverse compounds |
Best et al. (1997) | OPLS | Solvation models | 63 | −0.6–6.51 | 25 | Diverse compounds |
Best et al. (1997) | MMFF | Solvation models | 66 | 0.05–6.66 | 25 | Diverse compounds |
Chen et al. (2001) | PLS | Linear regressions | 33 | 7.08–12.46 | 25 | PCDD/Fs |
Chen et al. (2002a) | PLS | Linear regressions | 57 | 7.4–11.66 | −50–0 | PCDD/Fs |
Chen et al. (2002b) | PLS | Linear regressions | 210 | 5.61–12.29 | 25 | PCBs |
Chen et al. (2003a) | PLS | Linear regressions | 52 | 8.33–13.26 | 15–45 | PBDEs |
Chen et al. (2003b) | PLS | Linear regressions | 97 | 6.73–12.76 | −10–30 | PCBs |
Chen et al. (2003c) | PLS | Linear regressions | 31 | 4.47–10.11 | 25 | PCNs, CBz |
Chen et al. (2016) | PLS | Linear regressions | 208 | 6.3–12 | 25 | PCBs |
Chen et al. (2016) | PLS | Linear regressions | 208 | 6.64–12.56 | 25 | PCBs |
Cousins and Mackay (2000) | LR | Linear regressions | 22 | 7.01–13.01 | 25 | Phthalate esters |
Dallas (1995) | MOSCED | Solvation models | 39 | 0.09–1.93 | 25 | Simple hydrocarbons |
Dallas (1995) | UNIFAC | UNIFAC | 73 | 0.4–2.38 | 25 | Simple hydrocarbons |
Duffy and Jorgensen (2000) | MC | Linear regressions | 85 | −0.21–7.38 | 25 | Simple diverse compounds |
Ferreira (2001) | PLS | Linear regressions | 16 | 5.01–14.09 | 25 | PAHs |
Finizio et al. (1997) | Triangle | Thermodynamic triangles | 32 | 6.68–11.19 | 25 | CBz, PCBs, DDT, PAHs, HCH |
Fu et al. (2016) | SM8AD | Solvation models | 376 | −0.65–12.78 | 25 | PCNs, PBDEs, PCBs, DDT, other hydrocarbons |
Fu et al. (2016) | SM8AD | Solvation models | 376 | −0.65–12.78 | 25 | PCNs, PBDEs, PCBs, DDT, other hydrocarbons |
Giesen et al. (1997) | SM5.4/PM3 | Solvation models | 31 | 0.07–5.35 | 25 | Diverse compounds |
Giesen et al. (1997) | SM5.4/AM1 | Solvation models | 30 | 0.44–5.43 | 25 | Diverse compounds |
Hiatt (1997) | Triangle | Thermodynamic triangles | 113 | 0.3–6.7 | 25 | Hydrocarbons, PAHs, CBz, halogenated, amines, labeled |
Jiao et al. (2014) | ANN | Machine learning | 22 | 7.38–12.23 | 25 | PBDEs |
Jiao et al. (2017) | MLR | Linear regressions | 22 | 7.4–12.26 | 25 | PBDEs |
Jin et al. (2017) | MLR | Linear regressions | 98 | 4.63–11.5 | −10–50 | PAHs, CBz, PCNs, PCBs, PBDEs, PCDDS, etc. |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.28–12.07 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.12 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.34–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.38–12.14 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.51–12.26 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.59–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.59–12.34 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.6–12.3 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.74–12.26 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 8.33–12.19 | 25 | PCDDs |
Kurz and Ballschmiter (1999) | Triangle | Thermodynamic triangles | 107 | 6.33–9.13 | 25 | PCDEs |
Li et al. (1999) | SM5.4/PM3 | Solvation models | 192 | −1.24–10.98 | 25 | Diverse compounds |
Li et al. (1999) | SM5.4/AM1 | Solvation models | 192 | −1.24–11.48 | 25 | Diverse compounds |
Li et al. (2006) | MLR | Linear regressions | 598 | 3.45–14.54 | 10–40 | PAHs, CBz, PCNs, PCBs, PBDEs, PCDD/Fs |
Li et al. (2020) | SMD/HF/MIDI!6D | Solvation models | 836 | 6.22–10.89 | 10–30 | Diverse compounds |
Lian and Yalkowsky (2014) | UPPER | UPPER | 170 | 1.15–15.63 | 25 | Hydrocarbons, PAHs |
Liu et al. (2013) | PLS | Linear regressions | 39 | 8.43–13.3 | 25 | PBDEs |
Liu et al. (2013) | PLS | Linear regressions | 39 | 8.7–13.29 | 25 | PBDEs |
Mathieu (2020) | MLR and PLS | Linear regressions | 935 | −0.38–13.78 | −10–50 | Diverse compounds |
Meylan and Howard (2005) | Triangle | Thermodynamic triangles | 434 | −1.14–13.67 | 10–25 | Various hydrocarbons |
Nabi et al. (2014) | LFER | Linear regressions | 52 | 4.1–10.56 | 25 | Nonpolar organic compounds |
Nabi et al. (2014) | ppLFER | Linear regressions | 52 | 4.12–11.11 | 25 | Nonpolar organic compounds |
Nedyalkova et al. (2019) | M11 | Solvation models | 55 | −0.4–7.3 | 25 | Hydrocarbons |
Nedyalkova et al. (2019) | B3LYP | Solvation models | 55 | −0.08–7.65 | 25 | Hydrocarbons |
Nedyalkova et al. (2019) | M06-2X | Solvation models | 55 | 1.65–9.1 | 25 | Hydrocarbons |
Odabasi et al. (2006a) | Triangle | Thermodynamic triangles | 14 | 6.23–13.67 | 25 | PAHs |
Oliferenko et al. (2004) | MLR | Linear regressions | 47 | −0.15–5.15 | 25 | Aliphatic compounds |
Papa et al. (2009) | OLS | Linear regressions | 220 | 6.65–17.97 | 25 | PBDEs, other hydrocarbons |
Parnis et al. (2015) | COSMO | Solvation models | 1060 | 4.65–11.57 | −5–40 | PAHs |
Puzyn and Falandysz (2005) | PCR | Linear regressions | 75 | 5.76–11.52 | 25 | PCNs |
Raevsky et al. (2006) | Triangle | Thermodynamic triangles | 98 | −1.11–8.93 | 25 | Hydrocarbons, CBz, PAHs, etc. |
Raevsky et al. 2006 | Triangle | Thermodynamic triangles | 98 | −0.75–8.46 | 25 | PAHs, CBz, hydrocarbons |
Sepassi and Yalkowsky (2007) | Triangle | Thermodynamic triangles | 219 | 1.99–12.99 | 25 | Hydrocarbons, PCBs, CBz, PAHs, etc. |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 6.66–12.07 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 6.94–12.14 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.21–12.25 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.21–12.25 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.22–12.15 | 25 | PCDD/Fs |
Wang et al. (2008) | MLR | Linear regressions | 209 | 7.38–15.26 | 25 | PBDEs |
Xu et al. (2007) | MLR | Linear regressions | 209 | 7.17–15.73 | 25 | PBDEs |
Yalkowsky et al. (1994b) | UPPER | UPPER | 12 | 2.74–5.76 | 25 | CBz |
Yuan et al. (2016) | PLS | Linear regressions | 209 | 5.7–11.14 | 25 | PCBs |
Yuan et al. (2016) | MLR | Linear regressions | 209 | 6.6–11.6 | 25 | PCBs |
Zeng et al. (2013) | MLR | Linear regressions | 76 | 7.15–12.25 | 25 | PCDDs |
Zhang et al. (2016) | SPARC | Solvation models | 93 | 2.6–28.4 | 25 | Novel flame retardants |
Zhang et al. (2016) | Triangle | Thermodynamic triangles | 93 | 4.4–30.2 | 25 | Novel flame retardants |
Zhang et al. (2016) | ABSOLV | Linear regressions | 93 | 5.6–29.1 | 25 | Novel flame retardants |
Zhao et al. (2005) | MLR | Linear regressions | 6 | 4.44–6.91 | 25 | PCNs |
Zhao et al. (2005) | MLR | Linear regressions | 4 | 6.83–8.86 | 25 | CBs |
Zhao et al. (2005) | MLR | Linear regressions | 24 | 6.93–10.18 | 25 | PBDES |
Zhao et al. (2005) | MLR | Linear regressions | 10 | 7.79–11.68 | 25 | PAHs |
Zhao et al. (2005) | MLR | Linear regressions | 13 | 9.4–12.26 | 25 | PCDD/Fs |
Zhu et al. (1998) | SM5.42R/BPW91/MIDI!6D | Solvation models | 192 | −1.25–9.93 | 25 | Diverse compounds |
Zhu et al. (1998) | SM5.42R/BPW91/6-31G* | Solvation models | 192 | −1.24–9.91 | 25 | Diverse compounds |
Zhu et al. (1998) | SM5.42R/BPW91/DZVP | Solvation models | 192 | −1.21–9.95 | 25 | Diverse compounds |
. | . | . | . | log10 KOA . | Temperature . | . |
---|---|---|---|---|---|---|
References . | Method . | Technique . | n . | range . | range (°C) . | Compound groups . |
Abraham et al. (2001) | Triangle | Thermodynamic triangles | 23 | 4.46–9.59 | 25 | PAHs, CBz, Hydrocarbons |
Abraham et al. (2005) | MLR | Linear regressions | 21 | 5.52–7.67 | 25 | Methyl and alkyl substituted naphthalenes |
Abraham et al. (2005) | MLR | Linear regressions | 21 | 5.52–7.85 | 25 | Methyl and alkyl substituted naphthalenes |
Best et al. (1997) | Triangle | Thermodynamic triangles | 66 | −3.02–7.55 | 25 | Diverse compounds |
Best et al. (1997) | OPLS | Solvation models | 63 | −0.6–6.51 | 25 | Diverse compounds |
Best et al. (1997) | MMFF | Solvation models | 66 | 0.05–6.66 | 25 | Diverse compounds |
Chen et al. (2001) | PLS | Linear regressions | 33 | 7.08–12.46 | 25 | PCDD/Fs |
Chen et al. (2002a) | PLS | Linear regressions | 57 | 7.4–11.66 | −50–0 | PCDD/Fs |
Chen et al. (2002b) | PLS | Linear regressions | 210 | 5.61–12.29 | 25 | PCBs |
Chen et al. (2003a) | PLS | Linear regressions | 52 | 8.33–13.26 | 15–45 | PBDEs |
Chen et al. (2003b) | PLS | Linear regressions | 97 | 6.73–12.76 | −10–30 | PCBs |
Chen et al. (2003c) | PLS | Linear regressions | 31 | 4.47–10.11 | 25 | PCNs, CBz |
Chen et al. (2016) | PLS | Linear regressions | 208 | 6.3–12 | 25 | PCBs |
Chen et al. (2016) | PLS | Linear regressions | 208 | 6.64–12.56 | 25 | PCBs |
Cousins and Mackay (2000) | LR | Linear regressions | 22 | 7.01–13.01 | 25 | Phthalate esters |
Dallas (1995) | MOSCED | Solvation models | 39 | 0.09–1.93 | 25 | Simple hydrocarbons |
Dallas (1995) | UNIFAC | UNIFAC | 73 | 0.4–2.38 | 25 | Simple hydrocarbons |
Duffy and Jorgensen (2000) | MC | Linear regressions | 85 | −0.21–7.38 | 25 | Simple diverse compounds |
Ferreira (2001) | PLS | Linear regressions | 16 | 5.01–14.09 | 25 | PAHs |
Finizio et al. (1997) | Triangle | Thermodynamic triangles | 32 | 6.68–11.19 | 25 | CBz, PCBs, DDT, PAHs, HCH |
Fu et al. (2016) | SM8AD | Solvation models | 376 | −0.65–12.78 | 25 | PCNs, PBDEs, PCBs, DDT, other hydrocarbons |
Fu et al. (2016) | SM8AD | Solvation models | 376 | −0.65–12.78 | 25 | PCNs, PBDEs, PCBs, DDT, other hydrocarbons |
Giesen et al. (1997) | SM5.4/PM3 | Solvation models | 31 | 0.07–5.35 | 25 | Diverse compounds |
Giesen et al. (1997) | SM5.4/AM1 | Solvation models | 30 | 0.44–5.43 | 25 | Diverse compounds |
Hiatt (1997) | Triangle | Thermodynamic triangles | 113 | 0.3–6.7 | 25 | Hydrocarbons, PAHs, CBz, halogenated, amines, labeled |
Jiao et al. (2014) | ANN | Machine learning | 22 | 7.38–12.23 | 25 | PBDEs |
Jiao et al. (2017) | MLR | Linear regressions | 22 | 7.4–12.26 | 25 | PBDEs |
Jin et al. (2017) | MLR | Linear regressions | 98 | 4.63–11.5 | −10–50 | PAHs, CBz, PCNs, PCBs, PBDEs, PCDDS, etc. |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.28–12.07 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.33–12.12 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.34–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.38–12.14 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.51–12.26 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.59–12.11 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.59–12.34 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.6–12.3 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 7.74–12.26 | 25 | PCDDs |
Kim et al. (2016) | SLRa | Linear regressions | 76 | 8.33–12.19 | 25 | PCDDs |
Kurz and Ballschmiter (1999) | Triangle | Thermodynamic triangles | 107 | 6.33–9.13 | 25 | PCDEs |
Li et al. (1999) | SM5.4/PM3 | Solvation models | 192 | −1.24–10.98 | 25 | Diverse compounds |
Li et al. (1999) | SM5.4/AM1 | Solvation models | 192 | −1.24–11.48 | 25 | Diverse compounds |
Li et al. (2006) | MLR | Linear regressions | 598 | 3.45–14.54 | 10–40 | PAHs, CBz, PCNs, PCBs, PBDEs, PCDD/Fs |
Li et al. (2020) | SMD/HF/MIDI!6D | Solvation models | 836 | 6.22–10.89 | 10–30 | Diverse compounds |
Lian and Yalkowsky (2014) | UPPER | UPPER | 170 | 1.15–15.63 | 25 | Hydrocarbons, PAHs |
Liu et al. (2013) | PLS | Linear regressions | 39 | 8.43–13.3 | 25 | PBDEs |
Liu et al. (2013) | PLS | Linear regressions | 39 | 8.7–13.29 | 25 | PBDEs |
Mathieu (2020) | MLR and PLS | Linear regressions | 935 | −0.38–13.78 | −10–50 | Diverse compounds |
Meylan and Howard (2005) | Triangle | Thermodynamic triangles | 434 | −1.14–13.67 | 10–25 | Various hydrocarbons |
Nabi et al. (2014) | LFER | Linear regressions | 52 | 4.1–10.56 | 25 | Nonpolar organic compounds |
Nabi et al. (2014) | ppLFER | Linear regressions | 52 | 4.12–11.11 | 25 | Nonpolar organic compounds |
Nedyalkova et al. (2019) | M11 | Solvation models | 55 | −0.4–7.3 | 25 | Hydrocarbons |
Nedyalkova et al. (2019) | B3LYP | Solvation models | 55 | −0.08–7.65 | 25 | Hydrocarbons |
Nedyalkova et al. (2019) | M06-2X | Solvation models | 55 | 1.65–9.1 | 25 | Hydrocarbons |
Odabasi et al. (2006a) | Triangle | Thermodynamic triangles | 14 | 6.23–13.67 | 25 | PAHs |
Oliferenko et al. (2004) | MLR | Linear regressions | 47 | −0.15–5.15 | 25 | Aliphatic compounds |
Papa et al. (2009) | OLS | Linear regressions | 220 | 6.65–17.97 | 25 | PBDEs, other hydrocarbons |
Parnis et al. (2015) | COSMO | Solvation models | 1060 | 4.65–11.57 | −5–40 | PAHs |
Puzyn and Falandysz (2005) | PCR | Linear regressions | 75 | 5.76–11.52 | 25 | PCNs |
Raevsky et al. (2006) | Triangle | Thermodynamic triangles | 98 | −1.11–8.93 | 25 | Hydrocarbons, CBz, PAHs, etc. |
Raevsky et al. 2006 | Triangle | Thermodynamic triangles | 98 | −0.75–8.46 | 25 | PAHs, CBz, hydrocarbons |
Sepassi and Yalkowsky (2007) | Triangle | Thermodynamic triangles | 219 | 1.99–12.99 | 25 | Hydrocarbons, PCBs, CBz, PAHs, etc. |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 6.66–12.07 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 6.94–12.14 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.21–12.25 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.21–12.25 | 25 | PCDD/Fs |
Vikas and Chayawan (2015) | SLRa | Linear regressions | 18 | 7.22–12.15 | 25 | PCDD/Fs |
Wang et al. (2008) | MLR | Linear regressions | 209 | 7.38–15.26 | 25 | PBDEs |
Xu et al. (2007) | MLR | Linear regressions | 209 | 7.17–15.73 | 25 | PBDEs |
Yalkowsky et al. (1994b) | UPPER | UPPER | 12 | 2.74–5.76 | 25 | CBz |
Yuan et al. (2016) | PLS | Linear regressions | 209 | 5.7–11.14 | 25 | PCBs |
Yuan et al. (2016) | MLR | Linear regressions | 209 | 6.6–11.6 | 25 | PCBs |
Zeng et al. (2013) | MLR | Linear regressions | 76 | 7.15–12.25 | 25 | PCDDs |
Zhang et al. (2016) | SPARC | Solvation models | 93 | 2.6–28.4 | 25 | Novel flame retardants |
Zhang et al. (2016) | Triangle | Thermodynamic triangles | 93 | 4.4–30.2 | 25 | Novel flame retardants |
Zhang et al. (2016) | ABSOLV | Linear regressions | 93 | 5.6–29.1 | 25 | Novel flame retardants |
Zhao et al. (2005) | MLR | Linear regressions | 6 | 4.44–6.91 | 25 | PCNs |
Zhao et al. (2005) | MLR | Linear regressions | 4 | 6.83–8.86 | 25 | CBs |
Zhao et al. (2005) | MLR | Linear regressions | 24 | 6.93–10.18 | 25 | PBDES |
Zhao et al. (2005) | MLR | Linear regressions | 10 | 7.79–11.68 | 25 | PAHs |
Zhao et al. (2005) | MLR | Linear regressions | 13 | 9.4–12.26 | 25 | PCDD/Fs |
Zhu et al. (1998) | SM5.42R/BPW91/MIDI!6D | Solvation models | 192 | −1.25–9.93 | 25 | Diverse compounds |
Zhu et al. (1998) | SM5.42R/BPW91/6-31G* | Solvation models | 192 | −1.24–9.91 | 25 | Diverse compounds |
Zhu et al. (1998) | SM5.42R/BPW91/DZVP | Solvation models | 192 | −1.21–9.95 | 25 | Diverse compounds |
Different molecular descriptors are used to develop multiple single linear regressions models for KOA.
4.8. Differences between KOA and K′OA
In Sec. 1.1.5, we remarked on the use of wet-octanol in place of dry-octanol. In Fig. 10, we compare the KOA and K′OA values for the same chemicals; however, there is no visible difference between the two KOA values that can be attributed to the polarity of the compound. While there is a limited number of chemicals with both empirically derived KOA and K′OA values, the two sets of values are very similar. Estimated KOA and K′OA values are also well correlated, and the deviations seen could be attributed more toward differences in the estimation approach rather than the difference between wet- and dry-octanol. The effects of using wet-octanol will likely be more evident for more polar compounds at the higher log10 KOA range, for which data currently is lacking.
Comparison of log10 KOA and log10 K′OA values for the same chemicals at the same temperature. The dashed lines have a slope of 1.
Comparison of log10 KOA and log10 K′OA values for the same chemicals at the same temperature. The dashed lines have a slope of 1.
5. Conclusions
The earliest KOA data included in this work was published in 1960 by Boyer and Bircher. Following these first measurements, interest in KOA waned for almost 30 years, likely due to the difficulty in measuring this property and the lack of direct applicability. In the 1990s, KOA became of increasing interest due to its applicability in pharmaceutical and environmental chemistry and as technological advances and new analytical techniques were more widely accessible (Figs. SI 5–SI 7). The database assembled here is an effort to catalog the work of various researchers to measure and estimate KOA and assess the applicability ranges of the different techniques used. The database currently includes 13 264 KOA values for 1643 different chemicals. Of these, 2517 KOA values are experimentally derived and the remaining 10 747 are estimated.
In almost all cases, the development of a new model or estimation technique for log10 KOA requires good reference data that can be used to train and validate the model. Large training and validation datasets, including diverse chemicals, are necessary to generate robust models. We hope that this database will serve as a basis for new estimation techniques and experimental measurements of KOA and as a reference dataset.
6. Supplementary Material
See the supplementary material for additional figures (Figs. SI 1–SI 7—the distribution of data with respect to time, methodology, reliability scores, and additional properties included in the database). A Microsoft Excel file containing the KOA database is included.
ACKNOWLEDGMENTS
We thank Tom Harner for sharing his KOA measurement data with us. We thank Alessandro Sangion for his advice on database structure and formatting. We acknowledge the European Chemical Industry Council (CEFIC) for funding from Project No. ECO-41 of the Long-range Research Initiative (LRI).
7. Data Availability
The data that support the findings of this study are openly available on GitHub (https://github.com/sivanibaskaran/koadata) and are available within its supplementary material.
List of Symbols
Within this work, we utilize a variety of variables and abbreviations. In some cases, the abbreviations have not been explicitly defined in the text. For convenience, we have included all abbreviations and variables here.
Variables
- CA
concentration in air
- CO
concentration in octanol
- CW
concentration in water
- kH
Henry’s law constant in water
Henry’s law constant in octanol
- k′Hoct
reciprocal of Henry’s law constant in octanol
- K′OA
wet octanol–air partition ratio
- KAW
air–water partition ratio
- KOA
octanol–air partition ratio
- KOW
octanol–water partition ratio
- Loct
Ostwald coefficient in octanol
- PL
liquid vapor pressure or subcooled liquid vapor pressure
- voct
molar volume of octanol
activity coefficient at infinite dilution in octanol
- ΔG°
Gibbs free energy
- ΔG°OA
Gibbs free energy of solvation in octanol
- SO
solubility in octanol
Compounds/compound groups
- BFRs
brominated flame retardants
- CBz
chlorobenzenes
- DDT
dichlorodiphenyltrichloroethane
- FOSA
perfluorinated alkyl sulfonamides
- FOSE
perfluorinated sulfonamido ethanols
- FTAs
fluorotelomer acrylates
- FTOHs
fluorotelomer alcohols
- HCH
hexachlorocyclohexane
- OCPs
organochlorine pesticides
- OPEs
organophosphate esters
- POPs
persistent organic pollutants
- PAHs
polycyclic aromatic hydrocarbons
- PBDEs
polybrominated diphenyl ethers
- PCBs
polychlorinated biphenyls
- PCDD/Fs
polychlorinated dibenzodioxins and polychlorinated dibenzofurans
- PCDDs
polychlorinated dibenzodioxins
- PCDEs
polychlorinated diphenyl ether
- PCDFs
polychlorinated dibenzofurans
- PCNs
polychlorinated naphthalenes
- PFASs
per/poly-fluoroalkyl substances
- VOCs
volatile organic compounds
Experimental/estimation techniques
- 2P-Eqbm
two-phase equilibrium technique
- 3P-Eqbm
three-phase equilibrium technique
- ABSOLV
ACD/ABSOLV program from ACD/Labs
- ANN
artificial neural network
- B3LYP
parameterization condition of a solvation model
- BN-B
Ben-Naim/Baer-type apparatus
- CoMFA
comparative molecular field analysis
- CoMSIA
comparative molecular similarity indices analysis
- COSMO
conductor-like screening model
- Dynamic
dynamic techniques
- FM
fugacity meter or generator column
- GasSol
gas solubility
- GC-RT
gas chromatography retention time
- GLC-RT
gas–liquid chromatography retention time
- GS
gas stripping
- HS
headspace
- HS and GC
headspace with gas chromatography
- HS Vac
headspace with vacuum
- LFER
linear free energy relationship
- LR
linear regression
- M06-2X
parameterization condition of a solvation model
- M11
parameterization condition of a solvation model
- MA
modified Morrison–Billett apparatus
- MC
Monte Carlo analysis
- MCIs
molecular connectivity indexes
- MLR
multiple linear regression
- MMFF
parameterization condition of a solvation model
- MOSCED
modified separation of cohesive energy density model
- MR-GC-RT
multi reference gas chromatography retention time
- MR-SC-GC-RT
multi reference, single column, gas chromatography retention time
- OLS
ordinary least squares
- OPLS
optimized potentials for liquid simulations force field—the parameterization used in the continuum solvation model
- PCR
principal component regression
- PLS
partial least squares
- PM
photomultiplier
- ppLFER
poly-parameter linear free energy relationships
- PP
partial pressure technique
- QSARs
quantitative structure–activity relationships
- QSPRs
quantitative structure–property relationships
- RT
retention time
- RTI
retention time index
- SC-GC-RT
single reference, gas chromatography retention time
- SLR
single linear regression
- SM5.4/AM1
parameterization condition of a solvation model
- SM5.4/PM3
parameterization condition of a solvation model
- SM5.42R/BPW91/6-31G*
parameterization condition of a solvation model
- SM5.42R/BPW91/DZVP
parameterization condition of a solvation model
- SM5.42R/BPW91/MIDI!6D
parameterization condition of a solvation model
- SM8AD
parameterization condition of a solvation model
- SMD/HF/MIDI!6D
parameterization condition of a solvation model
- SPARC
SPARC performs automated reasoning in chemistry, software by ARChem
- SPME
solid phase microextraction
- SR-GC-RT
single reference, gas chromatography retention time
- Triangle
thermodynamic triangle techniques
- UNIFAC
UNIQUAC functional-group activity coefficients
- UNIQUAC
universal quasichemical
- UPPER
unified physical property estimation relationship
- VD/GC/MS
vacuum distillation gas chromatography mass spectrometry
- Vgas
Van Slyke–Neill blood gas apparatus
- VPHS
variable phase ratio