Polymer properties are inherently multi-scale in nature, where delicate local interaction details play a key role in describing their global conformational behavior. In this context, deriving coarse-grained (CG) multi-scale models for polymeric liquids is a non-trivial task. Further complexities arise when dealing with copolymer systems with varying microscopic sequences, especially when they are of amphiphilic nature. In this work, we derive a segment-based generic CG model for amphiphilic copolymers consisting of repeat units of hydrophobic (methylene) and hydrophilic (ethylene oxide) monomers. The system is a simulation analogue of polyacetal copolymers [S. Samanta *et al*., Macromolecules **49**, 1858 (2016)]. The CG model is found to be transferable over a wide range of copolymer sequences and also to be consistent with existing experimental data.

## I. INTRODUCTION

Polymer processing requires a detailed understanding of their structure-property relationships, which not only is fundamentally challenging but also has a wide range of applications ranging from physics to biology.^{1,2} In this context, smart responsive polymers serve as excellent candidates. A polymer is commonly known as “smart responsive” when a small change in external stimuli can drastically change its structure, function, and/or stability. Furthermore, in these systems, where the relevant energy scale is of the order of the thermal energy *k*_{B}*T*, large conformational and local compositional fluctuations play a delicate role dictating their properties.^{3,4} When the external stimulus is temperature *T*, these polymers are referred to as thermoresponsive polymers.^{5,6} One of the interesting classes of thermoresponsive polymers is those that remain expanded at low *T* and collapse into compact objects when $T>T\u2113$, where $T\u2113$ is referred to as a lower critical solution temperature (LCST). The properties of LCST polymers are usually dictated by hydrogen bonding of water molecules with polymer segments, which break down when $T>T\u2113$. During this process, when hydrogen bonds are broken, water molecules are expelled from near the polymer structure and thus gain a large amount of translational entropy. Therefore, simply speaking, the polymer chain will collapse whenever the total translational entropy gained by water molecules is larger than the conformational entropy lost by the polymer upon collapse.^{7}

For a given homopolymer structure, there exists a characteristic $T\u2113$.^{8–10} This $T\u2113$ can, however, be tuned by introducing more hydrophobic or more hydrophilic units along the homopolymer backbone.^{6,11–15} In particular, $T\u2113$ increases (decreases) with increasing hydrophilic (hydrophobic) units. Moreover, the changes in $T\u2113$ are usually difficult to predict and non-linear with changing copolymer sequences.^{11–13} However, in a recent work, based on the acetal linkage of methylene and ethylene oxide units, it has been shown that $T\u2113$ can be linearly tuned by adjusting the fractions of methylene and ethylene oxide monomers along the polymer backbone.^{6} In addition to predictability with changes in sequence, the acetal-based copolymers are advantageous due to their biodegradable properties. Moreover, these polymers show strong chain length effects dependent on end-group nature (see Fig. 5 in Ref. 6). For example, it has been shown that a molecular of weight $Mw\u2009\u223c\u2009104$ g/mol is necessary to avoid chain length effects, which corresponds to an end-to-end distance $Ree\u2009\u223c\u200910$ nm for a poly(ethylene oxide). Within a simulation setup, this would require a box size of at least 15 to 20 nm. For a water box, with a number density of 32 water molecules per cubic nm, more than 10^{5} water molecules would be needed (or equivalent of $\u223c3\xd7105$ atoms). This poses a serious challenge for all-atom simulations of these systems and motivates the development of an accurate, lower resolution coarse-grained (CG) model for further theoretical investigations. While there are studies in deriving implicit and explicit solvent CG models of PEO,^{16–19} in this work we derive a sequence transferable, systematic CG model of an analogue of polyacetal that represents each copolymer units with single CG beads. We employ the model to study different copolymer architectures that are complementary to the earlier experimental results.^{6} Additionally, we also propose a broad range of molecular morphologies, going beyond available experimental data. Note that here we discuss a case where hydrophobicity (or hydrophilicity) is tuned along the backbone and not added as a side group. However, there are cases, such as poly(*N*-isopropylmethacrylamide), where an extra methylene group added as a side group to poly(*N*-isopropylacrylamide) increases $T\u2113$.^{20} The possible reason behind an added hydrophobic group that increases $T\u2113$ can be attributed to the solvent structure around the chain.^{15}

## II. MODEL AND METHOD

We start by commenting on the underlying reference all-atom simulations. For this purpose, the chemical structures are taken to be the same as in experiment.^{6} In Fig. 1(a), we show a typical chemical configuration of the polyacetal copolymer architecture. In this amphiphilic structure, $T\u2113$ is tuned by changing the number of methylene (*n*_{1} and *n*_{2}) and ethylene oxide (*m*_{1} and *m*_{2}) units. For the simplicity of computation, we have chosen the repeat unit of the chain as *N* = 5 [see Fig. 1(b)]. The standard OPLS force field, consisting of explicit atoms, is used to model the polymer structure^{21} that is solvated in a water box consisting of 17 000 water molecules described by the SPC/E water model.^{22} Here we consider a copolymer structure represented with *n*_{1} = 4, *m*_{1} = 0, *n*_{2} = 2, and *m*_{2} = 3. Note that the molecular weight of the simulated chain presented in Fig. 1(b) is almost one order of magnitude smaller than *M*_{w} needed to avoid strong chain length effects.^{6} However, it is noteworthy that the system size effects in the experiments are associated with the end group effects. Therefore, to avoid the system size effect and to make a reasonable estimate of $T\u2113$, we have terminated the ends of a copolymer with inert CH_{3} groups, as shown in Fig. 1(b). As will be discussed in the latter part of this manuscript, this approximation allows us to obtain a reasonable $T\u2113$, while not attempting to make any claims on the system size effects in simulations. Moreover, this small chain length cannot be used for any structural predictions that only happen for the longer chains.

All-atom molecular dynamics (MD) simulations are performed using the GROMACS 4.5.5 package.^{23} The temperature is varied from 290 K to 320 K using velocity rescaling with a coupling constant 0.5 ps.^{24} Each of these simulations is performed for 50 ns production runs, which is at least one order of magnitude larger than the longest relaxation time. The average is taken over the last 10 ns of MD data. The electrostatics are treated using particle mesh Ewald.^{25} The interaction cutoff for non-bonded interactions is chosen as 1.0 nm. The simulations are performed with a constant pressure ensemble, where the pressure is controlled using a Berendsen barostat^{26} with a coupling time of 0.5 ps and 1 atm pressure. The time step for the simulations is set to 2 fs and the equations of motion are integrated using the leap-frog algorithm. The LINCS algorithm is used to constraint all bond vibrations.^{27}

We find the gyration radius *R*_{g} to be $1.33\xb10.15$ nm (for *T* = 290 K), $1.38\xb10.16$ nm (for *T* = 300 K), $0.97\xb10.04$ nm (for *T* = 310 K), and $0.98\xb10.09$ nm (for *T* = 320 K) for the system presented in Fig. 1(b). There is a reasonably sharp change in *R*_{g} between 300 K and 310 K, suggesting $T\u2113$ to be between 300 K and 310 K. The experimental phase diagram, however, suggests $T\u2113\u223c320$ K for the same copolymer sequence. Therefore, our all-atom data underestimate $T\u2113$ by about 10–20 K.

For the derivation of the CG model, we have used a combination of two structure-based techniques for solutions,^{28} namely, the iterative Boltzmann inversion (IBI)^{29} and the cumulative iterative Boltzmann inversion ($C\u2212$IBI).^{30} The IBI procedure starts from an initial guess of the interaction potential of the CG model $V0(r)=\u2212kBT\u2009lngijtarget(r)$. Here $gijtarget(r)$ is the pair distribution function between different solvent components obtained from the reference all atom simulation. Then the potentials are updated over several iterations *n* using the protocol, $VnIBI(r)\u2009=\u2009Vn\u22121IBI(r)+kBTlngijn\u22121(r)/gijtarget(r)$. Moreover, in the IBI protocol, solution component fluctuations that are related to the tail of g_{ij}(*r*) sometimes need fine tuning, especially when dealing with multi-component systems. For this purpose, the $C\u2212$IBI might serve as a possible candidate. In $C\u2212$IBI, the initial guess of potential is taken from IBI, i.e., $VnIBI(r)$, which is then updated with a protocol, $Vn+1C\u2212IBI(r)\u2009=\u2009VnC\u2212IBI(r)+kBT\u2009lnCijn(r)/Cijtarget(r)$, with cumulative integral $Cijn(r)=\u20094\pi \u222b0rgijn(r\u2032)r\u20322dr\u2032$. Note that in a structure-based CG method, the microscopic interaction details known from all atom simulations, including electrostatic interactions, are integrated into the pairwise effective interaction potentials.

To derive the CG model, which is sequence transferable, we make two assumptions: (1) Upon changing composition and sequencing along the backbone, there are no cross correlations between different monomer units. In this context, experiments have shown that $T\u2113$ changes linearly with changing fractions of hydrophobic or hydrophilic units.^{6} Therefore, presenting a situation where a segment-based coarse-graining (only building CG models based on separate simulations of different monomer units) can be performed, which can later be incorporated into a polymer chain with different sequences. In this context, a generic study of copolymer sequences, within a mean-field picture, has shown linear or non-linear pair interpolation depending on the interaction parameters.^{14} (2) The acetal linker (represented by red in Fig. 1) only contributes to a negligible shift in $T\u2113$. Therefore, we do not incorporate an acetal linker in our simulations. We note that one might use a hard sphere type representation of the acetal unit, although this is beyond the motivation of the present study. The step-by-step procedure for building the CG model is as follows:

*Step 1*: For the non-bonded interaction between two ethylene oxides, we perform the all-atom simulation of ethylene oxide monomers solvated in water, with the mapping scheme presented in Fig. 2(a). For this purpose, we use 800 ethylene oxide molecules in a box of 8200 water molecules. Then an ethylene oxide-ethylene oxide interaction potential is obtained using combined IBI^{29} and $C\u2212$IBI^{30} protocols. We perform 10 iterations of IBI each with 2 ns long MD runs, followed by another 20 iterations of $C\u2212$IBI to obtain final potentials that are used for the CG simulations. Moreover, the direct use of monomer-based CG potentials can lead to unphysical attractive interactions when adding into a polymer chain. In this context, it should be mentioned that the monomer units only present a spherical symmetric solvation volume that gets reduced when the same monomer is put into a polymer chain. This solvation volume correction is approximately half for a monomer in a chain in comparison to an isolated monomer.^{31} Therefore, by scaling the solvation volume, we estimate the interaction potential using the expression, $u(r)=\u2212RTlne\u2212um(r)/RT+1\u2212ln2$, with *R* being the gas constant. Here *u*(*r*) is the interaction between a monomer of a chain and the other component and *u*^{m}(*r*) is the interaction of an isolated monomer with another component. We utilize this expression to obtain the ethylene oxide-ethylene oxide potential. Note that, for $r<0.1$ nm, *u*(*r*) = 10^{6} kJ/mol.

*Step 2*: For the methylene-methylene interaction, we simply take the GROMOS-type united atom model representing CH_{2} and CH_{3} as spheres without explicitly accounting for hydrogen atoms, as shown in Fig. 2(a).

*Step 3*: The methylene-ethylene oxide interaction is obtained using a combination rule $uME(r)\u2009=\u2009uMM(r)\u2009+\u2009uEE(r)/2$. Here, *u*^{ME}, *u*^{MM}, and *u*^{EE} are the interactions between methylene-ethylene oxide, methylene-methylene, and ethylene oxide-ethylene oxide, respectively.

*Step 4*: In order to obtain potentials between the different monomers and water molecules, we have simulated a diblock polymer of three methylene units and three ethylene oxide units. This system is chosen in order to capture any hydrophobic effect that might originate because of the neighboring hydrophobic methylene units, as well as capturing the correct solvation volume and the neighboring effects. The potentials are obtained between the middle monomers of each block and the water molecules.

*Step 5*: The water-water CG potential is obtained from all-atom simulations of pure water using the IBI protocol.

## III. RESULTS AND DISCUSSIONS

The resultant interaction potentials,^{32} shown in Fig. 3, are used to simulate the whole range of copolymer sequences. Note that the derived potentials from the monomer level are used in simulations of a polymer chain with one-four exclusion of bonded neighbors. Importantly, we derive only one set of *u*(*r*) that we will use for a wide range of copolymer sequences. Bonded interactions are obtained by the Boltzmann inversion of bonded, angle, and dihedral distributions. CG simulations are also performed in GROMACS 4.5.5 package,^{23} where the interaction cutoff is chosen as 1.8 nm and the time step is chosen as 2 fs. Simulations are performed for 100 ns long CG MD trajectory.

We start by investigating the conformation of alkane and poly(ethylene oxide) chains in water. For this purpose, the molecular weight is chosen as *M*_{w} = 10^{4} g/mol. The CG model reproduces the expected conformations for both these chains, i.e., an expanded chain for poly(ethylene oxide) with $Rg=3.91\xb10.46$ nm, which is consistent with the previous simulations^{33} and a collapsed structure for the alkane chain with $Rg=1.18\xb10.05$ nm. The structures are characterized by their single chain structure factor *S*(*q*) shown in Fig. 4. For example, the expanded chain shows a scaling law $q\u22121/\nu $ with $\nu =3/5$ being Flory’s exponent for a good solvent chain (see black curve in Fig. 4). When a polymer collapses, its conformation is well described by a hard sphere scattering function with a scaling law *q*^{−4} (see red curve in Fig. 4).^{7} To further validate the CG model, we have calculated the Kuhn length $\u2113k$ of poly(ethylene oxide) by simulating a short segment with the all-atom model. This results in approximately $\u2113k\u223c1.26$ nm and, for *M*_{w} = 10^{4} g/mol or *N*_{l} = 230, the number of Kuhn segments $Nk\u223c48$ per chain. The end-to-end distance is then estimated using the expression $Ree=Nk\u2113k2\u223c8.73$ nm, while the CG model gives $Ree\u223c9.33$ nm.

Now we focus on the main theme of this work, i.e., to study the conformational behavior of complex amphiphilic structures and their comparison to known experimental data obtained from the polyacetal system.^{6} In Table I, we summarize data from experimental copolymers and its comparison to these CG simulations. Note that turbid solutions found by experiment indicate collapsed structures, while clear solutions are associated with expanded conformations. Comparing experimental and simulation data in the last two columns, we find that the polymer conformation is reasonably well reproduced by the CG model, except for two cases. We would also like to point out that for methylene mole fraction $xm>50%$, chains are always collapsed as shown in the column five of Table I.

n_{1}
. | m_{1}
. | n_{2}
. | m_{2}
. | $\u2009\u2009\u2009xm\u2009\u2009\u2009$ . | Experiment^{6}
. | Simulation . |
---|---|---|---|---|---|---|

4 | 0 | 2 | 0 | 1.00 | Turbid | Globule |

4 | 0 | 2 | 1 | 0.86 | Turbid | Globule |

4 | 0 | 2 | 2 | 0.75 | Turbid | Globule |

4 | 0 | 2 | 3 | 0.67 | Turbid | Globule |

4 | 0 | 2 | 4 | 0.60 | Turbid | Globule |

4 | 0 | 2 | 5 | 0.54 | Turbid | Globule |

2 | 1 | 2 | 1 | 0.67 | Turbid | Globule |

2 | 1 | 2 | 2 | 0.57 | Turbid | Globule |

2 | 1 | 2 | 3 | 0.50 | Clear | Lamellar |

2 | 1 | 2 | 4 | 0.44 | Clear | Expanded |

2 | 1 | 2 | 5 | 0.40 | Clear | Expanded |

2 | 2 | 2 | 1 | 0.57 | Clear | Globule |

2 | 2 | 2 | 2 | 0.50 | Clear | Expanded |

2 | 2 | 2 | 3 | 0.44 | Clear | Expanded |

2 | 2 | 2 | 4 | 0.40 | Clear | Expanded |

2 | 2 | 2 | 5 | 0.36 | Clear | Expanded |

n_{1}
. | m_{1}
. | n_{2}
. | m_{2}
. | $\u2009\u2009\u2009xm\u2009\u2009\u2009$ . | Experiment^{6}
. | Simulation . |
---|---|---|---|---|---|---|

4 | 0 | 2 | 0 | 1.00 | Turbid | Globule |

4 | 0 | 2 | 1 | 0.86 | Turbid | Globule |

4 | 0 | 2 | 2 | 0.75 | Turbid | Globule |

4 | 0 | 2 | 3 | 0.67 | Turbid | Globule |

4 | 0 | 2 | 4 | 0.60 | Turbid | Globule |

4 | 0 | 2 | 5 | 0.54 | Turbid | Globule |

2 | 1 | 2 | 1 | 0.67 | Turbid | Globule |

2 | 1 | 2 | 2 | 0.57 | Turbid | Globule |

2 | 1 | 2 | 3 | 0.50 | Clear | Lamellar |

2 | 1 | 2 | 4 | 0.44 | Clear | Expanded |

2 | 1 | 2 | 5 | 0.40 | Clear | Expanded |

2 | 2 | 2 | 1 | 0.57 | Clear | Globule |

2 | 2 | 2 | 2 | 0.50 | Clear | Expanded |

2 | 2 | 2 | 3 | 0.44 | Clear | Expanded |

2 | 2 | 2 | 4 | 0.40 | Clear | Expanded |

2 | 2 | 2 | 5 | 0.36 | Clear | Expanded |

We also would like to draw attention to the fact that it is generally challenging to incorporate composition (or sequence) and temperature transferability in a CG model. Table I demonstrates that our CG model is well transferable over a wide range of copolymer sequences. Further testing reveals that the CG model only reproduces reliable structures for this particular thermodynamic state of parameterization, i.e., 320 K. This is not surprising given that the many-body potential of mean force (PMF) is state point-dependent. However, there are examples where the many-body features are weak and thus the CG model can be *T* transferable.^{3,4,34} Moreover, the hydrogen bonding nature of the interactions, as in the case of polyacetal, adds up an additional complexity, making the many-body effect more relevant and thus the CG potential becomes less transferable.

The advantage of a segment-transferable CG model, which originates because of the lack in cross correlation between different monomer units over large length scales along the polymer backbone, is not only that it allows a rather consistent comparison with the experiments^{6} but that it also enables structural predictions to be made for several more macromolecular architectures. Therefore, the CG model presented here can be viewed as a molecular toolbox to investigate the properties of different polymer architectures for advanced functional uses.^{32} In this context, it has been previously predicted that amphiphilic copolymers can exhibit interesting structures,^{35,36} which was later studied by generic simulations^{37} and Monte Carlo simulations.^{38} These complex structures are highly interesting for biomedical applications, such as drug delivery and materials for tissue engineering.^{1,38} Therefore, to investigate a broad range of conformational properties of polyacetal-based systems, we have constructed a set of 192 copolymer configurations in water with varying amphiphilic sequences. Each of these simulations was performed for 100 ns in CG units, generating a total of about 15 $\mu $s of CG MD data.

In Fig. 5, we present three sets of representative phase diagrams for the copolymer representing with varying *n*_{1}, *n*_{2}, *m*_{1}, and *m*_{2}. It can be seen that depending on the sequences, we observe a variety of copolymer configurations. For example, we find flower, micellar, multiple-flower, lamellar, and/or worm-like micellar structures. This presents a fully flexible and versatile molecular toolbox for the simulation of these amphiphilic systems. In addition to the 192 configurations presented here, many additional configurations may be generated by employing distinct combinations of hydrophobic and hydrophilic units. Furthermore, the applicability of this CG model is not only restricted to the linear amphiphilic chains but may also be useful to study the solvation and/or aggregation of branched or brush-like copolymers.^{39}

## IV. CONCLUSIONS

We have presented a protocol to obtain a sequence-transferable coarse-grained (CG) model for amphiphilic copolymer architectures. Our CG model was derived from a segment (monomer)-based level and then was translated into longer chain simulations with different amphiphilic sequences. The derived CG model was validated by one-to-one comparison with experimental data.^{6} Additionally, we also make several structural predictions that go beyond the polymers synthesized in experiments. While the derived CG model is sequence transferable, it shows poor temperature transferability. This is due to the hydrogen bonded nature of the underlying interaction details, which leads to complex many-body effects that cannot be captured within a structure-based CG model. Here, however, parameterization of a CG model at different temperatures *T* can be performed to obtain a *T* dependent interaction capturing correct nature of hydrogen bonded interaction.^{40,42} Furthermore, the segment transferability is important because one set of CG potentials can describe, predict, and validate many amphiphilic polymer architectures. Therefore, the potentials presented here can be viewed as a molecular toolbox for a wide range of alkane and ethylene oxide-based architectures.

## SUPPLEMENTARY MATERIAL

See supplementary material for information about the bonded potentials that are used to simulate different copolymer sequences.

## ACKNOWLEDGMENTS

We thank Burkhard Dünweg and Carlos M. Marques for countless interesting discussions and Tiago E. de Oliveira for the help with the $C\u2212$IBI scripts implemented in the VOTCA package^{41} and all atom force field. We further acknowledge Joseph Rudzinski, Nancy Carolina Forero-Martinez, and Torsten Stühn for critical reading of the manuscript. C.D.S., P.L., and J.T.K. wish to acknowledge support by the National Science Foundation under Grant No. DMR-1206191. D.M. thanks Joachim Dzubiella for discussions related to his manuscript draft on the CG model of poly(ethylene oxide) homopolymer in water.^{42}

## REFERENCES

Interaction potentials and all the automated scripts to generate different molecular architectures will be compiled in a web domain located at the MPIP.

*et al*., private communications (