Eukaryotic DNA is packaged in the cell nucleus into chromatin, composed of arrays of DNA–histone protein octamer complexes, the nucleosomes. Over the past decade, it has become clear that chromatin structure in vivo is not a hierarchy of well-organized folded nucleosome fibers but displays considerable conformational variability and heterogeneity. In vitro and in vivo studies, as well as computational modeling, have revealed that attractive nucleosome–nucleosome interaction with an essential role of nucleosome stacking defines chromatin compaction. The internal structure of compacted nucleosome arrays is regulated by the flexible and dynamic histone N-terminal tails. Since DNA is a highly negatively charged polyelectrolyte, electrostatic forces make a decisive contribution to chromatin formation and require the histones, particularly histone tails, to carry a significant positive charge. This also results in an essential role of mobile cations of the cytoplasm (K+, Na+, Mg2+) in regulating electrostatic interactions. Building on a previously successfully established bottom-up coarse-grained (CG) nucleosome model, we have developed a CG nucleosome array (chromatin fiber) model with the explicit presence of mobile ions and studied its conformational variability as a function of Na+ and Mg2+ ion concentration. With progressively elevated ion concentrations, we identified four main conformational states of nucleosome arrays characterized as extended, flexible, nucleosome-clutched, and globular fibers.
I. INTRODUCTION
To fit within the limited volume in the eukaryotic cell nucleus (diameter of about 10 μm), the genetic material comprising double-helical DNA (46 chromosomes of ∼2-m total length in humans) is compacted in the form of a protein-DNA complex called chromatin. In vivo, molecular-resolution studies of chromatin structure and dynamics are notoriously difficult and have progressed only recently, profiting from rapidly advancing methods of genomic mapping (chromosome conformation capture, 3C; such as Hi-C, Micro-C, and others), fluorescent super-resolution microscopy, cryogenic electron microscopy (cryo-EM) and tomography (cryo-ET).1–3 Several decades of in vitro studies on cell-extracted and recombinant-reconstituted model systems have revealed detailed structures of chromatin constituents and varying features of chromatin packaging. Eukaryotic chromatin consists of linear arrays of nucleosomes made of nucleosome core particles (NCPs) and linker DNA of variable lengths (10–70 bp). The NCP is a DNA–protein complex of about 145 bp DNA wrapped around a histone octamer, HO, comprised of two histone H2A/H2B dimers and one (H3/H4)2 tetramer, as a ∼1.8-turn superhelix, forming a flat, wedge-like cylinder (∼10 nm diameter, ∼5 nm height).4–8 Histone N-terminal tails of each histone plus the C-terminus of the H2A protrude outside the globular domain and are flexible and disordered, being capable of dynamic interactions not only with DNA of its nucleosome but also with other chromatin constituents.7,9,10
Double helical DNA is a densely charged polyelectrolyte (−1e per 0.17 nm of its length). Therefore, reducing electrostatic DNA–DNA repulsion by neutralization of its negative charge is the primary precondition for genome compaction inside the nucleus.11–13 Long-range, non-specific electrostatic interactions between DNA and positively charged histones and other nuclear proteins contribute decisively to chromatin folding in general and to nucleosome formation in particular.12–14 The electrostatic component of the DNA–nuclear protein binding depends on the charge and charge densities of the proteins and their oligocationic, Lys- and Arg-enriched domains (e.g., histone tails). The binding is mainly driven by entropy gain due to the release of mobile ions (K+, Na+, and Mg2+) interacting with polyelectrolyte DNA and, as a consequence, DNA condensation in general and relevant in vivo phenomena (NCP formation, nucleosome array folding, chromatin compaction) are highly sensitive to the ion concentration in the test tube and in the cytoplasm.13,15–19 Attractive interactions between NCPs are a foundation for higher-order chromatin compaction. Phase separation in NCP solutions is dependent on the concentrations of the cationic species, such as biologically relevant monovalent (K+, Na+) and divalent (Mg2+) ions. Histone tails also play critical roles by mediating NCP–NCP interactions.20–23 An imperative feature of the NCP condensed phases (solid and liquid crystals, lamellar, columnar, and aggregate forms in solutions) is the close NCP stacking by the adjacent flat surfaces of its wedge-like cylinders, observed in vitro,7,8,14,24–31 and in vivo.32–34
In vitro studies of nucleosome arrays (NCPs connected by linker DNA) have previously been carried out using arrays reconstituted from recombinant histones and DNA with the high-affinity nucleosome-positioning Widom “601” sequence.35 It was deduced that array folding, self-association, and phase transitions follow general rules of DNA condensation, mainly defined by electrostatic contributions.13,15,16,36 In addition to the salt conditions,15,18,37 the compaction, micro- and meso-structures of the arrays are dependent on the distribution of the charges on the overall negatively charged NCP surface with a prominent role of the acidic “islets,”38,39 and different contributions from the basic histone tails,20–23,39–43 addition of linker histones,44,45 as well as other nuclear proteins.45,46 X-ray crystallography and cryo-EM studies revealed that nucleosome arrays with nucleosome repeat length (NRL) = 147 + 10·n bp (n = 2,3, …, 6) tend to fold into compact two-start 30 nm fiber stabilized by stacking interactions between i and i + 2 NCPs.27,28,30,31,47–49 Under conditions favoring chromatin compaction, the array sedimentation coefficient50 and mechanical stiffness51 suggested the most pronounced compaction at 167 and 177 bp NRL. However, studies of nucleosome arrays produced by digestion and purification of native chromatin52–56 or reconstituted using DNA without precise nucleosome positioning sequence indicated the formation of irregularly folded fibers, with globular “clutches” of nucleosomes separated by stretches of NCP-free DNA.57–59 This has led to the vision that in vivo chromatin is folded irregularly, and only a minor fraction of it represents elements of regular structures observed in vitro.60–63
Chromatin condensation in vivo is heterogeneous and dynamic, exhibiting various coexisting phases occupying specific regions in the cell nucleus and having different viscoelastic and micro-environmental properties, ranging from solid-like to gel-like to fluid liquid.63–69 The data supports a dynamic phase-separated, fluid-like structure of chromatin lacking regular organization in favor of the long-standing hierarchical folding model.44,64,66,70,71 In vivo cryo-ET observations demonstrated the limited appearance of a regular chromatin fiber structure. The NCP structure is variable, with differences in DNA-HO attachment, while complete DNA wrapping was observed for the higher organisms.72–75 However, in fission yeast, NCPs are partially unwrapped from DNA.76 Chromatin might also be formed from non-canonically packed NCPs, e.g., columnar structures with no linker DNA, which were discovered in an in vitro study of telomeric chromatin.77 Inside the condensed environment of the eukaryotic nucleus, 11 nm nucleosome arrays fold into secondary and tertiary structures, which typically lack long-range periodicity in the linker DNA length and a clear hierarchy of folding.3,78 Compaction of the 11 nm fiber is assisted by the histone tails and a number of nuclear proteins with significant contributions from linker histones (H1) and heterochromatin protein 1 (HP1). In vivo data and modeling showed that common motifs in the secondary packing of the arrays are clutches79–83 consisting of several interacting nucleosomes with frequent occurrence of short two-start 30 nm fibers73,84 stabilized by attractive stacking between 1–3 and 2–4 NCPs in the basic tetranucleosome units.72,82
In light of the above-mentioned considerations, it is clear that chromatin structures display considerable variability and heterogeneity. One alternative method to investigate this issue and obtain detailed information on mechanisms and forces driving chromatin structure and condensation is by means of computer simulations. Over the past three decades, molecular modeling has made a valuable contribution to our understanding of the physical forces and mechanisms defining chromatin organization and has been thoroughly reviewed.85–89 However, even moderate-sized chromatin compartments are too large for detailed all-atom modeling, comprising a huge number of solvent particles. Some all-atom MD simulations of NCP systems have been reported (e.g.,90–93), including a 500 ns simulation of the octa-nucleosome.93 A multiscale simulation using an implicit solvent Born model at the atomic level of the whole chromatin fiber was also reported.94 A comprehensive review of nucleosome all-atom simulations was reported.95 Multiscale coarse-graining is the method of choice for understanding chromatin structure and dynamics at larger length scales. Simple models in the 1990s96,97 have now been developed to residue-based coarse-grained (CG) approaches98–104 with inter-NCP interactions described at higher resolution, enabling information on the structure and dynamics of NCP in condensed phases to be elucidated. It was recently shown that using coarse-grained models, the macroscopic chromatin and NCP phase behavior can be predicted.100,102 Notably, previous CG chromatin and nucleosome modeling has, with the exception of the recent study by Lin and Zhang,105 not been performed with the presence of explicit mobile ions, instead using Debye–Hückel (DH) treatment of the effects of cationic salt100,101,106,107 or resorting to a phenomenological approach to include the effects of divalent ions as in the work of Schlick and co-workers,108,109 which does not correctly capture the important ion correlation effects. From the above-mentioned discussion on the effects of ionic interactions, it is clear that the explicit presence of ions is significant in simulations of strongly coupled polyelectrolyte systems such as DNA and chromatin, where the DH treatment does not correctly model ionic interactions.106 Both analytic modeling and simulation studies confirmed the importance of ion correlation and electrostatic screening in the manifestations of the electrostatic interactions in NCP condensation.100,101,106,107
In the current work, we develop a CG nucleosome array model aimed at investigating the conformational variability of the chromatin fiber due to the effects of the explicit presence of mobile ions (K+, Na+, and Mg2+). The model is developed from our previously successfully validated bottom-up CG single NCP model for variable ion compositions and concentrations.104,110 The main difference between this model (as well as the recent explicit ion model by Lin and Zhang105) and other chromatin modeling studies, such as those of the Schlick, de Pablo, and Takada groups100,101,106–109 is the presence of explicit ions in our model, compared to DH approximations for treating ionic interactions in these other studies. Compared to our previous NCP model, we introduce improvements to the description of DNA unwrapping dynamics for the NCP at the residue/nucleotide level of resolution. As the previously developed NCP model has been proven to reproduce inter-nucleosome interactions accurately, these improvements then allow us to build a nucleosome array model simply by concatenating NCPs with linker DNA. We subsequently subject this nucleosome array model to simulations under various ionic conditions to demonstrate the conformational stages of nucleosome array folding.
II. MODEL AND SIMULATION
A. Improved CG NCP model
We have previously developed a (almost exclusively) bottom-up CG NCP model from all-atom simulations through a “divide-and-conquer” approach and used it in CG simulations of isolated NCPs at different ion compositions and concentrations.104,110 Briefly, atomistic molecular dynamics simulations of DNA in the presence of various ionic species, DNA with short peptides, and peptides with ions have been carried out. Radial distribution functions between CG units, as well as relevant bond and angle distributions, have been determined. They were used as input to the inverse Monte Carlo (IMC) procedure, which yielded nonbonded and bonded potentials for the CG model, reproducing within the CG model RDFs generated in the atomistic simulations.
Here, we aim to use and adapt this model for the simulation of a nucleosome array. To this end, we introduce some modifications that are parameterized in a top-down approach to obtain a more realistic description within the previously developed model and topology. This prioritizes a better structure and, hence, interactions between NCPs connected by linker DNA in an array. The nucleosomal DNA is modeled in the same way as in our previous work,111,112 where five coarse-grained beads represent two base pairs. Histone tails are modeled as chains of beads, whereas the well-structured histone core is represented by a distance-based elastic network. The same set of nonbonded interactions rigorously derived from the atomistic model by using the IMC method is used.113,114 To correctly represent the curved nucleosomal DNA using parameters derived from a free double helical DNA atomistic model in the previous work,111,112 harmonic bonds between nucleosomal DNA and histone core were introduced. It has been shown that such a model is capable of characterizing various condensed phases of NCPs induced by cations.110 Although these bonds maintain the stable structure of the NCP, they, at the same time, hinder the spontaneous unwrapping and rewrapping of DNA around the histone core, known as DNA breathing. In nucleosome arrays, DNA breathing is an integral part of its conformation. It can no longer be approximated in an ad hoc way if we aim to produce physically meaningful conformations. Therefore, in the current work, we seek to realistically model the DNA breathing motions by introducing two modifications to the original CG NCP model.
The first modification is to extend the distance criterion while building the elastic network in the histone core region. The distance cutoff to introduce a bond is now increased from 7 Å to 9 Å to make sure that in cases of DNA unwrapping from the histone core, the structure of the histone core is not altered. Second, we replace the harmonic bonds connecting DNA and the histone core with softer (extensible) bonds that allow DNA breathing.
After performing a number of tests, we finally settled on having each D bead in the CG DNA connected to 24 beads in the histone core, as shown in Fig. 1(a). The beads in the histone core are selected in a distance order, where every 5nth (n = 1, 2, …, 24) closest bead to a certain D bead is bonded to this D bead with the extensible bond. We decide to adopt uniform parameters for all DNA-histone core bonds to reduce the complexity. The parameters of the extensible bond potential were optimized by fitting experimental data on the radius of gyration and DNA unwrapping free energies [Figs. 1(b) and 1(c), see below]. The adopted parameters are 0.11kBT, b = 1.4 nm.
Refinement of the CG NCP model by introducing extensible bonds between the DNA and histone core. (a). Front and side views of the improved NCP model. Extensible DNA-histone core bonds are indicated by cyan (panel to the right shows bond potential profile). Only bonds originating from one D bead are shown to facilitate viewing. Histone tails are not shown. (b). A comparison of the dependence of the radius of gyration (Rg) on salt concentration was determined for the arrays experimentally and in the CG MD simulations. The blue points and curve represent the results of this work; green color represents the experimental data;20 and magenta represents the data of the earlier simulation work.16 (c). Profile of potential of mean force (PMF) of DNA unwrapping calculated for the refined CG NCP model (blue curve). Points are the data of the force-extension experiments.118,119 The estimated error range is smaller than the linewidth. The open state corresponds to the detachment of 30 bp of DNA from either side of the histone core.
Refinement of the CG NCP model by introducing extensible bonds between the DNA and histone core. (a). Front and side views of the improved NCP model. Extensible DNA-histone core bonds are indicated by cyan (panel to the right shows bond potential profile). Only bonds originating from one D bead are shown to facilitate viewing. Histone tails are not shown. (b). A comparison of the dependence of the radius of gyration (Rg) on salt concentration was determined for the arrays experimentally and in the CG MD simulations. The blue points and curve represent the results of this work; green color represents the experimental data;20 and magenta represents the data of the earlier simulation work.16 (c). Profile of potential of mean force (PMF) of DNA unwrapping calculated for the refined CG NCP model (blue curve). Points are the data of the force-extension experiments.118,119 The estimated error range is smaller than the linewidth. The open state corresponds to the detachment of 30 bp of DNA from either side of the histone core.
B. NCP model validation
The new NCP model is tested and validated by examining the radius of gyration and DNA unwrapping free energy for an isolated NCP. We investigate the structural features under equilibrium simulation at 310 K. These simulations were performed in 35 nm cubic boxes for 50 ns under varied monovalent salt concentrations. Hamiltonian dynamics with a 5 fs time step was used to sample the conformational space. Constant temperature is regulated by a velocity rescaling thermostat116 with relaxation time of 10 ps.
The results for the radius of gyration as a function of salt concentration are shown in Fig. 1(b) compared to the results of the previous model104 and experimental data.20 The new model presents not only a more accurate radius of gyration values across the tested monovalent salt concentration range, especially between 50 and 150 mM, but also a better correlation with the experimentally determined values.20 In particular, both the new model and the experiment show the minimum radius of gyration at around 50 mM salt concentration. This is a considerable improvement, as our previous model with harmonic DNA-histone core bonds only presents a monotonically increasing radius of gyration and systematically lower values.
The DNA unwrapping free energy is estimated between close and open states of DNA arms of a single NCP, as shown in Fig. 1(c). We use the end-to-end distance of DNA as the reaction coordinate and compute the potential of mean force (PMF) using the umbrella sampling approach using the SSAGES package.117 The sampling is performed over 32 equidistant windows along the DNA end-to-end vector (supplementary material, Fig. S1) from the center of mass (COM) of the first ten beads (4 bp) to the COM of the last ten beads. The restraint strength is set to 0.02 kCal/mol/Å2. The monovalent salt concentration is kept at 150 mM with equal amounts of Na+ and K+ ions. The resulting PMF is shown in Fig. 1(c). In the open state, there is about 30 bp of double-helical DNA on each side detached from the histone core, and the end-to-end distance corresponds to about 20 nm. The PMF profile shows two relatively low force regions along the reaction coordinate, at ∼15 and ∼19.7 nm, corresponding to the opening of the two DNA arms. The free energy difference between DNA wrapped and unwrapped states was found to be about 10kBT, in good agreement with ∼9kBT measured in force-extension experiments.118,119
With the above-mentioned characterization, we are satisfied with the new NCP model and optimized parameters for the extensible DNA-core bonds in its capacity to reproduce single NCP conformations. In the following, this NCP model is incorporated into a nucleosome array model to study conformations of the nucleosome array and NCP–NCP interaction in the context of a nucleosome array.
C. Building the 12–177 nucleosome array model
The improved NCP model allows us to build a nucleosome array model that incorporates both experimental data and interactions derived from atomistic simulations. Here, we built a CG model of a 12-mer NCP array with 30 bp linker DNA, i.e., a 12–177 nucleosome array. It is constructed by connecting the developed CG NCP model containing 147 bp of DNA with 30 bp CG DNA segments. In some simulations, we used a 167 bp nucleosome array with linker DNA consisting of 20 bp.
In preliminary test simulations, we found that the linker DNA, especially the segments directly connected to nucleosomal DNA, is attracted to the histone core, resulting in a bend of DNA near the dyad. We attribute this artifact to the modeling of Coulombic interaction in our model, which is described with a uniform dielectric constant equal to that of water. We decided to treat the nonbonded interaction between this part of DNA and the histone core similarly to nucleosomal DNA by introducing extensible bonds with the nucleosomal core. Such DNA-core bonds were introduced between 10 bp of linker DNA immediately connected to nucleosomal DNA and the histone core. The connection is determined in the same way for nucleosomal DNA as described in Sec. II A, although the bond strength of all these bonds is set to zero. These bonds serve to modulate only the nonbonded interaction between the linker DNA and histone core.
D. Simulations of the 12–177 nucleosome array
It is known that the nucleosome array undergoes salt-induced folding under strong electrostatic screening of cations. We performed simulations of a 12–177 nucleosome array under various salt conditions to further validate our model and study the conformational characteristics of the nucleosome array. The simulations were carried out with sodium chloride (NaCl) or magnesium chloride (MgCl2) solutions as modeled in our previous studies,104,113 i.e., one bead per ion with IMC-derived potentials. The simulations are performed with a single nucleosome array in cubic boxes, which are big enough to accommodate the nucleosome array without contact between the array and its images. In most simulations, the box size was 100 nm, except for the lowest NaCl concentration case, where a simulation box with 200 nm sides was used to accommodate the highly extended nucleosome array. The detailed compositions of all simulations are listed in the supplementary material, Table S1. As in our previous studies, Hamiltonian dynamics with a 5 fs time step is employed for sampling.
In cases when we observed the folding of the nucleosome array to a compact structure, we carried out additional annealing simulations to provide better sampling of the array conformations. In the annealing simulations, the temperature was repeatedly raised to between 450 and 500 K to achieve optimal sampling efficiency. All equilibrium simulations were performed at 310 K. The high temperature used in simulated annealing is between 450 and 500 K. We follow the same simulated annealing procedure as in our previous study,104,110 where the system is heated from equilibrium temperature to high temperature in 2 ns. The system is then kept at a high temperature for 16 ns before being cooled down to 310 K in 2 ns. The subsequent simulation is conducted for 50–120 ns, completing a simulated annealing cycle. The simulated annealing cycle is repeated as many times as we can afford to make sure that the sampling of NCP configurations is sufficient. All simulations are performed under the canonical ensemble.
E. Calculation of sedimentation coefficients
The partial specific volumes of amino acid residues can be applied directly, as in the HullRad method, with each residue being treated as one particle. We do not have a sequence identity in our DNA model. An averaged partial specific volume is used for DNA particles, i.e., the sum of the volumes of the A, T, G, and C nucleotides divided by their total mass.
We assume that in the presence of Mg2+ ions, the bound ions (within 1 nm) are tightly associated with the nucleosome array. Hence, the total molecular weight of the nucleosome array, as well as the bound Mg2+ ions, is used when determining the sedimentation coefficient.
III. RESULTS AND DISCUSSION
A. Angles between linker DNA
To validate the applicability of our model to an array of connected NCPs, we computed the distribution of the angle between the linker DNA in a 167 bp nucleosome array and compared the result to the experimental data measured in a cryo-tomography study.121 The simulation is carried out with a single 167 bp NCP in a 35 nm box, with a monovalent ion concentration equal to 150 mM with equal numbers of K+ and Na+ ions.
The 10 bp DNA on each end is considered as linker DNA, compared to 147 bp NCP. The principal axis of the linker DNA D beads pointing away from the histone core is taken as the direction of the linker DNA, in which the angle between the linker DNA is calculated. Three types of angles were considered: one direct angle between linker DNA [denoted θ, Fig. 2(a)] and two projections of this angle corresponding to the face view [θF, Fig. 2(b)] and side view [θS, Fig. 2(c)] of it. The distributions of these values under equilibrium simulation are shown in Fig. 2.
Distribution of linker DNA angles calculated for the CG model of 167 bp nucleosome in salt solutions (see text). (a) Simple angle between two arms of the 10 bp linker DNA of the 167 bp nucleosome (θ). (b) and (c) Angles between projections of the linker DNA on the planes corresponding to the top (b, θF) and side (c, θS) views of the nucleosome.
Distribution of linker DNA angles calculated for the CG model of 167 bp nucleosome in salt solutions (see text). (a) Simple angle between two arms of the 10 bp linker DNA of the 167 bp nucleosome (θ). (b) and (c) Angles between projections of the linker DNA on the planes corresponding to the top (b, θF) and side (c, θS) views of the nucleosome.
The probability of the angle between linker DNA, presented by our new model, increases sharply from 0 to 40° and decays smoothly to ∼130°. This probability distribution coincides exceptionally well with the experimentally estimated distribution shown by Zhang et al.121 [see Fig. 2(o) in this work]. The projection component of θ, θF, also shows excellent agreement with the experimental data in its range and slight skewness toward positive values. Another component of θ, θS, shows a somewhat wider distribution compared to the experiment. Considering the difference in linker DNA length between our model and the experiment, we conclude that our model reproduces the equilibrium conformation of linker DNA very well.
B. The 12–177 nucleosome array model reproduces experimental sedimentation results
As described in Sec. II C, our nucleosome array model is constructed by concatenating the improved NCP model with linker DNA. Here, we further validate the nucleosome array model as characterized by its sedimentation coefficient. It is an appropriate criterion as any inaccuracy in the NCP conformation or nonbonded interaction would be reflected in the overall nucleosome array extension and, in turn, sedimentation coefficient.
The simulated sedimentation coefficient agrees well with the experimental values obtained from an analytical ultracentrifugation (AUC) sedimentation velocity study,16 as shown in Fig. 3(a). In addition, our simulation data are in good qualitative agreement with the data of a recent Cryo-ET study75 reporting Mg2+-dependent folding and aggregation of native and reconstituted 12-mer arrays of different nucleosome repeat lengths.
Dependence of sedimentation coefficients of the 12–177 nucleosome array, s20,w, on NaCl (a) and MgCl2 (b) concentrations. The experimental data are from an analytical ultracentrifugation study.16 Illustrative EM images of the extended (a) and globular (b) conformations of the 12–177 arrays123 are shown together with snapshots of the similarly structured arrays generated in the CG MD simulations of this work. EM images were reprinted with permission from Springer Nature.
Dependence of sedimentation coefficients of the 12–177 nucleosome array, s20,w, on NaCl (a) and MgCl2 (b) concentrations. The experimental data are from an analytical ultracentrifugation study.16 Illustrative EM images of the extended (a) and globular (b) conformations of the 12–177 arrays123 are shown together with snapshots of the similarly structured arrays generated in the CG MD simulations of this work. EM images were reprinted with permission from Springer Nature.
Across the tested concentration range of NaCl (10–100 mM), the simulated values coincide particularly well with the experimental values. At the same time, in the presence of magnesium salt, the simulated sedimentation coefficient is slightly lower than in experiments below 1 mM [Fig. 3(b)]. We attribute this to the discrepancy in concentration between experiments and simulations due to the limited volume effect in simulations. Particularly, for simulation data, we report the average ion concentration in the simulation box, which can differ from experiments with a much larger bulk volume. Especially in the case of low ionic concentration, a significant portion of cations are in contact with the array due to adsorption, which results in higher inaccuracy in effective ion concentration. We have discussed this scenario in our previous work.112 In that work, we carried out simulations of 20 NCPs in a smaller box size (70 nm compared to 100 nm here) for a range of average Mg2+ concentrations in the simulation box at 3–17 mM, which we analyzed by computing RDFs and employing a Langmuir model. Other approaches have been suggested to deal with this issue.122 We found the mean simulation box concentrations to be corresponding to estimated “bulk” concentrations in the range of 0.6–11 mM. This “Donnan” effect of depleting the bulk concentration is likely to be lower for the present array system with a smaller number of nucleosomes (twelve) that are connected with linker DNA. In addition, our simulations are made for a larger bulk due to a larger box size (a factor of three larger volume), which significantly reduces the “Donnan” effect. In the limit of very large box size, the bulk and mean box size concentrations, of course, are the same. The present simulations covering average concentrations of 0.2–10 mM are thus expected to cover the experimental range of sedimentation velocity measurements that we compare to (made at bulk concentrations in the range of 0.2–1 mM). This justifies the present qualitative comparison of the effects of explicit salt on sedimentation coefficients in the simulations.
We note that the maximum value of the sedimentation coefficient from the simulation is more significant than that from the experiment for both ions. This is most likely due to the limitation of the AUC method, where the most highly compacted nucleosome array is challenging to characterize since elevated cation concentrations eventually lead to array aggregation (oligomerization) before the most compacted array state is reached.
It is worth noting that the experimental data on angles between linkers DNA and for sedimentation coefficients were not used in the parameterization of the extensible bonds between DNA and histone core, and all other interactions were strictly derived from the atomistic model by the IMC approach. The observed agreement with the experiment for these properties provides additional validation of our model and gives confidence that it can be used for the prediction of other properties of nucleosome arrays.
C. Nucleosome array compaction stages
In our simulations, the discrete nucleosome array exhibits a library of conformations under various ionic conditions. Generally, both the Na+ and Mg2+ ions induce the compaction of the nucleosome array at elevated concentrations, albeit Mg2+, being more potent in electrostatic screening, does so at significantly lower concentrations. Here, we analyze and discuss the array conformations at the nucleosome level in order to provide insights that are meaningful to understanding chromatin at the mesoscale.
We use the inter-nucleosome distances to characterize array conformations. With our 12-nucleosome array, 66 nucleosome pairs exist that are used as features of array conformations. Among all nucleosome pairs, the distances between the pairs i and i + 1, i.e., between adjacent nucleosomes, are primarily determined by the length and flexibility of the linker DNA. Therefore, we use distances between non-adjacent nucleosomes, which is a 55-dimensional vector, to characterize nucleosome array conformation.
With all of the sampled nucleosome array conformations, we first performed principal component analysis (PCA) on the conformation vector (Fig. 4). We found that the first two principal components (PCs) account for about 81% of the conformational variance. In the PC1-PC2 plane, the sampled conformations of the nucleosome arrays are dispersed well, as shown in Figs. 4(a) and 4(b), presenting the results calculated for the systems with various concentrations of Na+ [Fig. 4(a)] and Mg2+ cations [Fig. 4(b)]. It implies that one can analyze nucleosome array conformation relying on just these first two PCs. Merging the data from all equilibrated trajectories, we identified four main conformational clusters corresponding to four typical and distinctly different conformations [Fig. 4(c)]. These conformations of the 12–177 array can be termed as extended, flexible, clutched, and globular. Representative snapshots illustrating the four main structural states of the nucleosome array are shown in Fig. 4(d).
(a) and (b) Results of the PCA analysis in the system with various concentrations of MgCl2 (a) and NaCl (b). (c) Grouped conformations of the nucleosome arrays in all simulations spanned out on the plane of the first two principal components from the PCA analysis based on nucleosome–nucleosome distances. Conformation points are colored according to the K-means clustering result. (d) Snapshots of the 12–177 CG array illustrating its four structural states observed at various salt concentrations. The chosen structures are close to the centroids of the corresponding PCA clusters.
(a) and (b) Results of the PCA analysis in the system with various concentrations of MgCl2 (a) and NaCl (b). (c) Grouped conformations of the nucleosome arrays in all simulations spanned out on the plane of the first two principal components from the PCA analysis based on nucleosome–nucleosome distances. Conformation points are colored according to the K-means clustering result. (d) Snapshots of the 12–177 CG array illustrating its four structural states observed at various salt concentrations. The chosen structures are close to the centroids of the corresponding PCA clusters.
The first PC from the PCA analysis accounts for most of the conformational variations. It is highly correlated to nucleosome array extension, as shown by positive transformation coefficients on all dimensions (shown in the supplementary material, Fig. S2). There is a clear trend that far neighbor nucleosomes contribute more to PC1 than near neighbors, although all distances are positively contributing to PC1. On the other hand, PC2 is characterizing local compaction, as can be seen in the coefficients (supplementary material, Fig. S2) that only long-range neighbors (nucleosome i to ) contribute positively to PC2, whereas near neighbors contribute negatively to PC2.
Along PC1, we have conformation of clusters that are tightly compacted (low PC1 value) and extended (high PC1 value). There are clearly a few outstanding conformational groups shown in the PC1-PC2 plane [Fig. 4(c)]. The K-means clustering algorithm is employed to group the sampled conformations into the four clusters in the PC space. From the snapshots closest to the centroid of each cluster shown in Fig. 4(d), we can distinguish the conformational characteristics of each cluster. From the cluster at the lowest PC1 value [black in Fig. 4(c)], we see the most compacted conformation where all nucleosomes are in direct contact. We refer to this as the “globular cluster” conformation. In the less compacted cluster where PC1 is ∼40 (magenta), partial compaction is observed, where the nucleosomes are compacted into “clutches.” The last two clusters with higher PC1 values both correspond to extended array conformations. However, they differ in the overall length of the nucleosome array. The cluster colored in dark green shown in Fig. 4(c) is made up of conformations from simulations with 0.2 and 0.5 mM Mg2+ ions [Fig. 4(a)], where the DNA charge is more significantly neutralized. A flexible nucleosome array is observed under such ionic conditions. At the low 10 mM NaCl concentration, where charge screening is minimal, the array is fully extended [light green in Fig. 4(c)]. Repulsion between negative charges administers array extension with the highest PC1 value among all clusters.
The divalent Mg2+ is a potent cation inducing compaction of the nucleosome array through a charge-bridging effect. The degree of compaction is highly correlated with ion concentrations. As shown by the radial distribution functions of the cation in the supplementary material, Fig. S3, more cations are in close contact with DNA under higher ion concentrations, participating in charge screening.
The difference in nucleosome array compaction among clusters can be seen in the inter-nucleosome distance distributions shown in Fig. 5. From the nucleosome–nucleosome radial distribution function (RDF) shown in Fig. 5(a), we see a clear compaction of the nucleosomes in the globular clusters and cluster with clutches, illustrated by the high RDF peaks. However, fully extended and flexible nucleosome arrays show the highest peak at ∼21 nm, which is equal to the distance between adjacent nucleosomes, suggesting an extended array conformation. In particular, the globular cluster displays a more compact conformation than those with clutches, manifested by a higher RDF and a higher probability at a shorter distance up to i–i + 3 neighbors [Figs. 5(b)–5(d)]. In the flexible and extended arrays (dark and light green in Fig. 5, respectively), the fully extended array (light green) displays longer nucleosome–nucleosome distances across the board.
Structure of the 12–177 array characterized by nucleosome–nucleosome distribution of the distances within four populations of the arrays with different degrees of folding. (a) General distribution of the core–core distance. (b)–(d) Profiles of the distribution between (b) i and i + 1, (c) i and i + 2, and (d) i − i + 3 nucleosomes.
Structure of the 12–177 array characterized by nucleosome–nucleosome distribution of the distances within four populations of the arrays with different degrees of folding. (a) General distribution of the core–core distance. (b)–(d) Profiles of the distribution between (b) i and i + 1, (c) i and i + 2, and (d) i − i + 3 nucleosomes.
In summary, we sampled nucleosome array conformations, which are manifested in the presence of two species of cations (Na+ and Mg2+ ions), each with variable concentrations. Under all ionic conditions, we see four stages of compaction, from globally compact to fully extended. Each stage exhibits distinctive structural features in terms of nucleosome–nucleosome distance and conformation.
D. Nucleosomes are aligned in chromatin fibers
The order parameter of each compaction stage is shown in Fig. 6. We first notice that at a short range closer than 10 nm, where nucleosomes could have direct contact, the order parameter is high due to alignment enforced by steric hindrance. The peak values of the order parameter are in reverse order of nucleosome compactness, most likely due to stronger dynamic hindrance under high ionic concentration. In the distance range between 10 and 15 nm, we see a negative-order parameter in the reverse order of the first peak. This distance corresponds to the face-to-side, perpendicular orientation of the contacting nucleosome. The order is in reverse order of the first peak, indicating less dynamic hindrance in less tight compaction. In the fully extended conformation, although, there are no sampled data points in the contacting range; nucleosomes tend to align with each other when coming close. Hence, there are no data points for distances below 14 nm. This correlation of orientation is explained by electrostatic interactions at low salt with nucleosomes displaying repulsion. The long-range order parameter converges to zero for all conformations. We propose, however, that the reasons differ in compacted and extended conformations. In compacted conformations, the lack of correlation starts from the distance of ∼18 nm, which is due to steric and dynamic hindrance resulting from nucleosome compaction. In the fully extended conformation, the order parameter presents a plateau between 25 and 38 nm. This can only be explained by the torsional restriction introduced by linker DNA. Hence, the lack of correlation of orientation in extended conformation is the result of the long-range decay of nucleosome array torsional correlation.
Inter-nucleosome order parameter as a function of nucleosome–nucleosome distance for four conformations of the array with different degrees of folding in the range of distances between 6.5 and 22 nm. The order parameter characterizes nucleosome–nucleosome orientation with 1.0 and −0.5 values corresponding to parallel and perpendicular alignments, respectively, of the nucleosome cylinders; Sord(r) = 0 indicates lack of regular order.
Inter-nucleosome order parameter as a function of nucleosome–nucleosome distance for four conformations of the array with different degrees of folding in the range of distances between 6.5 and 22 nm. The order parameter characterizes nucleosome–nucleosome orientation with 1.0 and −0.5 values corresponding to parallel and perpendicular alignments, respectively, of the nucleosome cylinders; Sord(r) = 0 indicates lack of regular order.
IV. CONCLUSIONS
In this study, we investigate the conformational characteristics of chromatin fiber under varied degrees of compaction. The compaction of the nucleosome array is sampled with CG simulations based on a physically sound nucleosome array model and applying varied ionic conditions. Our simulation model includes explicit ions, which is crucial in order to correctly describe and capture the electrostatic interactions in this strongly coupled system. First, we improve our previously developed isolated NCP model, which has been shown to describe the macroscopic salt-dependent phase separation of NCPs.104 This model was used and improved for the application to the nucleosome array to better reproduce inter-nucleosome interactions by incorporating DNA breathing motion. Subsequently, a nucleosome array model is built by connecting the NCPs in this model with linker DNA. This chromatin fiber model is shown to accurately reproduce nucleosome array conformations from low to high ionic concentrations, as characterized by linker DNA angle distributions and sedimentation coefficients. Finally, we show that as a function of ion concentration, the nucleosome array exhibits four distinct but dynamic chromatin fiber compaction stages, from a fully extended array to a globally compacted array. In all conformations, nucleosomes are aligned as a result of the torsional restriction of linker DNA. These conformational stages are observed in the presence of either sodium or magnesium ions, suggesting that the observed conformational change is not dependent on cation species.
The encouraging results of this approach can be extended with a second step of coarse graining to develop a “super-CG” chromatin fiber model similar to the super-CG NCP model successfully used in our previous work.104 This would enable mesoscale simulations of a system of a very large number of arrays to model chromatin phase separation.124
SUPPLEMENTARY MATERIAL
The supplementary material includes data for the simulated systems composition, figures illustrating the potential of mean force of DNA unwrapping, principle component analysis, and cation - DNA phosphates RDFs.
ACKNOWLEDGMENTS
The authors thank the National Supercomputing Center (NSCC) of Singapore for its computing resource support. A.P.L. acknowledges the support from the Swedish Research Council (Vetenskapsrådet) under Grant No. 2021-04474, and the computational resources provided by the Swedish National Infrastructure for Computing (SNIC) through the National Supercomputer Centre, Linköpings Universitet (NSC). L.N acknowledges the support provided for this work by the Ministry of Education (MOE), Singapore Academic Research Fund Tier 3 (Grant No. MOE2019-T3-1-012).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
T.S., N.K., A.P.L., and L.N. designed the research. T.S. performed the research. T.S., N.K., A.P.L., and L.N. wrote the manuscript.
Tiedong Sun: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Software (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Nikolay Korolev: Conceptualization (equal); Data curation (equal); Formal analysis (equal); Investigation (equal); Methodology (equal); Validation (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Alexander P. Lyubartsev: Conceptualization (equal); Formal analysis (equal); Methodology (equal); Software (equal); Validation (equal); Writing – review & editing (equal). Lars Nordenskiöld: Conceptualization (equal); Formal analysis (equal); Funding acquisition (equal); Methodology (equal); Project administration (equal); Resources (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal).
DATA AVAILABILITY
An archive of files with a brief manual on how to set up and run simulations within this model uploaded to the Zenode repository is available (DOI: 10.5281/zenodo.13918365). Additional data that support the findings of this study are available from the corresponding authors upon reasonable request.