We analyse the combinatorial aspect of global optimisation for multicomponent systems, which involves searching for the optimal chemical ordering by permuting particles corresponding to different species. The overall composition is presumed fixed, and the geometry is relaxed after each permutation in order to relieve local strain. From ideas used to solve graph partitioning problems we devise a deterministic search scheme that outperforms (by orders of magnitude) conventional and self-guided basin-hopping global optimisation. The search is guided by the energy gain from either swapping particles *i* and *j* (Δ*E*_{ij}) or changing the identity of particles *i* (Δ*E*_{i}). These quantities are derived from the underlying (arbitrary) energy function, hence not constituting external bias, and for site-separable force fields each Δ*E*_{i} can be approximated simply and efficiently. In our self-guided variant of basin-hopping, particles are weighted by an approximate Δ*E*_{i} when randomly selected for an exchange, yielding a significant improvement for segregated multicomponent systems with modest particle size mismatch.

Predicting the structure of multicomponent systems is a general problem that is ubiquitous in materials science. For instance, the capability to predict and manipulate the structure of alloyed metallic nanoparticles (*nanoalloys*) is crucial for applications in catalysis and plasmonics.^{1} Another contemporary example is the rational design of functional mesoscopic structures formed from colloidal pseudoatoms,^{2} which is also founded upon reliable structure prediction. Theory and simulation have been productive in complementing experimental investigations of both nanoalloys and colloidal clusters, but the global optimisation procedures required for the identification of (meta)stable structures in a multicomponent model system are always complicated by the combinatorial number of arrangements that may be possible for each underlying framework. In the present contribution we analyse this complication and demonstrate how it can be dealt with more effectively by applying ideas previously used to solve graph partitioning problems.

Global optimisation for multicomponent systems generally can be separated into two distinct and complementary stages. One stage is focused on global geometry optimisation, and it involves the use of appropriate *displacement* moves—operations that alter the coordinates of at least one constituent particle without affecting the associated labels. Here we focus on the other stage, intended to find an optimal permutational isomer (*homotop*^{3}) for a particular geometrical motif. The search typically relies on *exchange* moves—operations that interchange the labels of two particles without immediately affecting the coordinates, thus performing a function that is complementary to that of displacement moves. A common approach to homotop search is to use basin-hopping (BH),^{4,5} where every *randomly* selected exchange is followed up by a quench that facilitates strain relaxation.^{6} With some prior knowledge of the system, the random exchange trials can be weighted in a way that biases the stochastic search and improves the overall efficiency.^{7} Here we will show that a *deterministic* algorithm based on Kernighan and Lin's (KL)^{8} heuristic for partitioning graphs can be a more effective approach, and it relies solely on the underlying (arbitrary) energy function. The heuristic is similar to the iterated local search (ILS) proposed by Lai *et al.*,^{9} but with a number of subtle and important differences. We will also show how BH (with random exchanges) can be made *self-guided* by weighting individual particles in a way that does not constitute an external bias, and for site-separable potentials the weights can be approximated in a simple and efficient manner. For the sake of brevity we shall restrict the discussion that follows to binary systems, but all the underlying ideas can be applied to ternary and more complicated systems.

For a binary cluster with *N*_{A} particles of type *A* and *N*_{B} = *N* − *N*_{A} of type *B*, each geometry might support up to *N*!/(*N*_{A}! × *N*_{B}!) distinct homotops. This number is often reduced by symmetry, but an exhaustive search for clusters with *N* ≳ 35 is still practically unfeasible. A Markovian random search without any guidance or bias is also not ideal, because it could require even more than *N*!/(*N*_{A}! × *N*_{B}!) iterations due to multiple visits to the same configuration. On the other hand, it is clear that at most min (*N*_{A}, *N*_{B}) exchanges should ever be needed to reach one homotop from another. This observation implies that, if the most favourable homotop were known, it could be obtained from any other permutational isomer by exchanging no more than min (*N*_{A}, *N*_{B}) unlike pairs. Hence, a perfect search strategy would be the one that never requires more than min (*N*_{A}, *N*_{B}) swaps to find the best homotop. Also note that at every instance there are only *N*_{A} × *N*_{B} possible choices for a swap. Determining the “gain” (to be defined) associated with every possibility and using this information to guide the search becomes a feasible strategy. In the ideal case where swaps provide perfect guidance, the maximum number of possibilities that will need to be considered is

Viewing a particle exchange as the basic step in combinatorial optimisation naturally leads to the notion of a *locally optimal* solution: it is one that cannot be improved by any single swap. Determining the number of locally optimal homotops would be a non-trivial task, but our intuition suggests that the number should be significantly smaller than *N*!/(*N*_{A}! × *N*_{B}!). It is also clear that the best (*globally optimal*) homotop must belong to this smaller set. Hence, there is potentially great utility in methods that can efficiently find a local optimum with respect to pair swaps, but not necessarily produce the global optimum. A powerful heuristic procedure due to Kernighan and Lin^{8} (KL) achieves precisely this result.

The KL heuristic was intended as a rapid search for locally optimal partitions of an arbitrary graph, and the procedure involves deterministic swapping of individual nodes between subsets of given size. The optimality of a partition is measured by a specified “cost” function. Here we represent a multicomponent cluster as a generalised graph: the particles correspond to mobile nodes and the energy is the cost function, which does not have to be pairwise additive, and the graph edges need not be well-defined. The probability (*p*) that an optimal solution found by the KL procedure is globally optimal diminishes with system size (*N*), and the empirically inferred scaling is approximately^{8} *p*(*N*) = 2^{−N/30}. Assuming the same scaling applies to heterogeneous materials, for a 30-particle binary cluster at 50/50 composition we would expect the KL procedure to find the global minimum with *p* ∼ 0.5. Here we shall not focus on the scaling with system size, but rather consider all possible compositions for one particular size to demonstrate the effectiveness of the KL procedure.

The KL heuristic starts from an arbitrary partition of particles into two indexed subsets

*a*

_{i}and

*b*

_{j}is a label permanently fixed to a particle, and the corresponding potential energy is

*E*. After initialising

*E*

_{O}←

*E*,

*C*to

*FALSE*, the procedure iterates according to the structure outlined in Algorithm 1. The algorithm is guided by the quantities Δ

*E*

_{ij}, which we refer to as

*swap gains*, each corresponding to the change in energy due to particles

*i*and

*j*trading places. The lists

*pass*(the

**do**loop); whenever two particles are exchanged, their labels are removed from

*E*

_{O}in an entire pass, guaranteeing that the final

while C is FALSE: |

Update C to TRUE. |

Set $\mathcal {A}^{\prime } \leftarrow \mathcal {A}$ $A\u2032\u2190A$ and $\mathcal {B}^{\prime } \leftarrow \mathcal {B}$ $B\u2032\u2190B$. |

do min (N_{A}, N_{B}) times: |

Pick $a_{I}^{\prime } \in \mathcal {A^{\prime }}$ $aI\u2032\u2208A\u2032$ and $b_{J}^{\prime } \in \mathcal {B^{\prime }}$ $bJ\u2032\u2208B\u2032$ with the lowest ΔE_{IJ}. |

Remove $a_{I}^{\prime }$ $aI\u2032$ from $\mathcal {A}^{\prime }$ $A\u2032$ and $b_{J}^{\prime }$ $bJ\u2032$ from $\mathcal {B}^{\prime }$ $B\u2032$. |

Move a_{I} to $\mathcal {B}$ $B$ and b_{J} to $\mathcal {A}$ $A$. |

Update the total energy E. |

if E < E_{O} then: |

Update E_{O} ← E, $\mathcal {A}_{O} \leftarrow \mathcal {A}$ $AO\u2190A$ and $\mathcal {B}_{O} \leftarrow \mathcal {B}$ $BO\u2190B$. |

Reset C to FALSE. |

end if. |

end do loop. |

end while loop. |

while C is FALSE: |

Update C to TRUE. |

Set $\mathcal {A}^{\prime } \leftarrow \mathcal {A}$ $A\u2032\u2190A$ and $\mathcal {B}^{\prime } \leftarrow \mathcal {B}$ $B\u2032\u2190B$. |

do min (N_{A}, N_{B}) times: |

Pick $a_{I}^{\prime } \in \mathcal {A^{\prime }}$ $aI\u2032\u2208A\u2032$ and $b_{J}^{\prime } \in \mathcal {B^{\prime }}$ $bJ\u2032\u2208B\u2032$ with the lowest ΔE_{IJ}. |

Remove $a_{I}^{\prime }$ $aI\u2032$ from $\mathcal {A}^{\prime }$ $A\u2032$ and $b_{J}^{\prime }$ $bJ\u2032$ from $\mathcal {B}^{\prime }$ $B\u2032$. |

Move a_{I} to $\mathcal {B}$ $B$ and b_{J} to $\mathcal {A}$ $A$. |

Update the total energy E. |

if E < E_{O} then: |

Update E_{O} ← E, $\mathcal {A}_{O} \leftarrow \mathcal {A}$ $AO\u2190A$ and $\mathcal {B}_{O} \leftarrow \mathcal {B}$ $BO\u2190B$. |

Reset C to FALSE. |

end if. |

end do loop. |

end while loop. |

Remarkably, for all the weighted graphs considered by Kernighan and Lin (and all the model systems considered below), no more than four passes were ever required to converge on a locally optimal (or near optimal) partition.^{8} This upper bound does not appear to be sensitive to system size, resulting in the total number of scanned exchanges scaling as *N*^{2}log *N*.^{8} Note that it is the removal of particles from contention that improves the scaling from *N*^{3} to *N*^{2}log *N*, because during each pass the number of swap possibilities is not fixed but rapidly diminishes from *N*_{A} × *N*_{B} to |*N*_{A} − *N*_{B}|. It is also important to note that swaps with positive as well as negative Δ*E*_{ij} can occur, which allows the system to escape from relatively shallow minima, thus improving the chances that the final partitioning is globally optimal. These two key features distinguish the KL heuristic from the “greedy” ILS of Lai *et al.*^{9}

The computational bottleneck in the KL procedure is the selection of an exchange pair, which requires scanning all the possibilities to ensure the execution of the best swap every time. For weighted graphs with a pairwise-additive cost function the calculation of every Δ*E*_{ij} is relatively straightforward, and the total cost *E* can be updated very efficiently.^{8} For an *N*-body system with a general energy function the bottleneck may be more severe. Most (semi-)empirical potentials for nanoalloys are not pairwise additive, in which case calculating each Δ*E*_{ij} may necessitate recomputing the total energy, and all of these computations will typically scale as *N*^{2}. Furthermore, exchanging two dissimilar particles is likely to result in undesirable local strain that should be relieved, and relaxing the geometry after every exchange is computationally expensive. To reduce this particular overhead in their ILS algorithm, Lai *et al.*^{9} exploit a correlation between

*before*the relaxation and Δ

*E*

_{ij}

*after*the relaxation, and they elect to rank all candidate pairs solely based on

*E*

_{ij}gets worse (e.g., lattice mismatched systems).

^{10}

We now define the *flip gain* (analogous to the “cell gain” of Fiduccia and Mattheyses^{11}), quantified by Δ*E*_{i}, the energy change due to a single particle *i* switching type (i.e., “flipping”). There are only *N* possibilities for a flip at any given instance: fewer than the number of unlike pairs, unless min (*N*_{A}, *N*_{B}) ⩽ 1. The computational cost of selecting exchange candidates can be reduced by considering all the *N* flip gains instead of all the *N*_{A} × *N*_{B} swap gains. Fiduccia and Mattheyses^{11} exploit this opportunity: they follow the same algorithm as in Algorithm 1, but each choice of

*E*

_{I}and each

*E*

_{J}. Although this strategy improves the scaling with system size, it can compromise the quality of the final solution. The problem is that particles

*I*and

*J*with minimal respective flip gains (Δ

*E*

_{I}and Δ

*E*

_{J}) will not necessarily yield the lowest possible swap gain (Δ

*E*

_{IJ}). In general, one cannot immediately infer the exact energy change resulting from a swap based on flip gains alone, especially if the potential is not pairwise additive and the two particles interact strongly with each other. Furthermore, relying on flip gains for guidance can result in desirable swaps being systematically missed, causing the algorithm to converge on a “good” solution that is not actually locally optimal. Note that one can select swap candidates based on flip gains calculated either in the same instance or in two successive instances. In the latter case,

*N*

_{A}and

*N*

_{B}are not strictly conserved, but are shifted by ±1 midway through every swap, always reverting to the original values at the conclusion of each swap. Sequential updating guarantees precise knowledge of the resulting change in total energy

*before*a flip is executed: the change is always the corresponding flip gain. Of course, one still has to decide which particle type to flip first, and this choice may also affect the quality of the final solution.

If locating the globally optimal homotop is not the ultimate goal, but rather the aim is to find a homotop that is (in some sense) *almost* globally optimal, then relying on flip gains to select swap candidates would be the simplest strategy. By analogy with swap gains, for multicomponent clusters it is useful to distinguish between *approximate* flip gains (

*before*geometry relaxation and

*exact*flip gains (Δ

*E*

_{i})

*after*the relaxation. Since most empirical potentials are site-separable, the total potential energy can be expressed as a sum over particles:

^{12}):

- KL
The heuristic in Algorithm 1 with

$a_{I}^{\prime }$$aI\u2032$ and$b_{J}^{\prime }$$bJ\u2032$ picked sequentially according to the minimum estimated flip gain, without mid-swap relaxation. The particle type with the lowest estimated flip gain was always picked first. When the search converged, it was restarted with another random initial condition. - BH*
Self-guided basin-hopping with particles weighted by

$\exp (- \Delta E_{i}^{*}/k_{B}T)$$exp(\u2212\Delta Ei*/kBT)$ in the random swap selection. - BH
Standard basin-hopping with all particles weighted uniformly (i.e., random swaps with fair selection).

Although the general idea behind BH* is similar to “tailored” moves,^{7} the intention is not to impose any external bias based on chemical intuition, but rather use the underlying energy landscape to make the search self-guided.

Our model system involves 22-particle binary Lennard-Jones clusters. This size was chosen because the global minimum structure^{5} in the homogeneous case has relatively low symmetry (*C*_{s}), and all the homotops for every composition can be enumerated to confirm that the best homotop has indeed been found. The Lennard-Jones potential was chosen for its simple and generic form:

where α, β ∈ {*A*, *B*} specify the type of particles *i* and *j*, and *r*_{ij} is the distance between *i* and *j*. The parameters ε_{αβ} (the well depth) and σ_{αβ} (particle radius) must also be specified, and the values considered in the present work never led to significant departure from the geometry of the homogeneous global minimum.^{5} Starting from this structure, we first specify a composition and perform an extensive combinatorial search (10^{6} swaps for BH and BH*, and 100 restarts for KL) using a particular method, with the geometry relaxed after every swap. Once the search has completed, we set the best encountered structure as a target, and then perform a hundred more searches using the same method, each terminating whenever the target is hit. All of the hundred searches were initialised with a different, randomly chosen homotop. The entire procedure was carried out for all compositions (1 ⩽ *N*_{A} ⩽ 21), and the search was performed using each of the three aforementioned schemes. The approximate flip gains were forecast in a single function call with transformed parameter matrices

and likewise for

*before*geometry relaxation) if it were (alone) to switch type. Subtracting the current per-particle energies

*E*

_{i}(computed with ε

_{αβ}and σ

_{αβ}) will yield approximate flip gains:

The mean first encounter statistics accumulated for three different parameter sets are shown in Fig. 1. The parametrisation considered in Fig. 1(a) is intended to represent a segregated system without lattice mismatch, in which case BH* and KL both require up to two orders of magnitude fewer swaps (and quenches) to find the global minimum. It is important to note that the performance of BH and BH* is sensitive to the temperature parameter in the accept/reject condition, which may require some tuning to achieve optimal performance, whereas KL has no adjustable parameters.

In Fig. 1(b) we again consider a lattice-matched system, but with well-depths set to encourage mixing. For this case KL outperforms both BH* and BH when min (*N*_{A}, *N*_{B}) ≳ 5, i.e., the most challenging cases. Note that for *N*_{A} = 10 and *N*_{A} = 11 not all (99% and 98%) of the KL runs found the global minimum within the specified upper bound of 100 restarts. Still, there is convincing evidence for KL outperforming the two variants of basin-hopping by two orders of magnitude.

The final parameter set considered in Fig. 1(c) represents a system of size-mismatched particles that are otherwise (chemically) equivalent. Note that the level of mismatch was chosen to ensure significant strain yet not alter the geometrical motif of the global minimum.^{13,14} Again, KL performs better than BH and BH*, especially for compositions near 50/50. However, for one data point in Fig. 1(c) (indicated by an arrow) KL failed to find the globally optimal solution in the initial search for a target. This result prompted us to repeat the run with more restarts, and we found that it located the global minimum only 6% of the time. In contrast, using exact flip gains leads to a hit-rate of 70%. This result exemplifies a potential pitfall: lattice mismatch can weaken the correlation between Δ*E*_{i} and

^{10}A detailed analysis of this issue and more benchmark calculations are in progress.

To summarise, we have outlined a deterministic scheme based on the Kernighan-Lin^{8} heuristic and shown that it can outperform stochastic methods in combinatorial optimisation for binary clusters of 22 particles. We have also proposed a self-guided variant of basin-hopping, which improves conventional basin-hopping without relying on any external bias. We anticipate that both these methods will be effective for combinatorial optimisation of a wide range of multicomponent clusters, ranging from the atomic to the mesoscopic length scale. Hence, this approach should expand the capability of theory and simulation to make efficient predictions and establish design principles for heterogeneous materials.

This work was financially supported by the EPSRC Grant No. EP/J010847/1. D.S. gratefully acknowledges Dr. Jacob Stevenson for his valuable comments, which initiated our investigation of graph partitioning.