Anomalous diffusion is characterized by its asymptotic behavior for *t* → ∞. This makes it difficult to detect and describe in particle trajectories from experiments or computer simulations, which are necessarily of finite length. We propose a new approach using Bayesian inference applied directly to the observed trajectories sampled at different time scales. We illustrate the performance of this approach using random trajectories with known statistical properties and then use it for analyzing the motion of lipid molecules in the plane of a lipid bilayer.

Anomalous diffusion has been observed in a wide range of physical systems and has also been the subject of theoretical investigations.^{1–4} The term “anomalous” refers to a non-linear (asymptotic) growth of the mean square displacement (MSD) of the diffusing particles with time, and thus to a deviation from Einstein’s classical diffusion model.^{5} Most anomalous diffusion processes display subdiffusion, where the MSD grows like 2*D*_{α}*t*^{α}, with 0 < *α* < 1 for large *t*, and *D*_{α} is the fractional diffusion constant.

A central problem in the study of anomalous diffusion is the estimation of the parameters *D*_{α} and *α* from experimental or simulated particle trajectories. Since any observed trajectory consists of a finite set of points, it is not even obvious that it makes sense to discuss its asymptotic behavior for *t* → ∞. The most common approach is to compute a time-averaged MSD over all position pairs with identical lag time *t*, complemented by a statistical average over multiple trajectories that are assumed to be equivalent. Since there are fewer position pairs for large values of *t* than for small ones, the statistical quality of the MSD decreases with *t* although it is exactly the large-*t* data that are most relevant for the asymptotic behavior. In the case of normal diffusion, this approach works reasonably well because the characteristic linear growth of the MSD with time is easy to spot visually on a plot, but for anomalous diffusion, the identification of a potential asymptotic regime is not reliable.

Another approach is to assume a concrete model for the joined probability distribution functions of the sampled trajectories and use inference techniques to estimate the model parameters. There are several fundamentally different physical and mathematical models that give rise to the same 2*D*_{α}*t*^{α} form of the MSD for long times, which raises the question of how to choose the most appropriate one for a given dataset. Much recent work on particle trajectory analysis for subdiffusion has been dedicated to this question.^{6} Moreover, none of these models can be expected to be valid on all time scales. Diffusive dynamics gives way to ballistic motion at time scales smaller than the atomic or molecular collision times, meaning that for sufficiently small *t*, the MSD always grows as *t*^{2}.

In this work, we address these issues with a new data analysis technique that is based on two core ideas: (1) a multiscale approach in which the data are analyzed at different sampling time steps and (2) the use of Bayesian inference to obtain model parameters and their uncertainties directly from observed particle positions, without the computation of intermediate quantities such as an MSD. We will limit ourselves to descriptions of subdiffusion in molecular systems, but our method is not limited to this type of diffusion. In the first step, we explore the performance of our method for ideal trajectory ensembles generated randomly from a precisely known model. This allows us to see how the inference process converges as more input data are added, and to understand how it can be used to identify the asymptotic behavior in the presence of distinct short-time dynamics. In the second step, we analyze the lateral diffusion of lipid molecules in a computer simulation of a hydrated lipid bilayer.

The starting point of our approach is an observed particle trajectory *X*(*iδt*), *i* = 0, …, *K*. From this trajectory we extract subsamples of length *L* at sampling time step Δ*t* = *sδt*, *s* = 1, 2, …, i.e., the points *X _{j}* =

*X*(

*j*Δ

*t*),

*j*= 0, …,

*L*. We interpret these trajectories as realizations of an

*L*-step stochastic process that is defined by its probability distribution

where *ϕ*_{1}, …, *ϕ _{M}* are the numerical parameters of the model whose values we wish to estimate. For our intended application to molecular diffusion, we need a model for a continuous diffusion process that is sampled at arbitrarily chosen discrete time values. This excludes models such as the Continuous Time Random Walk (CTRW),

^{7}which describes a hopping process with random waiting times, but also mathematical models such as the autoregressive fractionally integrated moving average (ARFIMA) process,

^{8,9}which assume from the start a discrete time series and thus cannot accommodate different choices of sampling times. The simplest continuous stochastic process that gives rise to anomalous diffusion is

*fractional Brownian Motion*(fBM). Its probability distribution is given by

^{8,9}

where **X** = (*X*_{0}, …, *X _{L}*) and

are the components of the covariance matrix **Σ** of the process. For *α* = 1, fBM reduces to standard Brownian motion, whereas 0 < *α* < 1 describes subdiffusion and 1 < *α* < 2 superdiffusion. Although the choice of the fBM model is motivated by its capacity to describe asymptotic subdiffusion, we use it here as an analysis tool for *all* time scales present in the observed trajectories.

Bayesian inference is based on the idea of introducing a probability distribution not only for the data but also for the parameters of the model describing them.^{10} This distribution is taken to be a description of the knowledge one has about these parameters, in contrast to the more common use of probabilities in statistical physics for describing natural phenomena by random processes. Our data are equidistantly sampled trajectories **X**, which are treated as “physical” random variables, and the model parameters *α* and *D*_{α}, which are treated as “informational” random variables, leading to an interpretation of Eq. (2) as a conditional probability distribution.

The starting point of Bayesian inference is a *prior probability distribution* *P*_{0}(*α*, *D*_{α}), which describes the prior information one has about the parameters, before exploiting any observations. We use a uniform distribution for *α* in the interval (0, 2) and a uniform distribution for *D*_{α} in the interval (0, *D*_{max}) with a large but finite *D*_{max} to ensure normalizability. Each observed *L*-step trajectory **X**^{(j)} (for *j* = 1, …, *N*) adds information about *α* and *D*_{α}. The posterior probability distribution that integrates the information from *N* trajectories is derived via Bayes’ theorem and is given by

assuming that the trajectories are independent realizations of the fBM process. The factor *P*(**X**^{(j)}|*α*, *D*_{α}) is called the likelihood of the observation **X**^{(j)} and is given by Eq. (2). It describes the information about *α* and *D*_{α} contributed by a single trajectory. The denominator in Eq. (4) is not important for our application, it can be regarded as a normalization factor for *P _{N}*(

*α*,

*D*

_{α}).

Bayesian inference has several advantages over the traditional approaches for studying diffusion processes. First of all, it yields probability distributions for the parameters and thus information about their uncertainties, in addition to estimates for their values. Second, it allows a comparative evaluation of the quality of different models through the computation of their Bayes factor,^{11} although we will not discuss this aspect here. Third, it makes all the assumptions that have an impact on the results explicit.

Although we have performed simultaneous estimation of the parameters *α* and *D*_{α}, yielding the two-dimensional posterior distribution from Eq. (4), we report only single-parameter Bayesian inference on *α* in this communication. The two parameters turn out to be only weakly correlated, with 2*D*_{α}Δ*t*^{α} being narrowly distributed around the straightforward estimate,

In the following, we use this estimate for *D*_{α}, allowing us to concentrate on the more difficult and more important inference for *α*.

Before applying Bayesian inference to the estimation of fBM parameters for observed particle trajectories, we must develop an understanding of how it is affected by the quality of the input data. To this end, we analyze synthetic trajectories whose statistical properties are known by construction. In the first step, we generate these trajectories as numerical samples of the distribution given by Eq. (2), choosing 2*D*_{α}Δ*t*^{α} = 1 for simplicity and *α* = 0.6 because that is the order of magnitude we will find for subdiffusion in our lipid bilayer simulations. We thus study an ideal case in which the fBM model is known to be exact.

Fig. 1 shows how the posterior distribution narrows as more trajectories are supplied to the inference algorithm. For short trajectories (10 steps, upper panel), a single trajectory provides little information, the distributions being wide and their maxima often far away from the known value for *α*. Several hundred trajectories are required before the maximum of the distribution stabilizes near the correct value. As is to be expected, longer trajectories (100 steps, lower panel) yield individual parameter distributions that are narrower and closer to the true value. As a consequence, a reliable estimate for *α* can be obtained from about 200 input trajectories. We use trajectory lengths of *L* = 100 everywhere in the rest of this work.

As we explained earlier, trajectories for any real physical system cannot be expected to be exactly described by the fBM model, in particular for short times, when the dynamics of the underlying microscopic processes become visible. In order to study the impact of a different short-time behavior on our inference approach, we analyze another set of synthetic trajectories, generated from a modification of the fBM model that has different short-time properties.

To construct such a model, we consider the increments of the fBM process, defined by Δ*X _{i}* =

*X*

_{i+1}−

*X*. The increments form a Gaussian stochastic process as well, which is often called fractional Brownian noise. Its covariance matrix is given by

_{i} For any Gaussian process, Σ_{i,j} and $ \Sigma i , j ( inc ) $ are related by

Contrary to the fBM process itself, the increment process is stationary, and therefore its autocorrelation function $\u3008\Delta X i \Delta X j \u3009= \Sigma i , j ( inc ) $ depends only on the time lag (*i* − *j*)Δ*t*. Its physical meaning is very similar to that of a velocity autocorrelation function in atomic or molecular liquids. In fact, if we assume the existence of a microscopic velocity *v*(*t*) such that $\Delta X ( t ) = \u222b 0 \Delta t d\tau v ( t + \tau ) $, then 〈Δ*X _{i}*Δ

*X*〉 is given by the microscopic velocity autocorrelation function convoluted with a triangular weight function of width 2Δ

_{j}*t*. The increment autocorrelation function has been used by Jeon

*et al.*

^{12}to characterize subdiffusion in lipid bilayers, and their analysis shows clear deviations from fBM for short lag times.

A straightforward way to modify the short-time behavior of a process is to add a term $ \Sigma i , j ( s t ) $ to the increment covariance matrix that is non-zero only for small |*i* − *j*|. According to Eq. (6), it is sufficient to respect the condition $ \u2211 k \Sigma i , i + k ( s t ) =0$ to ensure that the asymptotic behavior of the process remains unaltered. We choose a form with only two non-zero terms, $ \Sigma i , i + 1 ( s t ) =\u2212 \Sigma i , i + 2 ( s t ) =1/2 ( \Sigma i , i + 1 ( inc ) + \Sigma i , i + 2 ( inc ) ) $, i.e., we replace the points at *k* = 1 and *k* = 2 by their average. This leads to a wider and shallower minimum, as shown in the leftmost panel of Fig. 2, which shows the increment autocorrelation functions for the fBM process and for our modified process. The middle panel of Fig. 2 shows the MSD for the standard and modified fBM processes. There is a pronounced difference in the short-time behavior that persists over a much longer time range than for the increment correlations. However, the asymptotic subdiffusive behavior of the two processes is the same.

We now use the modified process for generating synthetic trajectories of *L* = 100 steps and perform parameter inference on these trajectories using the standard fBM process from Eq. (2) as a model. The inference process converges with no sign of a problem (see figures in the supplementary material), but yields an estimate for *α* of 0.66, which is rather far from the input value of 0.6. This shows that short-time effects cannot simply be neglected. It also illustrates a well-known limitation of uncertainty estimation by Bayesian inference, which captures the uncertainty due to noise in the input data, but not the uncertainty due to a possible mismatch between the data and the chosen model (fBM in our case). The convergence of the inference process to *α* = 0.66 thus does not permit the conclusion that fBM with *α* = 0.66 is a good description of the data. In particular, it does not mean that the asymptotic diffusive behavior of our trajectories corresponds to *α* = 0.66. In fact, we know by construction that its long-time diffusion is described by *α* = 0.6.

However, we can still say that “fBM with *α* = 0.66” is a well-defined characterization of the data, since Bayesian inference yields a posterior probability distribution that is sharply peaked around this value. We then address the question of the asymptotic behavior by applying fBM-based inference to our data at different time scales, sampling the trajectories from the modified fBM process at time steps Δ*t* = *sδt*, for *s* ranging from 1 to 100. For each *s*, we generate trajectories with *L* = 100 points, such that the total amount of data received by the inference procedure is always the same. The length of each trajectory in physical time is *sLδt*. With increasing *s*, we thus include less short-time information and more long-time information. As Fig. 2 shows, this change of sampling time yields estimates for *α* that converge towards the input value of 0.6 around *s* = 100, i.e., for sampling times that are 50 times longer than the short-time perturbation we added to our model.

We now apply our Bayesian inference method to computer simulation trajectories for a hydrated lipid bilayer. The simulation data we use have been described and analyzed earlier.^{13,14} The simulated system consists of 2033 POPC (1-palmitoyl-2-oleoyl-sn-glycero-3-phosphocholine) molecules and 57 952 water beads (equivalent to 231 808 water molecules), using the coarse-grained MARTINI force field. The simulation was performed in an NVT ensemble at *T* = 320 K. Two trajectory files were generated during the simulation run: (1) a short-time trajectory of length 300 ps, sampled every 0.03 ps and (2) a long-time trajectory of length 600 ns, sampled every 18 ps. Extracts of these trajectories containing the center-or-mass trajectories of the lipid molecules have been published^{15,16} and are the basis for the present analysis.

We use only the *x* and *y* components of the center-of-mass trajectories, discarding the *z* coordinate which is perpendicular to the membrane plane. Assuming that the *x* and *y* components for all molecules are statistically equivalent and independent, we thus have 4066 single-coordinate input trajectories for inference. We use subsets consisting of *L* = 100 steps, with different sampling time steps Δ*t* = *sδt*.

Fig. 3 shows that the inference procedure converges much like for the synthetic trajectories. The dependence of the maximum-likelihood estimate for *α* on the sampling step size is shown in Fig. 4. On a very short time scale, we find *α* ≈ 2, i.e., we see the nearly ballistic motion typical for times shorter than the mean time between collisions. Increasing the sampling step size, we see a rapid decrease of *α*, ending in a stable plateau with *α* ≈ 0.55 that extends over three decades. In Fig. 4 we also show the values of 2*D*_{α}Δ*t*^{α} estimated using Eq. (5). Its behavior for large sampling times also indicates a stable asymptotic fBM regime for Δ*t* > ≈ 10 ps. For short times, we see a ballistic regime that remains a good approximation up to Δ*t* < ≈ 0.2 ps. The transition between the two regimes can be used to define a time scale *τ* for the diffusion process through the relation *D*_{2}*τ*^{2} = *D*_{α}*τ*^{α}, yielding a value of *τ* = 0.59 ps for our trajectories. Up to a factor of 2^{1/(2−α)}, this time scale is the same as the time scale *τ*_{V ACF} = (*D*_{α}/〈|**v**|^{2}〉)^{1/(2−α)} introduced earlier by Kneller *et al.*^{17}

For comparison, we have added to Fig. 4 asymptotic behavior of the three fBM model fits from Ref. 14. Their “MSD” fit, which fits the long-time part of a computed MSD to its fBM form, yields parameters that are very similar to those found in this work. The other two approaches (WDFT = Windowed Discrete Fourier Transform, MEE = Maximum Entropy Estimation) fit the frequency spectrum of the velocity autocorrelation function to its known functional form for the fBM process and lead to noticeably smaller values for *α*.

This analysis of computer simulations of lipid dynamics shows that Bayesian inference on multiple time scales is a powerful method to gain insight into subdiffusion processes. In comparison to the most common approach of fitting long-time MSD data to the asymptotic 2*D*_{α}*t*^{α} form, it has several advantages. The application of Bayesian inference directly to the observed particle positions makes it possible to focus on a small and well-defined time scale range. In our application, we chose a time scale range from [Δ*t*…100Δ*t*], for many different values of Δ*t*. The only arbitrary parameter choice is *L* = 100, but different choices lead to similar conclusions (see the supplementary material). Furthermore, monitoring the narrowing of the posterior distribution as more trajectory data were injected into the procedure allowed us to evaluate the statistical quality of our dataset, which is always a problematic aspect in MSD-based fits. We can therefore have confidence in our observation of a stable *α* ≈ 0.55 for time scales ranging from 10 to 10.000 ps. For an analysis of observed trajectories, which are necessarily of finite length, this is as close as one can get to identifying asymptotic long-time behavior. We note, however, that this conclusion could only be reached because our simulation trajectories spanned an exceptionally long time range.

The complete source code of our analysis software and additional figures are available as the supplementary material to this article.