The influence of microscopic force fields on the motion of Brownian particles plays a fundamental role in a broad range of fields, including soft matter, biophysics, and active matter. Often, the experimental calibration of these force fields relies on the analysis of the trajectories of the Brownian particles. However, such an analysis is not always straightforward, especially if the underlying force fields are non-conservative or time-varying, driving the system out of thermodynamic equilibrium. Here, we introduce a toolbox to calibrate microscopic force fields by analyzing the trajectories of a Brownian particle using machine learning, namely, recurrent neural networks. We demonstrate that this machine-learning approach outperforms standard methods when characterizing the force fields generated by harmonic potentials if the available data are limited. More importantly, it provides a tool to calibrate force fields in situations for which there are no standard methods, such as non-conservative and time-varying force fields. In order to make this method readily available for other users, we provide a Python software package named DeepCalib, which can be easily personalized and optimized for specific force fields and applications. This package is ideal to calibrate complex and non-standard force fields from short trajectories, for which advanced specific methods would need to be developed on a case-by-case basis.

Measuring microscopic force fields is of fundamental importance to understanding microscale systems. In experimental soft matter, biophysics, and active matter, microparticles are often used to probe force fields.1–4 This has been done, for example, to measure the elasticity of cells,5,6 inter-particle interactions,7–9 and non-equilibrium fluctuations.10–13 Accurate force calibration is also crucial to study molecular motors14 and microscopic heat engines.15–20 Sometimes the calibration of the force field needs to be done even in real time.21 Disentangling the deterministic force fields from the unavoidable Brownian noise in these systems requires care and has a direct impact on the quality of the experimental results.

If one has access to a large amount of data, the profile of a generic force field can be directly estimated by averaging the particle displacements at different positions and times (see, e.g., Refs. 22 and 23). However, there are many experimental situations where this is not feasible. As a consequence, several methods have been developed for the most common force fields, which have become standard in various research areas.1,4,24

A particularly well-studied case is that of the force field Fh(x)=kx (where k is the stiffness and x the particle position with respect to the equilibrium) generated by a harmonic potential Uh(x)=12kx2. This case is particularly interesting because it approximates the force field near any stable equilibrium, such as that experienced by microscopic particles held in optical, magnetic, or acoustic traps.1,4 The simplest approach to its calibration exploits the relation between the experimental probability distribution ρ(x) and the potential, i.e., Uh(x)=kBTlnNρ(x), where N is the normalization factor, kB is the Boltzmann constant, and T is the absolute temperature, from which the force can be derived as Fh(x)=xUh(x). Beyond this potential method, several additional methods are also available. These methods use the temporal information contained in the particle trajectory, extracted by calculating the autocorrelation function,1,4 the power spectral density,25 or the recently developed algorithm FORMA, a maximum likelihood estimator based on linear regression.26 All these methods work well with long trajectories with a sufficiently high sampling rate, while their performance declines when only short trajectories or low sampling rates are available.

Some of the methods used for the calibration of harmonic traps can be generalized to more complex force fields. For example, the potential method can, in principle, be used to characterize any conservative force field at thermodynamic equilibrium; however, the required amount of data grows exponentially for complex potential landscapes, because the probe particle must be given enough time to explore the entire configuration space. Standard methods for the calibration of even more complex force fields, such as non-conservative or time-varying force fields, are not readily available. The calibration becomes particularly complex when dealing with a limited amount of data, such as when real-time calibration is necessary.12 In fact, developing methods for the calibration of some specific examples of these force fields is a very active field of research.24,26–30

In this article, we demonstrate numerically and experimentally that machine learning can efficiently calibrate the force field experienced by a Brownian particle. Specifically, we employ a recurrent neural network (RNN),31 because RNNs have been proven very successful at tasks requiring the analysis of time series, such as natural language recognition and translation,32–34 event prediction,35 and anomalous diffusion characterization.36 We demonstrate that this RNN-powered method outperforms standard calibration techniques when calibrating a harmonic potential using only a short trajectory. Then, we demonstrate that it can also be used to calibrate force fields for which standard calibration techniques do not exist, namely, bistable, non-conservative, and time-varying force fields. In order to make this approach readily available for other users, we provide a Python software package, called DeepCalib,37 which can be easily adapted to different force fields and, therefore, personalized and optimized for the needs of specific users and applications.

Machine-learning-powered techniques have been particularly successful in data analysis, emerging as an ideal method to study systems for which only limited data are available or no standard approaches are available.38,39 In particular, artificial neural networks40,41 provide a powerful way to automatically extract information from data. They belong to the class of supervised machine-learning methods. Unlike standard algorithmic approaches that use explicit mathematical recipes in order to obtain the sought-after results, supervised machine-learning methods are trained with large data sets associated with the corresponding ground truth in order to determine the optimal processing to estimate this ground truth from the input data. The learning task is typically a classification (where the ground truth indicates to which class the input belongs, e.g., determining if an image contains a cat or a dog) or a regression (where the ground truth is the numerical value of a quantity, e.g., inferring a parameter from a physical experiment).

Neural networks are composed of artificial neurons connected by adjustable weights. These neurons are often arranged in layers. The neurons in a layer perform a nonlinear transformation of their inputs and feed their results to the neurons of the subsequent layer. The final layer returns an estimate of the ground truth corresponding to the original input. The training process consists of iteratively adjusting the weights of the neural network in order to decrease the difference between the output and the ground truth of the sample so that the network progressively learns to associate the input data with the correct ground truth. This is usually achieved by backpropagating the estimation error through the layers.42 Once the neural network is trained, it can be used to predict the features of data it has never seen before.

Neural networks have recently been shown to be a powerful tool for classification and parametrization of stochastic phenomena, e.g., to determine anomalous diffusion exponents36,43 (also recently done using random forests44) the arrow of time,45 and the position of particles,46,47 as well as in microscopy48,49 and in the simulations of hydrodynamic interactions50 and optical forces.51 

This success of neural networks in analyzing experimental data motivated us to test their performance in reconstructing microscopic force fields. This has led us to develop a force-field calibration method based on the use of RNNs, which are especially well-suited to handle time series, because they process the input data sequence iteratively and, therefore, explicitly model their time evolution. We name this method and the corresponding software package DeepCalib.37 Given a force field characterized by a set of parameters (e.g., a harmonic force field characterized by its stiffness k), we train the RNN to infer these parameters from short trajectories of Brownian particles moving in such force fields. Specifically, DeepCalib analyzes an input trajectory of varying length using a RNN with three long short-term memory (LSTM) layers (with 1000, 250, and 50 nodes, respectively, see discussion in the supplementary material and Fig. S1 for details) and outputs the estimated values of the force-field parameters. We choose LSTMs because their architecture manages to retain a combination of short time as well as longer term correlations without making the training procedure excessively unstable.31 Further, LSTMs have been shown to perform well on short stochastic time series.36 We have chosen the number of layers and of nodes within each layer to achieve a good complexity–performance trade-off for all tasks presented in this article (see the supplementary material for a more detailed discussion of this trade-off); however, these parameters can be easily changed in DeepCalib by the user in order to optimize the performance for specific applications.

FIG. 1.

Calibration of harmonic potentials (simulations). (a) Simulated trajectory (gray line) of a Brownian particle in a harmonic trap. DeepCalib employs a recurrent neural network (RNN, 3 LSTM layers of sizes 1000, 250, and 50) to estimate the stiffness parameter (k) from a short section of the trajectory (black line). (b) Distribution (orange density plot) of k estimated by DeepCalib using simulated trajectories; the ground truth value of k is provided by the black dashed line. (c) Relative mean absolute error (MAE) (orange dots) of the DeepCalib estimations as a function of k. (d) Distribution (blue density plot) and (e) relative MAE (blue dots) of k estimated by the variance method. (f) Distribution (green density plot) and (g) relative MAE (green dots) of k estimated by the autocorrelation method. (h) Distribution (cyan density plot) and (i) relative MAE (cyan dots) of k estimated by FORMA. The dashed orange lines in (e), (g), and (i) reproduce the relative MAE for DeepCalib from (c) for ease of comparison. DeepCalib provides smaller errors for low values of k and comparable errors to the best methods for larger k. These results are obtained from a test dataset of 105 trajectories, each sampled 1000 times every 10 ms. Both training and test trajectories are generated with k values uniformly distributed in logarithmic scale. See also example 1a of the DeepCalib software package.37 

FIG. 1.

Calibration of harmonic potentials (simulations). (a) Simulated trajectory (gray line) of a Brownian particle in a harmonic trap. DeepCalib employs a recurrent neural network (RNN, 3 LSTM layers of sizes 1000, 250, and 50) to estimate the stiffness parameter (k) from a short section of the trajectory (black line). (b) Distribution (orange density plot) of k estimated by DeepCalib using simulated trajectories; the ground truth value of k is provided by the black dashed line. (c) Relative mean absolute error (MAE) (orange dots) of the DeepCalib estimations as a function of k. (d) Distribution (blue density plot) and (e) relative MAE (blue dots) of k estimated by the variance method. (f) Distribution (green density plot) and (g) relative MAE (green dots) of k estimated by the autocorrelation method. (h) Distribution (cyan density plot) and (i) relative MAE (cyan dots) of k estimated by FORMA. The dashed orange lines in (e), (g), and (i) reproduce the relative MAE for DeepCalib from (c) for ease of comparison. DeepCalib provides smaller errors for low values of k and comparable errors to the best methods for larger k. These results are obtained from a test dataset of 105 trajectories, each sampled 1000 times every 10 ms. Both training and test trajectories are generated with k values uniformly distributed in logarithmic scale. See also example 1a of the DeepCalib software package.37 

Close modal

For the training of the RNN, we use simulated trajectories, for which we know the ground-truth values of the force-field parameters, to iteratively adjust the weights in the nodes in the LSTM layers using the backpropagation training algorithm.42 The possibility of rapidly generating a large amount of data by simulation allows us to employ a relatively large RNN with 5·106 parameters without overfitting and without the need of overwhelmingly long training time (a few hours on a GPU-enhanced laptop). In this way, we can keep the specific number of layers and their dimensions constant for all the different calibration tasks we present in this article, easing the comparison of the RNN performance across the different tasks. The details of the network size effect on performance can be found in the supplementary material and Fig. S2.

FIG. 2.

Calibration of harmonic potentials (experiments). (a) A fluorescent particle (R = 100 nm) is trapped in a harmonic potential using a thermophoretic feedback trap (see Fig. S11 in the supplementary material for details). By focusing a laser on a nanofabricated chrome film (gray structure) with a disk-shaped hole (black region, diameter 15μm), one generates a thermal gradient across the particle and, therefore, a thermophoretic force pushing the particle toward the center of the ring. By adjusting the laser position and intensity as a function of the measured position of the particle, it is possible to control the direction and strength of the thermophoretic force. The resulting particle trajectories are measured by digital video microscopy and used by the RNN in order to estimate the trap stiffness k. (b)–(e) Distribution of the k estimated by DeepCalib [orange histogram in (b) analyzed using the same RNN as in Fig. 1], the variance method [blue histogram in (c)], the autocorrelation method [green histogram in (d)], and FORMA [cyan histogram in (e)] for 400 (partly overlapping) 10-s segments of a single 500-s trajectory (each segment corresponds to 1000 samples taken every 10 ms). The black dashed vertical line represents the estimation of k using the full length of the trajectory and the potential method, which we take as ground truth. The orange lines in (c), (d), and (e) reproduce the histogram for DeepCalib from (b) for ease of comparison. DeepCalib outperforms the variance and autocorrelation methods featuring lower variance and lower bias, while performing equally well as FORMA. See also example 1b of the DeepCalib software package.37 

FIG. 2.

Calibration of harmonic potentials (experiments). (a) A fluorescent particle (R = 100 nm) is trapped in a harmonic potential using a thermophoretic feedback trap (see Fig. S11 in the supplementary material for details). By focusing a laser on a nanofabricated chrome film (gray structure) with a disk-shaped hole (black region, diameter 15μm), one generates a thermal gradient across the particle and, therefore, a thermophoretic force pushing the particle toward the center of the ring. By adjusting the laser position and intensity as a function of the measured position of the particle, it is possible to control the direction and strength of the thermophoretic force. The resulting particle trajectories are measured by digital video microscopy and used by the RNN in order to estimate the trap stiffness k. (b)–(e) Distribution of the k estimated by DeepCalib [orange histogram in (b) analyzed using the same RNN as in Fig. 1], the variance method [blue histogram in (c)], the autocorrelation method [green histogram in (d)], and FORMA [cyan histogram in (e)] for 400 (partly overlapping) 10-s segments of a single 500-s trajectory (each segment corresponds to 1000 samples taken every 10 ms). The black dashed vertical line represents the estimation of k using the full length of the trajectory and the potential method, which we take as ground truth. The orange lines in (c), (d), and (e) reproduce the histogram for DeepCalib from (b) for ease of comparison. DeepCalib outperforms the variance and autocorrelation methods featuring lower variance and lower bias, while performing equally well as FORMA. See also example 1b of the DeepCalib software package.37 

Close modal

Finally, we test the performance of the trained RNN on experimental trajectories of Brownian particles in force fields that we generate using a thermophoretic feedback trap52 (see experimental details below).

In Secs. II A–II F, we demonstrate that DeepCalib can be used to estimate a large variety of force fields from stochastic trajectories. We start by considering the paradigmatic case of a harmonic trap, showing that DeepCalib outperforms standard techniques for short trajectories. Then, we move to more complex scenarios: a double-well potential, a non-conservative force field, and a time-varying force field for which no simple general calibration method exists. We provide the source code of DeepCalib together with example files that reproduce all presented results.37 This code can be easily adapted to other force fields and, therefore, optimized for the needs of specific users and applications.

In order to benchmark the performance of DeepCalib, we start by considering the simple case of a force field generated by a harmonic potential, for which many efficient standard calibration methods already exist. Harmonic traps are widely studied because they represent good approximations to more complex force-field profiles near their stable equilibria, and they are easy to experimentally realize and to analyze. A Brownian particle in a harmonic trap, in the overdamped limit, is described by the Langevin equation53 

dxdt=kγx+2kBTγξ(t),
(1)

where γ is the friction coefficient and ξ(t) is uncorrelated Gaussian noise with unitary variance. An example of a simulated trajectory is shown in Fig. 1(a). To calibrate this force field, one needs to estimate the stiffness k.

We train the RNN using simulated trajectories with different k. The friction coefficient γ is randomly varied by 5% around its nominal value in order for the RNN to gain tolerance against small fluctuations in the friction. Since we want to train the RNN to estimate accurately stiffness values that can vary over a few orders of magnitude (from 1 to 100fNμm1), we draw the values of k from a distribution that is uniform in logarithmic scale (from 100.5 to 103.5fNμm1). This is a challenging task because the range of k is very broad and the trajectory is very short [an example is the black portion of the trajectory in Fig. 1(a)]. Importantly, the training range of k is wider than the desired measurement range in order to ensure that the RNN is properly trained also for the expected edge cases. Overall, we train the RNN using 107 trajectories corresponding to 10s and sampled 1000 times (time step 10 ms). We continuously generate new trajectories (so that the RNN is never trained twice with the same trajectory, avoiding any risk of overtraining) and split them in batches of increasing size (from 32 to 2048, so that, at the beginning, the RNN optimization process can freely explore a large parameter space, and, gradually, it gets progressively annealed toward an optimal parameter set54) The training process is efficient and takes about four hours on a GPU-enhanced laptop (Intel Core i7 8750H, Nvidia GeForce GTX 1060). For further details on the model and the training, see also example 1a of the DeepCalib software package.37 

The estimations done by DeepCalib are shown in Fig. 1(b) (orange distribution) in comparison with the ground truth (black dashed line), while the corresponding relative mean absolute error (MAE) is shown in Fig. 1(c) (orange dots). DeepCalib provides accurate results for the entire range of k, significantly improving its performance at larger k. This is expected, because the time-scales of the fluctuations of the particle position in the trap are inversely proportional to k, so that for larger k the 10-s trajectory is able to explore the trapping potential more efficiently. For very small values of k, one can observe a slight bend in the cloud of predicted values, which tends to slightly overestimate the stiffness. This is a common feature for neural-network-based regression and happens because we are considering values that are close to the smallest values the network has seen in its training.

The most commonly used methods to estimate the stiffness of a harmonic trap are the variance method, the autocorrelation method, the power spectrum analysis, and the recently developed FORMA.1,4,25,26 For the trajectories we are considering, it is extremely difficult to employ the power spectrum analysis, because the trajectories are too short to accurately estimate the power spectral density. We, therefore, compare DeepCalib to the other three methods. The variance method [Figs. 1(d)–(e)] determines k from the measurement of the variance of the particle position in the trap:

k=kBTx2.
(2)

The autocorrelation method [Figs. 1(f)–(g)] determines k by fitting the decorrelation curve of the particle position in the trap:

x(t+Δt)x(t)=kBTkeΔt/τ,
(3)

where τ=γ/k is the characteristic time of the trap. In both cases, · represents averaging over time. FORMA [Figs. 1(h)–(i)] determines k using a maximum likelihood estimator:

k=1Δtiγxi(xixi1)iγxi2,
(4)

where xi is the trajectory sample, and Δt is the sampling time.

The estimations of k obtained with the variance method, with the autocorrelation method and with FORMA, present the distributions shown in Figs. 1(d) (blue density plot), 1(f) (green density plot), and 1(h) (cyan density plot), respectively. The autocorrelation method and FORMA provide slightly more accurate results than the variance method when k is small; however, they become less accurate when k is large, because individual data samples in the trajectory become excessively uncorrelated. This is expected because we are sampling the trajectory with a low frequency of just 100 Hz, which becomes comparable to the characteristic frequency of the trap (e.g., 53 Hz for k = 100 fN/μm). The corresponding relative MAEs are shown in Figs. 1(e) (blue dots), 1(f) (green dots), and 1(g) (cyan dots), together with the comparison with DeepCalib's performance (orange dashed line). DeepCalib clearly outperforms the other methods for small k values where the measurement is more challenging and matches the performances of the best method for the simpler cases with larger stiffnesses.

So far, we have demonstrated how DeepCalib performs on simulated test data that are obtained similarly to the training data set. In order to test DeepCalib in a realistic situation, we now investigate the performance of the same RNN discussed in Sec. II A, trained on simulated data (Fig. 1), on experimental trajectories.

The experimental setup to obtain the trajectories consists of a feedback trapping system that enables us to generate a wide variety of force fields.52,55 We measure the Brownian motion of a single 200-nm diameter polystyrene particle (ThermoFisher Scientific, F8810) in an aqueous environment, confined by dynamic temperature fields to a circular region of a UV-lithographically fabricated nanostructure [Fig. 2(a)].55 The confinement in these temperature fields occurs as a result of thermophoretic drifts of the particle due to temperature-dependent solute–solvent interactions56,57 [red arrow, Fig. 2(a)]. The microscopic origin of these drifts is manifold and summarized in the thermodiffusion coefficient DT. As it is usually the case, also in our experiment, DT has a positive sign, which means that the corresponding objects move toward colder regions in the temperature landscape. A thermophoretic drift velocity vT=DTT can be assigned to this directed motion, which is proportional to the temperature gradient T. The relative strength of the thermophoretic motion of particles in liquids is given by the ratio ST=DT/D, which is also known as the Soret coefficient. Typical values for the Soret coefficient are in the range of 0.01 to 10K1.56 In the thermophoretic trapping setup, temperature gradients are generated by the conversion of optical energy of a focused 808-nm laser beam [Pegasus Lasersysteme, PL.MI.808.300, beam waist ω0500nm, Fig. 2(a)] positioned on the circumference of a circular hole with diameter of 15μm in an otherwise continuous chrome film (thickness 30 nm). Temperature differences between the rim and the trapping center are typically on the order of ΔT10K. The current position of the particle is obtained by the fluorescence emitted by the particle under homogeneous illumination with an excitation laser (λ=532nm, Pusch OptoTech), is recorded via an EMCCD camera (Andor iXon 3) at a frequency of 100 Hz, and is evaluated in real time using a custom-made software. The real-time positioning and intensity control of the heating laser, which is realized with an acousto-optic deflector (Brimrose, 2 DS-75-40-808), can be performed according to any protocol allowing for the investigation of a pluripotency of dynamic temperature fields.52 This technique is, thus, ideally suited for testing DeepCalib on experimental data obtained from a broad range of force fields.

In Sec. II B, we use the thermophoretic trap to generate a restoring force field corresponding to a harmonic potential. We record a 500-s trajectory (5×104 samples, time step 10 ms) and determine the “true” ground-truth k by the variance method using the full recorded trajectory [black dashed lines in Figs. 2(b)–2(e)]. We then test the performance of DeepCalib on 400 (partially overlapping) segments of this trajectory (1000 samples each); the resulting estimations are presented by the orange histogram in Fig. 2(b) (see also example 1b of the DeepCalib software package37). The estimations obtained by the variance and autocorrelation methods are presented by the blue and green histograms in Figs. 2(c) and 2(d), respectively, and show that these methods present a bias toward larger and smaller values of k, respectively. Such biases can be explained by the short length of the trajectories. For the variance method, the trajectory is not long enough to explore the full potential well leading to an underestimation of the variance and, thus, an overestimation of k. For the autocorrelation method, short trajectories exploring only the region near the equilibrium position lead to an overestimation of the correlation time in the trap and, thus, an underestimation of k. The estimations obtained by FORMA [Fig. 2(e)] show better results than the variance and autocorrelation methods. This is expected because FORMA performs best in this regime when the time step is very small compared to timescales of the trap (γ/k).26 

Although DeepCalib is trained with simulated trajectories, it determines the trap stiffness from experimental trajectories more accurately than the variance and autocorrelation methods and similarly to FORMA: DeepCalib estimations are both closer to the measured truth (lower bias) and less spread (higher precision). Therefore, thanks to its data-driven training process, the RNN manages to combine the insight provided by the variance and autocorrelation methods, while largely avoiding their pitfalls, resulting in a performance that matches FORMA. In addition, we also show that DeepCalib remains the most robust analysis method also in the presence of measurement errors (Fig. S3 in the supplementary material) or inhomogeneous diffusion coefficients (Fig. S4 in the supplementary material), even though it has not been trained with data that contemplate these scenarios. Detailed discussion of these scenarios is found in the supplementary material.

FIG. 3.

Calibration of bistable potentials (simulations). (a) Potential energy profile (blue solid line) and force field (gray arrows in the bottom) of a double-well trap characterized by the equilibrium distance L and the energy-barrier height ΔU. (b) Distribution of L and (c) ΔU estimated by DeepCalib from simulated trajectories; the black dashed line represents the ground truth. (d)–(e) Corresponding distributions estimated by the potential method and (f)–(g) by the extrema method (see details about these methods in the text). (h) Relative mean absolute error (MAE) of the estimations obtained using DeepCalib (orange dots), the potential method (blue circles), and the extrema method (green triangles) as a function of L and (i) as a function of ΔU. In both cases DeepCalib outperforms the other methods achieving lower MAE. These results are obtained from a test dataset of 105 trajectories, each sampled 1000 times every 50 ms. Both training and test trajectories are generated with L values uniformly distributed in linear scale and ΔU values uniformly distributed in logarithmic scale. See also example 2a of the DeepCalib software package.37 

FIG. 3.

Calibration of bistable potentials (simulations). (a) Potential energy profile (blue solid line) and force field (gray arrows in the bottom) of a double-well trap characterized by the equilibrium distance L and the energy-barrier height ΔU. (b) Distribution of L and (c) ΔU estimated by DeepCalib from simulated trajectories; the black dashed line represents the ground truth. (d)–(e) Corresponding distributions estimated by the potential method and (f)–(g) by the extrema method (see details about these methods in the text). (h) Relative mean absolute error (MAE) of the estimations obtained using DeepCalib (orange dots), the potential method (blue circles), and the extrema method (green triangles) as a function of L and (i) as a function of ΔU. In both cases DeepCalib outperforms the other methods achieving lower MAE. These results are obtained from a test dataset of 105 trajectories, each sampled 1000 times every 50 ms. Both training and test trajectories are generated with L values uniformly distributed in linear scale and ΔU values uniformly distributed in logarithmic scale. See also example 2a of the DeepCalib software package.37 

Close modal

We remark that the RNN is expected to perform well for trap stiffnesses that lie in the range that is used for its training. For calibration of stronger harmonic traps (such as optical tweezers1) it is sufficient to modify the range of k values used for the training. We demonstrate the use of DeepCalib with experimental data for colloids trapped by optical tweezers in Fig. S5 in the supplementary material.

Now that we have validated DeepCalib on the fundamental case of a harmonic trap, we move to the more complex case of a bistable potential. Bistable traps represent a model system to study several physical and biological phenomena, such as Kramer's transitions,58 Landauer's principle,12 and folding energies of nucleic acids.59 The simplest analytic form for a double-well potential is given by a quartic polynomial [solid line in Fig. 3(a)]:

U(x)=ΔU[(xL)21]2,
(5)

where x=±L are the local minima and ΔU is the barrier height. This gives rise to a cubic force field [arrows in Fig. 3(a)]:

F(x)=4ΔUL2x[(xL)21],
(6)

clearly showing that the force vanishes at the potential minima x=±L and at the local maximum x = 0. The parameters that characterize this double-well potential are the equilibrium distance L and the energy barrier height ΔU.

The RNN employed by DeepCalib is similar to that for the harmonic trap case, but has two outputs to estimate both L and ΔU. We train this RNN on about 107 trajectories that are simulated with ΔU ranging from 0.1 to 10 kBT (uniformly distributed in logarithmic scale) and L ranging from 1 μm to 3 μm (uniformly distributed in linear scale). Finally, we test its performance on 104 simulated trajectories with 1000 samples (time step 50 ms). DeepCalib provides accurate estimations for both L [orange distribution, Fig. 3(b)] and ΔU [orange distribution, Fig. 3(c)] for a wide range of parameters (the ground truth is plotted by the black dashed lines). More details can be found in example 2a of the DeepCalib software package.37 

We now compare the performance of DeepCalib [Figs. 3(b)–3(c)] to standard methods [Figs. 3(d)–3(g)]. The standard methods to calibrate a double-well potential use the relation between equilibrium probability distribution and the potential energy,1 which is given by

ρ(x)=eU(x)/kBTN,
(7)

where the normalization factor N=eU(x)/kBTdx is the partition function. We remark that other standard methods that employ the statistics of the transition times between the wells58 cannot be applied here because we analyze short trajectories featuring few transitions. Here, we use two concrete approaches. First, we perform a quartic fit to lnρ(x) to determine the optimal values of L and ΔU [“potential method,”13,Figs. 3(d)–3(e)]. However, we observe that, for short trajectories, the potential method estimates ΔU with a strong bias. Thus, we employ a second method that is more accurate for shorter trajectories: As ρ(x) displays two local maxima at ±L (potential minima) and a local minimum at the origin (potential barrier), we obtain L as the distance between the maximum of ρ(x) and the origin, and ΔU as the ratio of the maximum probability and the probability at the origin [“extrema method,”58 Figs. 3(f)–3(g)]. Although the extrema method provides much better estimations than the potential method, it achieves a significantly worse performance than DeepCalib because of the limited length of the trajectories. This is confirmed by the inspection of the relative MAE [Figs. 3(h)–3(i)]: The relative MAE of DeepCalib (orange dots) is much lower than that of the potential method (blue circles) and of the extrema method (green triangles) over the whole range of both L [Fig. 3(h)] and ΔU [(Fig. 3(i)].

Finally, we test the performance of DeepCalib on experimental trajectories while using the same RNN employed for the analysis of the simulated data. The experimental data are acquired using the same thermophoretic setup employed for the harmonic trap [Fig. 2(a)], but imposing the force field of a double-well trap. We record a 1500-s trajectory (150 000 samples, time step 10 ms). A part of the experimental trajectory is shown in Fig. 4(a). Interestingly, the experimental potential is not exactly a quartic potential {a typical example of the “reality gap” [Fig. 4(b)] between experiments and simulations39}. The experimental potential obtained with the full extent of the trajectory is shown in Fig. 4(b). We determine the “true” ground-truth values for L and ΔU using the extrema method [green line Fig. 4(b) and black dashed lines, Figs. 4(c)–4(h)]. This reality gap makes it particularly interesting to assess how the various methods perform, because DeepCalib is trained on the idealized quartic potential, and the potential method assumes a quartic potential in its analysis. We test the performance of DeepCalib on 900 (partially overlapping) segments of this trajectory [1000 samples each with time step 50 ms, highlighted black line in Fig. 4(a)] obtaining the estimations of L and ΔU represented by the orange histograms in Figs. 4(c) and 4(d), respectively (see also example 2b of the DeepCalib software package37) The corresponding estimations for the potential method are provided by the blue histograms in Figs. 4(e)–4(f), and those for the extrema method by the green histograms in Figs. 4(g)–4(h). Also in this case, DeepCalib is more accurate and less biased than the standard methods. In particular, we highlight the fact that DeepCalib provides accurate estimations even though the experimental potential differs from the idealized double-well potential employed in the simulations used in its training. This demonstrates that the neural-network approach put forward by DeepCalib can efficiently bridge the reality gap between idealized simulations and actual experiments. We also highlight that the measurements of DeepCalib are robust against asymmetries in the double-well potential. In our tests, the measurements of the equilibrium distance remains almost unaffected even if the potential is strongly asymmetric (see Fig. S6 in the supplementary material), despite the network being trained only with symmetric potentials. We also show that DeepCalib can easily be retrained to measure two different potential barrier heights (see Fig. S7 in the supplementary material). A more detailed discussion about asymmetric double wells is found the supplementary material.

FIG. 4.

Calibration of bistable potentials (experiments). (a) Example of an experimental trajectory of a Brownian particle in a double-well potential (gray line). The highlighted section (black line) is an example of the trajectory portion used to estimate the potential parameters (see Fig. S12 in the supplementary material for details). (b) The experimental potential energy landscape corresponding to the whole experimental trajectory in (a) (black dots) and corresponding to fitted potential using the extrema method (green line). Note the reality gap between the theory and the experiment.39 (c)–(h) Distributions of L and ΔU estimated by DeepCalib [orange histograms in (c) and (d), respectively, analyzed using the same RNN as in Fig. 3], by the potential method [blue histograms in (e) and (f), respectively], and by the extrema method [green histograms in (g) and (h), respectively] for 900 (partly overlapping) 50-s segments of a single 1500-s trajectory (each segment corresponds to 1000 samples taken every 50 ms). The black dashed lines represent the estimations of L [(c), (e), (g)] and ΔU [(d), (f), (h)] using the full length of the trajectory with the extrema method, which we take as ground truth. The orange lines in (e) and (g) [(f) and (h)] reproduce the histogram for DeepCalib from (c) [(d)] for ease of comparison. In all cases DeepCalib provides more accurate and precise estimations. See also example 2b of the DeepCalib software package.37 

FIG. 4.

Calibration of bistable potentials (experiments). (a) Example of an experimental trajectory of a Brownian particle in a double-well potential (gray line). The highlighted section (black line) is an example of the trajectory portion used to estimate the potential parameters (see Fig. S12 in the supplementary material for details). (b) The experimental potential energy landscape corresponding to the whole experimental trajectory in (a) (black dots) and corresponding to fitted potential using the extrema method (green line). Note the reality gap between the theory and the experiment.39 (c)–(h) Distributions of L and ΔU estimated by DeepCalib [orange histograms in (c) and (d), respectively, analyzed using the same RNN as in Fig. 3], by the potential method [blue histograms in (e) and (f), respectively], and by the extrema method [green histograms in (g) and (h), respectively] for 900 (partly overlapping) 50-s segments of a single 1500-s trajectory (each segment corresponds to 1000 samples taken every 50 ms). The black dashed lines represent the estimations of L [(c), (e), (g)] and ΔU [(d), (f), (h)] using the full length of the trajectory with the extrema method, which we take as ground truth. The orange lines in (e) and (g) [(f) and (h)] reproduce the histogram for DeepCalib from (c) [(d)] for ease of comparison. In all cases DeepCalib provides more accurate and precise estimations. See also example 2b of the DeepCalib software package.37 

Close modal

We now test DeepCalib in a non-equilibrium scenario created by a non-conservative rotational force field. Non-conservative force fields are widely used to investigate the non-equilibrium dynamics and thermodynamics of microscopic systems.60–63 We consider the rotational force field described by the following equation:

F(r)=krγΩ(r×ẑ),
(8)

where r is the two-dimensional position in the xy-plane of the Brownian particle, which is subjected to a restoring force with stiffness k and a torque with rotational frequency Ω. An example of a rotational force field is shown in Fig. 5(a). This non-equilibrium system relaxes to a steady state, but its distribution is determined only by the restoring force and is independent of Ω [ρ(x,y)ek(x2+y2)/T60] Thus, different from the previous examples, even in principle, it is impossible to use the steady-state probability distribution to calibrate this force field, regardless of the amount of available data. The available methods26,30,60,61 rely essentially on local drifts and, therefore, require high-frequency measurements (i.e., the measurement time step must be at least one order of magnitude smaller than the characteristic times associated with the motion of the Brownian particle in the force field, which in this case are τc=γ/k and τr=Ω160,61). To explore the potential of DeepCalib in challenging scenarios, we consider trajectories sampled with a relatively low frequency (20 Hz). Thus, we train DeepCalib on simulated two-dimensional trajectories with 1000 samples acquired with a time step of 50 ms. We use about 107 trajectories that are simulated with k ranging from 6 to 150fNμm1 (uniformly distributed in logarithmic scale) and γΩ ranging from –42 to 42fNμm1 (uniformly distributed in linear scale).

FIG. 5.

Calibration of a non-conservative force field. (a) Non-conservative force field consisting of a harmonic potential characterized by the stiffness k and a rotational force field characterized by the rotational parameter Ω. (b)–(e) Distributions of k and γΩ estimated by DeepCalib [(b) and (c), respectively] and by FORMA [(d) and (e), respectively] from simulated trajectories; the black dashed lines represent the ground truth. Both training and test trajectories are generated with k values uniformly distributed in logarithmic scale and Ω values uniformly distributed in linear scale. These results are obtained from a test dataset of 104 trajectories, each sampled 1000 times every 50 ms. (f)– (i) Distributions of k and Ω estimated by DeepCalib [orange histograms in (f) and (g), respectively, analyzed using the same RNN as in (b) and (c)] and by FORMA [blue histograms in (h) and (i), respectively] for 100 (partly overlapping) 50-s segments of a single 1000-s trajectory (each segment corresponds to 1000 samples taken every 50 ms). The black dashed lines represent the FORMA-based estimations of k [(f), (h)] and Ω [(g), (i)], using the full length of the trajectory sampled every 10 ms, which we take as ground truth (see Fig. S13 in the supplementary material for details). See also examples 3a and 3b of the DeepCalib software package.37 

FIG. 5.

Calibration of a non-conservative force field. (a) Non-conservative force field consisting of a harmonic potential characterized by the stiffness k and a rotational force field characterized by the rotational parameter Ω. (b)–(e) Distributions of k and γΩ estimated by DeepCalib [(b) and (c), respectively] and by FORMA [(d) and (e), respectively] from simulated trajectories; the black dashed lines represent the ground truth. Both training and test trajectories are generated with k values uniformly distributed in logarithmic scale and Ω values uniformly distributed in linear scale. These results are obtained from a test dataset of 104 trajectories, each sampled 1000 times every 50 ms. (f)– (i) Distributions of k and Ω estimated by DeepCalib [orange histograms in (f) and (g), respectively, analyzed using the same RNN as in (b) and (c)] and by FORMA [blue histograms in (h) and (i), respectively] for 100 (partly overlapping) 50-s segments of a single 1000-s trajectory (each segment corresponds to 1000 samples taken every 50 ms). The black dashed lines represent the FORMA-based estimations of k [(f), (h)] and Ω [(g), (i)], using the full length of the trajectory sampled every 10 ms, which we take as ground truth (see Fig. S13 in the supplementary material for details). See also examples 3a and 3b of the DeepCalib software package.37 

Close modal

DeepCalib manages to estimate with good accuracy both k and Ω, as can be seen by comparing the orange distributions and the ground-truth values provided by the black dashed lines in Figs. 5(b) and 5(c), respectively [see also example 3a of the DeepCalib software package37].

Since the time step is comparable to the characteristic time of the system, we expect the standard methods to fail.26,30 In fact, when we apply FORMA26 to calibrate this force field, we obtain much poorer estimations [blue distributions in Figs. 5(d) and 5(e)]. FORMA performs reasonably well for low k (longer characteristic times), but fails for higher values of k (shorter characteristic times), while it performs poorly over the whole range of Ω.

Finally, we test the performance of DeepCalib for an experimental rotational force field, generated using the thermophoretic setup [Fig. 2(a)]. We make the test on 100 (partially overlapping) segments of the experimental trajectory (1000 seconds long), each with 1000 samples with the time step of 50 ms. The estimation of the force-field parameters is challenging because the 50 ms measurement time step is comparable to the force-field characteristic times (τc=145ms,τr=193ms). We determine the “true” ground-truth values of k and Ω [black dashed lines in Figs. 5(f)–(i)] with the FORMA-based estimations using the full length of the trajectory sampled more often (i.e., every 10 ms instead of every 50 ms), so that the sampling time is much shorter than τc and τr. Once again, the estimations of k by DeepCalib [orange distribution, Fig. 5(f)] are more accurate than those by FORMA [blue distribution, Fig. 5(h)], which clearly deviate from the measured ground truth (black dashed lines). Likewise, the estimations of Ω by DeepCalib [orange distribution, Fig. 5(g)] are also closer to the measured ground truth (black dashed lines) than those by FORMA [blue distribution, Fig. 5(j)]. For further details, see also example 3b of the DeepCalib software package.37 

To further demonstrate the potentiality of DeepCalib, we set to calibrate an even more challenging dynamical nonequilibrium system. We consider a Brownian particle subject to an alternating trapping potential that is switching between a low stiffness klow and a high stiffness khigh with a period τ. Figure 6(a) shows an example trajectory together with the corresponding stiffness protocol. There is no simple standard method for calibrating such a system, as one would have to combine techniques to detect the switching points (see, e.g., Ref. 64) with techniques to estimate stiffnesses (such as the variance and autocorrelation method that we discussed for the harmonic trap) on shorter segments of the trajectory. However, it is quite difficult to estimate these parameters for most cases, as the exact switching point gets very difficult to determine when the stiffness values are close [Fig. 6(a) features an example with a large difference between klow and khigh]. In addition, as the system is continuously kept in a nonequilibrium state, the variance and the autocorrelation methods cannot be used.

FIG. 6.

Calibration of a time-varying force field. (a) Trajectory of a Brownian particle in a harmonic trap whose stiffness switches over time between a lower stiffness value klow and a higher stiffness value khigh with a period of τ. (b)–(d) Distribution of klow,khigh, and τ estimated by DeepCalib as a function of their ground truth (black dashed lines) for 2×104 simulated trajectories with 1000 samples taken every 100 ms. Both training and test trajectories are generated with klow,khigh, and τ values uniformly distributed in logarithmic scale. (e) Experimental trajectory of a Brownian particle in a harmonic trap whose stiffness k switches over time between klow=4.1fNμm1 and khigh=52fNμm1 with a period of τ = 20 ms (see Fig. S14 in the supplementary material for details). (f)–(h) Histograms of klow,khigh, and τ estimated by DeepCalib for 100 (partly overlapping) 100-s segments of a single experimental trajectory (each segment corresponds to 1000 samples taken every 100 ms) analyzed using the same RNN as in (b)–(d). The black dashed lines in (f)–(h) represent the ground truth values of klow,khigh, and τ (klow and khigh are measured from recorded trajectories kept at constant stiffness, and τ is the set period in the switching of the protocol). DeepCalib achieves estimations of these parameters with low variance even using very short trajectory segments. See also examples 4a and 4b of the DeepCalib software package.37 

FIG. 6.

Calibration of a time-varying force field. (a) Trajectory of a Brownian particle in a harmonic trap whose stiffness switches over time between a lower stiffness value klow and a higher stiffness value khigh with a period of τ. (b)–(d) Distribution of klow,khigh, and τ estimated by DeepCalib as a function of their ground truth (black dashed lines) for 2×104 simulated trajectories with 1000 samples taken every 100 ms. Both training and test trajectories are generated with klow,khigh, and τ values uniformly distributed in logarithmic scale. (e) Experimental trajectory of a Brownian particle in a harmonic trap whose stiffness k switches over time between klow=4.1fNμm1 and khigh=52fNμm1 with a period of τ = 20 ms (see Fig. S14 in the supplementary material for details). (f)–(h) Histograms of klow,khigh, and τ estimated by DeepCalib for 100 (partly overlapping) 100-s segments of a single experimental trajectory (each segment corresponds to 1000 samples taken every 100 ms) analyzed using the same RNN as in (b)–(d). The black dashed lines in (f)–(h) represent the ground truth values of klow,khigh, and τ (klow and khigh are measured from recorded trajectories kept at constant stiffness, and τ is the set period in the switching of the protocol). DeepCalib achieves estimations of these parameters with low variance even using very short trajectory segments. See also examples 4a and 4b of the DeepCalib software package.37 

Close modal

DeepCalib can be straightforwardly applied also to this case. We train DeepCalib on simulated trajectories with 1000 samples acquired with a time step of 100 ms. We train this RNN on about 107 trajectories that are simulated with klow and khigh ranging from 2 to 280 fN μm1 (uniformly distributed in logarithmic scale, with a condition that khigh>2klow) and τ ranging from 3 to 110 s (uniformly distributed in logarithmic scale). We then test the trained RNN on 2×104 simulated trajectories, demonstrating that it is able to simultaneously and accurately estimate klow [Fig. 6(b)], khigh [Fig. 6(c)], and τ [Fig. 6(d)] (see also example 4a of the DeepCalib software package37) Of course, the accuracy in estimating the switching time depends on how different the stiffnesses of the traps are. We show this by plotting the MAE for the switching time as a function of the ratio khigh/klow in Fig. S8 of the supplementary material, where it is evident how traps with similar stiffnesses represent more challenging cases.

Experimentally, we realize this protocol uses a thermophoretic harmonic trap that alternates between two stiffnesses. We record the experimental trajectory of 105 data samples with 10-ms time steps; a part of this trajectory is shown in Fig. 6(e). We then perform the test on 100 (partially overlapping) segments of the experimental trajectory each with 1000 samples with the time step of 100 ms [black line, Fig. 6(e)]. The measured ground truth [black dashed lines in Figs. 6(f)–6(h)] for the stiffnesses of the experimental data is obtained from trajectories recorded at constant stiffnesses klow and khigh, while we know exactly the ground truth for τ, because we control the period of the experimental switching protocol. Using the same RNN trained for Figs. 6(b)–6(d), DeepCalib successfully estimates the parameters of the system klow [Fig. 6(f)], khigh [Fig. 6(g)], and τ [Fig. 6(h)] from the experimental data (see also example 4b of the DeepCalib software package37).

This latter example demonstrates that DeepCalib can be readily applied beyond simple equilibrium or steady-state dynamics to rather generic settings, for which standard techniques are not available and one would have to develop system-specific analysis methods.

Neural-network-based methods often operate as black boxes, and it is, therefore, of great importance to properly characterize their robustness and validity in the specific scenarios where they are to be employed.39 By applying DeepCalib to experimental data in all tasks we have studied, we have already demonstrated its ability to bridge the reality gap and correctly calibrate experiments that may subtly differ from the simulations employed for the training. Here we further explore how common sources of alterations of the trajectories affect the performances of DeepCalib and how this compares to the other techniques.

First of all, we consider how the presence of measurement noise affects force calibration. For the harmonic trap case, we investigate how increasing levels of noise disrupt the performances of DeepCalib and of the other methods. Figure S3 in the supplementary material shows that DeepCalib is less affected by the presence of noise than the standard methods, even when the power of the signal equals that of the noise [signal-to-noise ratio (SNR) equal to 1]. Interestingly, these results are obtained with the very same RNN employed in Figs. 1 and 2, which is trained on trajectories without measurement noise. Even better results can be expected by re-training the RNN with trajectories with measurement noise.

Then, we consider how inhomogeneities in the diffusion coefficient (i.e., spatial gradients in the friction coefficient65) affect the performances of DeepCalib and of the other methods. Again, DeepCalib is more robust than the other methods, as shown in Fig. S4 in the supplementary material. Also in this case, we use the very same RNN employed in Figs. 1 and 2, which is trained on trajectories without diffusion gradients. Thus, we can expect even better results by re-training the RNN with trajectories with simulations that account for the presence of diffusion gradients.

Another important source of variability is the length of trajectories from which we want to calibrate the force fields. In Secs. II A–II E, for the sake of simplicity, we have trained and tested DeepCalib on trajectories of the same length (always 1000 time steps). However, DeepCalib is capable of handling trajectories of different lengths. Figure S9 in the supplementary material shows how the performance of the RNN trained for the harmonic trap with trajectories containing 1000 measurements changes as a function of the length of the test trajectories from 500 to 2000 time steps. As can be expected, longer trajectories result in more accurate calibration. However, for 500-time step trajectories (much shorter than those employed in the training) systematic biases arise, likely because the RNN is optimized to calibrate data of a certain typical length. These biases can be mitigated by training the RNN using trajectories with a distribution of lengths. However, we recommend to train DeepCalib on trajectories of similar lengths to the actual trajectories the user is interested in characterizing (see the supplementary material for an extended discussion). If very long stationary trajectories are available, an efficient use of DeepCalib is to apply it on a sliding window along the trajectory and average its predictions, similarly to what was done for anomalous diffusion in Ref. 36.

To further assess the ability of DeepCalib to address deviations in the test data from the data employed in the training, we have studied how the RNN trained for a symmetric double-well quartic potential (employed in Figs. 3 and 4) performs when applied to a trap in which the two wells have different depths. As shown by Fig. S6 in the supplementary material, the performance of DeepCalib in determining the distance between the two wells is essentially unaltered by this asymmetry.

Finally, we discuss the possible concern that for neural-network based predictions one lacks a measure of the confidence of the calibration.39 This problem emerges because a neural network returns an answer even if asked a “trick question,” such as measuring a parameter that does not belong to the physical model under measurement. For example, one could employ a neural network trained to characterize the stiffness of a harmonic trap on data from a bistable potential. As another example, one could employ a neural network trained to determine the potential barrier height in a double-well trap on data from a harmonic trap. In both cases, the neural network will return a value, which is largely meaningless. Therefore, it is useful to be able to detect this meaninglessness. As a solution to this problem, it is possible to quantify the reliability of the neural network prediction by training an ensemble of neural networks and consider how scattered their predictions are. A high variance between predictions signals a low reliability. We test this method by training 40 different networks on data from a harmonic trap and contrasting the variance of their prediction between the case in which they are actually applied to a harmonic potential and the one in which they are “tricked” into estimating a (meaningless) spring constant for a double-well potential. As shown in Fig. S10 in the supplementary material, the RNNs applied to the wrong model display a much higher variance in their prediction. As a final remark, we also note that the standard methods (e.g., the variance method, the autocorrelation method, or FORMA) would suffer from the same issues, if applied in the same way.

We provide DeepCalib on GitHub as a Python open-source freeware software package, which can be readily personalized and optimized for the needs of specific users and applications.37 The user can easily adapt DeepCalib to the analysis of any force field by altering the stochastic differential equations describing the motion of the Brownian particle used for the simulation of the training datasets. This gives users the ability to train their own RNN in order to calibrate their specific force field with no prior machine learning knowledge. The trained RNN can also be saved to be used on other software platforms (e.g., MATLAB and LabVIEW). This opens the possibility to straightforwardly analyze any force field, even when no standard calibration techniques are available, greatly enhancing the range of microscopic systems that can be analyzed and studied.

We have introduced DeepCalib, a data-driven, neural-network approach for the calibration of microscopic force fields acting on a Brownian particle, and reported its performance. By benchmarking it on simple tasks, for which standard techniques are available, we have shown that it outperforms standard methods in challenging conditions involving short and/or low frequency measurements. Then, we have demonstrated that it can be straightforwardly applied to non-equilibrium, unsteady force fields, for which no simple standard technique exists. We have also demonstrated that DeepCalib, while trained on simulated data, is able to generalize and successfully calibrate force fields from experimental data. Remarkably, even when the model of the force field used for the training was not perfectly matching the experimental one, as in the case of the double-well trap, DeepCalib managed to extract the key features, such as the location of the potential minima and the barrier height, better than the standard methods. This demonstrates its capability to bridge the reality gap between the idealized simulation used for training and the experimental reality. We have also shown DeepCalib robustness to measurement noise, inhomogeneities in diffusivity, variations of trajectory length, and differences between the models used to generate the training data and the properties of the testing data.

DeepCalib is, thus, a flexible method that can be used on a wide variety of calibration tasks. This can be clearly appreciated if one considers that there is no standard technique that we could have used to address all the examples we have considered. Indeed, even for the scenarios that admit standard methods, we had to employ different tools for each case, whereas DeepCalib just needed different training sets and the minor modification of adjusting the number of outputs to match the number of the desired calibration parameters. Therefore, DeepCalib is ideal to calibrate complex and non-standard force fields from short trajectories, for which advanced specific methods would have to be developed on a case-by-case basis. Potential areas of application include the real time calibration of bistable potentials used for information theory,12 improvement of the analysis of microscopic heat engines,18 and prediction of the free energies of biomolecules.66 

See the supplementary material for a section with details about the neural network architecture, a section with a series of tests of the robustness of DeepCalib, a section discussion on how to evaluate the robustness of the predictions provided by DeepCalib, and a section with the plots of experimental data used in the article.

All authors contributed equally to this manuscript. All authors reviewed the final manuscript.

The authors thank Harshith Bachimanchi and Martin Selin for critically revising the manuscript and the software. A.A. and G.V. acknowledge support from H2020 European Research Council (ERC) Starting Grant ComplexSwimmers (Grant No. 677511). F.C. acknowledges financial support by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG) through the Collaborative Research Center TRR 102 “Polymers under multiple constraints: Restricted and controlled molecular order and mobility” (funded by the DFG, German Research Foundation, Project No. 189853844), and the ESF and the Free State of Saxony (Junior Research Group UniDyn, Project No. SAB 100382164).

The data and software that support the findings of this study are openly available in GitHub.37 

1.
P. H.
Jones
,
O. M.
Maragò
, and
G.
Volpe
,
Optical Tweezers: Principles and Applications
(
Cambridge University
,
2015
).
2.
M. C.
Wu
,
Nat. Photon.
5
,
322
324
(
2011
).
3.
M.
Braun
and
F.
Cichos
,
ACS Nano
7
,
11200
11208
(
2013
).
4.
J.
Gieseler
,
J. R.
Gomez-Solano
,
A.
Magazzù
,
I. P.
Castillo
,
L. P.
García
,
M.
Gironella-Torrent
,
X.
Viader-Godoy
,
F.
Ritort
,
G.
Pesce
,
A. V.
Arzola
,
K.
Volke-Sepulveda
, and
G.
Volpe
, arXiv:2004.05246 (
2020
).
5.
J. P.
Mills
,
L.
Qie
,
M.
Dao
,
C. T.
Lim
, and
S.
Suresh
,
Mech. Chem. Biosys.
1
,
169
180
(
2004
).
6.
J.
Sleep
,
D.
Wilson
,
R.
Simmons
, and
W.
Gratzer
,
Biophys. J.
77
,
3085
3095
(
1999
).
7.
K. H.
Su
,
Q. H.
Wei
,
X.
Zhang
,
J. J.
Mock
,
D. R.
Smith
, and
S.
Schultz
,
Nano Lett.
3
,
1087
1090
(
2003
).
8.
M.
Yada
,
J.
Yamamoto
, and
H.
Yokoyama
,
Phys. Rev. Lett.
92
,
185501
(
2004
).
9.
S.
Paladugu
,
A.
Callegari
,
Y.
Tuna
,
L.
Barth
,
S.
Dietrich
,
A.
Gambassi
, and
G.
Volpe
,
Nat. Commun.
7
,
11403
(
2016
).
10.
J.
Liphardt
,
S.
Dumont
,
S. B.
Smith
,
I.
Tinoco
, Jr.
, and
C.
Bustamante
,
Science
296
,
1832
1835
(
2002
).
11.
D.
Collin
,
F.
Ritort
,
C.
Jarzynski
,
S. B.
Smith
,
I.
Tinoco
, and
C.
Bustamante
,
Nature
437
,
231
234
(
2005
).
12.
Y.
Jun
,
M.
Gavrilov
, and
J.
Bechhoefer
,
Phys. Rev. Lett.
113
,
190601
(
2014
).
13.
A.
Bérut
,
A.
Arakelyan
,
A.
Petrosyan
,
S.
Ciliberto
,
R.
Dillenschneider
, and
E.
Lutz
,
Nature
483
,
187
(
2012
).
14.
S.
Toyabe
,
T.
Okamoto
,
T.
Watanabe-Nakayama
,
H.
Taketani
,
S.
Kudo
, and
E.
Muneyuki
,
Phys. Rev. Lett.
104
,
198103
(
2010
).
15.
V.
Blickle
and
C.
Bechinger
,
Nat. Phys.
8
,
143
146
(
2012
).
16.
P. A.
Quinto-Su
,
Nat. Commun.
5
,
5889
(
2014
).
17.
I. A.
Martínez
,
E.
Roldán
,
L.
Dinis
,
D.
Petrov
,
J. M. R.
Parrondo
, and
R. A.
Rica
,
Nat. Phys.
12
,
67
70
(
2016
).
18.
I. A.
Martínez
,
E.
Roldán
,
L.
Dinis
, and
R. A.
Rica
,
Soft Matter
13
,
22
36
(
2017
).
19.
A.
Argun
,
J.
Soni
,
L.
Dabelow
,
S.
Bo
,
G.
Pesce
,
R.
Eichhorn
, and
G.
Volpe
,
Phys. Rev. E
96
,
052106
(
2017
).
20.
F.
Schmidt
,
A.
Magazzù
,
A.
Callegari
,
L.
Biancofiore
,
F.
Cichos
, and
G.
Volpe
,
Phys. Rev. Lett.
120
,
068004
(
2018
).
21.
M.
Gavrilov
,
Y.
Jun
, and
J.
Bechhoefer
,
Rev. Sci. Instrumen.
85
,
095102
(
2014
).
22.
P.
Wu
,
R.
Huang
,
C.
Tischer
,
A.
Jonas
, and
E.-L.
Florin
,
Phys. Rev. Lett.
103
,
108101
(
2009
).
23.
R.
Friedrich
,
J.
Peinke
,
M.
Sahimi
, and
M. R. R.
Tabar
,
Phys. Rep.
506
,
87
162
(
2011
).
24.
S.
Ciliberto
,
Phys. Rev. X
7
,
021051
(
2017
).
25.
K.
Berg-Sørensen
and
H.
Flyvbjerg
,
Rev. Sci. Instrumen.
75
,
594
612
(
2004
).
26.
L. P.
García
,
J. D.
Pérez
,
G.
Volpe
,
A. V.
Arzola
, and
G.
Volpe
,
Nat. Commun.
9
,
5166
(
2018
).
27.
F.
Böttcher
,
J.
Peinke
,
D.
Kleinhans
,
R.
Friedrich
,
P. G.
Lind
, and
M.
Haase
,
Phys. Rev. Lett.
97
,
090603
(
2006
).
28.
S.
Türkcan
,
A.
Alexandrou
, and
J. B.
Masson
,
Biophys. J.
102
,
2288
2298
(
2012
).
29.
S.
Bera
,
S.
Paul
,
R.
Singh
,
D.
Ghosh
,
A.
Kundu
,
A.
Banerjee
, and
R.
Adhikari
,
Sci. Rep.
7
,
41638
(
2017
).
30.
A.
Frishman
and
P.
Ronceray
,
Phys. Rev. X
10
,
021009
(
2020
).
31.
Z. C.
Lipton
,
J.
Berkowitz
, and
C.
Elkan
, arXiv:1506.00019 (
2015
).
32.
A.
Graves
,
N.
Jaitly
, and
A.
Mohamed
, in
2013 IEEE Workshop on Automatic Speech Recognition and Understanding
(
IEEE
,
2013
), pp.
273
278
.
33.
Y.
Wu
 et al, arXiv:1609.08144 (
2016
).
34.
S.
Han
,
J.
Kang
,
H.
Mao
,
Y.
Hu
,
X.
Li
,
Y.
Li
,
D.
Xie
,
H.
Luo
,
S.
Yao
,
Y.
Wang
,
H.
Yang
, and
W. J.
Dally
,
in Proceedings of the 2017 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
(
2017
) pp.
75
84
.
35.
F. A.
Gers
,
J.
Schmidhuber
, and
F.
Cummins
,
9th International Conference on Artificial Neural Networks: ICANN '99
(
ICANN
,
1999
), pp.
850
855
.
36.
S.
Bo
,
F.
Schmidt
,
R.
Eichhorn
, and
G.
Volpe
,
Phys. Rev. E
100
,
010102
(
2019
).
37.
A.
Argun
,
T.
Thalheim
,
S.
Bo
,
F.
Cichos
, and
G.
Volpe
, http://github.com/softmatterlab/DeepCalib (
2020
).
38.
L.
Zdeborová
,
Nat. Phys.
13
,
420
421
(
2017
).
39.
F.
Cichos
,
K.
Gustavsson
,
B.
Mehlig
, and
G.
Volpe
,
Nat. Mach. Intell.
2
,
94
103
(
2020
).
40.
M. A.
Nielsen
,
Neural Networks and Deep Learning
(
Determination Press
,
San Francisco, CA
,
2015
), Vol. 2018.
41.
F.
Chollet
 et al, “
Keras: The Python deep learning library
,” Astrophysics Source Code Library (
2018
).
42.
J. L.
McClelland
and
D. E.
Rumelhart
,
Corporate PDP Research Group
,
Parallel Distributed Processing: Explorations in the Microstructure of Cognition
(
MIT Press
Cambridge
,
1986
).
43.
N.
Granik
,
L. E.
Weiss
,
E.
Nehme
,
M.
Levin
,
M.
Chein
,
E.
Perlson
,
Y.
Roichman
, and
Y.
Shechtman
,
Biophys. J.
117
,
185
192
(
2019
).
44.
G.
Muñoz-Gil
,
M. A.
Garcia-March
,
C.
Manzo
,
J. D.
Martín-Guerrero
, and
M.
Lewenstein
,
New J. Phys.
22
,
013010
(
2020
).
45.
A.
Seif
,
M.
Hafezi
, and
C.
Jarzynski
, arXiv:1909.12380 (
2019
).
46.
M. D.
Hannel
,
A.
Abdulali
,
M.
O'Brien
, and
D. G.
Grier
,
Opt. Express
26
,
15221
15231
(
2018
).
47.
S.
Helgadottir
,
A.
Argun
, and
G.
Volpe
,
Optica
6
,
506
513
(
2019
).
48.
G.
Barbastathis
,
A.
Ozcan
, and
G.
Situ
,
Optica
6
,
921
943
(
2019
).
49.
B.
Midtvedt
,
E.
Olsén
,
F.
Eklund
,
F.
Höök
,
C. B.
Adiels
,
G.
Volpe
, and
D.
Midtvedt
, arXiv:2006.11154 (
2020
).
50.
L. J.
Gibson
,
S.
Zhang
,
A. B.
Stilgoe
,
T. A.
Nieminen
, and
H.
Rubinsztein-Dunlop
,
Phys. Rev. E
99
,
043304
(
2019
).
51.
I. C. D.
Lenton
,
G.
Volpe
,
A. B.
Stilgoe
,
T. A.
Nieminen
, and
H.
Rubinsztein-Dunlop
,
Mach. Learn.: Sci. Tech.
1
,
045009
(
2020
).
52.
M.
Braun
,
A. P.
Bregulla
,
K.
Günther
,
M.
Mertig
, and
F.
Cichos
,
Nano Lett.
15
,
5499
5505
(
2015
).
53.
G.
Volpe
and
G.
Volpe
,
Am. J. Phys.
81
,
224
231
(
2013
).
54.
S. L.
Smith
,
P. J.
Kindermans
,
C.
Ying
, and
Q. V.
Le
, arXiv:1711.00489 (
2017
).
55.
M.
Fränzl
,
T.
Thalheim
,
J.
Adler
,
D.
Huster
,
J.
Posseckardt
,
M.
Mertig
, and
F.
Cichos
,
Nat. Methods
16
,
611
614
(
2019
).
56.
A.
Würger
,
Rep. Prog. Phys.
73
,
126601
(
2010
).
57.
A. P.
Bregulla
,
A.
Würger
,
K.
Günther
,
M.
Mertig
, and
F.
Cichos
,
Phys. Rev. Lett.
116
,
188303
(
2016
).
58.
L. I.
McCann
,
M.
Dykman
, and
B.
Golding
,
Nature
402
,
785
787
(
1999
).
59.
M. T.
Woodside
,
P. C.
Anthony
,
W. M.
Behnke-Parks
,
K.
Larizadeh
,
D.
Herschlag
, and
S. M.
Block
,
Science
314
,
1001
1004
(
2006
).
60.
G.
Volpe
and
D.
Petrov
,
Phys. Rev. Lett.
97
,
210603
(
2006
).
61.
G.
Volpe
,
G.
Volpe
, and
D.
Petrov
,
Phys. Rev. E
76
,
061118
(
2007
).
62.
V.
Blickle
,
T.
Speck
,
C.
Lutz
,
U.
Seifert
, and
C.
Bechinger
,
Phys. Rev. Lett.
98
,
210601
(
2007
).
63.
J. R.
Gomez-Solano
,
A.
Petrosyan
,
S.
Ciliberto
,
R.
Chetrite
, and
K.
Gawȩdzki
,
Phys. Rev. Lett.
103
,
040601
(
2009
).
64.
D.
Montiel
,
H.
Cang
, and
H.
Yang
,
J. Phys. Chem. B
110
,
19763
19770
(
2006
).
65.
G.
Volpe
and
J.
Wehr
,
Rep. Prog. Phys.
79
,
053901
(
2016
).
66.
G.
Hummer
and
A.
Szabo
,
PNAS
107
,
21441
21446
(
2010
).

Supplementary Material