By Bayesian inference with Metropolis algorithm, we have succeeded highly accurate estimation of a vibrational frequency as well as an initial phase for a damped oscillation contained in coherent phonon signals. Although a rise and damping profile of such vibrating signal impedes high precision estimation in conventional methods based on plane-waves expansion, the Bayesian inference makes it possible to obtain posterior probability distributions of all parameters in an appropriate physical model. On coherent phonon signals with a signal-to-noise ratio of $\u223c16dB$, the probability distribution width of the vibrating frequency becomes two-orders of magnitude smaller than the Fourier spectral width. In addition, we can also estimate the initial phase with an accuracy on the order of 10 milli-radians as well as other parameters.

## I. INTRODUCTION

Dynamics of damped oscillation in coherent phonon (CP) signals^{1} offers key insights into many sort of phonon-correlated transient phenomena in condensed matters, e.g. phonon-plasmon coupling^{2} and the early stage dynamics in photo-induced phase transition phenomena.^{3} Although conventional Fourier and wavelet transform methods have been used to analyze their vibrational frequencies and dynamics, rise and damping properties of the vibrations make resonance peaks broader because such transform methods over-decompose to sine and cosine waves of quasi-continuous frequencies and mother wavelets. For precise determination of their frequency and other parameters, data fitting by the method of least squares is frequently applied. However, the precision (error) of the fitting parameters can be hardly evaluated statistically from only one measured data.

Recently, it has been demonstrated that Bayesian inference has great abilities in complicated spectral deconvolution.^{4,5} The Bayesian inference can get error distributions as well as optimal values of spectral fitting parameters by using Markov chain Monte Carlo (MCMC) methods.^{6}

In this letter, we introduce the Bayesian inference to analyze the damped oscillation observed in the CP signals at the first time and exemplify that their fitting parameters can be determined with much superior precision especially for the vibrating frequency and the initial phase, in which the error width of vibrating frequency becomes much narrower than the band width determined by the uncertainly principle.

## II. COHERENT PHONON SIGNAL OF BISMUTH THIN FILM AND PHYSICAL MODELING

To demonstrate capabilities of the Bayesian inference, we analyzed a typical CP signal of a bismuth thin film evaporated on a Si (100) substrate. The CP signal was measured at room temperature with a conventional fast-scan method.^{7} Exciting light pulses with averaged power of $\u223c1W$ were provided from a mode-locked Ti:sapphire laser, in which a pulse width is $\u223c100fs$.

A solid line in Fig. 1 shows the CP signal, in which we subtracted a decaying background response associated with carrier excitation and relaxation^{8} from a raw data.

We deal with the experimental data in the positive delay time region and the data set $D\u2261{(t1,y1),\cdots ,(tN,yN)}$ can be modeled by $g(ti;\theta )$ in Eq. (1).

where the parameter set *θ* includes a time-integrated intensity *A*, a vibrating frequency *f*, an initial phase *ϕ*, a rising time-constant $\tau r$ and a damping constant $\tau d$.

## III. BAYESIAN INFERENCE WITH METROPOLIS ALGORITHM

We try to express the experimental data *D* by the model function $g(ti;\theta )$. However, a random noise *n*_{i} (independent on *i*) is undoubtedly superposed during measurements. Therefore, the intensities *y*_{i} are expressed as $yi=g(ti;\theta )+ni$. In the method of least squares, values of the parameter set *θ* are estimated by minimizing a mean square error function $E(\theta )=12N\u2211i=1N[yi\u2212g(ti;\theta )]2$. However, we occasionally encounter with difficulties in estimation of the optimal values and their errors when $g(ti;\theta )$ includes the fitting parameters nonlinearly like Eq. (1).

A joint probability $P(D,\theta )$ for the parameter set $\theta $ and the data set *D* is equivalent to the product of a conditional probability $P(D|\theta )$ and a probability $P(\theta )$, where the $P(D|\theta )$ is a conditional probability of *D* under *θ* given. According to the Bayes’ theorem, the $P(D,\theta )$ is also equivalent to $P(\theta |D)P(D)$. The $P(\theta |D)$ is a conditional probability of *θ* under *D* given and is what we want to evaluate. Consequently, on the assumption that the random noise *n*_{i} obeys a normal (Gaussian) distribution with a standard deviation $\sigma n$ in amplitude and a mean value of zero, the noise distribution $P(D|\theta )$ (likelihood) is directly proportional to $exp[\u2212NE(\theta )/\sigma n2]$ because the noise *n*_{i} is the same with the residual of *y*_{i} from $g(ti;\theta )$. As the result, since the data set *D* has been already measured and the *P*(*D*) becomes constant, we can evaluate the posterior probability distribution $P(\theta |D)$ on the basis of $E(\theta )$, $\sigma n$ and prior probability distribution $P(\theta )$ as described in Eq. (2).^{4}

When the $P(\theta )$ is continuous uniform distribution, the $P(\theta |D)$ is proportional to $P(D|\theta )$. Consequently, values of *θ* are distributed in the parameter space of $E(\theta )$ according to the $\sigma n$ of the noise.

For getting distributions of $P(\theta |D)$ for all parameters, we used Metropolis algorithm in MCMC methods.^{4,6} During iterations (*j*) for updating Markov-chained parameters, candidate values $\theta \u2032$ are generated randomly by normal distributions $N(\theta j,\sigma \theta )$ from the previous values $\theta j$ with standard deviations $\sigma \theta $. When a $P(\theta \u2032|D)$ is larger than the $P(\theta j|D)$, the candidate value $\theta \u2032$ is accepted unconditionally as a value in the next step. This updating corresponds to minimization of $E(\theta )$ in the same way as the method of least squares. On the Metropolis algorithm, even when $P(\theta \u2032|D)$ is smaller than $P(\theta j|D)$, the $\theta \u2032$ is also accepted randomly with the acceptance ratio,

in which it is not necessary to do the marginalization of $P(D|\theta )$ in the whole parameter space for getting *P*(*D*) in the denominator of Eq. (2) because, as seen in Eq. (3), the acceptance or rejection of the parameter values is decided by the ratio of the posterior probabilities in the MCMC methods. By using such MCMC method, we can get the distributions of $P(\theta |D)$ as histograms of the values during iteration steps.

Since the $P(\theta |D)$ distributions are considered to become broader with increasing $\sigma n$ in Eq. (2), we determined its value through repeated Bayesian inference with changing $\sigma n$ on a criterion that the root mean squared error of the obtained ${g(ti;\theta )}$ from the data set {*y*_{i}} is balanced with $\sigma n$. As a result, $\sigma n$ is $3.1\xd710\u22125$ in signal amplitude and it corresponds to a signal-to-noise (S/N) ratio of $\u223c16dB$.

## IV. RESULTS AND DISCUSSION

Figures 2 show value histories of all parameters during iteration steps on the Metropolis algorithm with $\sigma n=3.1\xd710\u22125$ in Eq. (2). These values start from unreliable initial values and drift toward the minimum point of the mean square error $E(\theta )$ in the parameter set *θ*. After such burn-in phase, these values fluctuate around the minimum point of $E(\theta )$ through iteration. The distribution of $P(\theta |D)$ is the normalized histogram of fluctuated values of each parameter during MCMC iterations. On the Bayesian inference, precisions of the respective parameters can be evaluated by distribution widths of $P(\theta |D)$ and these widths are decisively governed by the standard deviation $\sigma n$ of the random noise *n*_{i}. We evaluate the $P(\theta |D)$ distributions from sampling of these fluctuated values in the gray area of Figs. 2.

Figures 3(a)∼(e) show the $P(\theta |D)$ for all parameters included in Eq. (1). On the assumption that these distributions are normal distributions, probable errors $\u03f5\theta $ of these parameters can be obtained as the interquartile range and they are indicated with counter-arrows in Fig. 3. The physical model function $g(ti;\theta )$ in Eq. (1) is depicted by a gray shadow in Fig. 1, which is calculated with mean values of the respective parameters obtained from the $P(\theta |D)$ distributions in Fig. 3. This $g(ti;\theta )$ can reproduce well the experimental result as seen in Fig. 1.

By the Bayesian inference, we succeed high-precision estimation of all fitting parameters *θ* as summarized in Table I. It is worth noticing that, even under the circumstance of rather poor S/N ratio ($\u223c16dB$), the distribution width $\u03f5f$ of the vibrating frequency is about two-orders of magnitude smaller than the spectral bandwidth in the conventional Fourier transform (FT) spectrum as seen in Fig. 4(a). Although the dominant peak ($\u223c2.9THz$) in the FT spectrum has a broad bandwidth ($160GHz$) due to the rise and damping properties of this vibrating mode, the *P*(*f*|*D*) has a very sharp profile like a *δ*-function as indicated with a black shadow in Fig. 4(a).

Bayesian inference . | ||
---|---|---|

θ
. | mean value ± $\u03f5\theta $ . | unit . |

A | 8.63 ± 0.10 | |

F | 2930.04 ± 0.93 | GHz |

ϕ | 5.219 ± 0.012 | radian |

$\tau r$ | 80.9 ± 4.8 | fs |

$\tau d$ | 2811 ± 51 | fs |

Bayesian inference . | ||
---|---|---|

θ
. | mean value ± $\u03f5\theta $ . | unit . |

A | 8.63 ± 0.10 | |

F | 2930.04 ± 0.93 | GHz |

ϕ | 5.219 ± 0.012 | radian |

$\tau r$ | 80.9 ± 4.8 | fs |

$\tau d$ | 2811 ± 51 | fs |

In addition, the Bayesian inference serves a considerable advantage for elucidation of the CP phonon dynamics. Open circles in Fig. 4(b) show an initial phase spectrum obtained by the FT. However the initial phase drastically changes at the resonance and its spectrum has no meaning in the case of damped oscillations because the damped oscillation is over-decomposed to improper continuous plane waves by the FT. In contrast, the Bayesian inference can estimate the initial phase *ϕ* with the several milli-radians order as seen in Table I. Since exciting mechanisms of coherent phonons can be classified by initial phase,^{9} such highly accurate estimation of *ϕ* is extremely effective.

Conventional analysis methods such as Fourier transformation is expansion of arbitrary waveform by using orthogonal normalized functions on the basis of completeness principle, in which the causality is not included. In contrast, the Bayesian inference requires an appropriate physical model meaning the causality to explain the experimental data. In other words, the Bayesian inference is to search for the highest posterior probability point in the parameter *θ* space and to evaluate the distribution of the $P(\theta |D)$ around the maximum point. Although the magnitude of $P(\theta |D)$ distribution width is determined on the basis of the noise strength $\sigma n$ in Eq. (2), it strongly depends on the morphology of the error function $E(\theta )$ in the parameter space, i.e., the curvature of $E(\theta )$ at the global minimum point of $E(\theta )$. In our case, since the global minimum point of *E*(*f*) is very deep and has steeply quasi-parabolic curve, the *P*(*f*|*D*) becomes sharp as shown in Fig. 4(a).

## V. CONCLUSION

We have introduced the Bayesian inference to analyze the transient profile of the CP signal at the first time. On the basis of the physical model with a damped oscillation, we can determine all parameters with ultra-high precisions. Especially, the precision of the vibrating frequency improves in two-orders of magnitude compared with the limitation of the conventional Fourier transform method at the S/N ratio of $\u223c16dB$. In addition, we can also estimate the vibrating initial phase with high precision.

## ACKNOWLEDGMENTS

This work was supported by JSPS KAKENHI Grant Numbers 25280090, 16H01552, 16H04002, and 16K13824.