By Bayesian inference with Metropolis algorithm, we have succeeded highly accurate estimation of a vibrational frequency as well as an initial phase for a damped oscillation contained in coherent phonon signals. Although a rise and damping profile of such vibrating signal impedes high precision estimation in conventional methods based on plane-waves expansion, the Bayesian inference makes it possible to obtain posterior probability distributions of all parameters in an appropriate physical model. On coherent phonon signals with a signal-to-noise ratio of 16dB, the probability distribution width of the vibrating frequency becomes two-orders of magnitude smaller than the Fourier spectral width. In addition, we can also estimate the initial phase with an accuracy on the order of 10 milli-radians as well as other parameters.

Dynamics of damped oscillation in coherent phonon (CP) signals1 offers key insights into many sort of phonon-correlated transient phenomena in condensed matters, e.g. phonon-plasmon coupling2 and the early stage dynamics in photo-induced phase transition phenomena.3 Although conventional Fourier and wavelet transform methods have been used to analyze their vibrational frequencies and dynamics, rise and damping properties of the vibrations make resonance peaks broader because such transform methods over-decompose to sine and cosine waves of quasi-continuous frequencies and mother wavelets. For precise determination of their frequency and other parameters, data fitting by the method of least squares is frequently applied. However, the precision (error) of the fitting parameters can be hardly evaluated statistically from only one measured data.

Recently, it has been demonstrated that Bayesian inference has great abilities in complicated spectral deconvolution.4,5 The Bayesian inference can get error distributions as well as optimal values of spectral fitting parameters by using Markov chain Monte Carlo (MCMC) methods.6 

In this letter, we introduce the Bayesian inference to analyze the damped oscillation observed in the CP signals at the first time and exemplify that their fitting parameters can be determined with much superior precision especially for the vibrating frequency and the initial phase, in which the error width of vibrating frequency becomes much narrower than the band width determined by the uncertainly principle.

To demonstrate capabilities of the Bayesian inference, we analyzed a typical CP signal of a bismuth thin film evaporated on a Si (100) substrate. The CP signal was measured at room temperature with a conventional fast-scan method.7 Exciting light pulses with averaged power of 1W were provided from a mode-locked Ti:sapphire laser, in which a pulse width is 100fs.

A solid line in Fig. 1 shows the CP signal, in which we subtracted a decaying background response associated with carrier excitation and relaxation8 from a raw data.

FIG. 1.

A solid line represents a target data for the Bayesian inference, which is a CP signal of a bismuth thin film. A reproduced result is indicated by a gray shadow.

FIG. 1.

A solid line represents a target data for the Bayesian inference, which is a CP signal of a bismuth thin film. A reproduced result is indicated by a gray shadow.

Close modal

We deal with the experimental data in the positive delay time region and the data set D{(t1,y1),,(tN,yN)} can be modeled by g(ti;θ) in Eq. (1).

g(ti;θ)=Aτdτr[exp(tiτd)exp(tiτr)]×sin(2πfti+ϕ),when(ti0)(i=1N),θ{A,f,ϕ,τr,τd},
(1)

where the parameter set θ includes a time-integrated intensity A, a vibrating frequency f, an initial phase ϕ, a rising time-constant τr and a damping constant τd.

We try to express the experimental data D by the model function g(ti;θ). However, a random noise ni (independent on i) is undoubtedly superposed during measurements. Therefore, the intensities yi are expressed as yi=g(ti;θ)+ni. In the method of least squares, values of the parameter set θ are estimated by minimizing a mean square error function E(θ)=12Ni=1N[yig(ti;θ)]2. However, we occasionally encounter with difficulties in estimation of the optimal values and their errors when g(ti;θ) includes the fitting parameters nonlinearly like Eq. (1).

A joint probability P(D,θ) for the parameter set θ and the data set D is equivalent to the product of a conditional probability P(D|θ) and a probability P(θ), where the P(D|θ) is a conditional probability of D under θ given. According to the Bayes’ theorem, the P(D,θ) is also equivalent to P(θ|D)P(D). The P(θ|D) is a conditional probability of θ under D given and is what we want to evaluate. Consequently, on the assumption that the random noise ni obeys a normal (Gaussian) distribution with a standard deviation σn in amplitude and a mean value of zero, the noise distribution P(D|θ) (likelihood) is directly proportional to exp[NE(θ)/σn2] because the noise ni is the same with the residual of yi from g(ti;θ). As the result, since the data set D has been already measured and the P(D) becomes constant, we can evaluate the posterior probability distribution P(θ|D) on the basis of E(θ), σn and prior probability distribution P(θ) as described in Eq. (2).4 

P(θ|D)=P(D|θ)P(θ)P(D)exp[Nσn2E(θ)]P(θ).
(2)

When the P(θ) is continuous uniform distribution, the P(θ|D) is proportional to P(D|θ). Consequently, values of θ are distributed in the parameter space of E(θ) according to the σn of the noise.

For getting distributions of P(θ|D) for all parameters, we used Metropolis algorithm in MCMC methods.4,6 During iterations (j) for updating Markov-chained parameters, candidate values θ are generated randomly by normal distributions N(θj,σθ) from the previous values θj with standard deviations σθ. When a P(θ|D) is larger than the P(θj|D), the candidate value θ is accepted unconditionally as a value in the next step. This updating corresponds to minimization of E(θ) in the same way as the method of least squares. On the Metropolis algorithm, even when P(θ|D) is smaller than P(θj|D), the θ is also accepted randomly with the acceptance ratio,

rP(θ|D)P(θj|D)=exp[Nσn2(E(θ)E(θj))],
(3)

in which it is not necessary to do the marginalization of P(D|θ) in the whole parameter space for getting P(D) in the denominator of Eq. (2) because, as seen in Eq. (3), the acceptance or rejection of the parameter values is decided by the ratio of the posterior probabilities in the MCMC methods. By using such MCMC method, we can get the distributions of P(θ|D) as histograms of the values during iteration steps.

Since the P(θ|D) distributions are considered to become broader with increasing σn in Eq. (2), we determined its value through repeated Bayesian inference with changing σn on a criterion that the root mean squared error of the obtained {g(ti;θ)} from the data set {yi} is balanced with σn. As a result, σn is 3.1×105 in signal amplitude and it corresponds to a signal-to-noise (S/N) ratio of 16dB.

Figures 2 show value histories of all parameters during iteration steps on the Metropolis algorithm with σn=3.1×105 in Eq. (2). These values start from unreliable initial values and drift toward the minimum point of the mean square error E(θ) in the parameter set θ. After such burn-in phase, these values fluctuate around the minimum point of E(θ) through iteration. The distribution of P(θ|D) is the normalized histogram of fluctuated values of each parameter during MCMC iterations. On the Bayesian inference, precisions of the respective parameters can be evaluated by distribution widths of P(θ|D) and these widths are decisively governed by the standard deviation σn of the random noise ni. We evaluate the P(θ|D) distributions from sampling of these fluctuated values in the gray area of Figs. 2.

FIG. 2.

Value histories of the parameters A, f, ϕ, τr and τd during iteration steps j. ▶ at j = 0 indicate initial values before iterations. The P(θ|D) distributions are sampled in the gray shadow areas after a burn-in phase.

FIG. 2.

Value histories of the parameters A, f, ϕ, τr and τd during iteration steps j. ▶ at j = 0 indicate initial values before iterations. The P(θ|D) distributions are sampled in the gray shadow areas after a burn-in phase.

Close modal

Figures 3(a)∼(e) show the P(θ|D) for all parameters included in Eq. (1). On the assumption that these distributions are normal distributions, probable errors ϵθ of these parameters can be obtained as the interquartile range and they are indicated with counter-arrows in Fig. 3. The physical model function g(ti;θ) in Eq. (1) is depicted by a gray shadow in Fig. 1, which is calculated with mean values of the respective parameters obtained from the P(θ|D) distributions in Fig. 3. This g(ti;θ) can reproduce well the experimental result as seen in Fig. 1.

FIG. 3.

(a)∼(e) P(θ|D) distributions of the parameter set θ in Eq. (1). Counter-arrows and values mean the probable errors in the respective parameters, and dotted vertical lines are mean values of the P(θ|D) distributions.

FIG. 3.

(a)∼(e) P(θ|D) distributions of the parameter set θ in Eq. (1). Counter-arrows and values mean the probable errors in the respective parameters, and dotted vertical lines are mean values of the P(θ|D) distributions.

Close modal

By the Bayesian inference, we succeed high-precision estimation of all fitting parameters θ as summarized in Table I. It is worth noticing that, even under the circumstance of rather poor S/N ratio (16dB), the distribution width ϵf of the vibrating frequency is about two-orders of magnitude smaller than the spectral bandwidth in the conventional Fourier transform (FT) spectrum as seen in Fig. 4(a). Although the dominant peak (2.9THz) in the FT spectrum has a broad bandwidth (160GHz) due to the rise and damping properties of this vibrating mode, the P(f|D) has a very sharp profile like a δ-function as indicated with a black shadow in Fig. 4(a).

TABLE I.

Optimal values and distribution widths (probable errors) ϵθ in Eq. (1) obtained by the Bayesian inference on the experimental data in Fig. (1).

Bayesian inference
θmean value ± ϵθunit
A 8.63 ± 0.10  
F 2930.04 ± 0.93 GHz 
ϕ 5.219 ± 0.012 radian 
τr 80.9 ± 4.8 fs 
τd 2811 ± 51 fs 
Bayesian inference
θmean value ± ϵθunit
A 8.63 ± 0.10  
F 2930.04 ± 0.93 GHz 
ϕ 5.219 ± 0.012 radian 
τr 80.9 ± 4.8 fs 
τd 2811 ± 51 fs 
FIG. 4.

(a) A FT magnitude and a P(f|D) distribution in Fig. 3(b)A counter-arrow and a value mean a half width at half maximum (HWHM) of the FT magnitude. (b) An initial phase of a FT spectrum and a mean value of a P(ϕ|D) in Table I.

FIG. 4.

(a) A FT magnitude and a P(f|D) distribution in Fig. 3(b)A counter-arrow and a value mean a half width at half maximum (HWHM) of the FT magnitude. (b) An initial phase of a FT spectrum and a mean value of a P(ϕ|D) in Table I.

Close modal

In addition, the Bayesian inference serves a considerable advantage for elucidation of the CP phonon dynamics. Open circles in Fig. 4(b) show an initial phase spectrum obtained by the FT. However the initial phase drastically changes at the resonance and its spectrum has no meaning in the case of damped oscillations because the damped oscillation is over-decomposed to improper continuous plane waves by the FT. In contrast, the Bayesian inference can estimate the initial phase ϕ with the several milli-radians order as seen in Table I. Since exciting mechanisms of coherent phonons can be classified by initial phase,9 such highly accurate estimation of ϕ is extremely effective.

Conventional analysis methods such as Fourier transformation is expansion of arbitrary waveform by using orthogonal normalized functions on the basis of completeness principle, in which the causality is not included. In contrast, the Bayesian inference requires an appropriate physical model meaning the causality to explain the experimental data. In other words, the Bayesian inference is to search for the highest posterior probability point in the parameter θ space and to evaluate the distribution of the P(θ|D) around the maximum point. Although the magnitude of P(θ|D) distribution width is determined on the basis of the noise strength σn in Eq. (2), it strongly depends on the morphology of the error function E(θ) in the parameter space, i.e., the curvature of E(θ) at the global minimum point of E(θ). In our case, since the global minimum point of E(f) is very deep and has steeply quasi-parabolic curve, the P(f|D) becomes sharp as shown in Fig. 4(a).

We have introduced the Bayesian inference to analyze the transient profile of the CP signal at the first time. On the basis of the physical model with a damped oscillation, we can determine all parameters with ultra-high precisions. Especially, the precision of the vibrating frequency improves in two-orders of magnitude compared with the limitation of the conventional Fourier transform method at the S/N ratio of 16dB. In addition, we can also estimate the vibrating initial phase with high precision.

This work was supported by JSPS KAKENHI Grant Numbers 25280090, 16H01552, 16H04002, and 16K13824.

1.
R.
Merlin
,
Solid State Commun.
102
,
207
(
1997
).
2.
H.
Takeuchi
,
S.
Tsuruta
, and
M.
Nakayama
,
J. Appl. Phys.
110
,
013515
(
2011
).
3.
S.
Iwai
and
H.
Okamoto
,
J. Phys. Soc. Japan
75
,
011007
(
2006
).
4.
K.
Nagata
,
S.
Sugita
, and
M.
Okada
,
Neural Networks
28
,
82
(
2012
).
5.
K.
Iwamitsu
,
S.
Aihara
,
M.
Okada
, and
I.
Akai
,
J. Phys. Soc. Japan
85
,
094716
(
2016
).
6.
Y.
Ogata
,
Ann. Inst. Statist. Math.
42
,
403
(
1990
).
7.
T.
Kawamoto
,
Y.
Yoshizaki
,
K.
Yamashiro
,
K.
Iwamitsu
,
T.
Shimamoto
, and
I.
Akai
, in
Proceedings of the 12th Asia Pacific Physics Conference (APPC12)
, Vol. 012071 (
2014
) p.
5
.
8.
T. K.
Cheng
,
S. D.
Brorson
,
A. S.
Kazeroonian
,
J. S.
Moodera
,
G.
Dresselhaus
,
M. S.
Dresselhaus
, and
E. P.
Ippen
,
Appl. Phys. Lett.
57
,
1004
(
1990
).
9.
H. J.
Zeiger
,
J.
Vidal
,
T. K.
Cheng
,
E. P.
Ippen
,
G.
Dresselhaus
, and
M. S.
Dresselhaus
,
Phys. Rev. B
45
,
768
(
1992
).