Electric field waveforms of light carry rich information about dynamical events on a broad range of timescales. The insight that can be reached from their analysis, however, depends on the accuracy of retrieval from noisy data. In this article, we present a novel approach for waveform retrieval based on supervised deep learning. We demonstrate the performance of our model by comparison with conventional denoising approaches, including wavelet transform and Wiener filtering. The model leverages the enhanced precision obtained from the nonlinearity of deep learning. The results open a path toward an improved understanding of physical and chemical phenomena in field-resolved spectroscopy.

Direct access to the temporal evolution of few-cycle optical pulses in the early 2000s enabled the investigation of electronic processes in various media1–5 under the emerging field of attosecond science. Few-cycle pulses in the optical spectral range typically have pulse envelope durations of sub-femtosecond to tens of femtoseconds. To directly measure the carrier field of such pulses, a shorter gate is required in the sampling process.6 Attosecond streaking spectroscopy is considered the gold standard for sampling visible light. The approach employs attosecond extreme ultraviolet pulses, produced from high-harmonic generation (HHG), as a gate.4,7,8 The extreme nonlinearity required for HHG, however, is a significant disadvantage of the technique since the process is highly inefficient. Moreover, the technique necessitates working under costly vacuum infrastructure, restricting its general accessibility. These limitations led to the emergence of novel approaches for characterizing few-cycle pulses, such as nonlinear photoconductive sampling (NPS),9–11 linear photoconductive sampling,12,13 TIPTOE,14–16 and electro-optic sampling (EOS) using visible and ultraviolet pulses,17,18 triggering a new era of spectroscopic19,20 and microscopic21,22 techniques. Nevertheless, such measurements trade ease of implementation for a worsening of the signal-to-noise ratio (SNR).11,16,23 This worsened SNR largely arises from weak signals measured with high gain that is particularly susceptible to ambient electronic noise, despite circuit grounding, shielding, and the selection of low-noise components.

A traditional denoising technique for waveform retrieval is the wavelet transform (WT).24 This is a linear time-frequency signal processing technique that decomposes a given signal into a set of wavelet coefficients called approximation and detail coefficients. It can be used to denoise data through iterative thresholding25 where the approximation filters act as averaging filters and the detail filters extract high-frequency information. Here, soft thresholding26 is implemented as values <2σ are set to 0 to omit the very high-frequency fluctuation that is typical of noise. The data are then transformed back at the end of the iterative process, producing the WT denoised data.25,27 Another approach using a genetic algorithm has recently been proposed for denoising terahertz signals.28 The method is based on removing noise-induced spikes and oscillations in the transfer function (i.e., the complex ratio between sample and reference signals in the frequency domain). Machine learning does not rely on assumptions about the transfer function and uses supervised learning, which can yield enhancement of time domain signals in attosecond spectroscopy.29,30

In this article, the extraction of few-cycle waveforms from measurements with low signal-to-noise ratios (SNRs) is demonstrated by employing a denoising algorithm based on a deep neural network architecture. The use of machine learning-based denoising (MLBD) is a particularly beneficial approach in scenarios where noise is intrinsic to the data acquisition electronics and has found applications from photonics,31–34 astrophysics, and astronomy35–38 to medicine.39–42 Here, a multilayer convolutional neural network (CNN) is trained on a large synthetic dataset (ntrain = 57 600 and ntest = 6400) to learn a mapping from noisy input to clean output. This procedure is akin to conventional denoising of data with pre-determined spectral filters as in Fourier and wavelet filtering.43 Here, however, the filters are learned directly from the training data rather than using pre-determined filters. Following a supervised learning approach, the CNN is presented with paired waveforms noisy “Xtrain” and clean “Ytrain.” This permits the model to learn the structure of signal vs noise without foreknowledge of the underlying mathematical relationship between the two.

In this article, a model capable of tackling measurements while being trained on a simulated dataset is developed and implemented. The model is compared with traditional signal processing methods, such as wavelet analysis and Wiener filtering. Despite being trained on simulated data, this approach is able to reconstruct ultrashort pulses from low SNR experimental data.

Figure 1 illustrates a sample from the performance of the model on the test set ntest = 6400. The coefficient of determination is calculated for the model with an overall score of R2 = 0.99. To test the overall performance of the model, it is compared to wavelet transform (WT) and Wiener filtering (WF) as denoising methods. In WT, the denoising is performed using two layers of filters and the Symlets 8 family of wavelets. Wiener filtering, on the other hand, takes into account the statistical properties of the noise present in the signal by estimating the power spectral density of the noise vs that of the signal, where the noise is evaluated as the average of the local variance of the input signal. A comparison between the root mean square (RMS) errors between each denoising method is presented in Fig. 2. It is evident from the plot that the MLBD approach outperforms both the WT and the WF.

FIG. 1.

A sample of the waveforms extracted by the CNN model, Ypred, is shown in red. The noisy waveforms, Xtest, fed into the model are shown in light blue. The blue line represents the target waveforms, Ytest. The Pearson correlation coefficient r is calculated for each waveform with excellent results.

FIG. 1.

A sample of the waveforms extracted by the CNN model, Ypred, is shown in red. The noisy waveforms, Xtest, fed into the model are shown in light blue. The blue line represents the target waveforms, Ytest. The Pearson correlation coefficient r is calculated for each waveform with excellent results.

Close modal
FIG. 2.

A histogram distribution representing the RMS error obtained by calculating YtestYx2̄, where Yx stands for estimator results, showing that the performance of the machine learning model (red) surpasses that of the wavelet transform (blue) and Wiener filtering (gray). Note that the waveforms Ytest and Yx are normalized individually prior to calculating the RMS error to remove any error bias due to the decrease in amplitude when using the WT and WF.

FIG. 2.

A histogram distribution representing the RMS error obtained by calculating YtestYx2̄, where Yx stands for estimator results, showing that the performance of the machine learning model (red) surpasses that of the wavelet transform (blue) and Wiener filtering (gray). Note that the waveforms Ytest and Yx are normalized individually prior to calculating the RMS error to remove any error bias due to the decrease in amplitude when using the WT and WF.

Close modal

We benchmark our model against laboratory-measured data and compare to time-frequency Fourier filtering with zero-padding to increase the frequency resolution of the transformed signal. A broadband 4.2 fs white-light waveform spanning 500–1150 nm was obtained from the broadened output of a titanium-sapphire chirped pulse amplifier (light blue) and is depicted in Fig. 3. The waveform is measured by means of NPS that relies on strong field interaction (multiphoton absorption) to generate free carriers in a fused silica substrate.11 The free carriers generated by the interaction form a short gating event, which can be used to measure the electric field of a weak test pulse. Generally speaking, the NPS technique relies on the use of transimpedance amplifiers9,11 due to the small signals obtained with the technique. The signal-to-noise ratio of the waveform in Fig. 3 remains somewhat low with an SNR = 8.8 that can be attributed to several factors, such as laser intensity fluctuations, timing jitter, gain electronics, and spatial overlap variability. Nevertheless, the MLBD model manages to extract the waveform from the data as shown in red with a Pearson correlation factor of r = 0.97 as calculated between the cleaned laboratory data and the model prediction. Physics-informed learning, by, e.g., imposing a limited frequency range commensurate with the incident light’s bandwidth would further improve the SNR in the retrieval.

FIG. 3.

Model denoising laboratory-measured data (light blue). The cleaned laboratory data (blue) are processed using a super-Gaussian band-pass filter in both the time and frequency domains, as well as zero padding in the frequency domain. Obtaining the denoised plot (red) in the relevant frequency range required 0.13 ms using an Apple M1 pro chip. The measured white-light waveform contains frequencies below 375 THz (800 nm), constituting a frequency range previously unexplored in the learning process [see Eq. (1)]. High-frequency noise above 1 PHz (300 nm) is excluded from the assessment.

FIG. 3.

Model denoising laboratory-measured data (light blue). The cleaned laboratory data (blue) are processed using a super-Gaussian band-pass filter in both the time and frequency domains, as well as zero padding in the frequency domain. Obtaining the denoised plot (red) in the relevant frequency range required 0.13 ms using an Apple M1 pro chip. The measured white-light waveform contains frequencies below 375 THz (800 nm), constituting a frequency range previously unexplored in the learning process [see Eq. (1)]. High-frequency noise above 1 PHz (300 nm) is excluded from the assessment.

Close modal
In this section, we elucidate the inner workings of our MLBD model. Machine learning (ML) typically requires a large and diverse dataset in order to develop a robust algorithm that can generalize well to new and unseen data. The dataset size and variety permit the model to avoid the pitfall of over-fitting, a situation wherein the model becomes adept at performing the required task on the training dataset only while struggling with new data; effectively, it only memorizes the training set. In an ideal scenario, the CNN model would be trained on fully characterized experimental waveforms. However, this scenario is not feasible as it necessitates experimentally generating tens of thousands of traces with different wavelengths, phases, and durations. For these reasons, artificially generated data are used to augment the dataset and rectify the lack of waveform data diversity. A waveform E(t) may be expressed mathematically by
(1)
Each individual quantity is defined as follows:

Note that A → [0, 1] and is uniformly random selected. The subscript R denotes a randomly generated value from a range of values. For tR → [−65, 65] (fs), σR → [5, 35] (fs), ωR → [375, 750] (THz), and CEPR → [0, 2π). This methodology randomly generates a set of waveforms with different arrival times tR, widths σR, central frequencies ωR, and phases. Figure 4 depicts a sample of the randomly generated waveforms (blue) following the expression in (1). Random Gaussian noise is added to the generated waveforms to form the noisy waveforms (light blue). The total sample size generated by the method is n = 64 000 waveforms. In preparation for the learning process, the dataset is randomly split into a training set containing ntrain = 57 600 waveforms and a test set containing ntest = 6400 waveforms. Note that ntest = 6400 are “held out” during the learning process and are not re-split from n = 64 000 in every learning iteration.

FIG. 4.

Randomly selected waveforms from the training dataset. Xtrain represents the noisy data, and Ytrain represents the clean data. The data are comprised of a variety of waveforms with different characteristics. Note that the plotted waveforms are normalized.

FIG. 4.

Randomly selected waveforms from the training dataset. Xtrain represents the noisy data, and Ytrain represents the clean data. The data are comprised of a variety of waveforms with different characteristics. Note that the plotted waveforms are normalized.

Close modal

In ML, a feature represents a quantifiable attribute or characteristic of a given phenomenon.44 In our case of measured waveforms, each sampled point is considered a feature, and a ML model must learn to identify the locations where the field exists in the measurement. During a measurement, some of the sampled points only fluctuate by noise, while others fluctuate by both signal and noise (i.e., where the field exists in time). To mitigate any statistical biasing that may be introduced into the model due to the relative difference in these two types of fluctuations in the dataset, we employ the common preprocessing step known as feature scaling. This step involves the normalization of the value ranges assigned to the features within a dataset, enhancing the performance and accuracy of the algorithm.45 In this article, the data are scaled using MaxAbsScaler() such that Xscaled = X/max(|X|), which scales all sampled numerical features—time-domain sample points—to a range [−1, 1). This approach increases the visibility of the noise away from the pulse center but ensures that each sampled point is presented with an equal probability of lying in the range [−1, 1) over the training dataset.

The model constitutes a sequential one-dimensional CNN. It is constructed in Python using the Keras library.46 The model consists of convolution and pooling layers followed by a dropout layer, deconvolution layers, and finally a fully connected layer with nonlinear tanh() activation. Each convolution layer takes in an input vector Xin = x1 + x2 +⋯+ xn of length n = 1000 and applies a one-dimensional filter or kernel wmi of length m = 3, where i denotes the layer number,
(2)
where j = (m − 1)/2 and b is a bias term. Note that the weight vector wmi is element-wise multiplied (Frobenius inner product) by the vector Xin. An activation function tanh() is applied to the resultant sum in (2) to generate the result ani, which is given as input to subsequent layers. Note that without vector length correction, the convolution process in each layer creates multiple downsampled convolution filters, containing information from the input vector as shown in Fig. 5 under layer 2. In this model, vector length is preserved by choosing to pad the vectors by zero prior to applying the kernel wmi. Pooling then aggregates the information by downsampling (decimating) the data by passing the maximum value of two adjacent elements to the next layer, termed max-pooling, using a new set of filters wmi of length m = 2 as illustrated in Fig. 5 under layer 3. A dropout regularization layer of rate 0.2 is used to increase the robustness of the model and alleviate over-fitting.47 Finally, deconvolution effectively up-samples the information before passing it to a fully connected Dense layer to ultimately form the prediction Ypred.
FIG. 5.

(a) An illustration of the model structure with a total of 1 million trainable parameters. The input layer contains the preprocessed and normalized data Xtrain. Layer 2 performs a 1D convolution using a kernel w32 to generate 32 feature maps (4 are shown for clarity). Layer 3 performs a 1D max pooling using a kernel w23 to decimate the vectors from layer 2. Two steps of convolution and pooling are performed to extract the features from Xtrain. The data are then deconvolved before passing the information to a fully connected layer (dense) to generate a predicted output Ypred. (b) Model learning curve. The model employs the algorithm Adam to minimize a loss calculated as the mean squared error shown in blue. The validation set loss is plotted in red.

FIG. 5.

(a) An illustration of the model structure with a total of 1 million trainable parameters. The input layer contains the preprocessed and normalized data Xtrain. Layer 2 performs a 1D convolution using a kernel w32 to generate 32 feature maps (4 are shown for clarity). Layer 3 performs a 1D max pooling using a kernel w23 to decimate the vectors from layer 2. Two steps of convolution and pooling are performed to extract the features from Xtrain. The data are then deconvolved before passing the information to a fully connected layer (dense) to generate a predicted output Ypred. (b) Model learning curve. The model employs the algorithm Adam to minimize a loss calculated as the mean squared error shown in blue. The validation set loss is plotted in red.

Close modal

By iterative minimization of the mean squared error, the model learns to map the noisy input onto the desired denoised output. In this model, iterative minimization is achieved by employing the Adam optimization algorithm,48 a variant of stochastic gradient descent. The model is trained locally on a MacBook Pro (Apple M1 Pro chip) in batches of 128 samples for 100 epochs—cycles through the collection of batches—for a total elapsed time of ∼500 s. We note that training time could be significantly improved with the use of training-optimized hardware49,50 but was not necessary for the scope of this work. A hold-out test set comprising 20% of the synthetic data is used only to evaluate the loss at the end of each epoch; it is not used to adapt the neural network weights nor hyper-parameter values, such as batch size, learning rate, and dropout.

Machine learning-based denoising is a proficient approach for tackling intricate noise patterns that are difficult to address using conventional methods that rely on linear filters, such as Fourier filtering or the wavelet transform. The strength of MLBD lies in its nonlinear approach, which allows the ML model to capture nonlinear relationships in the data, permitting the model to adapt to novel noise patterns when generalizing to new unseen data. This is a result of following an end-to-end learning procedure, which allows the model to learn directly from the noisy data. While the model in this article is trained using a generic white noise distribution, by training on a diverse and large dataset, the model was capable of denoising complex laboratory-measured waveforms, which were not explicitly seen during training nor were they strictly white noise. Consequently, MLBD is a versatile and efficient tool, with benefits across an extensive range of applications whereby poor SNR is encountered for measured waveforms that are not easily experimentally generated.36,37 Additionally, the advent of various inference acceleration hardware allows for sufficiently large neural networks to produce cleaned results at repetition rates that surpass 3851 to 100 K49 inferences per second, thus keeping abreast of incoming single shots of high repetition rate lasers. These inference accelerators52,53 can be leveraged for MLBD in real-time measurements for signals that may otherwise remain undetected and do so with sufficiently short latency for use in real-time high rep-rate pulse shaping feedback systems. Accelerating MLBD in such a way could inspire new approaches to 3D spatiotemporal real-time adaptive laser pulse shaping in many important fields from medical applications to semiconductor processing.

N.A. acknowledges support from the Max Planck Society via the IMPRS for Advanced Photon Science. N.A. is part of the Max Planck School of Photonics supported by BMBF, Max Planck Society, and Fraunhofer Society. R.N.C.’s work was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, under Field Work Proposal 100643 “Actionable Information from Sensor to Data Center.” M.F.K.’s work at SLAC was supported by the U.S. Department of Energy, Office of Science, Basic Energy Sciences, under Grant No. DE-AC02-76SF00515 and by the Chemical Sciences, Geosciences, and Biosciences Division (CSGB).

The authors have no conflicts to disclose.

Najd Altwaijry: Conceptualization (lead); Data curation (lead); Formal analysis (lead); Investigation (lead); Methodology (lead); Software (lead); Validation (lead); Visualization (lead); Writing – original draft (lead). Ryan Coffee: Validation (supporting); Visualization (supporting); Writing – review & editing (supporting). Matthias F. Kling: Funding acquisition (lead); Resources (lead); Supervision (lead); Validation (supporting); Visualization (supporting); Writing – review & editing (lead).

The data and model that support the findings of this study are available under https://github.com/ananajd/MLBD.

1.
M. Y.
Ivanov
,
R.
Kienberger
et al, “
Attosecond physics
,”
J. Phys. B: At., Mol. Opt. Phys.
39
,
R1
R37
(
2006
).
2.
M. F.
Kling
,
C.
Siedschlag
,
A. J.
Verhoef
,
J. I.
Khan
,
M.
Schultze
,
T.
Uphues
,
Y.
Ni
,
M.
Uiberacker
,
M.
Drescher
,
F.
Krausz
, and
M. J. J.
Vrakking
, “
Control of electron localization in molecular dissociation
,”
Science
312
,
246
248
(
2006
).
3.
E.
Goulielmakis
,
V. S.
Yakovlev
,
A. L.
Cavalieri
,
M.
Uiberacker
,
V.
Pervak
,
A.
Apolonski
,
R.
Kienberger
,
U.
Kleineberg
, and
F.
Krausz
, “
Attosecond control and measurement: Lightwave electronics
,”
Science
317
,
769
775
(
2007
).
4.
M.
Schultze
,
E. M.
Bothschafter
,
A.
Sommer
,
S.
Holzner
,
W.
Schweinberger
,
M.
Fiess
,
M.
Hofstetter
,
R.
Kienberger
,
V.
Apalkov
,
V. S.
Yakovlev
,
M. I.
Stockman
, and
F.
Krausz
, “
Controlling dielectrics with the electric field of light
,”
Nature
493
,
75
78
(
2013
).
5.
M.
Schultze
,
K.
Ramasesha
,
C.
Pemmaraju
,
S.
Sato
,
D.
Whitmore
,
A.
Gandman
,
J. S.
Prell
,
L. J.
Borja
,
D.
Prendergast
,
K.
Yabana
,
D. M.
Neumark
, and
S. R.
Leone
, “
Attosecond band-gap dynamics in silicon
,”
Science
346
,
1348
1352
(
2014
).
6.
C.
Shannon
, “
Communication in the presence of noise
,”
Proc. IRE
37
,
10
21
(
1949
).
7.
M.
Hentschel
,
R.
Kienberger
,
C.
Spielmann
,
G. A.
Reider
,
N.
Milosevic
,
T.
Brabec
,
P.
Corkum
,
U.
Heinzmann
,
M.
Drescher
, and
F.
Krausz
, “
Attosecond metrology
,”
Nature
414
,
509
513
(
2001
).
8.
M.
Ossiander
,
F.
Siegrist
,
V.
Shirvanyan
,
R.
Pazourek
,
A.
Sommer
,
T.
Latka
,
A.
Guggenmos
,
S.
Nagele
,
J.
Feist
,
J.
Burgdörfer
,
R.
Kienberger
, and
M.
Schultze
, “
Attosecond correlation dynamics
,”
Nat. Phys.
13
,
280
285
(
2017
).
9.
A.
Schiffrin
,
T.
Paasch-Colberg
,
N.
Karpowicz
,
V.
Apalkov
,
D.
Gerster
,
S.
Mühlbrandt
,
M.
Korbman
,
J.
Reichert
,
M.
Schultze
,
S.
Holzner
,
J. V.
Barth
,
R.
Kienberger
,
R.
Ernstorfer
,
V. S.
Yakovlev
,
M. I.
Stockman
, and
F.
Krausz
, “
Optical-field-induced current in dielectrics
,”
Nature
493
,
70
74
(
2013
).
10.
T.
Paasch-Colberg
,
A.
Schiffrin
,
N.
Karpowicz
,
S.
Kruchinin
,
Ö.
Sağlam
,
S.
Keiber
,
O.
Razskazovskaya
,
S.
Mühlbrandt
,
A.
Alnaser
,
M.
Kübel
,
V.
Apalkov
,
D.
Gerster
,
J.
Reichert
,
T.
Wittmann
,
J. V.
Barth
,
M. I.
Stockman
,
R.
Ernstorfer
,
V. S.
Yakovlev
,
R.
Kienberger
, and
F.
Krausz
, “
Solid-state light-phase detector
,”
Nat. Photonics
8
,
214
218
(
2014
).
11.
S.
Sederberg
,
D.
Zimin
,
S.
Keiber
,
F.
Siegrist
,
M. S.
Wismer
,
V. S.
Yakovlev
,
I.
Floss
,
C.
Lemell
,
J.
Burgdörfer
,
M.
Schultze
,
F.
Krausz
, and
N.
Karpowicz
, “
Attosecond optoelectronic field measurement in solids
,”
Nat. Commun.
11
,
430
(
2020
).
12.
M.
Ossiander
,
K.
Golyari
,
K.
Scharl
,
L.
Lehnert
,
F.
Siegrist
,
J. P.
Bürger
,
D.
Zimin
,
J. A.
Gessner
,
M.
Weidman
,
I.
Floss
,
V.
Smejkal
,
S.
Donsa
,
C.
Lemell
,
F.
Libisch
,
N.
Karpowicz
,
J.
Burgdörfer
,
F.
Krausz
, and
M.
Schultze
, “
The speed limit of optoelectronics
,”
Nat. Commun.
13
,
1620
(
2022
).
13.
N.
Altwaijry
,
M.
Qasim
,
M.
Mamaikin
,
J.
Schötz
,
K.
Golyari
,
M.
Heynck
,
E.
Ridente
,
V. S.
Yakovlev
,
N.
Karpowicz
, and
M. F.
Kling
, “
Broadband photoconductive sampling in gallium phosphide
,”
Adv. Opt. Mater.
11
,
2202994
(
2023
).
14.
S. B.
Park
,
K.
Kim
,
W.
Cho
,
S. I.
Hwang
,
I.
Ivanov
,
C. H.
Nam
, and
K. T.
Kim
, “
Direct sampling of a light wave in air
,”
Optica
5
,
402
408
(
2018
).
15.
M. R.
Bionta
,
F.
Ritzkowsky
,
M.
Turchetti
,
Y.
Yang
,
D.
Cattozzo Mor
,
W. P.
Putnam
,
F. X.
Kärtner
,
K. K.
Berggren
, and
P. D.
Keathley
, “
On-chip sampling of optical fields with attosecond resolution
,”
Nat. Photonics
15
,
456
460
(
2021
).
16.
J.
Blöchl
,
J.
Schötz
,
A.
Maliakkal
,
N.
Šreibere
,
Z.
Wang
,
P.
Rosenberger
,
P.
Hommelhoff
,
A.
Staudte
,
P. B.
Corkum
,
B.
Bergues
, and
M. F.
Kling
, “
Spatiotemporal sampling of near-petahertz vortex fields
,”
Optica
9
,
755
761
(
2022
).
17.
S.
Keiber
,
S.
Sederberg
,
A.
Schwarz
,
M.
Trubetskov
,
V.
Pervak
,
F.
Krausz
, and
N.
Karpowicz
, “
Electro-optic sampling of near-infrared waveforms
,”
Nat. Photonics
10
,
159
162
(
2016
).
18.
E.
Ridente
,
M.
Mamaikin
,
N.
Altwaijry
,
D.
Zimin
,
M. F.
Kling
,
V.
Pervak
,
M.
Weidman
,
F.
Krausz
, and
N.
Karpowicz
, “
Electro-optic characterization of synthesized infrared-visible light fields
,”
Nat. Commun.
13
,
1111
(
2022
).
19.
I.
Pupeza
,
M.
Huber
,
M.
Trubetskov
,
W.
Schweinberger
,
S. A.
Hussain
,
C.
Hofer
,
K.
Fritsch
,
M.
Poetzlberger
,
L.
Vamos
,
E.
Fill
,
T.
Amotchkina
,
K. V.
Kepesidis
,
A.
Apolonski
,
N.
Karpowicz
,
V.
Pervak
,
O.
Pronin
,
F.
Fleischmann
,
A.
Azzeer
,
M.
Žigman
, and
F.
Krausz
, “
Field-resolved infrared spectroscopy of biological systems
,”
Nature
577
,
52
59
(
2020
).
20.
M. T.
Peschel
,
M.
Högner
,
T.
Buberl
,
D.
Keefer
,
R.
de Vivie-Riedle
, and
I.
Pupeza
, “
Sub-optical-cycle light-matter energy transfer in molecular vibrational spectroscopy
,”
Nat. Commun.
13
,
5897
(
2022
).
21.
A.
Alismail
,
H.
Wang
,
G.
Barbiero
,
S. A.
Hussain
,
W.
Schweinberger
,
F.
Krausz
, and
H.
Fattahi
, “
Near-infrared molecular fieldoscopy of water
,”
Multiphoton Microsc. Biomed. Sci. XIX
10882
,
1088231
(
2019
).
22.
M.
Mamaikin
,
Y.-L.
Li
,
E.
Ridente
,
W. T.
Chen
,
J.-S.
Park
,
A. Y.
Zhu
,
F.
Capasso
,
M.
Weidman
,
M.
Schultze
,
F.
Krausz
, and
N.
Karpowicz
, “
Electric-field-resolved near-infrared microscopy
,”
Optica
9
,
616
622
(
2022
).
23.
D. A.
Zimin
,
N.
Karpowicz
,
M.
Qasim
,
M.
Weidman
,
F.
Krausz
, and
V. S.
Yakovlev
, “
Dynamic optical response of solids following 1-fs-scale photoinjection
,”
Nature
618
,
276
280
(
2023
).
24.
D. M.
Mittleman
,
R.
Jacobsen
,
R.
Neelamani
,
R.
Baraniuk
, and
M.
Nuss
, “
Gas sensing using terahertz time-domain spectroscopy
,”
Appl. Phys. B: Lasers Opt.
67
,
379
390
(
1998
).
25.
D.
Donoho
, “
De-noising by soft-thresholding
,”
IEEE Trans. Inf. Theory
41
,
613
627
(
1995
).
26.
D. L.
Donoho
and
I. M.
Johnstone
, “
Ideal spatial adaptation by wavelet shrinkage
,”
Biometrika
81
,
425
455
(
1994
).
27.
A.
Graps
, “
An introduction to wavelets
,”
IEEE Comput. Sci. Eng.
2
,
50
61
(
1995
).
28.
X.
Chen
,
Q.
Sun
,
R. I.
Stantchev
, and
E.
Pickwell-MacPherson
, “
Objective and efficient terahertz signal denoising by transfer function reconstruction
,”
APL Photonics
5
,
056104
(
2020
).
29.
S.
Biswas
,
B.
Förg
,
L.
Ortmann
,
J.
Schötz
,
W.
Schweinberger
,
T.
Zimmermann
,
L.
Pi
,
D.
Baykusheva
,
H. A.
Masood
,
I.
Liontos
,
A. M.
Kamal
,
N. G.
Kling
,
A. F.
Alharbi
,
M.
Alharbi
,
A. M.
Azzeer
,
G.
Hartmann
,
H. J.
Wörner
et al, “
Probing molecular environment through photoemission delays
,”
Nat. Phys.
16
,
778
783
(
2020
).
30.
C.
Brunner
,
A.
Duensing
,
C.
Schröder
,
M.
Mittermair
,
V.
Golkov
,
M.
Pollanka
,
D.
Cremers
, and
R.
Kienberger
, “
Deep learning in attosecond metrology
,”
Opt. Express
30
,
15669
15684
(
2022
).
31.
M. A.
Krumbügel
,
C. L.
Ladera
,
K. W.
DeLong
,
D. N.
Fittinghoff
,
J. N.
Sweetser
, and
R.
Trebino
, “
Direct ultrashort-pulse intensity and phase retrieval by frequency-resolved optical gating and a computational neural network
,”
Opt. Lett.
21
,
143
145
(
1996
).
32.
T.
Zahavy
,
A.
Dikopoltsev
,
D.
Moss
,
G. I.
Haham
,
O.
Cohen
,
S.
Mannor
, and
M.
Segev
, “
Deep learning reconstruction of ultrashort pulses
,”
Optica
5
,
666
673
(
2018
).
33.
R.
Ziv
,
A.
Dikopoltsev
,
T.
Zahavy
,
I.
Rubinstein
,
P.
Sidorenko
,
O.
Cohen
, and
M.
Segev
, “
Deep learning reconstruction of ultrashort pulses from 2D spatial intensity patterns recorded by an all-in-line system in a single-shot
,”
Opt. Express
28
,
7528
7538
(
2020
).
34.
G.
Genty
,
L.
Salmela
,
J. M.
Dudley
,
D.
Brunner
,
A.
Kokhanovskiy
,
S.
Kobtsev
, and
S. K.
Turitsyn
, “
Machine learning and applications in ultrafast photonics
,”
Nat. Photonics
15
,
91
101
(
2021
).
35.
P.
Graff
,
F.
Feroz
,
M. P.
Hobson
, and
A.
Lasenby
, “
SKYNET: An efficient and robust neural network training tool for machine learning in astronomy
,”
Mon. Not. R. Astron. Soc.
441
,
1741
1759
(
2014
).
36.
D.
George
and
E.
Huerta
, “
Deep learning for real-time gravitational wave detection and parameter estimation: Results with advanced LIGO data
,”
Phys. Lett. B
778
,
64
70
(
2018
).
37.
E.
Cuoco
,
J.
Powell
,
M.
Cavaglià
,
K.
Ackley
,
M.
Bejger
,
C.
Chatterjee
,
M.
Coughlin
,
S.
Coughlin
,
P.
Easter
,
R.
Essick
,
H.
Gabbard
,
T.
Gebhard
,
S.
Ghosh
,
L.
Haegel
,
A.
Iess
,
D.
Keitel
,
Z.
Márka
,
S.
Márka
,
F.
Morawski
,
T.
Nguyen
,
R.
Ormiston
,
M.
Pürrer
,
M.
Razzano
,
K.
Staats
,
G.
Vajente
, and
D.
Williams
, “
Enhancing gravitational-wave science with machine learning
,”
Mach. Learn.: Sci. Technol.
2
,
011002
(
2020
).
38.
C. J.
Fluke
and
C.
Jacobs
, “
Surveying the reach and maturity of machine learning and artificial intelligence in astronomy
,”
WIREs Data Min. Knowl. Discovery
10
,
e1349
(
2020
).
39.
J. A. M.
Sidey-Gibbons
and
C. J.
Sidey-Gibbons
, “
Machine learning in medicine: A practical introduction
,”
BMC Med. Res. Methodol.
19
,
64
(
2019
).
40.
T. J.
Fawcett
,
C. S.
Cooper
,
R. J.
Longenecker
, and
J. P.
Walton
, “
Machine learning, waveform preprocessing and feature extraction methods for classification of acoustic startle waveforms
,”
MethodsX
8
,
101166
(
2021
).
41.
W.
Hu
,
Y.
Zhang
, and
L.
Li
, “
Study of the application of deep convolutional neural networks (CNNs) in processing sensor data and biomedical images
,”
Sensors
19
,
3584
(
2019
).
42.
Z.
Wang
,
F.
Wan
,
C. M.
Wong
, and
L.
Zhang
, “
Adaptive Fourier decomposition based ECG denoising
,”
Comput. Biol. Med.
77
,
195
205
(
2016
).
43.
J. S.
Walker
, “
Fourier analysis and wavelet analysis
,”
Not. AMS
44
,
658
670
(
1997
).
44.
M.
Kubat
,
An Introduction to Machine Learning
(
Springer
,
Cham
,
2016
).
45.
S.
Skansi
,
Introduction to Deep Learning
(
Springer
,
Cham
,
2018
).
46.
F.
Chollet
et al, Keras (
2015
).https://keras.io.
47.
N.
Srivastava
,
G.
Hinton
,
A.
Krizhevsky
,
I.
Sutskever
, and
R.
Salakhutdinov
, “
Dropout: A simple way to prevent neural networks from overfitting
,”
J. Mach. Learn. Res.
15
,
1929
1958
(
2014
).
48.
D. P.
Kingma
and
J.
Ba
, “
Adam: A method for stochastic optimization
,” arxiv:1412.6980 (
2014
).
49.
M.
Kraus
,
N.
Layad
,
Z.
Liu
, and
R.
Coffee
, “
EdgeAI: Machine learning via direct attached accelerator for streaming data processing at high shot rate x-ray free-electron lasers
,”
Front. Phys.
10
,
957509
(
2022
).
50.
P. J.
Milan
,
H.
Rong
,
C.
Michaud
,
N.
Layad
,
Z.
Liu
, and
R.
Coffee
, “
Enabling real-time adaptation of machine learning models at x-ray free electron laser facilities with high-speed training optimized computational hardware
,”
Front. Phys.
10
,
958120
(
2022
).
51.
A.
Weigel
,
P.
Jacob
,
D.
Gröters
,
T.
Buberl
,
M.
Huber
,
M.
Trubetskov
,
J.
Heberle
, and
I.
Pupeza
, “
Ultra-rapid electro-optic sampling of octave-spanning mid-infrared waveforms
,”
Opt. Express
29
,
20747
20764
(
2021
).
52.
J.
Hirschman
,
A.
Kamalov
,
R.
Obaid
,
F. H.
O’Shea
, and
R. N.
Coffee
, “
At-the-edge data processing for low latency high throughput machine learning algorithms
,” in
Accelerating Science and Engineering Discoveries through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation
, edited by
K.
Doug
,
G.
Al
,
S.
Pophale
,
H.
Liu
and
S.
Parete-Koon
(
Springer Nature Switzerland
,
Cham
,
2022
), pp.
101
119
.
53.
R.
Herbst
,
R.
Coffee
,
N.
Fronk
,
K.
Kim
,
K.
Kim
,
L.
Ruckman
, and
J. J.
Russell
, “
Implementation of a framework for deploying AI inference engines in FPGAs
,” in
Accelerating Science and Engineering Discoveries through Integrated Research Infrastructure for Experiment, Big Data, Modeling and Simulation
, edited by
K.
Doug
,
G.
Al
,
S.
Pophale
,
H.
Liu
and
S.
Parete-Koon
(
Springer Nature Switzerland
,
Cham
,
2022
), pp.
120
134
.