In this paper, we investigate two deep learning approaches to recovering initial temperature profiles from thermographic images in non-destructive material testing. First, we trained a deep neural network (DNN) in an end-to-end fashion by directly feeding the surface temperature measurements to the DNN. Second, we turned the surface temperature measurements into virtual waves (a recently developed concept in thermography), which we then fed to the DNN. To demonstrate the effectiveness of these methods, we implemented a data generator and created a dataset comprising a total of 100 000 simulated temperature measurement images. With the objective of determining a suitable baseline, we investigated several state-of-the-art model-based reconstruction methods, including Abel transformation, curvelet denoising, and time- and frequency-domain synthetic aperture focusing techniques. Additionally, a physical phantom was created to support evaluation on completely unseen real-world data. The results of several experiments suggest that both the end-to-end and the hybrid approach outperformed the baseline in terms of reconstruction accuracy. The end-to-end approach required the least amount of domain knowledge and was the most computationally efficient one. The hybrid approach required extensive domain knowledge and was more computationally expensive than the end-to-end approach. However, the virtual waves served as meaningful features that convert the complex task of the end-to-end reconstruction into a less demanding undertaking. This in turn yielded better reconstructions with the same number of training samples compared to the end-to-end approach. Additionally, it allowed more compact network architectures and use of prior knowledge, such as sparsity and non-negativity. The proposed method is suitable for non-destructive testing (NDT) in 2D where the amplitudes along the objects are considered to be constant (e.g., for metallic wires). To encourage the development of other deep-learning-based reconstruction techniques, we release both the synthetic and the real-world datasets along with the implementation of the deep learning methods to the research community.

Analysis of structural imperfections of materials, spare parts, or components of a system is important in preventing the malfunction of devices. This can be achieved by active thermography, which has—in addition to non-destructive testing—several other applications, such as structural health monitoring,1 material characterization,2,3 and thermal imaging in medicine.4–6,45 In thermographic imaging, the specimen is heated by flashlamps, lasers, etc., and the corresponding temperature evolution is then measured on the surface. The resulting thermal pattern is used to reconstruct the heat distribution inside the material, which provides the main information for defect detection.

In recent decades, thermographic data evaluation was dominated by 1D methods, which became inaccurate, for example, when applied to anisotropic heat flow. For more accurate thermographic data evaluation, multidimensional heat flow must be considered. Some research groups have focused on solving multidimensional inverse heat conduction problems (IHCPs) using various thermal stimulation methods.7,8 The virtual wave concept9 is another approach that considers multidimensional heat flow without solving directly an IHCP. Since it does not add additional information for detecting defects, for a single one-dimensional (1D) reconstruction, the spatial resolution does not improve compared to direct inversion of the heat diffusion. The advantage of the virtual wave concept in 1D is that more advanced regularization techniques that incorporate a priori information, such as sparsity or positivity, can be utilized10,11 in the reconstruction process. The main benefit in 2D or 3D is that for the calculated virtual waves, any conventional ultrasound reconstruction method, such as the synthetic aperture focusing technique (SAFT),12,13 can be used in a second step to reconstruct the defects.9,14 The signal-to-noise-ratio (SNR) is significantly enhanced because the lateral heat flow perpendicular to the surface is also taken into account in the reconstructions. This is similar to averaging numerous 1D measurements when imaging a layered structure, but with the essential advantage that the virtual wave concept can be used for any 2D or 3D structure to be imaged. Here, for the first time, after the virtual waves have been calculated, acoustic reconstruction as the second step is performed by deep learning, which shows additional advantages: artifacts caused by limited view or from discretization are suppressed, which results in more accurate reconstruction.

Thermal reconstruction can be modeled mathematically by the heat diffusion equation. Since heat diffusion is an irreversible process, part of the information is inevitably lost, which implies the ill-posed nature of the problem.15 In this case, maximum likelihood solutions typically provide numerically unstable results, and regularization is needed to avoid over-fitting of the data.16 This can be achieved by direct methods, variational image reconstruction, and deep learning. The first approach is based on the approximate inversion of the problem and is fast because it involves only a few matrix-vector multiplications. However, this comes at the price of reconstruction accuracy, which is low for noisy measurements. The variational approach provides a good compromise between over- and under-fitting of the data. In this case, a priori information, such as smoothness, sparsity, group sparsity, and non-negativity, can be incorporated into the model to penalize unfeasible solutions. Although these algorithms are more robust against noise, the penalty function must satisfy certain conditions, such as convexity, which limits the eligible prior knowledge. Another drawback is that the considerably higher computational complexity usually prevents real-time imaging. Furthermore, in many applications, it is not obvious how a suitable regularization parameter is to be chosen.

Deep learning provides a good alternative to direct and variational methods for mainly two reasons: first, the main computational load of these algorithms is shifted to the offline training phase. This permits real-time imaging in the testing phase, even with embedded hardware. Second, since a priori information is implicitly encoded during the training phase, it need not be identified and explicitly incorporated into the method. Most successful industrial applications of deep learning have used it in a supervised manner. This approach, however, requires large amounts of labeled data, which is not easy to obtain in thermography, as it requires, for instance, production of physical phantoms with varying material properties including defects with various positions, sizes, and shapes. We tackled this problem by using synthetic data to train a deep neural network (DNN), which was then applied to real-measurement data in the testing phase. In this paper, we present two training strategies. The first is carried out in an end-to-end fashion, which means that the surface temperature data [see, e.g., Figs. 1(a) and 1(b)] is fed directly to the network used to predict the initial temperature profile inside the material [Fig. 1(d)]. The second approach uses the virtual wave concept9 as a feature extraction step [Fig. 1(c)] before the neural network is applied. This hybrid solution incorporates domain knowledge via virtual waves and automatically explores the internal structure of the data, such as sparsity.

FIG. 1.

Two-stage reconstruction process of thermographic images: (a) and (b) surface temperature measurements of specimens with different thermal diffusivities but the same defects; (c) corresponding virtual waves; and (d) defect locations.

FIG. 1.

Two-stage reconstruction process of thermographic images: (a) and (b) surface temperature measurements of specimens with different thermal diffusivities but the same defects; (c) corresponding virtual waves; and (d) defect locations.

Close modal

We demonstrate empirically that this pre-processing method removes irrelevant information from the surface temperature data. Compared to the end-to-end approach, we achieved the same performance with either fewer training samples or a more compact architecture with fewer trainable parameters. We extended our recent work17 on this topic by investigating the end-to-end approach and various ways of extracting virtual waves and by testing the proposed methods on both synthetic and real-world measurement data. For the latter, we performed a qualitative analysis of the reconstructed temperature profiles and compared them to those of other model-based approaches. Furthermore, we provide a quantitative assessment of the baseline, hybrid and end-to-end approaches in 2D thermographic reconstruction.

The paper is organized as follows. In Sec. II, we review the two-stage reconstruction process and investigate various model-based methods that can be applied in each step. The data, neural network architectures, and training regime are presented in Secs. III AIII C, respectively. We then describe the end-to-end and the hybrid deep learning approach in Secs. III C 2 and III C 1, where we also investigate the effect of various training data sizes. The results of the experiments on unseen synthetic and real-world data are presented in Secs. IV and V, respectively. This is followed by a discussion of the results and of the validity of the proposed methods in Sec. VI. Finally, Sec. VII concludes the paper with results and future research directions.

Thermal reconstruction can be modeled mathematically by the heat diffusion equation,

(1)

where α stands for the thermal diffusivity, T is the temperature as a function of space r and time t, and T0 denotes the initial temperature profile at t=0, which is given by the temporal Dirac delta function δ on the right-hand side. Except for some special cases, such as adiabatic boundary conditions (see Sec. III A), there is no exact solution to Eq. (1), and numerical approximations must be applied. Therefore, rather than solving Eq. (1) directly, we consider the problem of undamped acoustic wave propagation described by the wave equation,

(2)

where c is the speed of sound, p describes the acoustics pressure in the medium under investigation, and p0 is the initial pressure distribution at t=0 just after the Dirac-like excitation impulse. According to Burgholzer et al.,9 the thermographic reconstruction in Eq. (1) can be converted to an ultrasound reconstruction problem by substituting p0T0 in Eq. (2). The solution to this equation pTvirt is called virtual wave, and its relation to the original temperature distribution T can be formulated as a Fredholm integral equation of the first kind,

(3)

where the local transformation kernel K can be written as

(4)

A two-stage reconstruction process can now be defined in which the first step is to solve Eq. (3), followed by an ultrasonic evaluation method for reconstructing T0 from Tvirt in the second step.

In this paper, we consider 2D reconstruction problems (see, e.g., Fig. 1), where the temperature evolution T(r,t)=T(y,z,t) in time t is measured on the surface z=0 by an infrared camera at different locations along the y axis. The corresponding measurement data are an image in 2D that is stored in a matrix TRNt×Ny for estimating the initial temperature profile T0RNz×Ny, where Nt,Ny, and Nz denote the proper number of discretization points along the temporal, lateral, and depth dimensions, respectively. The discrete analog of the two-stage reconstruction process can be defined by the following regularized linear inverse problems:

(5)
(6)

where Ω() stands for the penalty function and λ,μ0 are regularization parameters. Here, the bold-face notation indicates that we are working in the dimension of vectorized 2D measurements, that is, d=vec(T)RNtNy×1 and KRNtNy×NtNy. K=diag(K,K,,K) is a blockdiagonal matrix, where the kernel in Eq. (4) is evaluated at discrete time instances, which means that Ki,j=K(ti,tj) for i,j=0,,Nt1. The virtual wave vector v~vec(Tvirt) represents the approximated solution to Eq. (2), with p0T0 and an arbitrarily chosen dimensionless virtual wave speed c that we set to 1. In Eq. (6), MRNtNy×NzNy represents the matrix form of conventional ultrasound reconstruction methods, such as the SAFT techniques, which are used to estimate the vectorized initial temperature profile u~vec(T0)RNzNy×1.

The solutions to Eqs. (5) and (6) depend on many factors, such as the numerical solver, the penalty function, the optimization constraints, and the regularization parameters. In Secs. II A and II B, we investigate these aspects in order to find the most suitable method for the two-stage thermographic reconstruction process in 2D.

In the first stage, the virtual waves v~ are to be estimated via Eq. (5), which is derived from the discretization of the Fredholm integral equation in Eq. (3). This leads to a discrete ill-posed inverse problem.18,19 Regularization including side constraints, such as sparsity and non-negativity, is, therefore, inevitable in order to obtain feasible solutions.

Since the virtual waves represent pressure signals, they can have both positive and negative values. We thus need to find an invertible linear transformation that maps v~ onto a space where non-negativity constraints apply. According to Thummerer et al.,10 this can be done by spherical projections that correspond to time integration in 3D and to Abel transformation in 2D. Let us denote these transformations by R, and define K^=KR1. The non-negative and sparse variation of Eq. (5) can then be written in the form,

(7)

Once the minimizer v^ has been found, the original virtual waves vec(Tvirt) can be approximated by R1v^.

Another way to sparsify the virtual wave vector is to use multiscale and multidirectional representations, such as curvelets. Unlike wavelets, this transformation applies a polar tiling pattern in the frequency domain, where each wedge represents the frequency support of the corresponding curvelet element. In the spatial domain, the basis functions look like elongated ridges, which are smooth in one direction and oscillatory in the other. Due to their similar structure, acoustic waves become sparse in the curvelet domain, which is illustrated in Fig. 2.

FIG. 2.

Sparse curvelet approximation of virtual waves. (a) Simulated virtual waves with additive white Gaussian noise; (b) virtual waves approximated by keeping less than 1% of the curvelet coefficients.

FIG. 2.

Sparse curvelet approximation of virtual waves. (a) Simulated virtual waves with additive white Gaussian noise; (b) virtual waves approximated by keeping less than 1% of the curvelet coefficients.

Close modal

Motivated by their usefulness in seismology,20 we utilize curvelets to recover the virtual wave field as follows:

(8)

Here, the curvelet transformation matrix C is not explicitly formed; instead, we apply the fast discrete curvelet transformation (FDCT),21 which requires O(n2logn) flops for images of size n×n. Therefore, this approach is also suitable for large-scale problems if the virtual waves are reconstructed from high-resolution thermal images.

We analyzed the performance of the virtual wave reconstruction methods utilizing Abel and curvelet transformation. To this end, we used a test set from our previous work,17 which consists of 1000 samples in ten versions for various SNRs, thus comprising 10 000 samples in total. These data simulate thermally isolated specimens in 2D and can be modeled by assuming adiabatic boundary conditions in Eq. (1). Both the virtual waves Tvirt and the initial temperature distribution T0 can thus be calculated analytically. For instance, the thermal and virtual wave images in Fig. 1 were simulated this way. Based on recent results,10 we used the alternating direction method of multipliers (ADMMs)22 in conjunction with the L-curve23 method to solve the corresponding sparse approximation problems and to estimate λ in Eqs. (7) and (8), respectively. The reconstruction error between the ground truth Tvirt and the reconstructed virtual waves (i.e., T~virt=vec(v^) for Abel trf. or T~virt=vec(v~) for curvelet trf.) were measured in terms of the mean squared error (MSE),

(9)

Figure 3(a) shows the MSEs averaged over 1000 sample images for each method and SNR. Except in the worst-case scenario with 20dB SNR, the sparse virtual wave reconstruction with Abel transformation outperformed the curvelet-based approach. Therefore, we conclude that in this case, the non-negativity is a much tighter constraint than the higher level of sparsity assumed in the curvelet domain.

FIG. 3.

Comparing various model-based approaches for reconstructing (a) virtual waves and (b) initial temperature profiles.

FIG. 3.

Comparing various model-based approaches for reconstructing (a) virtual waves and (b) initial temperature profiles.

Close modal

The outcome of the first reconstruction step is an approximation to Tvirt, which represents undamped acoustic waves that would be measured on the surface right after temporal excitation. Therefore, we must deal with an ultrasound reconstruction problem in the second stage, which is described by Eq. (6).

In ultrasonography, a point source has a hyperbolic response that depends on both the depth and the speed of sound in the specimen. Therefore, the forward operator MRNtNy×NzNy is designed such that these diffraction hyperbolas are assigned to the corresponding point sources, while the inverse operator M+ collapses these hyperbolas back to point sources. In this work, M+ is provided by the well-known Stolt’s f-k migration24 and the T-SAFT25 methods, which allow reconstruction in the frequency and the time domain. The initial temperature profile can then be estimated directly by u~M+v~. We hereafter refer to these algorithms as fkmig and tsaft.

One difficulty in applying variational methods here is the lack of proper inversion for M+. However, it has been shown that the adjoint operator can be used for this purpose.16,26 Therefore, in Eq. (6), we chose M to be equal to the adjoint of the T-SAFT matrix.25 Recall that the blockdiagonal structure of K allows application of singular value decomposition (SVD) of the kernel K in Eq. (5), which enables estimation of the optimal λ via the L-curve method23 in the first stage. However, in the second stage, calculating the SVD is unfeasible because the matrix M is too large in the case of T-SAFT or it is not explicitly formed, as in the case of f-k migration. Hence, we use the IRfista numerical solver, which has recently been developed for large-scale linear inverse problems,27 where regularization is achieved via semi-convergence of the iterations. Due to physical constraints, we assume that the initial temperature is non-negative (i.e., u0), and we set Ω(u)=u2 in Eq. (6). We hereafter refer to this setup as reg tsaft.

Note that nonzero elements of u represent either noise or defects, but that the latter show up in groups. Assuming that the volume of the defects is negligible compared to the full volume of the material, sparsity and group sparsity can also be imposed on u. This naturally raises the question of why not to use a penalty, such as Ω(u)=u1, that utilizes this knowledge in Eq. (6). The answer is twofold. One, the approximation of the virtual wave vector v~ is influenced by both the measurement noise and the regularization error of the first step in Eq. (5). As a consequence, estimating the optimal regularization parameter for each measurement is an intractable task. Two, group sparse optimizers28,29 usually require a priori knowledge of the group boundaries, and their performance is also limited by the number of groups.30 None of these variational approaches provides the necessary level of freedom for detecting defects with arbitrary shapes, locations, and overlap in large-scale linear inverse problems.

Figure 4 illustrates the previously mentioned reconstruction techniques: we first approximated the ground truth virtual waves Tvirt by applying ADMM in conjunction with Abel transformation as described in Sec. II A. This was followed by reconstruction of the initial temperature distribution T0 in the second stage using tsaft, reg tsaft, and group sparse grp.-tsaft, as shown in Figs. 4(b)4(d), respectively. Among these methods, the group sparse approximation seemed to be the best, but this approach required more than 150 iterations using the SPGL1 sparse numerical solver.31,32 Note that for this single example, we tried several parameter setups to find the optimal values for the regularization parameter λ and for the group sizes. Since this procedure would be intractable for large datasets, we omitted this approach from our extensive comparative study.

FIG. 4.

Second stage of the reconstruction process. (a) Virtual wave reconstruction by ADMM with Abel trf.; initial temperature distribution by (b) tsaft, (c) reg tsaft, and (d) group sparse grp. tsaft, where groups of size 10×10 were used as indicated by the black grid.

FIG. 4.

Second stage of the reconstruction process. (a) Virtual wave reconstruction by ADMM with Abel trf.; initial temperature distribution by (b) tsaft, (c) reg tsaft, and (d) group sparse grp. tsaft, where groups of size 10×10 were used as indicated by the black grid.

Close modal

We tested and compared the performances of the previously mentioned model-based approaches on the dataset17 described in Sec. II A. Since the Abel-transformed ADMM showed the best performance in the first stage, we used this method to extract the virtual waves. We then applied fkmig, tsaft, and reg tsaft to approximate T0 in the case of varying noise levels. Figure 3(b) shows that fkmig achieved the lowest reconstruction error in terms of MSE. Therefore, we chose this model-based reconstruction procedure as the baseline for the proposed deep learning approach.

In this section, we describe two approaches to tackling thermal reconstruction that build on the same architectures to allow direct comparison (see Fig. 5). First, we trained deep neural networks in an end-to-end fashion. That is, we directly fed the surface temperature data to the network. Second, we utilized the virtual wave concept9 as a feature extraction step. In this case, we fed the resulting mid-level representation to the neural networks.

FIG. 5.

Deep learning approaches to thermographic image reconstruction. Rescaling was applied only to the real-world measurement data in order to match the resolution of the training images.

FIG. 5.

Deep learning approaches to thermographic image reconstruction. Rescaling was applied only to the real-world measurement data in order to match the resolution of the training images.

Close modal

Deep learning approaches require vast amounts of data to learn the target distribution. This also applies to thermography, where the data depend on many factors, such as thermal diffusivity, parameters of the defects, and measurement setup. As covering all possible variations is impossible, we created the training set by using simulated data only. In this work, we considered Eq. (1) assuming adiabatic boundary conditions, since there is an analytic solution to this particular case,

(10)

where T^ and T^0 denote the Fourier cosine transforms of T and T0 in the yz-plane, and ky and kz are the corresponding spatial frequencies. In our experiments, the amplitude along the defects was considered to be constant. Therefore, to simulate the surface temperature data, we first generated T0 as a binary image of size Nz×Ny. This was used to calculate T^ by Eq. (10) and then T by taking the inverse Fourier cosine transform of T^. Evaluating T(y,z,t) at z=0 gave the simulated surface temperature data.

Our complete data are based on 10 000 different samples with up to five square-shaped defects with side lengths between two and six pixels.

First, we generated a binary target mask by randomly positioning the defects for each sample. Then, the corresponding surface temperature measurements were simulated by assuming adiabatic boundary conditions (i.e., no heat can flow in or out from the specimen; see, e.g., Sec. III in Ref. 9). The discrete Fourier number ΔFo = αΔt/Δz2 was chosen to be 0.45, where Δt is the temporal resolution and Δy=Δz is the spatial resolution of y and z. These surface temperature measurements were later used as input to the end-to-end approach. Note that generating new training data for different thermal diffusivities was not necessary, because we could easily rescale the temporal and spatial resolution to meet the discrete Fourier number of the training data. For our hybrid method, we computed the virtual waves from the temperature measurements as described in Sec. II A (i.e., by using ADMM with Abel transformation10). We highlight that the estimated virtual wave vector v~ is independent of the thermal diffusivity α but depends on the dimensionless speed of sound c, which was chosen to be 1.

The end result for each sample comprised three single-channel images 256 by 64 pixels in size: the temperature measurements, the virtual waves, and the target mask.

Additionally, we used ten different versions of each sample, representing SNRs from 20dB to 70 dB in 10 dB steps. On the one hand, this is a form of data augmentation that is supposed to increase robustness against changes in the level of SNR during training. On the other hand, having multiple versions of a sample allows more detailed performance evaluation and baseline comparison.

We divided these data into three disjoint (non-overlapping) subsets as follows: our training data consisted of 8000 samples. We normalized the samples to have zero mean and unit standard deviation using only the training data. Given that we had ten different versions of each sample that represented different SNR levels, we ended up with 80 000 samples.

For development and validation purposes, such as architecture engineering, hyperparameter tuning, and model selection, we always used the same 1000 samples. In total, we had 10 000 samples in our validation data.

In order to estimate the generalization capabilities and to ensure a fair comparison to the baseline, we evaluated on an additional dataset. This test dataset was unseen in every regard and also normalized based solely on the training data. It also consisted of 1000 samples in ten versions for different SNRs, thus comprising a total of 10 000 samples.

We propose two carefully designed u-net architectures, where we used the implementation from Ref. 33 as a starting point. The first architecture was developed to be very compact and computationally inexpensive. The resulting model is best suited to embedded and/or real-time settings. The second architecture was developed with the goal of maximizing performance. While the computational cost was ignored in this case, it is still relatively small for modern standards.

The u-net architecture attracted considerable attention by winning the International Symposium on Biomedical Imaging (ISBI) cell tracking challenge in 2015. Initially, it was proposed for various medical image segmentation tasks.34 One major advantage of this architecture is that it can also produce adequate results with moderate amounts of training data. The underlying basic principle is that of a fully convolutional (i.e., without any fully connected layers) auto-encoder with a contracting path (i.e., encoder) and an expansive path (i.e., decoder). In the contracting path, the feature map resolution is successively reduced due to the effect of pooling. In the expansive path, pooling operators are replaced by upsampling operators, which successively increases the resolution.

The ability to localize is realized via skip-connections from the higher-resolution features of the contracting path to the upsampled output of the expansive path. For a more detailed description, we refer to the original paper.34 

The compact architecture was designed and has a depth of three in both the contracting and the expansive path. It has just 16 filters in the first (single channel) layer, which results in about 109 000 weights. The second, larger architecture is very similar, except that it has an increased depth of five layers, which amounts to about 1.8×106 weights. Further increasing the number of filters and/or the depth of the network did not lead to improved performance in terms of validation loss.

For training of the weights, we use the same procedure for both models to minimize the binary cross entropy (BCE) loss as follows. We used the Adam optimizer35 with a learning rate of 1×103 and no weight decay. We trained for 100 epochs and fast convergence rendered a learning-rate schedule unnecessary. Furthermore, we did not apply batch normalization, as several experiments had shown that it was not useful.

The only data augmentation we applied during training was a simple left-right flip of both the input image and the target mask with a probability of 50%. Notice that many augmentations known to be helpful in other image-related tasks are not useful here. For instance, stretching or shearing applied to an input and target image would not simply result in a consistent input-target pair.

1. Training end-to-end

In this section, we investigate the performance of an end-to-end approach. We fed the temperature measurements directly to the neural networks. This approach has the least computational load, but at the same time the highest task complexity.

For additional insights, we trained both architectures with four different amounts of training data, starting with 10 000 samples. We then increased the size of the training dataset repeatedly by a factor of two until we ended up with the full set comprising 80 000 samples. In order to increase the meaningfulness of the results, we repeated each training five times and report the means and standard deviations. Note that we always used the exact same validation data as described in Sec. III A.

The outcome of this experiment is summarized in Fig. 6, which shows a side-by-side comparison of the two architectures. We refer to the compact and to the larger architecture as cmp and lrg, respectively. As can be seen, increasing the number of training samples resulted in models with lower loss and smaller generalization gap (i.e., the difference between training and validation loss). Additionally, the compact model fits the training and validation data less well than the complex model but also produced less overfitting. This is due to the low number of learnable weights, which acts as a regularizer. The compact architecture seems to give consistent results, as the standard deviation of both the training and validation results was relatively small (see cmp train/cmp val). The generalization gap vanished when 40 000 and 80 000 training samples were used.

FIG. 6.

End-to-end approach results with increasing training dataset size for both architectures. cmp train/cmp val: training and validation data loss for the compact model (109 000 weights); lrg train/lrg val: training and validation data loss for the larger model (1.8×106 weights). Error bars correspond to the standard deviations of the results after training five times.

FIG. 6.

End-to-end approach results with increasing training dataset size for both architectures. cmp train/cmp val: training and validation data loss for the compact model (109 000 weights); lrg train/lrg val: training and validation data loss for the larger model (1.8×106 weights). Error bars correspond to the standard deviations of the results after training five times.

Close modal

The results of the larger architecture are similar, except the loss was generally lower than for the compact architecture. Since the generalization gap also remained when training with 80 000 samples, increasing the training set size even further might be useful.

2. Training with virtual waves

In this section, we investigate the performance of a hybrid approach. Here, the goal was to learn the reconstruction process from v~ to u~, which also incorporates the previously mentioned a priori knowledge. The input of the network was provided by the first regularization step in Eq. (5). The original end-to-end reconstruction problem from d to u~ was converted to a much easier task, that is, recognition of hyperbolas in v~.

The setup of the experiment was the same as with the end-to-end approach, except that we used virtual waves instead of temperature measurements as input. Note that, to achieve the most meaningful comparison possible, all the samples used for training and validation matched exactly those used in the end-to-end approach. Again, we repeated each training five times and report the means and standard deviations.

Figure 7 shows a side-by-side comparison of the two architectures. In general, the results reflect model behavior similar to that of the end-to-end approach, except that the loss was consistently lower. This was expected, as the hybrid approach takes some of the computational burden away from the neural network compared to the end-to-end approach. Therefore, virtual waves seem to be a useful mid-level representation for learning the reconstruction process.

FIG. 7.

Hybrid approach results with increasing training dataset size for both architectures. cmp train/cmp val: training and validation data loss for the compact model (109 000 weights); lrg train/lrg val: training and validation data loss for the larger model (1.8×106 weights).

FIG. 7.

Hybrid approach results with increasing training dataset size for both architectures. cmp train/cmp val: training and validation data loss for the compact model (109 000 weights); lrg train/lrg val: training and validation data loss for the larger model (1.8×106 weights).

Close modal

After training, we selected the best models for each architecture and both the hybrid and the end-to-end approach according to the validation data results. The resulting four models were then evaluated on the unseen test data against the baseline.

In this section, we discuss the results of the end-to-end and the hybrid approaches on the unseen test set and compare them to the baseline (see Sec. II B).

We start by providing intuitive insights by comparing the reconstructions of our approaches with the baseline method. For this, we take one of the hardest examples from the 0 dB SNR results. In addition to the low SNR, two other points make this example challenging. First, several defects are in close proximity and risk becoming merged together. Second, there is a defect located in between and underneath two other defects, which makes it difficult to detect.

Figure 8 shows the target (mask) in the top-left and the baseline reconstruction fkmig in the top-right subplot. Compared to the baseline, all four of our proposed approaches produced reconstructions that were much closer to the desired result, even e2e cmp, the computationally cheapest one. Furthermore, only the baseline method seemed to merge several defects together. Detecting the “shadowed” defect near the lower right corner was difficult for all methods. In general, only the larger models were able to detect the defects deeper in the material (the lower, the deeper). Furthermore, our deep-learning-based methods produced almost no artifacts, making it easier to find a good threshold for the final binary defect/no defect decision.

FIG. 8.

A challenging 0 dB SNR example reconstructed by the baseline fkmig and our proposed methods (MSE in brackets).

FIG. 8.

A challenging 0 dB SNR example reconstructed by the baseline fkmig and our proposed methods (MSE in brackets).

Close modal

Notice that e2e lrg identifies the “shadowed” defect closer to the expected area without any false positive artifacts, although the MSE is larger than that of hybrid lrg. This might seem counterintuitive at first, but can be attributed to the fact that a visual—hence subjective—inspection of the results is prone to missing smaller differences from the target that still contribute to the objective error metric. For instance, in hybrid lrg of Fig. 8, three out of five defects have almost perfect reconstruction. In the case of e2e lrg, the same defects have blurred edges and slightly different amplitudes, which increase the overall MSE of the reconstructed image.

Depending on the specific application, it might be the case that either the absence of artifacts, or the correct position, shape, or size of the defect is the most important criterion for evaluation. Therefore, a different evaluation metric might be required. Combined with other techniques, such as uncertainty estimates of the predictions,36 a practically useful and efficacious system can then be engineered.

Next, we present the results of a more objective evaluation. Figure 9 shows the results of the baseline fkmig and of our proposed methods in terms of the MSE for various SNRs. Unsurprisingly, for all methods lower SNRs led to worse reconstructions and thus to higher losses. As can be seen, even the computationally cheapest method e2e cmp provided a substantial improvement over the baseline. Overall, the results confirm two findings from the validation results. First, the larger models hybrid lrg, e2e lrg performed consistently better in terms of MSE compared to their compact counterparts hybrid cmp, e2e cmp. Second, the hybrid models hybrid cmp, hybrid lrg performed consistently better in terms of MSE compared to their end-to-end counterparts e2e cmp and e2e lrg.

FIG. 9.

Comparison of baseline fkmig and our proposed methods for various SNRs.

FIG. 9.

Comparison of baseline fkmig and our proposed methods for various SNRs.

Close modal

Interestingly, between 30 dB and 70 db SNR, both hybrid models exhibited improving performance, whereas both end-to-end models stayed approximately at the same level of loss. This is another indicator that the virtual waves represent the information of the temperature measurements in a way that is easier to process by the neural networks. In order to explain this phenomenon in more detail, we present Fig. 10, depicting measurements (left column) and their corresponding virtual waves (right column) from the same sample, but assuming different SNR conditions. The plots at the bottom represent their corresponding squared difference. In order to make the squared differences visible to the naked eye, we had to upscale their magnitude by a factor of 10 000 and 10 for the measurements and the virtual waves, respectively.

FIG. 10.

Comparison of measurements (left column) and virtual waves (right column) for different SNRs from the same sample. The squared difference (upscaled for clarity, true MSE in brackets) between the two virtual waves seems to contain more meaningful information than the squared difference between the measurements, allowing the neural network to further improve above 30 dB SNR when fed with virtual waves.

FIG. 10.

Comparison of measurements (left column) and virtual waves (right column) for different SNRs from the same sample. The squared difference (upscaled for clarity, true MSE in brackets) between the two virtual waves seems to contain more meaningful information than the squared difference between the measurements, allowing the neural network to further improve above 30 dB SNR when fed with virtual waves.

Close modal

Clearly, the virtual waves differ, with the 70 dB representation looking less blurry than the 30 dB representation, especially toward the bottom of the image. Additionally, the squared difference shows a meaningful pattern—information which potentially enables the neural network to produce ever better results as the SNR conditions improve.

On the other hand, the squared difference of the measurements does not show such an obvious pattern and is significantly lower in magnitude by a factor of approximately 1000 (see the MSE in brackets in Fig. 10). As the input to the end-to-end approach is extremely similar, the neural network also produces similar output, which is why the end-to-end results do not improve above 30 dB SNR anymore. This further demonstrates the effectiveness of the hybrid approach that incorporates domain knowledge.

The experiments described so far were based on synthetic data only. However, we cannot simply assume that these results translate to the physical world, especially when we also use synthetic data for training, as was the case with the deep-learning-based methods. The generative process for the synthetic data might not incorporate all relevant aspects, rendering the results less relevant to practical applications.

Additionally, confounding factors could contribute to the problem that a seemingly good performance of a machine learning system might not be the consequence of actually understanding the underlying concepts that determine the target.37–40 Therefore, we considered it extremely important to provide additional results using real-world specimens.

However, for the task at hand, real-world data are hard to come by mainly for two reasons. First, we are often blind to the objective ground truth of the real-world specimen, as it would require destruction of the specimen itself in order to obtain it. Thus, we do not report the MSE as we did with the synthetic data, but demonstrate the differences between the methods by showing their outputs along with an estimate of the ground truth. Second, creating data in the physical world are more complex compared to collecting and annotating samples, for instance, for an image classification task. Therefore, the following results are based on a single specimen, and we took measurements from various angles to increase their meaningfulness.

First, a physical epoxy resin phantom with two embedded steel rods was created as our real-world specimen. The rods were heated up by eddy current induction for 2 s using an induction generator that provided 3 kW power at a frequency of 200 kHz. The resulting temperature evolution was then measured on the surface by means of an infrared camera. The thermal diffusivity parameter of the material was α=8.14×108m2/s [see Eq. (4)]. Additional physical parameters of the specimen and the measurement setup are shown in Figs. 11(a) and 11(b), respectively. Note that we can simulate internal heat sources within the test sample, since the epoxy resin has no transparency (i.e., it is opaque) in the infrared spectral range. Therefore, we can only measure the surface temperature and reconstruct the internal heat sources from it.

FIG. 11.

Real-world experiment. (a) Physical properties of the specimen and (b) measurement setup.

FIG. 11.

Real-world experiment. (a) Physical properties of the specimen and (b) measurement setup.

Close modal

Figure 12(a) shows the measured temperature data at the sample surface with the highest contrast. It is clearly visible that the lower regions of the rods cannot be detected due to the low SNR. Figure 12(b) presents a snapshot of the measurement along the vertical diameter of the specimen at a spatial resolution of Δx=0.04/238m and a temporal resolution of Δt=0.04s. As can be seen, some vertical and horizontal lines, which are caused by the pixel crosstalk of the infrared camera,41 distort the original image. This pattern noise was removed by eliminating the characteristic frequencies indicated by the red lines in Fig. 12(c). Then, the filtered measurement data were used to extract the virtual waves and to reconstruct the initial temperature profile T0 in each 2D cross section of the specimen. This was followed by the 3D reconstruction compositing all 2D estimations. For the end-to-end approach, the measurement data were rescaled to obtain a discrete Fourier number of 0.45, which enabled us to apply the generated training data.

FIG. 12.

Real-measurement data. (a) A frame recorded at the top of the specimen; (b) temperature evolution over time measured along the green dashed line in (a); (c) log spectra of the temperature data in (b), where the frequencies along the red lines were filtered out.

FIG. 12.

Real-measurement data. (a) A frame recorded at the top of the specimen; (b) temperature evolution over time measured along the green dashed line in (a); (c) log spectra of the temperature data in (b), where the frequencies along the red lines were filtered out.

Close modal

As in Secs. IIIV, we chose the reconstruction processes that achieved the lowest MSE, that is, the hybrid and end-to-end approaches with the large architectures lrg and e2e lrg, and applied them to the real-world data. We considered four variations of the measurement data. In the first case, the cross sections were perpendicular to the orientation of the steel rods, while we rotated the specimen by 10°, 25°, and 45° in the other examples.

The results of the model-based fkmig approach can be seen in Figs. 13–16. We reconstructed the volume of the steel rods (red) and also show the estimated ground truth (blue) with the orientation of the cross sections (green). As can be seen, the two steel rods deeper inside the specimen do not appear as separated defects anymore. This effect becomes more pronounced on the rotated examples, and artifacts on the edge of the specimen begin to appear, which is especially visible in Fig. 16, which shows the results of rotating the specimen by 45°. Note that the results of the fkmig approach were so noisy that postprocessing was necessary: We zeroed out those pixels in the reconstructed image whose values were below a certain threshold. Tuning this threshold was a difficult task for the fkmig approach. For the sake of simplicity, we set a uniform threshold of 200 for all reconstructions in Figs. 13–24.

FIG. 13.

Using the model-based fkmig approach for 3D reconstruction of the specimen without rotation.

FIG. 13.

Using the model-based fkmig approach for 3D reconstruction of the specimen without rotation.

Close modal
FIG. 14.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 10°.

FIG. 14.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 10°.

Close modal
FIG. 15.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 25°.

FIG. 15.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 25°.

Close modal
FIG. 16.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 45°.

FIG. 16.

Using the model-based fkmig approach for 3D reconstruction of the specimen with a rotation of 45°.

Close modal
FIG. 17.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen without rotation.

FIG. 17.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen without rotation.

Close modal
FIG. 18.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 10°.

FIG. 18.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 10°.

Close modal
FIG. 19.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 25°.

FIG. 19.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 25°.

Close modal
FIG. 20.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 45°.

FIG. 20.

Using the large end-to-end e2e lrg approach for 3D reconstruction of the specimen with a rotation of 45°.

Close modal
FIG. 21.

Using the large hybrid lrg approach for 3D reconstruction of the specimen without rotation.

FIG. 21.

Using the large hybrid lrg approach for 3D reconstruction of the specimen without rotation.

Close modal
FIG. 22.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 10°.

FIG. 22.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 10°.

Close modal
FIG. 23.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 25°.

FIG. 23.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 25°.

Close modal
FIG. 24.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 45°.

FIG. 24.

Using the large hybrid lrg approach for 3D reconstruction of the specimen with a rotation of 45°.

Close modal

The end-to-end approach e2e lrg yielded meaningful 3D reconstructions, as can be seen in Figs. 17–20, where the two defects stay clearly separated except at the edges of the specimen. Again, thresholding was necessary to eliminate some artifacts from the figures, but the end-to-end method turned out to be relatively insensitive to the value of the threshold. Compared to the model-based fkmig approach, the results seem to be affected less by rotating the specimen, and fewer artifacts appear. However, the volume of the steel rods was underestimated. Interestingly, this approach also provided good results for the test case with the largest rotation angle (45°; see Fig. 20).

The results of the hybrid approach can be seen in Figs. 21–24. As with the end-to-end approach, thresholding was necessary to eliminate some artifacts from the figures. Overall, the results seem to be similar to those from the end-to-end approach, and both steel rods appear as separate defects. However, the volume of the steel rods seems to be closer to the estimated ground truth. Note that the results for the specimen with the greatest rotation (45°) show signs of deterioration, and thresholding did not help to improve the quality of the reconstruction. We discuss these limitations and the validity of the proposed deep learning approaches in Sec. VI.

Both deep-learning-based methods outperformed the model-based baseline substantially on synthetic data. They also seem to perform very well on real-world data, even though no real-world measurements were used in the training process. The networks successfully generalized the learned reconstruction algorithm to real-world data even under previously unseen conditions (rotated specimen). Compared to the model-based fkmig approach, the results are more consistent with the estimated ground truth, and thresholding was effortless.

Which method is preferable depends on the particular use case. If computational complexity must be kept as low as possible, the end-to-end approach is likely to be the better choice, although the results might not reflect the actual size and/or shape of a defect. If the quality of the results is more important than low complexity, the hybrid approach seems to be the better choice. The reason for the results of the hybrid approach deteriorating for the 45° specimen is given in Sec. VI A.

Although the first stage of the reconstruction process (i.e., the virtual wave method) is valid in all dimensions, the second stage is applicable in 2D only. In fact, the hybrid deep learning approach was trained on synthetic 2D data assuming adiabatic boundary conditions. This means that the proposed real-world example worked well until the 2D slices were perpendicular or nearly perpendicular to the orientation of the steel rods. The deviation from the perpendicular orientation should not be greater than the angle between the steel rods and the measurement surface, which was about 10°. Note that the trained u-net also successfully generalized the results to rotated specimens. Interestingly, the reconstruction was very accurate even for a 25° rotation.

Due to its ill-posed nature, various noise assumptions must be fulfilled in order to provide feasible solutions to the thermal reconstruction problem.42 In the case of NDT applications, the temperature change is small for short integration time, and, thus, the noise is considered to be additive white Gaussian (AWGN), which otherwise follows a Poisson distribution.41 In this work, we assumed AWGN with a range of variances such that the training set SNRs matched real-world experiments. In order to check this assumption, we estimated the SNRs and the peak signal-to-noise-ratios (PSNRs) of each cross section d in our test specimen [cf. Fig. 12(b)] as follows:

where N denotes the overall number of elements in d, and ϵ=0.025K is the noise-equivalent differential temperature (NEDT) of our infrared camera. Figure 25 shows the estimated SNR and PSNR values for each vertical cross-section image of the measurement data. The range of SNRs (20dB to 70 dB) of the training set covered the estimated SNRs of the measurement data.

FIG. 25.

Estimated SNR and PSNR values of the measurement frames.

FIG. 25.

Estimated SNR and PSNR values of the measurement frames.

Close modal

From a computational point of view, training deep neural networks is an expensive task. However, once training is completed and the weights are fixed, inference from unseen data is computationally cheap in comparison. Inspired by Sovrasov’s work,43 we computed the number of multiply-accumulate operations (MACs) in our u-net architectures, and the result was that e2e cmp required 0.4 GMAC, while e2e lrg needed 0.76 GMAC to process one input image of size 256×64. In the case of the hybrid- and the model-based approaches, the computational complexity of the virtual wave extraction, which was 65.5 GMAC, should be added to the overall number of MACs. Note that the virtual wave extraction works with raw data, which explains the high computational load compared to the other three algorithms in Table I. This first stage can be sped up by processing the measurement frames in parallel.

TABLE I.

Computational complexity of the proposed algorithms for processing a single input image.

Methodcmplrgfkmigvwave extraction
Input 256 × 64 256 × 64 256 × 64 256 × 2000 
GMAC 0.4 0.76 0.0077 65.5 
Methodcmplrgfkmigvwave extraction
Input 256 × 64 256 × 64 256 × 64 256 × 2000 
GMAC 0.4 0.76 0.0077 65.5 

aEstimated number of overall giga multiply-accumulate operations.

In order to give an impression of the reconstruction speed, we measured the execution time on an Intel(R) Core(TM) i9-9900K at 3.60 GHz CPU system equipped with a NVIDIA GeForce GTX TITAN X GPU. According to our experiments, the 3D reconstruction of the real-world test specimen took approximately 2 s for e2e lrg, while it was around 800 s for the large hybrid and the model-based fkmig approaches including the virtual wave extraction. Note that in our setup, the difference in inference time between the compact and the large u-net model was negligible due to our GPU’s ability to parallelize matrix multiplications. However, this will be more pronounced in a setting where no GPU is available, for instance, on an embedded device.

We have proposed an end-to-end and a hybrid deep learning approach for thermographic image reconstruction. The latter uses the recently developed virtual wave concept,9 which proved to be an efficient feature extraction technique for our hybrid deep learning approach.17 We studied each step of the reconstruction chain by means of quantitative and qualitative tests. In doing so, we developed a framework44 for generating data samples, which can be used to train, validate, and test machine learning algorithms for thermographic imaging. In addition to these synthetic examples, we made a physical phantom from epoxy and used it for experiments on real-world data.45 

Our experiments showed that, in terms of MSE, the hybrid method performs better on synthetic data than the model-based and the end-to-end approaches. Both our deep learning methods performed well on real-world data compared to their model-based counterparts. The hybrid approach produced only a few artifacts and achieved the best reconstruction for test cases in which the 2D model was valid (i.e., with 0° and 10° of rotation), while the end-to-end method gave meaningful results on the real-world measurement data up to the highest rotation angle (45°). Overall, we conclude that the proposed hybrid method outperforms the model-based and the end-to-end approaches in terms of reconstruction error, while the last might perform better in online applications, where execution speed is crucial. For a first approximation, the end-to-end approach provides a fast reconstruction, which can be replaced by the hybrid method if the results are not satisfactory.

Further, we found that the virtual wave concept serves as an efficient feature extraction technique by which prior knowledge, such as non-negativity and sparsity, can be incorporated into the training process to improve the generalization properties of the neural network solutions and to reduce the size of the training set. In our simulations, the amplitude along the objects/defects was considered to be constant, mimicking, for instance, a metallic wire with constant cross section inside insulating mass. In accordance with this approach, we designed a real-world experiment that fits well with the assumptions we made for the training set. In fact, the amplitude along the steel rods did not change, the infrared camera we used fulfilled the AWGN assumption,42 and the SNR range of the measurement was covered by the training set. The applicability of the proposed deep learning approaches can be extended by augmenting the training set with additional data and assuming weaker constraints, such as inhomogeneous defects and non-Gaussian noise. This will be part of our future work.

P. Kovács and B. Lehner contributed equally to this work and both should be considered first authors of this manuscript.

This work was supported by Silicon Austria Labs (SAL), owned by the Republic of Austria, the Styrian Business Promotion Agency (SFG), the federal state of Carinthia, the Upper Austrian Research (UAR), and the Austrian Association for the Electric and Electronics Industry (FEEI); and by the COMET-K2 “Center for Symbiotic Mechatronics” of the Linz Center of Mechatronics (LCM), funded by the Austrian Federal Government and the Federal State of Upper Austria.

Financial support by the Austrian Federal Ministry for Digital and Economic Affairs, the National Foundation for Research, Technology and Development, and the Christian Doppler Research Association is gratefully acknowledged. Financial support was also provided by the Austrian Research Funding Association (FFG) within the scope of the COMET programme within the research project “Photonic Sensing for Smarter Processes (PSSP)” (Contract No. 871974). This programme is promoted by BMK, BMDW, the federal state of Upper Austria, and the federal state of Styria, represented by SFG.

Additionally, parts of this work were supported by the Austrian Science Fund (FWF) (Project Nos. P 30747-N32 and P 33019-N).

The data and code that support the findings of this study are available at Ref. 46.

1.
C.
Antolis
and
N.
Rajic
, “
Optical lock-in thermography for structural health monitoring—A study into infrared detector performance
,”
Procedia Eng.
188
,
471
478
(
2017
).
2.
O.
Lang
,
P.
Kovács
,
C.
Motz
,
M.
Huemer
,
T.
Berer
, and
P.
Burgholzer
, “
A linear state space model for photoacoustic imaging in an acoustic attenuating media
,”
Inverse Probl.
35
(
1
),
1
29
(
2018
).
3.
G.
Mayr
,
G.
Stockner
,
H.
Plasser
,
G.
Hendorfer
, and
P.
Burgholzer
, “
Parameter estimation from pulsed thermography data using the virtual wave concept
,”
NDT E Int.
100
,
101
107
(
2018
).
4.
K.
Sreekumar
and
A.
Mandelis
, “
Ultra-deep bone diagnostics with fat-skin overlayers using new pulsed photothermal radar
,”
Int. J. Thermophys.
34
,
1481
1488
(
2013
).
5.
P. T.
Tavakolian
and
A.
Mandelis
, “
Perspective: Principles and specifications of photothermal imaging methodologies and their applications to non-invasive biomedical and non-destructive materials imaging
,”
J. Appl. Phys.
124
,
160903
(
2018
).
6.
N.
Verdel
,
J.
Tanevski
,
S.
Džeroski
, and
B.
Majaron
, “
Predictive model for the quantitative analysis of human skin using photothermal radiometry and diffuse reflectance spectroscopy
,”
Biomed. Opt. Express
11
,
1679
1696
(
2020
).
7.
A.
Mendioroz
,
K.
Martínez
,
R.
Celorrio
, and
A.
Salazar
, “
Characterizing the shape and heat production of open vertical cracks in burst vibrothermography experiments
,”
NDT E Int.
102
,
234
243
(
2019
).
8.
M.-M.
Groz
,
E.
Abisset-Chavanne
,
A.
Meziane
,
A.
Sommier
, and
C.
Pradère
, “
Three-dimensional reconstruction of thermal volumetric sources from surface temperature fields measured by infrared thermography
,”
Appl. Sci.
9
,
5464
(
2019
).
9.
P.
Burgholzer
,
M.
Thor
,
J.
Gruber
, and
G.
Mayr
, “
Three-dimensional thermographic imaging using a virtual wave concept
,”
J. Appl. Phys.
121
,
105102
(
2017
).
10.
G.
Thummerer
,
G.
Mayr
,
M.
Haltmeier
, and
P.
Burgholzer
, “
Photoacoustic reconstruction from photothermal measurements including prior information
,”
Photoacoustics
19
,
100175
(
2020
).
11.
G.
Thummerer
,
G.
Mayr
,
P. D.
Hirsch
,
M.
Ziegler
, and
P.
Burgholzer
, “
Photothermal image reconstruction in opaque media with virtual wave backpropagation
,”
NDT E Int.
112
,
102239
(
2020
).
12.
L. J.
Busse
, “
Three-dimensional imaging using a frequency-domain synthetic aperture focusing technique
,”
IEEE Trans. Ultrason. Ferroelectr. Freq. Control.
39
,
174
179
(
1992
).
13.
D.
Lévesque
,
A.
Blouin
,
C.
Néron
, and
J.-P.
Monchalin
, “
Performance of laser-ultrasonic F-SAFT imaging
,”
Ultrasonics
40
,
1057
1063
(
2002
).
14.
P.
Burgholzer
,
G.
Stockner
, and
G.
Mayr
, “
Acoustic reconstruction for photothermal imaging
,”
Bioengineering
5
,
1
9
(
2018
).
15.
P.
Burgholzer
, “
Thermodynamic limits of spatial resolution in active thermography
,”
Int. J. Thermophys.
36
,
2328
2341
(
2015
).
16.
A.
Jonas
and
O.
Ozan
, “
Solving ill-posed inverse problems using iterative deep neural networks
,”
Inverse Probl.
33
,
124007
(
2017
).
17.
P.
Kovács
,
B.
Lehner
,
G.
Thummerer
,
M.
Günther
,
P.
Burgholzer
, and
M.
Huemer
, “A hybrid approach for thermographic imaging with deep learning,” in Proceedings of the 45th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (
IEEE
,
2020
), pp. 4277–4281.
18.
P. C.
Hansen
,
Rank-Deficient and Discrete Ill-Posed Inverse Problems: Numerical Aspects of Linear Inversion
(
SIAM Monographs on Mathematical Modeling and Computation
,
Philadelphia, PA
,
1998
).
19.
C. W.
Groetsch
, “
Integral equations of the first kind, inverse problems and regularization: A crash course
,”
J. Phys. Conf. Ser.
73
,
012001
(
2007
).
20.
F. J.
Herrmann
and
G.
Hennenfent
, “
Non-parametric seismic data recovery with curvelet frames
,”
Geophys. J. Int.
173
,
233
248
(
2008
).
21.
E.
Candés
,
L.
Demanet
,
D.
Donoho
, and
L.
Ying
, “
Fast discrete curvelet transforms
,”
Multiscale Model. Simul.
5
,
861
899
(
2006
).
22.
J.
Eckstein
and
P. D.
Bertsekas
, “
On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators
,”
Math. Program.
55
,
293
318
(
1992
).
23.
P. C.
Hansen
, “
Analysis of discrete ill-posed problems by means of the L-curve
,”
SIAM Rev.
34
,
561
580
(
1992
).
24.
D.
Garcia
,
L. L.
Tarnec
,
S.
Muth
,
E.
Montagnon
,
J.
Porée
, and
G.
Cloutier
, “
Stolt’s f-k migration for plane wave ultrasound imaging
,”
IEEE Trans. Ultrason. Ferroelectr. Freq. Control
60
,
1853
1867
(
2013
).
25.
F.
Lingvall
,
T.
Olofsson
, and
S.
T.
, “
Synthetic aperture imaging using sources with finite aperture: Deconvolution of the spatial impulse response
,”
J. Acoust. Soc. Am.
114
,
225
234
(
2003
).
26.
J. F.
Claerbout
,
Earth Soundings Analysis: Processing versus Inversion
(
Blackwell Scientific Publications
,
Cambridge Center, Cambridge, MA
,
2004
).
27.
S.
Gazzola
,
P. C.
Hansen
, and
J. G.
Nagy
, “
IR tools: A MATLAB package of iterative regularization methods and large-scale test problems
,”
Numer. Algorithms
81
,
1
39
(
2018
).
28.
M.
Yuan
and
Y.
Lin
, “
Model selection and estimation in regression with grouped variables
,”
J. R. Stat. Soc. Ser. B
68
,
49
67
(
2006
).
29.
E.
van den Berg
and
M. P.
Friedlander
, “
Sparse optimization with least-squares constraints
,”
SIAM J. Optim.
21
,
1201
1229
(
2011
).
30.
Y.
Zhang
,
J.
Yang
, and
W.
Yin
, “
Alternating direction algorithms for l1-problems in compressive sensing
,”
SIAM J. Sci. Comput.
33
,
1873
1896
(
2011
).
31.
E.
van den Berg
and
M. P.
Friedlander
, “
Probing the Pareto frontier for basis pursuit solutions
,”
SIAM J. Sci. Comput.
31
,
890
912
(
2008
).
32.
E.
van den Berg
and
M. P.
Friedlander
, “SPGL1: A solver for large-scale sparse reconstruction” (2019), see https://friedlander.io/spgl1.
33.
J.
Jvanvugt
, “Tunable u-net implementation in pytorch” (2019), see https://github.com/jvanvugt/pytorch-unet.
34.
O.
Ronneberger
,
P.
Fischer
, and
T.
Brox
, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (
Springer
,
2015
), pp. 234–241.
35.
D. P.
Kingma
and
J.
Ba
, “Adam: A method for stochastic optimization,” in Proceedings of the 3rd International Conference for Learning Representations (ICLR) (
OpenReview.net
,
2015
), pp. 1–15.
36.
B.
Lehner
and
T.
Gallien
, “Uncertainty estimation for non-destructive detection of material defects with u-nets,” in Proceedings of the 2nd International Conference on Advances in Signal Processing and Artificial Intelligence (ASPAI) (
IFSA Publishing
,
2020
).
37.
I. J.
Goodfellow
,
J.
Shlens
, and
C.
Szegedy
, “Explaining and harnessing adversarial examples,” in Proceedings of the 3rd International Conference on Learning Representations (ICLR), edited by Y. Bengio and Y. LeCun (
OpenReview.net
,
2015
).
38.
M. T.
Ribeiro
,
S.
Singh
, and
C.
Guestrin
, “Why should I trust you?: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (
ACM
,
2016
), pp. 1135–1144.
39.
B.
Lehner
,
J.
Schluter
, and
G.
Widmer
, “
Online, loudness-invariant vocal detection in mixed music signals
,”
IEEE/ACM Trans. Audio Speech Lang. Process.
26
,
1369
1380
(
2018
).
40.
B. L.
Sturm
, “
A simple method to determine if a music information retrieval system is a ‘horse’
,”
IEEE Trans. Multimedia
16
,
1636
1644
(
2014
).
41.
S.
Breitwieser
,
G.
Zauner
, and
G.
Mayr
, “
Characterization of mid-wavelength quantum infrared cameras using the photon transfer technique
,”
Infrared Phys. Technol.
106
,
103283
(
2020
).
42.
M. N.
Özisik
and
H.
Orlande
,
Inverse Heat Transfer: Fundamentals and Applications
(
Taylor & Francis
,
New York, NY
,
1999
).
43.
V.
Sovrasov
, “Flops counter for convolutional networks in pytorch framework” (2020), see https://github.com/sovrasov/flops-counter.pytorch.
44.
P.
Kovács
,
B.
Lehner
,
G.
Thummerer
,
G.
Mayr
,
P.
Burgholzer
, and
M.
Huemer
, “A hybrid approach for thermographic imaging with deep learning” (2019), see https://codeocean.com.
45.
N.
Verdel
,
J.
Tanevski
,
S.
Džeroski
, and
B.
Majaron
, “A machine-learning model for quantitative characterization of human skin using photothermal radiometry and diffuse reflectance spectroscopy,” in Proceedings of SPIE 10851, Photonics in Dermatology and Plastic Surgery (
SPIE
,
2019
).
46.
B.
Lehner
, ThermUnet - Deep Learning Approaches for Thermographic Imaging (Online, 2020); available at https://git.silicon-austria.com/pub/confine/ThermUNet