As a promising preclinical imaging technique, optical molecular tomography (OMT) shows great potential in early detection and diagnosis of tumor diseases. However, its widespread application has been hindered by the limitations of traditional reconstruction methods, specifically the accuracy of optical transmission models and the ill-posed nature of inverse reconstruction. The development of deep learning has offered novel solutions for OMT, enabling efficient reduction of the ill-posed nature in reconstruction. The existing deep learning approaches employ conventional neural networks and objective functions, which retains significant scope for enhancing the accuracy of image reconstruction. In this paper, we propose a source distribution correlation enabled self-attention residual network (DCeSR network) to address the need for accurate OMT reconstruction. The DCeSR network leverages a residual learning strategy and a self-attention mechanism to effectively integrate the deep and shallow features, subsequently extracting highly informative surface measurements to accurately predict the three-dimensional distribution of light sources within tissues. The efficacy of the DCeSR network was validated through training and testing with two distinct numerical simulated datasets, each encompassing both single and dual source configurations. Both qualitative and quantitative analyses demonstrate the superior performance of the DCeSR network in achieving accurate OMT reconstructions.
I. INTRODUCTION
Optical molecular imaging (OMI) is a highly promising molecular imaging technology that translates cellular and molecular behavior into visual patterns using specialized optical imaging equipment. It can obtain in vivo information about lesions and enables the dynamic monitoring of biological processes in living organisms.1,2 OMI has garnered significant attention in pre-clinical small animal research due to its non-invasiveness, high sensitivity, and strong specificity.3,4 However, as a two-dimensional imaging technique, it cannot provide the three-dimensional (3D) distribution of lesions within the imaging body. To address this limitation, optical molecular tomography (OMT) has attracted the attention of researchers, offering the capability to generate 3D reconstructions of lesions.5–7 OMT relies on surface light transmission data obtained through optical detectors and structural data of the imaging body to reconstruct the 3D distribution of the target within the body. The accuracy of the reconstruction algorithm is critical, as it directly impacts the imaging quality of OMT. Nonetheless, reconstructing accurate 3D images remains a challenging task due to the limited photon information collected from the surface and the highly scattering nature of biological tissues.
Traditionally, common strategies to address the ill-posed nature of 3D reconstruction include various types of prior information and regularization methods, such as constraining feasible regions8–10 and employing multispectral measurement strategies.11,12 The OMT reconstruction is highly sensitive to noise. In order to mitigate the inevitable noise interference, the OMT reconstruction process is typically formulated as a minimization problem with regularization terms, making it more tractable in practical applications.13–15 The development of hybrid models has further improved the accuracy and efficiency of optical transmission models, enhancing both imaging precision and computational performance. However, despite these advances, limitations persist, including modeling inaccuracies between the optical transmission model and the actual transmission process, as well as inefficiencies in reconstruction due to the iterative nature of solving OMT problems. With the development of deep learning, the advent of deep learning has demonstrated considerable potential and advantages in addressing OMT reconstruction challenges.16–18 By directly fitting the nonlinear relationship between the measured surface photon energy and the internal source distribution, deep learning methods circumvent the errors introduced by optical transport models and the computational cost of model construction. Moreover, once trained, the OMT reconstruction will become very efficient. Meng et al. developed a K-nearest neighbor based local connectivity (KNN_LC) network, which utilizes the K-nearest neighbor strategy to enable rapid morphological reconstruction in fluorescence molecular tomography.19 Zhang et al. proposed an adaptive adversarial learning strategy (3D-UR-WGAN), which combines adversarial loss in GAN with L1 loss to achieve robust FMT reconstruction.20 Zhang et al. developed an attention mechanism based locally connected (AMLC) network to reduce barycenter error and improve morphological restorability.21 Deng et al. developed a reconstruction model named FDU-Net, which comprised a fully connected subnet, a convolutional encoder–decoder subnet, and a U-Net, for fast end-to-end 3D diffuse optical tomography(DOT) image reconstruction.22 The results from digital phantoms and simulation data showed that FDU-Net significantly outperforms traditional DOT reconstruction methods, offering a computational speedup of over four orders of magnitude. While these deep learning approaches help alleviate the ill-posing of OMT reconstruction and enhance both speed and quality, the lack of consideration for similarity relationships within the data distribution and insufficient validation of network module effectiveness may impact the overall accuracy and reliability of OMT reconstruction.
In this paper, inspired by DC2-NET,23 we propose a source distribution correlation enabled self-attention residual (DCeSR) network, designed to enhance localization and morphological accuracy in OMT reconstruction. The DCeSR network leverages residual learning strategies and self-attention mechanisms to effectively extract features and accurately predict the 3D distribution of light sources in biological tissues. Additionally, a combination of distribution correlation (DC) loss function and mean square error (MSE) loss function is introduced to improve the OMT reconstruction. We evaluated the DCeSR network through two different numerical simulation experiments, where it produced more precise and robust reconstructed results compared to the AMCL and KNN_LC methods.
II. METHODS
A. DCeSR network architecture
The DCeSR network emphasizes the accuracy of localization in reconstruction. By employing residual learning strategies and self-attention mechanisms, it integrates deep and shallow feature information from the measured surface photon energy to effectively extract features and accurately predict the 3D distribution of light sources in biological tissues. The objective function highlights the synergy between the mean squared error (MSE) loss function, which focuses on data distribution similarity and the distribution correlation (DC) loss function, which emphasizes the data distribution correlation. This synergy aims to continuously optimize the difference between the reconstructed light source distribution and the actual light source distribution. Additionally, L1 loss was incorporated to enhance the robustness of the reconstruction results. The specific architecture of the DCeSR network is shown in Fig. 1, comprises two fully connected layers, a residual learning block, an attention block, and an objective function module. First, the acquired surface photon energy measurements are fed into the network as the input vector. Features from the fully connected layer are gradually extracted through the residual and self-attention blocks. Finally, the feature vector passes through another fully connected layer to output the final reconstructed source distribution. The residual block includes a batch normalization layer to enhance network training. Additionally, a down-sampling layer is employed to efficiently extract features and reduce the number of parameters, minimizing computational cost. An up-sampling layer is also used for subsequent multiply operations through reshaping and scaling.
Deep learning models have demonstrated significant performance in various tasks through the use of attention mechanisms. In the context of OMT reconstruction, surface photon node information directly linked to the source is paramount in predicting the 3D distribution of the source. To tackle this, we employ a self-attention block. The input is the feature vector generated by the residual learning block and the output is an attention map that emphasizes the most informative surface photon energy measurements by assigning them higher weights. Furthermore, the dual sampling layer within the residual learning block enhances feature scaling and reshaping effectively. Through these measures, greater weights can be assigned to the extracted comprehensive effective features, enhancing their informativeness and beneficial for OMT reconstruction. Different from the AMLC network, this model employs a self-attention mechanism to strengthen the focus on significant correlations among data within the same dimension. Additionally, the entire DCeSR network operates as an end-to-end connection network, allowing OMT reconstruction to be achieved without the need for introducing additional parameters for fine-tuning.
B. Loss function
In this section, multiple reconstruction objectives are achieved simultaneously through shared representation information, aiming for more precise reconstruction results. Beyond focusing solely on the data errors, this work takes into account the similarity and correlation between data distributions. We design multiple objective function losses to enable simultaneous learning, ensuring accurate reconstruction of the target distributions.
1. Mean Square Error (MSE) Loss
2. Distribution Correlation (DC) Loss
C. Evaluation metrics and implementation details
To quantitatively evaluate the quality of OMT reconstruction, we used the metrics of CLE, Dice, and CNR.
where u and denotes the average intensity and the standard deviation. is the weight coefficient which is equal to the ratio of the number of nodes in the corresponding area to the total number of nodes. The subscripts R and B represent reconstruction results and background, respectively. Higher CNR means fewer reconstruction artifacts.
We implemented the training and testing of all deep-learning networks using PyTorch and Python 3.7. The Adam optimizer was utilized for training networks,25 where is 0.9 and is 0.99. Our DCeSR network is trained for 300 epochs with a batch size of 32 and a learning rate of 1 × 10−4.
D. Dataset collection
Training data are essential for the successful training of deep learning networks. However, obtaining training samples through in vivo experiments, which involve acquiring the surface photon measurements and then achieving the distribution of reconstructed targets, is impractical. To address this challenge, we employed the Monte Carlo (MC) method26,27 to generate the necessary training samples. As the gold standard for transmission models, the MC method provides both surface photon measurements and the ground truth distribution of the source. All datasets utilized in this study were obtained through standard grid discretization from the numerical models of a cylinder and a mouse.
To generate numerical samples, spherical source targets with a 1 mm radius were randomly positioned within the experimental domains. In the numerical cylinder experiments, 1030 single-source samples were generated, of which 206 were randomly selected for the test set. Similarly, the numerical mouse experiments utilized 890 single-source samples, with 95 randomly selected samples reserved for the test set.
For the numerical cylinder experiments, we collected a total of 6370 dual-source samples. During the experiment, 1260 samples were randomly selected as the test set, while the remaining 5110 samples were used to train the DCeSR network. Similarly, for the numerical mouse experiments, we generated 2690 dual-source samples, of which 540 were randomly selected as the test set.
III. RESULTS
This section presents a detailed analysis of the DCeSR network results from various numerical simulation experiments, covering both quantitative and qualitative aspects. The comparison methods used are AMLC and KNN_LC. It is important to note that all transverse and vertical figures in this section are sliced from their corresponding z and y coordinates height and the red circle in each figure indicates the actual spherical source region.
A. Single-source reconstruction on cylindrical phantom simulations
The standard cylinder mesh used in this experiment contains 9786 nodes and 55 128 tetrahedrons. The representative sample's coordinate where the true source is located is (−1 −3 17) mm. Additionally, we applied 30% Gaussian noise to the test samples to evaluate the robustness of the DCeSR network.
1. Experimental results
Figure 2(a) presents the reconstructed source using three methods, including 3D views as well as 2D transverse and vertical images. It can be seen from the 2D transverse view that the source reconstructed by DCeSR is slightly larger than the real source but exhibits the highest energy intensity. In comparison, the reconstruction results of the AMLC and KNN_LC methods show lower energy intensity. Moreover, the source reconstructed by KNN_LC deviates from the true source's position. Based on the 2D vertical reconstruction results, the size and shape of the source reconstructed by the DCeSR network are closer to the actual source area. Among them, the AMLC network also produces a reconstructed source that is too large in size. The KNN_LC method exhibits a significant deviation between the reconstructed target's position and the actual source. To further verify the stability of the network, we conducted a noise experiment, and Fig. 2(b) shows the results of all three methods with 30% Gaussian noise applied. The noise experiment results demonstrate that with the increase of noise, DCeSR maintains the best distribution and reconstruction performance among the three methods, showing the highest overlap with the real source area. In contrast, both AMLC and KNN_LC continue to produce reconstructed targets that are too large. Additionally, the spatial localization of the regions with the highest energy in the AMLC and KNN_LC results exhibits visible misalignment. Overall, these experimental results confirm that DCeSR outperforms the other two methods in terms of localization accuracy, morphology consistency, and robustness.
2. Quantitative analysis
To present the experimental results more intuitively and demonstrate the effectiveness of the network, we provide the quantitative results of the mean and standard deviation (SD) for the three methods, as shown in Table I. Based on the statistical results of CLE and Dice coefficients from the test set, it is evident that DCeSR achieves the highest reconstruction positioning accuracy and reconstruction morphology, followed by the AMLC network. In contrast, the reconstruction performance of the KNN_LC network is relatively poor. The DCeSR network achieves the minimum CLE of 0.344 ± 0.163 mm, the highest Dice of 0.595 ± 0.114, and CNR of 14.43 ± 1.25. The mean values of the reconstruction results for the test set highlight the network's accuracy, and the minimized standard deviation of the reconstruction results further demonstrates the stability of the DCeSR method. With the addition of Gaussian noise, the reconstruction quality of all three methods slightly decreases. However, the DCeSR network still achieves the lowest CLE of 0.377 ± 0.196 mm, the highest Dice of 0.581 ± 0.118, and CNR of 14.45 ± 1.38. These results demonstrate that the DCeSR network consistently delivers good reconstruction accuracy and stability, both with and without noise.
Noise level . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
No noise | CLE (mm) | 0.344 ± 0.163 | 0.431 ± 0.206 | 0.478 ± 0.225 |
Dice | 0.595 ± 0.114 | 0.550 ± 0.118 | 0.525 ± 0.114 | |
CNR | 14.43 ± 1.25 | 13.55 ± 1.12 | 14.04 ± 0.98 | |
30% noise | CLE (mm) | 0.377 ± 0.196 | 0.435 ± 0.205 | 0.486 ± 0.225 |
Dice | 0.581 ± 0.118 | 0.549 ± 0.118 | 0.523 ± 0.113 | |
CNR | 14.45 ± 1.38 | 13.21 ± 1.23 | 14.00 ± 1.01 |
Noise level . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
No noise | CLE (mm) | 0.344 ± 0.163 | 0.431 ± 0.206 | 0.478 ± 0.225 |
Dice | 0.595 ± 0.114 | 0.550 ± 0.118 | 0.525 ± 0.114 | |
CNR | 14.43 ± 1.25 | 13.55 ± 1.12 | 14.04 ± 0.98 | |
30% noise | CLE (mm) | 0.377 ± 0.196 | 0.435 ± 0.205 | 0.486 ± 0.225 |
Dice | 0.581 ± 0.118 | 0.549 ± 0.118 | 0.523 ± 0.113 | |
CNR | 14.45 ± 1.38 | 13.21 ± 1.23 | 14.00 ± 1.01 |
B. Dual-source reconstruction on cylindrical phantom simulations
We collected various samples with different edge-to-edge distances (EEDs) to validate the DCeSR network's capability to reconstruct the dual-source. To visually demonstrate the network's resolution ability, three different EED test samples with EEDs of 1, 2, and 3 mm were used for demonstration. The coordinates of these samples are (−4 −3 20, −1 −3 20), (−5 2 19, −1 2 19), and (0 −3 11, 5 −3 11) mm, respectively.
1. Experimental results
Figure 3 presents the simulation results of dual-source reconstruction using the DCeSR, AMLC, and KNN_LC networks at varying EEDs. As can be seen from Fig. 3, it can be seen that when EED is 1 mm, although the light source is relatively close, both DCeSR and AMLC effectively distinguish targets. In contrast, the KNN_LC exhibits adhesion issues, resulting in ineffective target reconstruction. Notably, DCeSR demonstrates the highest energy intensity, and the dual-sources reconstructed by DCeSR exhibit the greatest overlap with the actual dual sources, which is significant in terms of reconstruction. When EED is 2 mm, the DCeSR network reconstructs the dual-source with higher energy intensity than both AMLC and KNN_LC. The reconstruction results from the KNN_LC network show a significant deviation from the true dual source in terms of location. At an EED of 3 mm, the reconstruction results of the three reconstruction networks tend to be larger in distribution. However, the localization accuracy of the targets reconstructed by DCeSR network remains superior to that of the other two methods, indicating DCeSR's advantages in dual-sources’ reconstruction.
2. Quantitative analysis
The quantitative comparison results are listed in Table II. Both the reconstruction data of a single light source target and the average dual light sources data indicate that DCeSR achieved the best overall reconstruction performance. DCeSR achieves the best reconstruction results with a mean CLE of 0.371 mm, a mean Dice coefficient of 0.762, and a CNR of 14.38. Compared to the other two methods, DCeSR demonstrated the smallest standard deviation, indicating better stability and a smaller gap between the reconstructed dual sources. Furthermore, in terms of position reconstruction error, DCeSR is 72.9% to 82.3% of the other two methods. The above results confirm that DCeSR is capable of accurately reconstructing the morphology of dual sources.
Type . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
Dual-source | CLE1 (mm) | 0.330 ± 0.188 | 0.380 ± 0.213 | 0.439 ± 0.239 |
CLE2 (mm) | 0.412 ± 0.241 | 0.521 ± 0.344 | 0.579 ± 0.369 | |
Mean CLE (mm) | 0.371 ± 0.215 | 0.451 ± 0.279 | 0.509 ± 0.304 | |
Dice 1 | 0.791 ± 0.172 | 0.713 ± 0.180 | 0.659 ± 0.193 | |
Dice 2 | 0.733 ± 0.228 | 0.639 ± 0.261 | 0.584 ± 0.263 | |
Mean Dice | 0.762 ± 0.200 | 0.676 ± 0.221 | 0.622 ± 0.228 | |
CNR | 14.38 ± 1.82 | 14.04 ± 1.96 | 13.93 ± 1.87 |
Type . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
Dual-source | CLE1 (mm) | 0.330 ± 0.188 | 0.380 ± 0.213 | 0.439 ± 0.239 |
CLE2 (mm) | 0.412 ± 0.241 | 0.521 ± 0.344 | 0.579 ± 0.369 | |
Mean CLE (mm) | 0.371 ± 0.215 | 0.451 ± 0.279 | 0.509 ± 0.304 | |
Dice 1 | 0.791 ± 0.172 | 0.713 ± 0.180 | 0.659 ± 0.193 | |
Dice 2 | 0.733 ± 0.228 | 0.639 ± 0.261 | 0.584 ± 0.263 | |
Mean Dice | 0.762 ± 0.200 | 0.676 ± 0.221 | 0.622 ± 0.228 | |
CNR | 14.38 ± 1.82 | 14.04 ± 1.96 | 13.93 ± 1.87 |
C. Single-source reconstruction on digital mouse simulations
The standard digital mouse mesh used in this experiment consists of 8588 nodes and 47 407 tetrahedrons, with the representative sample coordinate being (18, 16, 13) mm. Additionally, 30% Gaussian noise was added to the test samples to access the robustness of the DCeSR network.
1. Experimental results
Figure 4 presents the 3D view and 2D sectional images of single-source reconstruction using the DCeSR, AMLC, and KNN_LC networks. From the reconstruction results, it is evident that the reconstructed target for all three methods exhibits larger volumes compared to the actual light source. In terms of reconstruction accuracy, the spatial positioning of the reconstructed targets by the DCeSR and AMLC networks is relatively precise, while the KNN_LC network demonstrates a significant deviation between the reconstructed target position and the true source. Furthermore, compared with the other two methods, DCeSR yields the highest energy intensity. Figure 4(b) illustrates the 3D view and 2D sectional images of single-source reconstructions with 30% Gaussian noise by DCeSR, AMLC, and KNN_LC networks. Under this noise influence, the energy distribution of the reconstructed results from all three methods is affected to varying degrees. Notably, the energy intensity of the AMLC and KNN_LC reconstruction decreased significantly, whereas the energy of the DCeSR reconstruction only decreased slightly, and the influence of noise was relatively small. Additionally, compared to the reconstruction results of DCeSR, the reconstructed sources of AMLC and KNN_LC appear overly sparse. Especially, the position of the reconstructed source of the KNN_LC network has a large deviation from the true source. DCeSR achieves more accurate source localization and the highest overlap with the true source. These results demonstrate the superiority of the DCeSR network in terms of both reconstruction accuracy and robustness in single-source scenarios.
2. Quantitative analysis
To quantitatively assess the performance of the reconstruction results from the three methods, Table III presents the mean and standard deviation (SD) of CLE, Dice, and CNR metrics for the reconstructed targets in the test set. As shown in the table, the DCeSR network outperforms the other two methods in both positional accuracy and morphology reconstruction, achieving the minimum CLE of 0.442 ± 0.241 mm, the highest Dice of 0.560 ± 0.108, and the highest CNR of 13.46 ± 2.12, respectively. When 30% Gaussian noise is introduced, a slight decrease in reconstruction accuracy and morphological similarity is observed across all three methods. However, the DCeSR network still maintains the minimum CLE and the highest Dice coefficient, demonstrating its superior robustness.
Noise level . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
No noise | CLE (mm) | 0.442 ± 0.241 | 0.474 ± 0.237 | 0.480 ± 0.264 |
Dice | 0.560 ± 0.108 | 0.500 ± 0.119 | 0.462 ± 0.101 | |
CNR | 13.46 ± 2.12 | 11.34 ± 2.36 | 11.19 ± 2.86 | |
30% noise | CLE (mm) | 0.452 ± 0.239 | 0.484 ± 0.239 | 0.487 ± 0.265 |
Dice | 0.554 ± 0.115 | 0.491 ± 0.123 | 0.459 ± 0.102 | |
CNR | 11.82 ± 1.74 | 11.66 ± 1.82 | 11.12 ± 1.64 |
Noise level . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
No noise | CLE (mm) | 0.442 ± 0.241 | 0.474 ± 0.237 | 0.480 ± 0.264 |
Dice | 0.560 ± 0.108 | 0.500 ± 0.119 | 0.462 ± 0.101 | |
CNR | 13.46 ± 2.12 | 11.34 ± 2.36 | 11.19 ± 2.86 | |
30% noise | CLE (mm) | 0.452 ± 0.239 | 0.484 ± 0.239 | 0.487 ± 0.265 |
Dice | 0.554 ± 0.115 | 0.491 ± 0.123 | 0.459 ± 0.102 | |
CNR | 11.82 ± 1.74 | 11.66 ± 1.82 | 11.12 ± 1.64 |
D. Dual-source reconstruction on digital mouse simulations
In this section, we further evaluate the capability of the DCeSR network in reconstructing dual sources by collecting samples with varying edge-to-edge distances (EEDs) for training. For demonstration purposes, we selected three test dual-source samples with EEDs of 1 mm, 2 mm, and 3 mm, respectively. The coordinates of these samples are (15 10 13, 18 10 13) mm, (15 11 19, 19 11 19) mm, and (15 10 13, 20 10 13) mm, respectively.
1. Experiment results
Figure 5 presents the 3D view and 2D section images of dual-source reconstructions using the three methods. According to Fig. 5, when the EED is 1 mm, only the DCeSR network reconstructs the dual-source with the highest overlap with the true dual-source region. Both AMLC and KNN_LC fail to accurately reconstruct the dual-source, exhibiting adhesion issues. Additionally, compared to the other two methods, the energy intensity of the dual-source reconstructed by DCeSR is significantly higher than that of the other two methods. At the EED of 2 mm, DCeSR achieves the accurate reconstruction of the dual-source positions and the highest degree of overlap with the real sources. In contrast, one of the dual-source reconstructed by AMLC shows low overlap with the real source, while the dual-source reconstructed by the KNN_LC network exhibits a significant deviation. At an EED of 3 mm, DCeSR continues to outperform the other methods in terms of both source position and morphology reconstruction. The dual-sources reconstructed by AMLC and KNN_LC appear overly sparse and less accurate.
2. Quantitative analysis
The quantitative results of the previously mentioned metrics are presented in Table IV. It can be seen from the data in the table that the DCeSR network outperforms the other two methods in dual-source reconstruction, achieving the minimum mean CLE of 0.460 ± 0.304 mm and the maximum mean Dice metric of 0.843 ± 0.069. Notably, the Dice coefficient of DCeSR is 1.16 times that of AMLC (0.695 ± 0.308) and 1.28 times that of KNN_LC (0.627 ± 0.315). Additionally, the DCeSR network attains a high CNR value of 12.31 ± 1.96, further demonstrating its efficacy in reconstruction.
Type . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
Dual-source | CLE1 (mm) | 0.502 ± 0.337 | 0.500 ± 0.353 | 0.535 ± 0.378 |
CLE2 (mm) | 0.418 ± 0.270 | 0.519 ± 0.321 | 0.585 ± 0.337 | |
Mean CLE (mm) | 0.460 ± 0.304 | 0.510 ± 0.337 | 0.560 ± 0.358 | |
Dice 1 | 0.738 ± 0.309 | 0.693 ± 0.332 | 0.663 ± 0.346 | |
Dice 2 | 0.867 ± 0.221 | 0.697 ± 0.284 | 0.590 ± 0.283 | |
Mean Dice | 0.803 ± 0.265 | 0.695 ± 0.308 | 0.627 ± 0.315 | |
CNR | 12.31 ± 1.96 | 12.07 ± 2.14 | 11.94 ± 2.11 |
Type . | Metrics . | DCeSR . | AMLC . | KNN_LC . |
---|---|---|---|---|
Dual-source | CLE1 (mm) | 0.502 ± 0.337 | 0.500 ± 0.353 | 0.535 ± 0.378 |
CLE2 (mm) | 0.418 ± 0.270 | 0.519 ± 0.321 | 0.585 ± 0.337 | |
Mean CLE (mm) | 0.460 ± 0.304 | 0.510 ± 0.337 | 0.560 ± 0.358 | |
Dice 1 | 0.738 ± 0.309 | 0.693 ± 0.332 | 0.663 ± 0.346 | |
Dice 2 | 0.867 ± 0.221 | 0.697 ± 0.284 | 0.590 ± 0.283 | |
Mean Dice | 0.803 ± 0.265 | 0.695 ± 0.308 | 0.627 ± 0.315 | |
CNR | 12.31 ± 1.96 | 12.07 ± 2.14 | 11.94 ± 2.11 |
3. Ablation experiments
To validate the effectiveness of the distribution-correlation (DC) loss function, ablation experiments were conducted using both cylindrical and digital mouse simulations. A comparison between DCeSR and DCeSR trained without DC (SR_without_DC) was performed, focusing on the average CLE and Dice metrics. Figure 6 provides a detailed comparison of the results. When the DC loss function is removed, the average CLE of the reconstructed dual-source increases by 21.8% in the numerical cylinder simulations and by 5.4% in the digital mouse simulations. Concurrently, the average Dice coefficient of the reconstructed dual-source decreases by 9.1% in the numerical cylinder simulations and 4.4% in the digital mouse simulations. These results clearly demonstrate the effectiveness of the DC loss function module.
IV. DISCUSSION
Deep learning strategies have proved highly effective in OMT reconstruction. In this study, we develop the DCeSR network, which comprises two fully connected layers, one residual learning block, and a self-attention block to enhance OMT reconstruction. Multiscale features are progressively extracted through the residual learning and self-attention block. Within the residual learning block, down-sampling and up-sampling layers facilitate efficient feature scaling. The feature vectors generated by the residual learning block are subsequently input into the self-attention block. The attention mechanism produces a map that assigns greater weights to the most informative surface photon energy measurements to enabling precise OMT reconstruction. Given that the MSE loss function only considers the distance similarity between data points, we introduce a distribution correlation (DC) loss function based on Pearson's correlation coefficient to jointly facilitate the morphological reconstruction. The DC loss function serves as a measure of the similarity of data distributions, and the strategy that integrates both loss functions optimally reduces the discrepancy between the real and predicted source distributions.
We conducted single-source and dual-source reconstruction experiments using two distinct numerical simulation datasets to evaluate the reconstruction performance of the DCeSR network. The assessment of the DCeSR network was performed from three aspects: qualitative results, quantitative results, and robustness. The comparative methods employed in this study are the AMLC and KNN_LC networks. The experimental results indicate that the DCeSR network outperforms the other two methods in localization accuracy, morphology, and robustness, achieving the lowest CLE and highest Dice coefficient. Dual-source experiments were conducted with varying EEDs of 1, 2, 3 mm. The DCeSR network consistently demonstrated superior localization accuracy in dual-source reconstruction compared to the other methods. Overall, these results illustrate that the DCeSR network achieves the best reconstruction outcomes. Furthermore, the results from the ablation experiments underscore the effectiveness of the distribution correlation (DC) loss function within the DCeSR network.
There are still some limitations to this study. First, the rough approximation in meshing affects the reconstruction quality of OMT. Second, the ground-truth sources were not meshed in the experiments; instead, a red circle with a radius of 1 mm was used to represent the real source regions. which leads to a large difference between the morphology of the reconstructed sources and the real sources from the display level. In the future, we plan to address these limitations to further improve accuracy.
V. CONCLUSION
In conclusion, we propose a source distribution correlation enabled self-attention residual network (DCeSR network) to achieve accurate OMT reconstruction. The DCeSR network utilizes a residual learning strategy and a self-attention mechanism to combine the deep and shallow features, enabling the extraction of highly informative surface measurements for precise prediction of the three-dimensional distribution of light sources in tissues. Additionally, a distribution dependent (DC) loss function is introduced and combined with the MSE loss function to ensure morphological reconstruction accuracy while maintaining similarity and correlation. The DCeSR network was tested on two distinct numerical simulation models. The experimental results demonstrate that the DCeSR network has significant advantages in reconstructing precision. Furthermore, the noise experiment confirms the stability of the network.
ACKNOWLEDGMENTS
This study was supported by the National Natural Science Foundation of China (62101439) and Key Research and Development Program of Shaanxi (2023-YBSF-289).
AUTHOR DECLARATIONS
Conflict of interest
The authors declare no conflicts of interest.
Author Contributions
Lin Wang and Yahui Xiao contributed equally to this work.
Lin Wang: Conceptualization (equal); Data curation (equal); Funding acquisition (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). Yahui Xiao: Data curation (equal); Formal analysis (equal); Methodology (equal); Software (equal); Validation (equal); Writing – original draft (equal). Chenrui Pan: Data curation (equal); Formal analysis (equal); Validation (equal). Xin Cao: Investigation (equal); Methodology (equal); Resources (equal). Minghua Zhao: Project administration (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data sets and the raw code are available from the corresponding author upon request.