Transcranial magnetic stimulation (TMS) is a non-invasive method for treating neurological and psychiatric disorders. It is being tested as an experimental treatment for patients with mild to moderate traumatic brain injuries (mTBI). Due to the complex, heterogeneous composition of the brain, it is difficult to determine if targeted brain regions receive the correct amount of electric field (E-field) induced by the TMS coil. E-field distributions can be calculated by running time-consuming finite element analysis (FEA) simulations of TMS on patient head models. Using machine learning, the E-field can be predicted in real-time. Our prior work used a Deep Convolutional Neural Network (DCNN) to predict the E-field in healthy patients. This study applies the same DCNN to mTBI patients and investigates how model depth and color space of E-field images affect model performance. Nine DCNNs were created using combinations of 3, 4, or 5 encoder and decoder blocks with the color spaces RGB, LAB, and YCbCr. As depth increased, training and testing peak signal-to-noise ratios (PSNR) increased and mean squared errors (MSE) decreased. The depth 5 YCbCr model had the highest training and testing PSNRs of 34.77 and 29.08 dB and lowest training and testing MSEs of 3.335∗10−4 and 1.237∗10−3 respectively. Compared to the model in our prior work, models of depth 5 have higher testing PSNRs and lower MSEs and, except for RGB. Thus, DCNNs with depth 5 and alternative color spaces, despite losing information through color space conversions, resulted in higher PSNRs and lower MSEs.
Abbreviations
- DCNN
deep convolutional neural network
- E-field
electric field
- FEA
finite element analysis
- LAB
light-alpha-beta
- MRI
magnetic resonance imaging
- MSE
mean squared error
- mTBI
mild to moderate traumatic brain injury
- PSNR
peak signal-to-noise ratio
- RGB
red-green-blue
- TMS
transcranial magnetic stimulation
- YCbCr
light-color difference blue-color difference red.
I. INTRODUCTION
Transcranial Magnetic Stimulation (TMS) is a noninvasive neuromodulation technique that has been approved by the Food and Drug Administration to treat conditions such as depression, obsessive compulsive disorder, and smoking addiction.1–4 TMS uses time-varying magnetic fields to electrically stimulate targeted regions of the brain using a specific threshold electric field (E-field) of around 50-150 V/m depending on which condition is being treated.5–9 Due to the complex and varied structure of the brain, it is difficult to accurately determine if the targeted region has received the correct amount of stimulation,10 which may create uncertainty in cases of weak or non-response to treatment. The TMS dosage is a specific percentage of the maximum stimulator output called the resting motor threshold (RMT), which is measured when the motor evoked potential (MEP) in the first dorsal interosseous muscle of the thumb is 50 µV or more in 5 out of 10 consecutive stimuli.11–13 This typically occurs when the induced E-field in the targeted region of the brain is at a threshold of 50–150 V/m. Measuring the RMT or calculating the E-field induced by TMS through finite element analysis (FEA) simulations is time and resource consuming; however, using a neural network would overcome the time limitations associated with running the FEA for each case and enable prediction of the induced E-field in real time.14–17
Some studies have utilized artificial neural networks for predicting the E-field induced by the TMS coil.18,19 Another method uses deep convolutional neural networks (DCNN) which also requires less time and computational power to predict the induced electric field compared to FEA simulations. Other studies have used variations of DCNNs to predict the E-field induced by TMS, although those studies used publicly available data and rotated the TMS coil to augment data, which does not reflect clinical conditions with varying brain sizes and shapes.20–23
In our previous work, we used a DCNN to predict the E-field induced by TMS from healthy patients.24 To reflect the varying brain sizes, the brain models were scaled to generate 230 head models for training and validation. In this study, we used the same DCNN to predict the E-field induced by TMS from 26 mild to moderate traumatic brain injury (mTBI) patients. We also scaled the head models to generate 286 models to reflect varying size. The novelty of this work is a modified DCNN architecture by changing the number of encoders and decoders in the model as well as changing the format of the data fed into the model by changing the color space of the images before training the model with them. The goals of this study were to accurately predict the E-field induced by TMS from mTBI patients and to improve the performance of the DCNN by changing model parameters and the form of the data used to train the model.
II. METHODOLOGY
A. Head models
Twenty-six veterans (45.96 ± 9.81) participated in the study and were scanned with magnetic resonance imaging (MRI) according to the procedures described in Franke et al. with a Philips Ingenia 3.0 Tesla Scanner to produce T-1 weighted MRI images.25 The T-1 weighted MRI were processed through the SimNIBS mri2mesh pipeline (SimNIBS Developers 2019, v2.0.1) to create segmented head models (Fig. 1) with seven segments (white matter, gray matter, cerebrospinal fluid, skin, skull, ventricles, and cerebellum).26 Each segment was assigned its own density, conductivity, relative electric permittivity, and relative magnetic permeability based on the IT’IS LF database (IT’IS Foundation, v4.0).26 Any abnormalities in the segments of the head model were smoothed using Meshmixer (AutoDesk, Inc. v11.2.37).26,27
Anatomically accurate head models developed from MRI or 26 mTBI patients ordered from left to right then top to bottom.
Anatomically accurate head models developed from MRI or 26 mTBI patients ordered from left to right then top to bottom.
B. Head model scaling
Each head model segment for each participant was scaled into 11 different sizes ranging from 0.90 to 1.10 in increments of 0.02. This augmented the data to 286 distinct head models by considering each brain as another unique brain and to account for differing head sizes.
C. TMS finite element analysis simulation
The head model segments were imported into Sim4Life (Zurich Med Tech, v6.2.1.4972), an FEA software, where a TMS figure-of-eight coil was placed over the precentral gyrus target area of the brain, which was identified by the “inverted omega” landmark.26,28 Simulations were run with the coil set at a 45° angle (Fig. 2) with 4750 A and 3571.14 Hz where the Biot-Savart Law was used by Sim4life to calculate the magnetic field generated by the current running through the TMS machine coil and a magneto quasi-static low-frequency solver to calculate the values for the E-field induced by the magnetic field in the brain.24,26 After the simulations were run, the values of the E-field were scaled from 0 to 500 V/m, to accommodate all E-field values based on the peak E-field on the skin. The values for the E-field were represented using 32 different colors on a gradient from dark blue to light yellow as the values for E-field increased. Then, screenshots with resolution 1024 by 1024 pixels were captured of the coronal plane containing the greatest value for E-field and an anatomical slice containing all seven segmented parts of the brain.
D. Color spaces
Typically, images are represented in red-green-blue (RGB). In RGB, all colors are represented in rectangular coordinates with the axes being red, green, and blue and the domain for each axis is 0 to 256.
Images can also be represented in other color spaces as well (Fig. 3). In light-alpha-beta (LAB), colors are represented in rectangular coordinates as well, but the axes are different. One of the axes called light has values from 0 to 100 representing how dark or light a pixel is. The other axes are alpha and beta which represent spectrums from green to red and blue to yellow respectively with values from −128 to 127.
A third color space is luminance-blue difference chroma-red difference chroma (YCbCr), which is also represented in rectangular coordinates. One axis is luminance, which has values from 16 to 235 representing how dark or light a pixel is similar to the light axis for LAB. The remaining axes are called color difference blue and color difference red, which represented spectrums from yellow-green to light purple and teal to red, both with values from 16 to 240.
When images are converted between color spaces, the images lose a bit of information as a result of the color information being stored in 8-bit numbers. This information loss by means of color space conversion was quantified by converting images from RGB to LAB or YCbCr and converting those images back to RGB and computing mean squared error (MSE) and peak signal-to-noise ratio (PSNR). For RGB, the information loss was zero due to there being no color space conversion. For LAB, the MSE loss was 3.27 × 10−15 with a PSNR of 144.86 dB and for YCbCr, the MSE loss was 2.08 × 10−16 with a PSNR of 156.82 dB, in both cases the information loss was minimal.
E. Data preprocessing
Depending on which color space a DCNN was trained on, images in the dataset used to train the DCNN either remained in RGB or were converted to LAB or YCbCr. The images were padded with zeros and min-max normalization was implemented to scale the data before training the neural networks. The dataset of 286 anatomical images and their corresponding E-field images was randomly split 90-10 into training and testing sets.
F. Model
A DCNN-based encoder-decoder network called a U-Net. The DCNN is composed of an encoder on the left half and a decoder on the right half, each made up of 3, 4, or 5 encoder and decoder blocks (Fig. 4). The encoder, which featured a series of convolution, batch normalization, and nonlinear activation function layers with max pooling layers works similar to a traditional convolutional neural network by extracting complicated features from the image data. The decoder, which features a series of convolution, batch normalization, and nonlinear activation function layers with upsampling layers, works by reconstructing the E-field image. In addition, skip connections were utilized, which transferred information on features from the encoders to the decoders for a more accurate prediction.
The model was trained for 200 epochs using a NVIDIA A100-SXM4-40GB GPU through Google Colab. Over the course of training the models for 200 epochs, the kernel weights and kernel bias are updated in order to minimize the error between the predicted E-field images and the actual E-field images as quantified by the loss function, mean squared error (MSE). The LAB and YCbCr models minimized the error between the predicted E-field images and the actual E-field images after those images were converted to LAB and YCbCr respectively.
Training MSE was calculated by the models predicting the E-field image for each of the image pairs in the training set. For the LAB and YCbCr models, the predicted E-field images were converted to RGB. Training MSE was the MSE between the predicted E-field images and the actual E-field images in RGB. Similarly, Testing MSE was the MSE between the predicted E-field images from the testing dataset and the actual E-field images in RGB. Models trained on images in LAB or YCbCr predict images encoded in LAB or YCbCr respectively, so MSEs were taken after converting the images predicted by those models to RGB to ensure that MSEs were calculated between images encoded in the same color space. Training PSNR was calculated using the training MSE and testing PSNR was calculated using the testing PSNR.
III. RESULTS
Training and testing loss decrease exponentially over the course of training (Fig. 6), although testing loss does exhibit arbitrary fluctuations (Fig. 6). Models with a higher depth tended to achieve a higher training and testing PSNRs and MSEs. The model with the highest training and testing PSNRs and lowest MSEs was the YCbCr model with a depth of 5 with a training PSNR of 34.77 dB and a testing PSNR of 29.08 dB (Table I) as well as a training MSE of 3.335∗104 and a testing MSE of 1.237∗10−3 (Table II). As such, this model shows good accuracy in predicting the induced E-field for traumatic brain injury patients. For the DCNNs with a depth of 5, both the model trained on images encoded in YCbCr as well as the one trained on LAB had higher training and testing PSNRs than the model trained on images encoded in RGB. Across all numbers of encoders and decoders, larger errors in prediction occur at borders between different E-field intensities, however the occurrence of hotspots of high E-field values where they should not be mainly occurred in models with a depth of 3 (Fig. 5).
Training and Testing loss in each epoch of training. Over the epochs in training, training and testing loss decay exponentially, but testing loss shows arbitrary fluctuations. At the 200th epoch for all combinations of model depth and color space, the training and testing losses are less than 0.005.
Training and Testing loss in each epoch of training. Over the epochs in training, training and testing loss decay exponentially, but testing loss shows arbitrary fluctuations. At the 200th epoch for all combinations of model depth and color space, the training and testing losses are less than 0.005.
Mean squared error for training and testing sets by model and color space.
. | Training data . | Testing data . | ||||
---|---|---|---|---|---|---|
Color space . | Model depth . | Model depth . | ||||
3 | 4 | 5 | 3 | 4 | 5 | |
RGB | 25.84 | 31.27 | 32.77 | 24.51 | 28.07 | 28.90 |
LAB | 26.86 | 29.69 | 33.48 | 25.26 | 27.82 | 29.05 |
YCbCr | 25.79 | 29.53 | 34.77 | 24.59 | 27.72 | 29.08 |
. | Training data . | Testing data . | ||||
---|---|---|---|---|---|---|
Color space . | Model depth . | Model depth . | ||||
3 | 4 | 5 | 3 | 4 | 5 | |
RGB | 25.84 | 31.27 | 32.77 | 24.51 | 28.07 | 28.90 |
LAB | 26.86 | 29.69 | 33.48 | 25.26 | 27.82 | 29.05 |
YCbCr | 25.79 | 29.53 | 34.77 | 24.59 | 27.72 | 29.08 |
Peak signal-to-noise ratio (dB) for training and testing sets by model and color space.
Color space . | Training data . | Testing data . | ||||
---|---|---|---|---|---|---|
. | Model depth . | Model depth . | ||||
3 | 4 | 5 | 3 | 4 | 5 | |
RGB | 2.604 × 103 | 7.465 × 104 | 5.286 × 104 | 3.539 × 103 | 1.561 × 103 | 1.288 × 103 |
LAB | 2.062 × 103 | 1.074 × 103 | 4.487 × 104 | 2.981 × 103 | 1.653 × 103 | 1.245 × 103 |
YCbCr | 2.636 × 103 | 1.114 × 103 | 3.335 × 104 | 3.474 × 103 | 1.691 × 103 | 1.237 × 103 |
Color space . | Training data . | Testing data . | ||||
---|---|---|---|---|---|---|
. | Model depth . | Model depth . | ||||
3 | 4 | 5 | 3 | 4 | 5 | |
RGB | 2.604 × 103 | 7.465 × 104 | 5.286 × 104 | 3.539 × 103 | 1.561 × 103 | 1.288 × 103 |
LAB | 2.062 × 103 | 1.074 × 103 | 4.487 × 104 | 2.981 × 103 | 1.653 × 103 | 1.245 × 103 |
YCbCr | 2.636 × 103 | 1.114 × 103 | 3.335 × 104 | 3.474 × 103 | 1.691 × 103 | 1.237 × 103 |
Comparison between the induced E-field predicted by each model and the ground truth.
Comparison between the induced E-field predicted by each model and the ground truth.
IV. DISCUSSIONS
In this study, we were able to obtain accurate predictions of the induced E-field for mTBI patients and improve upon the DCNN proposed in our previous work. Our previous work achieved a PSNR of 32.83 and 27.4 dB for training and testing respectively from a dataset of healthy patients.24 In this study, our best performing model, a depth 5 DCNN with trained on images in YCbCr achieved PSNRs of 34.77 and 29.08 dB for training and testing as well as MSEs of 3.335∗10−4 and 1.237∗10−3 for training and testing respectively from a dataset of mTBI patients.
As the depth of the DCNNs increased, training and validation loss over the training interval became more stable and fewer instances of areas of high E-field where they should not have occurred. In addition, as the depth of the DCNNs increased, the borders between the colors in the induced E-field images became more defined. As the depth of the DCNNs increased, model performance also increased, which was consistent with Yokota et al.’s study.20 In addition, despite the information loss from color space conversions, the depth 5 DCNN trained on images in YCbCr performed better than the depth 5 DCNN trained on images in RGB, suggesting that alternative color spaces may be able to better represent the color information.
Previous investigations have shown the effectiveness of TMS protocols on improving TBI symptoms including depression and executive function. Studies have also shown the influence of neuroanatomy on TMS outcomes.25–29 These methods however have not produced a viable biomarker for accurately predicting improvements in depression, executive function, and other neurological conditions from TMS paradigms. Since the TMS outcomes are known to vary, there is a need for alternative approaches in understanding this variance for better treatment protocols.25–30 Other studies, which have utilized machine learning for TMS predictions have been limited to healthy patients which do not represent treatment conditions. In this study, we successfully created an accurate machine learning models to predict the E-field profile induced by TMS from anatomical and E-field image data for mTBI patients and built upon our previous work by investigating different color spaces and model depths.24
This study is limited by the small sample size; the inclusion of more data in the form of more images per head model or more subjects’ head models to increase the diversity of the data used to train the model would improve model predictability. Future studies should consider a deeper investigation into color spaces and conditions that are most suitable to improve the performance of models. This can be done by modifying the E-field image data directly in ways such as changing the number of colors to be greater or fewer from 32, by changing the colors that represent E-field magnitudes present from dark blue for the lowest values and white for the highest values, or by changing the scaling from linear to other kinds of scaling, such as logarithmic or a polynomial scaling. In addition, future studies should investigate whether the quality of images produced by generative adversarial networks is significantly affected by the color space of the images used to train them. The modified DCNNs may have a limitation that they can only be used for 2 dimensional images. This method may not be suitable to predict a single figure of merit of TMS response such as RMT or MEP, however future models using data in 3 dimensions may be able to predict these and other patient measures. This study is also limited to use with a figure of eight TMS coil but not with novel coils such as Quadruple Butterfly Coil30 or Halo Coil.31,32
V. CONCLUSIONS
These DCNNs predicted accurate E-field images in real-time for mTBI patients, with the best performing model possessing training and validation PSNRs of 34.77 and 29.08 dB and training and validation MSEs of 3.335∗10−4 and 1.237∗10−3 respectively. Therefore, DCNNs are viable alternatives to time consuming FEA analysis. In addition, for the DCNNs with a depth of 5, both the model trained on images encoded in YCbCr as well as the one trained on LAB had higher training and testing PSNRs and lower MSEs than the model trained on images encoded in RGB. Thus, despite the information loss caused when converting images between color spaces, using images in alternate color spaces resulted in models with higher PSNRs and lower MSEs, which is attributed to the information within an image being better represented and processed by models in alternative color spaces.
ACKNOWLEDGMENTS
This work is supported by the Commonwealth Cyber Initiative (Proposal ID #: FP00010500), VCU Breakthrough Grant (AP00001868) and the U.S. Department of Veterans Affairs Awards No. I01 CX002097. The U.S. Army Medical Research Acquisition Activity, 839 Chandler Street, Fort Detrick, MD 21702, is the awarding and administering acquisition office. Opinions, interpretations, conclusions, and recommendations are those of the authors and are not necessarily endorsed by the Department of Defense.
The authors would like to thank Dr. George Gitchel from Hunter Holmes McGuire Veterans Affairs Medical Center, Richmond, for useful discussions and subject recruitment and administration of the original study, conducted under Virginia Department of Aging and Rehabilitative Services Commonwealth Neurotrauma Initiative Fund, award # A262-76756.
Dr. Hadimani has two granted patents on TMS coils (US10792508B2 and US11547867B2), one granted patent on anatomically accurate brain phantom (US11373552B2), and one patent published and pending on TMS coil, US Patent Application (US20220241605A1).
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Ethics Approval
All procedures were approved by the Institutional Review Boards of the McGuire VA Medical Center and Virginia Commonwealth University, and the trial was registered at clinical trials.gov with identifier NCT03642158. All participants gave informed consent for the present study.
Author Contributions
Yash R. Saxena: Conceptualization (equal); Methodology (equal); Software (equal); Visualization (equal); Writing – original draft (equal); Writing – review & editing (equal). Connor J. Lewis: Data curation (equal); Software (equal); Writing – review & editing (equal). Joseph V. Lee: Data curation (equal). Laura M. Franke: Data curation (equal). Muhammad Sabbir Alam: Methodology (equal); Software (equal); Supervision (equal); Writing – review & editing (equal). Mohannad Tashli: Methodology (equal); Software (equal); Writing – review & editing (equal). Jayasimha Atulasimha: Data curation (equal); Software (equal); Writing – review & editing (equal). Ravi L. Hadimani: Data curation (equal); Funding acquisition (equal); Resources (equal); Software (equal); Supervision (equal); Writing – review & editing (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request.