Gastrointestinal disorders are a class of prevalent disorders in the world. Capsule endoscopy is considered an effective diagnostic modality for diagnosing such gastrointestinal disorders, especially in small intestinal regions. The aim of this work is to leverage the potential of deep convolutional neural networks for automated classification of gastrointestinal abnormalities from capsule endoscopy images. This method developed a deep learning architecture, GastroNetV1, an automated classifier, to detect abnormalities in capsule endoscopy images. The gastrointestinal abnormalities considered are ulcerative colitis, polyps, and esophagitis. The curated dataset consists of 6000 images with “ground truth” labeling. The input image is automatically classified as ulcerative colitis, a polyp, esophagitis, or a normal condition by a web-based application designed with the trained algorithm. The classifier produced 99.2% validation accuracy, 99.3% specificity, 99.3% sensitivity, and 0.991 AUC. These results exceed that of the state-of-the-art systems. Hence, the GastroNetV1 could be used to identify the different gastrointestinal abnormalities in the capsule endoscopy images, which will, in turn, improve healthcare quality.
I. INTRODUCTION
The disorders related to the digestive system are called gastrointestinal (GI) disorders. These disorders have an economic and social impact on society. The most common intestinal disorders are irritable bowel syndrome (IBS), GERD, ulcers, polyps, colorectal cancer (CRC), gastroenteritis, diverticular disease, celiac disease (CD), inflammatory bowel disease (IBD), gastric cancer, hemorrhoids, Crohn’s disease, and pancreatitis.1,2 Inflammation in the digestive tract results from the IBD called ulcerative colitis (UC). The condition starts in the rectum and progresses proximally and unbrokenly along the colon. Among the world’s developing regions, India has the most noteworthy prevalence frequency of 9.3 for IBD and 5.4 for ulcerative colitis per 100 000 persons. On the other hand, reports indicate that the occurrence rate of UC remains consistent between males and females from childhood through adulthood. Around 10% of patients with UC have a first-degree relative with the same disease, and it is more predominant than Crohn’s disease.3,4 Patients with longstanding or extensive UC have an increased chance of progressing to CRC.5 A gastric polyp is an unusual development of tissue projecting from the gastric mucosal layer. In general, the rate of polyps shows up to have expanded, as indicated by a higher prevalence in huge series.6 Evacuating the polyps diminishes the frequency and mortality due to colorectal cancer. Colonoscopy is the favored screening tool because it permits direct examination of the colorectal mucosa and evacuation of polyps with malignant potential.7 Esophagitis refers to inflammation or injury to the esophageal mucosa. The common cause is gastroesophageal reflux, which leads to erosive esophagitis. One-third of patients with esophagitis may have an ordinary appearance of the esophagus. Other findings include esophageal wrinkles, strictures, and mucosal rings (trachealization), and white medication-induced esophagitis has an estimated incidence of 3.9 per 100 000 populations per year with a mean age of 41.5 years at diagnosis.8
Traditional imaging and endoscopic procedures aided clinicians to examine and diagnose the human GI tract. Basic radiology methods, such as barium swallow test and x-ray fluoroscopy, visualize and examine the upper gastrointestinal tract.9 Invasive and non-surgical procedures, such as upper endoscopy, gastroscopy, colonoscopy, and ultra-high magnification endocytoscopy, identify bleeding, density, ulcer severity, biopsy collection, and tumor identification.10 One problem with these methods is that they can cause a duodenal hematoma, infection, abdominal pain, or a tear in the colon wall during diagnosis. Balloon enteroscopy examines the colon polyps or areas of bleeding in the gastrointestinal (GI) tract. Some complications associated with this method are sedation, perforation, and a potent risk of ileus (transient slowing of the bowel).11 The disadvantage of conventional endoscopy methods, such as colonoscopy or esophagogastroduodenoscopy, is the inability to image and examine the small intestine, which can be avoided by employing a procedure called capsule endoscopy, in which the patient swallows a capsule that takes pictures of the whole GI tract. Capsule endoscopy includes limited preparation and no anesthesia, is a painless procedure, and is the better imaging diagnostic modality for diagnosing GI abnormalities.12
The usage of artificial intelligence has greatly impacted the field of health care also. It serves as a tool for automated diagnosis. Machine learning and deep learning algorithms use huge medical data samples obtained using capsule endoscopy to train the net. The trained net detects clinical abnormalities in the GI tract automatically. The following are the objectives of the proposed work: (i) to develop a deep learning model for the automated classification of GI abnormalities from wireless capsule endoscopy (WCE) images, leveraging public domain databases, and (ii) to design a web-based application for aiding gastroenterologists by visualizing algorithm outputs and recommendations.
The remainder of this paper is organized in the following manner: Sec. II explains the start of the art research done in this field. Section III details the methodologies of the proposed work and materials used. Results with respect to the proposed work are presented in Sec. IV, and a detailed discussion by comparing the results with those of other similar studies is presented in Sec. V.
II. RELATED WORK
There are several studies done in this area of research for identifying pathologies from WCE images using state of art image processing methods, machine learning methods, and deep learning methods. This section describes in detail various methods studied. Wang et al.13 proposed a systematic evaluation and optimization method for detecting ulcers from wireless capsule endoscopy images through an automated means on a vast dataset collected from more than 30 hospitals and 100 medical examination centers using deep convolutional neural networks. The Second Glance (SecG) detection system diagnoses ulcers automatically. Barash et al.14 proposed a method to grade the severity of ulcers in video-capsule images of Crohn’s disease patients using an ordinal neural network solution. A deep learning algorithm automates the grading, and PillCam Crohn’s Capsule (PCC) Medtronic dataset was used. Saito et al.15 developed and tested a convolutional neural network to automatically detect protruding lesions of various types from wireless capsule endoscopy images. The data were collected using PillCam SB2 and were analyzed using STATA. They constructed an AI system using SSD. The stochastic gradient descent method fine-tunes all layers of the CNN. Ghosh and Chakareski16 developed a computer-aided diagnostic (CAD) tool for automated analysis of small intestinal abnormalities, such as bleeding. Convolutional layers used the VGG16 network, and a softmax classifier used the decoder output feature maps for pixel-wise classification. Yuan et al.17 proposed a two-stage, fully automated computer-aided detection system to detect ulcers from WCE images by saliency max-pooling method along with Locality-constrained Linear Coding (LLC) method. In this method, Support Vector Machine (SVM) classified 170 ulcer images and 170 normal images.
Raut et al.18 developed a segmentation and classification algorithm for the automated diagnosis of gastrointestinal tract disorders using capsule endoscopy images. The DeepLapV3+ algorithm segments the gastrointestinal tract, and the LeNet algorithm classifies abnormalities in the segmented image. Ribeiro et al.19 developed a deep learning model for small bowel disorders’ classification using capsule endoscopy images. The task is to classify the capsule endoscopy image as excellently visible (>90%), satisfactory (50–80), and unsatisfactory (<50%). The model was trained using 12 950 images obtained from clinical centers in Portugal.
Chung et al.20 used a no-code platform to develop a deep learning model for classifying gastrointestinal abnormalities using capsule endoscopy images. The abnormalities considered for this task were blood, inflamed, vascular, and polypoid conditions. No-code development of the deep learning classifier used the Neuro-T platform. Around 37 307 images from 24 capsule endoscopy videos were used to train the developed deep learning model. Mascarenhas et al.21 developed a convolutional neural network for detecting blood and mucosal lesions occurring in the colon using capsule endoscopy images. The CNN model was trained on 9005 images, where 3075 were normal, 3115 were blood, and 2815 were lesions obtained from 124 examinations. The CNN model is a pre-trained Xception network using a transfer learning technique. The useful works done on other images apart from gastrointestinal images are considered in our study for the purpose of correlation of the techniques employed. Wang et al.22 proposed EfficientNet-based U-Net for segmentation of fundus images. The contributions in Gopatoti and Vijayalakshmi,23 with respect to chest x-ray image classification, are also valid.
III. MATERIALS AND METHODS
A. Designated workflow
The proposed workflow of the automated diagnosis of GI abnormalities from capsule endoscopy images using CNN is shown in Fig. 1. The image provided by the user undergoes an appropriate image processing technique. The CNN trained on capsule endoscopy images will automatically classify the processed image into one of the four classes (normal, ulcerative colitis, polyps, and esophagitis).
B. Architecture for classifier model
In this work, we propose a deep learning model, i.e., GastroNetV1, which is based on EfficientNetV2B2 architecture. Figure 2 represents the detailed architecture diagram of the proposed model used in this work.
The EfficientNet scaling method uses fixed scaling coefficients to adjust the network’s width, depth, and resolution. The input of one convolution layer is fed as input to another convolution layer appearing later in the network.24 This connection eliminated the problem of data morphing, and deep CNNs came into existence using this concept, and it overcomes the expense of convolution operation. It consumes less time than the standard convolution and is efficient, with no nonlinearities introduced after the convolution operation. EfficientNetV2B2, which belongs to the EfficientNet family, is a convolutional neural network architecture. In addition to EfficientNetV2B2, several standard CNN architectures, such as VGG16, VGG19, and ResNet101, from the EfficientNet family are considered for comparison. The proposed architecture (GastroNetV1) combines certain layers and the existing EfficientNetB2 backbone with enhancements. The addition of these layers improves the performance of the EfficientNetB2 algorithm. The additional layers consist of convolution layers for generating heat maps (with the last convolution layer computing the gradients), global average pooling (which is a superior alternative to the flatten layer), a dense layer with 256 neurons using the ReLu activation function, batch normalization, a Gaussian noise layer with a standard deviation of 0.15, and a dropout layer with a drop rate of 0.15. The addition of these layers provides robustness and avoids overfitting.
The goal of the filtering action is to cancel noise while preserving the integrity of edge and detail information; nonlinear approaches generally provide more satisfactory results than linear techniques.25 Batch Normalization (BN) and its variations have successfully combatted the covariate shift induced by the training step of deep learning methods. Whereas these techniques normalize feature distributions by standardizing with batch statistics, they do not correct the influence on features from extraneous variables or multiple distributions.26 The dropout approach randomly shuts down the feature detectors during the training phase. It enables the network to become invariant to the “noisy” spectral components by randomly selecting only the most important basis vectors for signal reconstruction during the spectral dropout regularization process.27
C. Dataset description and data pre-processing
The proposed work uses the Kvasir public dataset for wireless capsule endoscopy images curated from the source.28 The dataset includes polyps, ulcerative colitis, esophagitis, and normal conditions. The dataset (Table I) consists of 6000 images for `ground truth', showing anatomical landmarks, pathological findings, or endoscopic procedures in the GI tract.29,30 The anatomical landmarks are Z-line, pylorus, and cecum, while the pathological findings include esophagitis, polyps, and ulcerative colitis.
Class No. . | Class name . | Total images (before augmentation) . | Total images (after augmentation) . |
---|---|---|---|
0 | Normal | 1500 | 3000 |
1 | Ulcerative colitis | 1500 | 3000 |
2 | Polyps | 1500 | 3000 |
3 | Esophagitis | 1500 | 3000 |
Total | 6000 | 12000 |
Class No. . | Class name . | Total images (before augmentation) . | Total images (after augmentation) . |
---|---|---|---|
0 | Normal | 1500 | 3000 |
1 | Ulcerative colitis | 1500 | 3000 |
2 | Polyps | 1500 | 3000 |
3 | Esophagitis | 1500 | 3000 |
Total | 6000 | 12000 |
Data augmentation is an important operation in developing an image classifier. This method is helpful, especially in the medical field, where collecting large volumes of data becomes tedious. Data augmentation is an approach to improve the strength of the dataset by applying specified transformations on the existing dataset to generate new data. In this work, horizontal flipping was applied on the capsule endoscopy images, resulting in 12 000 images.
D. Classification process
In this work, we used the cross-entropy loss function and Adam optimizer. The cross-entropy loss function, also known as log loss, relates the probabilities of the ground truth to that of the model prediction.31 Adam is one of the most popular choices of optimizers used by deep learning researchers. Adam stands for adaptive estimation. This optimizer can greatly aid in the process of classification.32 The model used early stopping to find the convergence while training. It will stop the training when there is no improvement in the performance metrics of the trained model. Based on the equation, all the models performed 25 epochs with 225 steps per epoch with a batch size of 16.
E. Evaluation metrics
The evaluation of the classification process uses sensitivity, specificity, accuracy, AUC, and loss. In data analysis, we use the following parameters: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN). These classification values are useful in calculating various metrics. TP is the count of ground truth and model predictions, being positive. FP is the count of ground truth that is negative, but the model prediction is positive. TN is the count of the model prediction and ground truth that is negative, and FN is the count of the ground truth that is positive, but the prediction is negative.
F. Design of web-based user interface and development environment
The physician cannot access the integrated development environment (IDE) for the diagnosis. Consequently, a tool that can assist a physician in the diagnosis is required. We have created a web-based application that automatically classifies capsule endoscopy images into one of the following classes: normal, ulcers, polyps, and esophagitis. This web application uses Python’s streamlit library. The web application classifies the input capsule endoscopy images automatically, assisting the physician in diagnosing the illness. Figure 3 represents the user interface design of the developed web application.
The process utilized the Python language (version 3.6.7) and the Keras module from TensorFlow (version 2.8.0) as the backend tools to implement it. It ran on Google Collab IDE with 8 GB RAM and 12 GB NVIDIA K80 GPU. The classification process employed the Keras module from TensorFlow. We obtained pre-trained versions of the models from a TensorFlow Keras application package and appended additional layers to these pre-trained models to obtain the final model.
IV. RESULTS
The implementation and results obtained for the proposed GastroNetV1, which is a CNN based system for detection of abnormalities in GI tract from WCE images, are detailed in this section.
A. Building GastroNetV1 model
The first step of the proposed work is training the CNN based GastroNetV1 model using the WCE data described in Table I. There are four classes (normal, ulcerative colitis, polyps, and esophagitis) considered in the dataset, and each class consists of 1500 WCE images before augmentation. Then, we applied image augmentation techniques described in the methodology section to enrich the data size. The training–validation split ratio of 0.86 is considered for the model development (Table II). The model underwent training on the Google Colab platform. The image sets considered for training and validation are organized into different directories with a batch size of 16.
Class . | Total number of images used for training . | Total number of images used for validation . |
---|---|---|
Normal (0) | 2600 | 400 |
Ulcerative colitis (1) | 2600 | 400 |
Polyps (2) | 2600 | 400 |
Esophagitis (3) | 2600 | 400 |
Class . | Total number of images used for training . | Total number of images used for validation . |
---|---|---|
Normal (0) | 2600 | 400 |
Ulcerative colitis (1) | 2600 | 400 |
Polyps (2) | 2600 | 400 |
Esophagitis (3) | 2600 | 400 |
The default learning rate for the Adam optimizer was set at 1 × 10−3. The model was trained for 25 epochs using the training images and evaluated on the training and validation image datasets. Obtaining the performance plots for the different classification metrics, we generated and hosted in the front-end design for the web app. Table III represents the comparative analysis of the performance metrics for the several CNN backbones trained on the capsule endoscopy images for training and validation, respectively. The best results are in boldfaced.
CNN backbone . | Accuracy . | AUC . | Precision . | Recall . |
---|---|---|---|---|
Training | ||||
VGG19 | 0.967 | 0.998 | 0.969 | 0.966 |
ResNet50 | 0.983 | 0.999 | 0.983 | 0.983 |
ResNet101 | 0.980 | 0.999 | 0.981 | 0.980 |
EfficientNetV2B1 | 0.984 | 0.999 | 0.984 | 0.983 |
Proposed GastroNetV1 | 0.987 | 0.999 | 0.987 | 0.987 |
Validation | ||||
VGG19 | 0.963 | 0.999 | 0.964 | 0.961 |
ResNet50 | 0.980 | 0.999 | 0.981 | 0.980 |
ResNet101 | 0.976 | 0.999 | 0.978 | 0.976 |
EfficientNetV2B1 | 0.985 | 0.999 | 0.985 | 0.985 |
Proposed GastroNetV1 | 0.992 | 0.991 | 0.993 | 0.993 |
CNN backbone . | Accuracy . | AUC . | Precision . | Recall . |
---|---|---|---|---|
Training | ||||
VGG19 | 0.967 | 0.998 | 0.969 | 0.966 |
ResNet50 | 0.983 | 0.999 | 0.983 | 0.983 |
ResNet101 | 0.980 | 0.999 | 0.981 | 0.980 |
EfficientNetV2B1 | 0.984 | 0.999 | 0.984 | 0.983 |
Proposed GastroNetV1 | 0.987 | 0.999 | 0.987 | 0.987 |
Validation | ||||
VGG19 | 0.963 | 0.999 | 0.964 | 0.961 |
ResNet50 | 0.980 | 0.999 | 0.981 | 0.980 |
ResNet101 | 0.976 | 0.999 | 0.978 | 0.976 |
EfficientNetV2B1 | 0.985 | 0.999 | 0.985 | 0.985 |
Proposed GastroNetV1 | 0.992 | 0.991 | 0.993 | 0.993 |
The GastroNetV1 was trained for 45 epochs using the Adam optimizer and cross-entropy loss function. It was determined to be the best model based on its performance in the training and validation datasets. Figure 4 represents the performance plots for the proposed model (GastroNetV1). Table IV compares the proposed model’s accuracy values with existing models.
Similar research studies . | Architecture used . | Accuracy (%) . |
---|---|---|
Li & Meng33 | MLP | 88.0 |
Wang et al.34 | CNN(ResNet34) | 90.1 |
Raut et al.18 | DeepNetV3+LeNet | 99.1 |
Ghosh et al.16 | CNN (AlexNet and SegNet) | 94.4 |
Tsuboi et al.35 | CNN | 95.0 |
Hassan et al.36 | SVM | 99.0 |
Ribeiro et al.19 | CNN | 92.1 |
Chung et al.20 | CNN | 98.0 |
Proposed GastroNetV1 | CNN (hybrid) | 99.2 |
Similar research studies . | Architecture used . | Accuracy (%) . |
---|---|---|
Li & Meng33 | MLP | 88.0 |
Wang et al.34 | CNN(ResNet34) | 90.1 |
Raut et al.18 | DeepNetV3+LeNet | 99.1 |
Ghosh et al.16 | CNN (AlexNet and SegNet) | 94.4 |
Tsuboi et al.35 | CNN | 95.0 |
Hassan et al.36 | SVM | 99.0 |
Ribeiro et al.19 | CNN | 92.1 |
Chung et al.20 | CNN | 98.0 |
Proposed GastroNetV1 | CNN (hybrid) | 99.2 |
B. Web based user interface
In the subsequent step, we developed a web-based system with user-friendly interfaces with an option to browse and select WCE images from the local system or any other storage media either locally or on the cloud server. The system then infers the input image with the developed model (GastroNetV1) and displays the predicted class output. There is also an option available for the patients to know about the pathology (basic understanding) predicted by the system.
Figure 5 represents the input WCE image and the corresponding Grad-CAM (gradient class activation mapping) heat map predicted and generated by the proposed GastroNetV1 for the respective intestinal disorder with the classified outcome. The heat map varies from blue to red, where blue means “lowest focus” and red means “highest focus.” The developed web-based system also emphasizes descriptive suggestions about the predicted pathology.
V. DISCUSSIONS
The WCE procedure is one of the important modes for screening small intestinal region of the digestive tract. The procedure may produce more than 100 thousand images, which requires a significant effort from the GI specialist to manually analyze and identify pathologies. Hence, the proposed research aimed to develop a CNN based model (GastroNetV1) for identifying problematic images and present those to the GI specialist to make the quick decision.
The proposed work demonstrates successful training on WCE images, and the model has performed well on the training and validation sets without signs of overfitting (Table III). The proposed method surpasses the performance of current state-of-the-art algorithms. The proposed GastroNetV1 model produced an accuracy, precession, and recall of 99.2%, 99.3%, and 99.3%, respectively (Table III). Similar results have been achieved by other researchers: Li and Meng developed an MLP neural network for detecting abnormal regions in WCE images and produced a specificity of 87.8%.33 Raut et al. developed a DeepLapV3+ algorithm that segments the gastrointestinal tract, and the LeNet algorithm classifies abnormalities from the segmented image, producing 99.1% accuracy, 98.8% precision, and 99.1% recall.18
Ribeiro et al. developed a deep learning model to classify WCE images as excellently visible (>90%), satisfactory (50–80), and unsatisfactory (<50%), which was trained on 12 950 images obtained from clinical centers in Portugal.19 Chung et al. used a no-code platform to develop a deep-learning model for classifying GI abnormalities, such as blood, inflamed, vascular, and polypoid conditions. The model was trained on 37 307 images from 24 capsule endoscopy videos and produced an accuracy of 98%, a precision of 89%, and a recall of 97%.20 Several CNN based models have been developed by various research groups for detecting various abnormalities in the GI tract from WCE images: Mascarenhas et al. proposed an algorithm for detecting blood and mucosal lesions occurring in the colon,21 Tsuboi et al. developed a model for extracting specific features and quantities for distinguishing the location of angioectasia correctly.35 The segmentation model proposed by Li et al. for CT images used the depth path module, which is a necessary approach in our study.45 Alaskar et al. developed a pre-trained neural network GoogLeNet to recognize ulcer images by adding four new layers to the standard architecture, namely, a dropout layer of 0.5 probability of dropout, a fully connected layer, a softmax layer, and a classification-output layer. In the experiments, a total of 144 layers were used to build GoogLeNet. The developed model achieved an accuracy of 85%.46 The accuracy of the developed model on the validation dataset (99.15%) is better than that of this work.
Hassan and Haque developed a texture-feature-descriptor-based algorithm that operates on the Normalized Gray Level Co-occurrence Matrix (NGLCM) of the magnitude spectrum of the images to detect bleeding and non-bleeding areas. Normalized Gray Level Co-occurrence Matrix (NGLCM) matrix was constructed to extract features from each log-transformed magnitude spectrum. They used linear SVM, polynomial and Radial Basis Function (RBF) kernels, and SVM classifiers. Among these, linear SVM produced the highest specificity of 98.95%.36 The specificity produced by the proposed work (99.25%) is better than that of the work of Hassan and Haque.
Ghosh et al. developed computer-aided diagnostic (CAD) tools for automated analysis of detecting small intestinal abnormalities, such as bleeding. A VGG16 architecture is used as first 13 convolutional layers, and a softmax classifier is fed by the decoder output feature maps for pixel-wise classification. The first 22 layers of our pre-trained AlexNet architecture are used, and the last three layers are designed according to our application. One fully connected layer is introduced, which has two categories (bleeding and non-bleeding) and thus contains two fully connected neurons. In addition, one softmax and one classification output layer are included. This network is trained and tested on a single NVIDIA GeForce GTX 1070. In the SegNet architecture, a VGG16 network is used as the first 13 convolutional layers and five max-pooling and five upsampling layers are used. Finally, a softmax classifier is fed by the decoder output feature maps for pixel-wise classification. Their(Ghosh et al) method produced a 98.49% F1-score and a 97.51% sensitivity.16 Similarly, Padmavathi et al proposed a classification method based on LeNet-5, which achieved a 99.12% accuracy using the publicly available Kvasir-V2 dataset.38 However, the proposed work outperformed their results by achieving a sensitivity of 99.25%.
Wang et al. proposed a method called Second Glance (SecG) detection framework for automatic detection of ulcers using the deep convolutional network. The architecture used was RetinaNet and also had a second glance refinement stage. The refinement stage is built by the ResNet34 and ResNet50 backbones. By utilizing two CNN backbones and two anchor settings, we can effectively utilize four distinct primary detectors: RT34 and RT50, which signify RetinaNet with ResNet34 and ResNet50, respectively. A1 and A2 refer to anchor settings. This method produced a specificity of 90.48%.34 Overall, when compared to the performances of the other researcher’s results, the proposed GastroNetV1 model demonstrated a superior performance in detecting the GI abnormalities (Table IV). Similarly, in our earlier study, we developed a U-Net-based polyp segmentation model from colonoscopy images, which produced an accuracy of 97.1%.47
An interactive web-based user interface has been developed for deploying the proposed model (GastroNetV1), as shown in Fig. 3. The model will be inferred with the image selected, and the predicted result will be presented on the display. This feature will be easier for the GI specialist to find the problematic frames/images in the WCE video of the given human subject.
The proposed model was trained on only severe gastrointestinal abnormalities, such as esophagitis, polyps, and ulcers, along with normal class data. There are other features available in the user interface: display viewports for displaying input image and the predicted pseudo-color map overlaid on the input image in a separate viewport; a text box to show the predicted pathology; and another text box to show the description of the pathology or basic details about the pathology. These will help the GI specialist analyze the data much faster and create the report effectively. Using this tool, one can automatically detect the disease or disorder in the GI tract in a cost-effective manner. The results produced are shown in Fig. 5 for the test data given to the system. Table III and Table IV present the results obtained while building the GastroNetV1 model and the validation process. However, the work has a limitation. The number of pathologies considered for model building is limited to only three. In order to create a more effective model, more pathologies and more datasets with variations need to be considered. Furthermore, we can extract more information from the WCE images and enhance the process with an improved accuracy. Some of the future research directions would be combining the proposed model with image enhancement algorithms and convolutional neural networks to improve the accuracy and reduce the misclassification rate to develop a better classification tool.
VI. CONCLUSION
A CNN based model, GastroNetV1, was successfully developed to classify four classes of GI tract disorders (normal, ulcerative colitis, polyps, and esophagitis) using WCE data obtained from the public sources, and an accuracy of more than 99% was obtained. Furthermore, an interactive web-based user interface has been developed to infer the GastroNetV1 model with the selected input data through the user interface to obtain the predicted outcome in visual form with the associated details in a consolidated manner. Hence, the trained GastroNetV1 model will be helpful for the gastroenterologist to make diagnostic decisions much faster and in a cost-effective manner.
ACKNOWLEDGMENTS
The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through Large Research Project under Grant No. RGP2/179/45.
AUTHOR DECLARATIONS
Conflict of Interest
The authors have no conflicts to disclose.
Author Contributions
S. Rajkumar: Investigation (equal); Methodology (equal); Writing – original draft (equal); Writing – review & editing (equal). C.S. Harini: Resources (equal); Software (equal); Validation (equal); Visualization (equal); Writing – review & editing (equal). Jayant Giri: Project administration (equal); Resources (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal). V.A. Sairam: Data curation (equal); Formal analysis (equal); Methodology (equal); Resources (equal); Writing – review & editing (equal). Naim Ahmad: Data curation (equal); Formal analysis (equal); Resources (equal); Software (equal); Supervision (equal); Writing – review & editing (equal). Ahmed Said Badawy: Resources (equal); Software (equal); Supervision (equal); Validation (equal); Writing – review & editing (equal). G.K. Krithika: Project administration (equal); Resources (equal); Software (equal); Supervision (equal). P. Dhanusha: Project administration (equal); Resources (equal); Software (equal); Supervision (equal); Validation (equal). G.E Chandrasekar: Formal analysis (equal); Project administration (equal); Validation (equal); Visualization (equal). V. Sapthagirivasan: Resources (equal); Software (equal); Supervision (equal); Validation (equal).
DATA AVAILABILITY
The data that support the findings of this study are available from the corresponding author upon reasonable request..