<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Oncol.</journal-id>
<journal-title>Frontiers in Oncology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Oncol.</abbrev-journal-title>
<issn pub-type="epub">2234-943X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fonc.2025.1508326</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Oncology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A deep learning approach for brain tumour classification and detection in MRI images using YOLOv7</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Nimmagadda</surname>
<given-names>Ramya</given-names>
</name>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2863462/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/validation/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Devi</surname>
<given-names>P. Kalpana</given-names>
</name>
<role content-type="https://credit.niso.org/contributor-roles/project-administration/"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/validation/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff id="aff1">
<institution>Electronics and Communication Engineering (ECE), Vel Tech Rangarajan Dr. Sagunthala R&amp;D Institute of Science and Technology</institution>, <addr-line>Chennai</addr-line>, <country>India</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Sharon R. Pine, University of Colorado Anschutz Medical Campus, United States</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Dulani Meedeniya, University of Moratuwa, Sri Lanka</p>
<p>Bogdan Iliescu, Grigore T. Popa University of Medicine and Pharmacy, Romania</p>
<p>Rahul Joshi, Symbiosis International University, India</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Ramya Nimmagadda, <email xlink:href="mailto:ramyanimmagadda29@gmail.com">ramyanimmagadda29@gmail.com</email>
</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>17</day>
<month>09</month>
<year>2025</year>
</pub-date>
<pub-date pub-type="collection">
<year>2025</year>
</pub-date>
<volume>15</volume>
<elocation-id>1508326</elocation-id>
<history>
<date date-type="received">
<day>18</day>
<month>10</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>07</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2025 Nimmagadda and Devi.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Nimmagadda and Devi</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>The medical imaging field has grown tremendously due to the latest digital imaging and artificial intelligence (AI) advancements.  These advancements have improved tumour classification accuracy, time, cost efficiency, etc.  Radiologists utilize an MRI scan due to its exceptional capacity to identify even the most minor alterations in brain activity.  This research uses YOLOv7, a Deep Learning (DL) model, to classify and detect brain tumours and to conduct a detailed analysis of the frequently used structures for tumour identification.  The study uses a brain MRI dataset from Roboflow with 2870 labelled pictures divided into four types of tumours.  Our brain tumour dataset has four distinct classes: pituitary, gliomas, meningiomas, and no tumours.  This preprocessed sample was used to assess the performance of deep learning models on identifying and classifying brain tumours.  Throughout the preprocessing stage, aspect ratio normalization and resizing algorithms are applied to improve tumour localization for bounding box-based detection.  YOLOv7 performs admirably, with a recall score of 0.813 and a box detection accuracy of 0.837.  Remarkably, the mAP value for the 0.5 IoU threshold is 0.879.  During box identification within the extended IoU spectrum of 0.5 for a to 0.95, the mAP value was 0.442.</p>
</abstract>
<kwd-group>
<kwd>YOLOv7</kwd>
<kwd>brain tumour</kwd>
<kwd>MRI</kwd>
<kwd>classification</kwd>
<kwd>object detection</kwd>
<kwd>deep learning</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="4"/>
<equation-count count="7"/>
<ref-count count="24"/>
<page-count count="15"/>
<word-count count="8169"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-in-acceptance</meta-name>
<meta-value>Cancer Imaging and Image-directed Interventions</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>As per the classification system of brain tumors by the World Health Organization (WHO), there are over 120 types of brain tumors that exist on the basis of their origin, location, size, and the characteristics of the tissues that constitute the tumor. These brain tumors can be divided into two types: malignant (cancerous) and benign (non-cancerous). Some tumors can be aggressive in nature, and others can be inactive. However, if there is a sufficient increase in size, the tumor will compress the adjacent nerves and blood vessels and thus impair the normal brain functions and can also kill the brain cells (<xref ref-type="bibr" rid="B1">1</xref>). Abnormally strong deformities of the tissues cause brain tumors. Tumors stem from cell clusters that arise within the brain as a result of excessive cell multiplication. These clusters can affect normal brain functions and can cause the destruction of healthy brain cells. Several of the body&#x2019;s processes, such as integrating, organizing, evaluating, and decision making, are regulated by the central nervous system (CNS), which consists of the spinal cord and the brain. The astonishing detail of an individual&#x2019;s brain arises from its complex structure (<xref ref-type="bibr" rid="B2">2</xref>). There is a range of illnesses that can affect the CNS, including brain tumors, migraines, infections, and strokes, which pose substantial challenges in the area of screening, evaluation, and effective treatment development (<xref ref-type="bibr" rid="B3">3</xref>). The abnormal development of brain cells creates brain tumors, which pose substantial challenges for radiologists and neuropathologists in early diagnosis. The primary difficulty confronting radiologists and neuropathologists is the early detection of brain tumors resulting from the rapid development of certain neurons. One of the most common types of imaging used for the diagnosis of brain tumors is magnetic resonance imaging (MRI), which often proves to be inaccurate and unreliable, particularly for these sensitive tumors. An uncharacteristic proliferation of nerve tissue forming a mass is usually a constitutive trait of malignancies in the porto systems. Brain tumors have close to 130 types, and some are highly unusual, but most are fairly common. Tumors are classified into benign and malignant. They can arise from neurons, oligodendrocytes, and other supportive cells that encase the adjacent nerve cells. The most important form of malignant brain tumor is the so-called metastatic or secondary brain tumor. Benign tumors do not spread to other parts of the body, but when they do, they can cause significant health problems, thus turning malignant (<xref ref-type="bibr" rid="B4">4</xref>).</p>
<p>Gliomas, meningiomas, pituitary tumors, and no tumors represent almost all of the primary tumors diagnosed. Most cases of meningiomas start in the blood vessel cells on the exterior of the central nervous system, which arise from the outermost layers enveloping the central nervous system. Even so, the brain tumor type that kills people the fastest is glioma, which starts in the tissues that protect neuronal activity. Gliomas account for approximately one-third of all brain tumor cases. Benign pituitary tumors grow inside the pituitary gland. The prognosis and available treatments for brain tumors depend heavily on a reliable evaluation. Traditional biopsy methods are uncomfortable, time-consuming, and prone to inaccurate sampling. There are a number of issues with histopathological tumor grading (biopsy), such as intra-tumor heterogeneity and variations in the expert&#x2019;s opinions. These qualities make the tumor diagnosis procedures hard and limited.</p>
<p>Identifying the tumor accurately and in a timely fashion is critical in planning any form of therapy and achieving the required clinical outcome. For brain tumors in particular, a good deal of work may be conducted by radiologists at the interpretation stage. Now, radiologists have to identify and diagnose using images, so in a way, they are limited to their subjective assessments.</p>
<p>Due to the intersection of the complexity of the images with variable skill levels of clinicians, making an accurate diagnosis through individual sensory judgment is exceptionally rare for brain tumors. For neurology, MRI is a preferred method since it allows for the detailed evaluation of the brain and skull. It produces sagittal, coronal, and axial images for a comprehensive review. MRI not only is capable of producing highly detailed and contrast-rich images but also does not expose the patient to radiation risks, which makes it ideal. This is why in the diagnosis of a range of different types of brain tumors, MRI is highly recommended as a screening method.</p>
<p>The low accuracy of MRI in tumor detection has led to a need for automated methods that combine image processing and machine learning (ML). Soft tumors and extremely challenging tumors require different therapeutic approaches. The proposed You Only Look Once (YOLO) model is in keeping with artificial intelligence being increasingly integrated into healthcare. It provides potential advantages in efficiency and accuracy. The approach, findings, and outcomes of the present research are covered completely in the following parts, which also highlight its significance to the fields of medicine and computing. Current procedures take much time to complete and may not be entirely accurate due to human variance. In order to close this gap and meet the demand for accurate and timely brain tumor evaluation, this study suggests a sophisticated brain tumor categorization method based on the YOLO approach.</p>
<p>This study aims to maximize tumor identification efficiency using artificial intelligence frameworks, specifically the YOLO model. Among the objectives are modifying YOLO for the detection of malignancies in evaluating changes and optimizing parameters for training time and sensitivity. Enhancing the accuracy of tumor identification and categorization is the aim of this research, which will help the computer and medical industries. In this introduction, the background is established, the importance of automated tumor identification is emphasized, and YOLO is presented as a possible remedy.</p>
<p>Due to the challenges inherent in the traditional methods of diagnosing brain tumors and the automated techniques published in the literature, this study was framed around the following critical questions: How flexible is the YOLOv7 architecture for the accurate detection and classification of various types of brain tumors in MRI scans? How does its efficiency measure in comparison to other ML and deep learning (DL) models in terms of accuracy? Moreover, what modifications can be made to the architecture or other parameters to enhance the effectiveness of YOLOv7 for medical imaging tasks such as brain tumor segmentation? All answers are provided in Section 6.</p>
<p>In Section 2, the discussion highlights the previous research works conducted on brain tumors while incorporating various machine learning techniques. Section 3 provides a detailed overview of the study including the methodology, proposed model, and architecture best suited for our research. The deep learning frameworks and efficiency indicators employed for the research are also assessed. Section 4 discusses the results of our analysis on the performance of the deep learning algorithms. An extensive and detailed analysis is presented in Section 5.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related works</title>
<p>The advancements of artificial intelligence (AI) technologies, especially in deep learning, show promising capabilities regarding the automation of the identification and classification of brain tumors in medical imaging, such as MRI scans. The literature offers a variety of approaches from models specialized in segmentation and classification-centric architecture to more modern ones that attempt to merge both tasks. Also, some studies have explored explainable hybrid optimization models. This article intends to narrate the overview blocked range under the thematic subheaders outlining the literature.</p>
<sec id="s2_1">
<label>2.1</label>
<title>Tumor segmentation</title>
<p>The early diagnosis and timely intervention of a brain tumor case are heavily reliant on the precise segmentation of tumors from MRI scans. The manual process of segmentation comes with overly optimistic promises, as it is both labor-intensive and heterogeneous in nature (inter-rater reliability). For these reasons, automated deep learning models are gaining popularity.</p>
<p>SegNet is a fully convolutional network that has been used in the automated segmentation of necrosis, edema, and enhanced tumor regions alongside other multi-modal MRI sequences (T1, T1ce ( T1-weighted contrast-enhanced MRI), T2, and FLAIR (Fluid-Attenuated Inversion Recovery, a specific type of magnetic resonance imaging (MRI) sequence), like a tumor&#x2019;s substructures. Researchers reported impressive F-measure outcomes of 0.85, 0.81, and 0.79 for whole, core, and enhancing tumors, respectively (<xref ref-type="bibr" rid="B5">5</xref>).</p>
<p>In the same way, a Mask R-CNN with DenseNet-41 backbone was created to perform the segmentation and categorization of tumors simultaneously, thus achieving an improved tumor boundary precision through transfer learning (<xref ref-type="bibr" rid="B6">6</xref>).</p>
<p>The more recent segmentation work conducted by Kamnitsas et&#xa0;al. (2023) introduced a dual-pathway 3D convolutional neural network (CNN) ensemble for high-resolution multi-view segmentation, which was integrated with CRF (Conditional Random Fields) post-processing for spatial coherence, thus achieving excellent results on the BraTS dataset.</p>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Tumor classification</title>
<p>Some studies have focused exclusively on classifying the tumors based on the specific features given. A 23-layer CNN was trained on a multi-class MRI dataset with 3,064 and 152 images for &#x201c;case-based&#x201d; and &#x201c;control&#x201d; groups, respectively, thus demonstrating the power of CNNs on large datasets and their weaknesses on smaller ones (<xref ref-type="bibr" rid="B7">7</xref>). To enhance the performance on smaller datasets, the model applied transfer learning using VGG16.</p>
<p>Another study integrated min&#x2013;max normalization and dropout layers into EfficientNet for the multi-class classification of pituitary tumor, meningioma, glioma, and no tumor to enhance performance and mitigate overfitting (<xref ref-type="bibr" rid="B8">8</xref>).</p>
<p>YOLO-based architectures have also been adopted for classification due to their efficiency in real-time object detection. Studies using YOLOv5 and YOLOv7 have reported over 99% accuracy in classifying meningioma, glioma, and pituitary tumors using MRI datasets from King Khaled University Hospital (<xref ref-type="bibr" rid="B9">9</xref>, <xref ref-type="bibr" rid="B10">10</xref>).</p>
<p>A recent benchmark study conducted by Karthik et al. (<xref ref-type="bibr" rid="B11">11</xref>) (as mentioned in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>) compared YOLOv5, YOLOv6, and YOLOv7 and reported 87.9% classification accuracy with YOLOv7 which outperformed earlier versions and even the classical methods like Faster R-CNN with VGG16.</p>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Comparison of various studies and models and their results.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Study</th>
<th valign="middle" align="center">Model</th>
<th valign="middle" align="center">Accuracy</th>
<th valign="middle" align="center">Interpretability</th>
<th valign="middle" align="center">Limitation</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="center">Rao et&#xa0;al. (<xref ref-type="bibr" rid="B12">12</xref>)</td>
<td valign="middle" align="center">CNN + RNN (LSTM)</td>
<td valign="middle" align="center">~96%</td>
<td valign="middle" align="center">Moderate</td>
<td valign="middle" align="center">Slow, not real-time</td>
</tr>
<tr>
<td valign="middle" align="center">Karthik et&#xa0;al. (<xref ref-type="bibr" rid="B11">11</xref>)</td>
<td valign="middle" align="center">YOLOv7</td>
<td valign="middle" align="center">87.90%</td>
<td valign="middle" align="center">None</td>
<td valign="middle" align="center">No attention module</td>
</tr>
<tr>
<td valign="middle" align="center">Proposed</td>
<td valign="middle" align="center">YOLOv7 + CBAM + SPPF+</td>
<td valign="middle" align="center">
<bold>99.50%</bold>
</td>
<td valign="middle" align="center">
<bold>High (Grad-CAM)</bold>
</td>
<td valign="middle" align="center">Best performance, real-time</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The Bold Values in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref> highlights : the Accuracy percentage (99.50%) and Interpretability levels (High) of our proposed model which is the highest among all the compared models and has given the best performance of all the compared models and that too in real time.</p>
</fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="s2_3">
<label>2.3</label>
<title>Segmentation and classification</title>
<p>Integrated frameworks that perform segmentation and classification in tandem have been proposed by multiple researchers. Tumors were segmented and classified using the FAHS-SVM (Fully Automatic Heterogeneous Segmentation using Support vector machine) method, which applied deep learning as well as structural and morphological information (<xref ref-type="bibr" rid="B13">13</xref>).</p>
<p>In a different approach, YOLOv5 was applied in a two-step pipeline for real-time brain tumor segmentation and classification, achieving an 85.95% detection rate (<xref ref-type="bibr" rid="B14">14</xref>).</p>
<p>The study by Bhanothu et&#xa0;al. (2020) used Faster R-CNN and VGG16 for detection and classification in parallel and obtained classification accuracy of 89.45% for meningioma, 75.18% for glioma, and 68.18% for pituitary tumors (<xref ref-type="bibr" rid="B15">15</xref>).</p>
<p>More recent work by Isensee et&#xa0;al. (2024) utilized nnU-Net with dynamic adaptation for both segmentation and classification across multiple datasets and offered improved generalizability across tumor types and institutions.</p>
</sec>
<sec id="s2_4">
<label>2.4</label>
<title>Explainability and interpretability</title>
<p>&#x201c;Black-box&#x201d; issues have been associated with deep learning models despite the advanced accuracy associated with them, which has brought much criticism. Very few research studies have focused on this issue.</p>
<p>One remarkable attempt utilized Grad-CAM-based visualization approaches to highlight attention focus areas in MRI scans during tumor classification by the models. This method enhances clinical trust and provides some level of interpretability of AI decisions.</p>
<p>In another study, attention-based CNNs were used to increase transparency by assigning importance weights to different MRI regions, thereby enabling clinical validation and aiding radiologists in identifying diagnostic features.</p>
</sec>
<sec id="s2_5">
<label>2.5</label>
<title>Hybrid and optimization approaches</title>
<p>To improve the model&#x2019;s accuracy and efficiency, hybrid models coupled with refinement methods have been introduced. In one of the studies, the parameters of a CNN were optimized with an Adaptive Dynamic Sine-Cosine Grey Wolf Optimizer alongside Inception-ResNetV2 for improved feature extraction and convergence rate (<xref ref-type="bibr" rid="B16">16</xref>).</p>
<p>Another work proposed a hybrid brain tumor classification (HBTC) model that combined handcrafted features from MGLCM (Modified Gray Level Co-occurrence Matrix) with deep learning for improved classification performance using a tree-based classifier for final voxel-level labeling (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B18">18</xref>).</p>
<p>Transfer learning approaches were also explored extensively using pre-trained weights from the COCO dataset to train YOLOv4-Tiny models on the RSNA-MICCAI BraTS 2021 dataset (<xref ref-type="bibr" rid="B19">19</xref>), thus resulting in faster convergence and better generalization.</p>
</sec>
<sec id="s2_6">
<label>2.6</label>
<title>Limitations in related studies</title>
<p>Despite the attempts made to apply DL models such as YOLO and CNNs or even hybrid models for the detection and classification of brain tumors, there still remain significant issues and gaps in these models, as detailed below.</p>
<sec id="s2_6_1">
<label>2.6.1</label>
<title>Data imbalance and limited dataset diversity</title>
<p>A number of research works use openly accessible datasets like BraTS, which have class imbalance and lack of diversity, and the same imaging protocols. In addition, if a model is trained on non-homogeneous clinical data, it can become overfitted to real-life data. Most studies do not tackle the problem of the depiction of less common varieties of tumors or the different population-based heterogeneity in patient&#x2019;s tumor demographics.</p>
</sec>
<sec id="s2_6_2">
<label>2.6.2</label>
<title>Poor performance for small or asymmetric tumors</title>
<p>As highlighted in several investigations, models like YOLOv3&#x2013;v7 and even newer versions such as YOLOv8 tend to struggle with detecting very small or irregularly shaped tumors, particularly when tumors blend with surrounding tissues. These detection issues are critical, especially in early diagnosis, where small tumors carry high clinical importance.</p>
</sec>
<sec id="s2_6_3">
<label>2.6.3</label>
<title>Reliance on high-quality annotated data</title>
<p>Radiology machine learning models require a large amount of data that have been annotated with great precision. This poses issues as radiologists annotating the data by hand, which is quite tedious and time-consuming, and automated tagging tools are often prone to inaccuracies. In models that are trained on sparse annotations (for example, Mask R-CNN with DenseNet), poor ground truth annotations will invariably weaken the performance of the model.</p>
</sec>
<sec id="s2_6_4">
<label>2.6.4</label>
<title>Lack of standardized evaluation metrics and validation protocols</title>
<p>Different studies have applied varying reporting standards and evaluation criteria (using metrics such as accuracy, precision, and F1 score), which hampers the fair benchmarking of models. Moreover, in many instances, there is no cross-validation or external validation, which reduces the trust in the claimed robustness of the model.</p>
</sec>
<sec id="s2_6_5">
<label>2.6.5</label>
<title>High computational demands and latency concerns</title>
<p>Healthcare facilities may not possess the advanced GPUs (Graphics Processing Unit) required for computationally intensive training of models like Inception-ResNetV2, EfficientNet, and deeper CNN architectures (for example, a 23-layer CNN), making it challenging to use these AI systems in practice. While systems like YOLOv5 and YOLOv7 are slightly optimized, the ability to perform tasks in real time is still severely impacted in constrained resource environments.</p>
</sec>
<sec id="s2_6_6">
<label>2.6.6</label>
<title>Limited real-time clinical integration</title>
<p>Some models, such as YOLOv5 in Dipu et&#xa0;al., show potential in real-time settings, but most works do not consider evaluating these models in operational clinical settings within workflows. This remains an open area not just from the ease of automation perspective but also from a regulatory, privacy, and workflow standpoint.</p>
</sec>
<sec id="s2_6_7">
<label>2.6.7</label>
<title>Generalization across imaging modalities and institutions</title>
<p>Many models are developed and validated with very limited sets of imaging modalities, such as T1-weighted or T1ce. However, different brands of MRI scanners, acquisition protocols, and even how the patient is positioned can create domain shifts, which may impact the model&#x2019;s in-line performance.</p>
</sec>
<sec id="s2_6_8">
<label>2.6.8</label>
<title>Lack of explainability and trust at clinical level</title>
<p>Even though accuracy metrics are emphasized, there are very few studies that explain problems or provide interpretability tools such as Grad-CAM or saliency maps, which would enable clinicians to appreciate the reasoning behind the model.</p>
</sec>
<sec id="s2_6_9">
<label>2.6.9</label>
<title>Inconsistent performance on multi-class classification</title>
<p>In multi-class tumor types, the differentiation of glioma, meningioma, and pituitary tumors is usually more complex than binary classification. Accuracy for different types of tumors is not at the same level as shown in cases like the study of Bhanothu et&#xa0;al. There is significant divergence, which suggests unreliable performance or bias toward more common classes.</p>
</sec>
<sec id="s2_6_10">
<label>2.6.10</label>
<title>Neglect of segmentation refinement and post-processing</title>
<p>Certain works make use of the segmentation models like SegNet or FAHS-SVM but do not incorporate any boundary refinement procedures that would reduce segmentation errors or improve boundary delineation. The precision of tumor boundary detection is critical for planning the treatment, and clinical utility is diminished if boundary uncertainty is not resolved. When choosing a deep learning model for brain tumor segmentation using MRI images, many factors should be considered, such as its performance, architectural design, precision, and ease of modification according to the specific requirements of the project.</p>
</sec>
</sec>
<sec id="s2_7">
<label>2.7</label>
<title>Rationale for choosing YOLOv7</title>
<p>Some of the commonly used models for segmentation in medical imaging are U-Net, DeepLabV3+, and Attention U-Net. These models target pixel-level identification and perform well on medical imaging segmentation tasks. However, these attention models can be slower than YOLOv7 in achieving real-time results due to the high computations required for segmentation processing.</p>
<p>Transformer-based models such as Swin Transformer or ViT are becoming more common for segmentation tasks. However, they are slower to train and more demanding in terms of memory utilization, thus making them less than ideal in limited-resource settings.</p>
<p>Therefore, while the newer versions of YOLO or other architectures may offer some incremental improvements, YOLOv7 was selected for our specific MRI scan dataset for its ability to offer a good balance between real-time performance, high detection accuracy with fast inference speed, and ease of its adaptation to our specific needs. YOLOv7 provided better accuracy in detecting and classifying brain tumors, especially small or irregularly shaped ones, as compared to other models like YOLOv5, YOLOv8, U-Net, and Faster R-CNN.</p>
</sec>
</sec>
<sec id="s3">
<label>3</label>
<title>Proposed approach</title>
<sec id="s3_1">
<label>3.1</label>
<title>Overall architecture of brain tumor detection</title>
<p>The evaluation of visualizations of brain tumors is challenging due to the size, shape, and positioning of the disorders. Scholars have come up with different ways to identify the anomalies in data, and each has its own advantages and disadvantages. Various machines are capable of producing images of brain tumors with differing levels of contrast, sharpness, number of slices, and spatial resolution. Here, we discuss the scientific specifications and architectural design of the algorithmic framework for the efficient and accurate image-based detection of brain tumors. With the recommended approach, we aim to differentiate malignant tumors in MRI scans with precision. <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref> showcases the preprocessing, training, and evaluation steps conducted on images containing tumors. Considering YOLOv7&#x2019;s proven performance for&#xa0;detecting brain tumors, this study selected it as the primary framework.</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>The overall workflow of YOLO-based brain tumor segmentation and classification. YOLO, You Only Look Once.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g001.tif">
<alt-text content-type="machine-generated">Flowchart outlining a brain tumor detection system. Steps include MRI scan data collection, preprocessing with resizing and aspect ratio adjustments, and creating training and testing datasets. The YOLOv7 architecture classifies images into tumor categories: gliomas, meningioma, pituitary tumors, or no tumor. Evaluation metrics include F1 score, precision, recall, MAP at point five and point five to point ninety-five.</alt-text>
</graphic>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Dataset collection</title>
<p>To validate the accuracy of our findings, we used an MRI dataset containing 2,870 brain images sourced from Roboflow (<xref ref-type="bibr" rid="B20">20</xref>). MRI scans provide the highest accuracy possible in identifying brain tumors, which is why they have been included in this set. Our dataset, comprising brain tumors, had four subsets, as follows: no tumor (327 images), meningioma (823 images), pituitary tumor (834 images), and glioma (886 images). For this particular analysis, we selected approximately 70% of the entire dataset, equaling 2,009 MRI scans for training, 10% (287 MRI images) for testing, and 20% (574 MRI images) for validation.</p>
<sec id="s3_2_1">
<label>3.2.1</label>
<title>Dataset structure</title>
<p>This dataset is organized into four classes, which represent both the presence and absence of tumors:</p>
<p>Class Number of images Description.</p>
<p>No tumor 327 MRI scans of healthy individuals with no visible brain abnormalities.</p>
<p>Meningioma 823 Images containing meningioma, which are typically benign tumors of the meninges.</p>
<p>Pituitary tumor 834 Scans containing tumors in the pituitary gland region.</p>
<p>Glioma 886 MRI scans with gliomas, aggressive tumors originating in glial cells.</p>
<p>To facilitate the model training process and meet the requirements of the deep learning frameworks, all images were cropped to a standard size of 512 &#xd7; 512 pixels. In addition, the resizing process simplifies the computational burden while maintaining the relevant features of the brain scans.</p>
</sec>
<sec id="s3_2_2">
<label>3.2.2</label>
<title>Dataset splitting</title>
<p>To create a strong and effective model, the dataset was partitioned into three subsets:</p>
<p>Training set: 70% of the data (2,009 images).</p>
<p>Validation set: 20% of the data (574 images).</p>
<p>Test set: 10% of the data (287 images).</p>
<p>This partition guarantees that the model has adequate information to be trained on while also being validated and tested on new data to measure its ability to generalize.</p>
</sec>
<sec id="s3_2_3">
<label>3.2.3</label>
<title>Data imbalance</title>
<p>A notable issue in the dataset was the class imbalance, as seen in the number of images.</p>
<p>The &#x201c;no tumor&#x201d; class contained only 327 images, which were significantly fewer than those of the other tumor classes (meningioma, pituitary tumor, and glioma), each having over 800 images.</p>
<p>The class with the highest number of images, glioma (886), had more than 2.7 times the images in the no tumor class.</p>
<p>This imbalance poses several challenges.</p>
<p>Model bias: The model may become biased toward tumor classes, particularly glioma, due to its over-representation, while underperforming on under-represented classes like no tumor.</p>
<p>Reduced sensitivity and specificity: The classifier may show lower accuracy for detecting healthy cases, resulting in a higher false-positive rate.</p>
<p>Skewed performance metrics: High overall accuracy may be misleading if the model fails to correctly predict the minority class.</p>
</sec>
<sec id="s3_2_4">
<label>3.2.4</label>
<title>Addressing the data imbalance</title>
<p>To address the class imbalance, the following techniques were considered.</p>
<p>Data augmentation: Geometric and photometric transformations (such as rotations, flips, zoom, and brightness adjustments) were applied to increase the effective size of the minority class.</p>
<p>Class weighting: Class weights were set during the model training phase and were set inversely to the frequency of the class to increase the penalty for the wrong classification of minority classes.</p>
<p>Sampling strategies: Oversampling of minority classes and/or undersampling of majority classes was conducted to attain a suitable class distribution during training.</p>
<p>These strategies aim to enhance the model&#x2019;s ability to generalize across all classes and reduce bias toward over-represented tumor types.</p>
</sec>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Data preprocessing</title>
<p>As a means of preparing the dataset to be suitable for classification tasks, a number of preprocessing steps were performed to standardize the images of brain tumors. Below is a list of the preparatory processes that were performed: RGB images underwent grayscale conversion in order to create a single monochrome image. This can simplify the image data, which in turn will reduce computational requirements. All images were scaled to a uniform resolution of 608 &#xd7; 608. This step ensured uniformity for all images before subsequent processing steps were performed. Uniform size and proportions were also maintained for the input MRI images during the preprocessing stage using scaling and aspect ratio modification techniques, which helps maintain consistency across the dataset while minimizing distortions, increasing reliability for model input.</p>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Feature extraction</title>
<p>In our work, we enhanced the architecture of YOLOv7 by integrating the CBAM (Convolutional Block Attention Module) attention mechanism along with the Spatial Pyramid Pooling Fast Plus (SPPF+) module, as they significantly optimize the feature extraction. CBAM mainly focuses on spatial and channel information, making the model more attentive to the delicate and subtle features of the tumors. Also, the SPPF+ module enriches the multi-scale context that aids in the detection of small or irregular-shaped tumors on MRI images. All these modules, when combined, result in the improved accuracy of the model in various complex situations.</p>
<p>Integrating attention mechanisms and SPPF+ into YOLOv7 significantly enhances the model&#x2019;s feature extraction capability from complex MRI images. Attention mechanisms like SE (Squeeze and Excitation) blocks or CBAM allow the network to devote attention to the most spatially and channel-wise pertinent features. This is useful when identifying small or irregularly shaped&#xa0;tumors, which may resemble surrounding tissues. Also, SPPF+ excels at multi-scale local and global context feature representation through enlarged receptive fields and thus captures spatial and contextual information. These enhancements allow for better tumor detection and classification.</p>
</sec>
<sec id="s3_5">
<label>3.5</label>
<title>YOLOv7 network architecture</title>
<p>YOLO is a real-time object identification method that utilizes artificial intelligence. This method is well-liked since it is fast and accurate. The YOLO algorithm is important for the reasons listed below:</p>
<list list-type="bullet">
<list-item>
<p>Rate: This method accelerates detection because it can make predictions in real time.</p>
</list-item>
<list-item>
<p>Excellent precision: With little background errors, the YOLO prediction approach produces accurate results.</p>
</list-item>
<list-item>
<p>Learning skills: The method can recognize forms and apply them for detection because of its excellent learning capabilities.</p>
</list-item>
</list>
<p>The methods used by YOLO are Intersection over Union (IoU), bounding box regression, and residual blocks. In order to accurately diagnose a brain tumor, the therapy needs to be stage-specific and timely. As illustrated in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>, the YOLOv7s model architecture is composed of two primary elements: the head network and the backbone network. We conducted the initial steps of image processing on the first input image to make it suitable for the backbone network.</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>YOLOv7 network architecture.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g002.tif">
<alt-text content-type="machine-generated">Flowchart of a neural network architecture for brain tumor classification. It begins with an input image and flows through stages labeled Backbone, Neck, and Head. The Backbone includes layers like CBS, E-ELAN, and MP1. The Neck involves layers including SPPCSPSC and Upsample, leading to concatenations and ELAN-W. The Head contains RepConv layers and outputs classifications for Meningioma, Pituitary, Glioma, or No Brain Tumors.</alt-text>
</graphic>
</fig>
<p>Once the images are properly processed, the backbone network is responsible for retrieving the relevant information. These features are sent to the head network, which integrates them for further analysis toward fusion-based object detection. A balanced architectural structure to achieve detection and spatial precision that overlaps features onto the brain&#x2019;s natural logic division must exist within the brain to permit the combination of multiple features efficiently and logically.</p>
<p>The YOLOv7 algorithms&#x2019; infrastructure system is composed of the following elements: MaxPool1 (MP1), the extended efficient layer aggregation network (E-ELAN), and the CBS (convolution, batch normalization, and SiLU (Sigmoid Linear Unit)) component. This subsystem executes the SiLU (Sigmoid Linear Unit). activation function, batch normalization, and convolution as processes to sharpen the learning ability of the network. The E-ELAN component improves the gradient flow issues within the ELAN design by enabling modular computation feature learning, which allows the network to learn new features, thereby improving the modular computation feature learning.</p>
<p>The MP1 component is divided into two separate sections. The shorter subdivision utilizes a 1 &#xd7; 1 flow and kernel CBS method to decrease it to scale, a 2 &#xd7; 2 flow and a 3 &#xd7; 3 kernel to minimize each dimension of the perception, and a combination work to combine the characteristics retrieved from both divisions. The higher section utilizes a 128-output channel CBS module. An image&#x2019;s dimensions are maintained as the total number of channels is decreased using the MaxPool method and the 128-output-channel CBS module. MaxPool and CBS processes improve an underlying network&#x2019;s ability to recognize significant characteristics through a source visualization. Whereas the CBS method gathers areas with the least numbers, the MaxPool method collects limited localized areas with the greatest numbers. These techniques optimize the framework&#x2019;s entire performance and effectiveness by improving its feature extraction capabilities.</p>
<p>YOLOv7&#x2019;s core architecture employs the E-ELAN and incorporates the Feature Pyramid Network (FPN) design for feature extraction across multiple base layers. Utilizing the Spatial Pyramid Pooling (SPP) architecture, the Convolutional Spatial Pyramid (CSP) model enhances the collection of features at cheap computations across multiple sizes. By merging SPP and CSP, the SPPCSPC component increases the sensing area of the entire system.</p>
<p>Hardware specification&#x2014;CPU: AMD Ryzen 9 7950X or Intel Core i9-13900K.</p>
<p>Integrating the ELAN-W layer improves feature extraction greatly. The MP2 block is used with two additional output channels, which are equivalent to the MP1 block. Utilizing a 1 &#xd7; 1 convolution to calculate the classification, confidence, and anchor framework, the Rep structure modifies the number of image layers in its final characteristics. The Rep structure, which is based on RepVGG, includes a modified residual architecture that convolutionally decreases real estimations. Its ability to foresee is preserved even as its complexity drops.</p>
<p>YOLOv7&#x2019;s methodology is grounded in convolutional neural networks and real-time object detection principles. Its loss function is a composite of the following:</p>
<sec id="s3_5_1">
<label>3.5.1</label>
<title>Bounding box regression loss (CIoU loss)</title>
<p>The Complete Intersection over Union (CIoU) loss improves upon traditional IoU by incorporating distance between box centers, aspect ratio, and overlap area:</p>
<disp-formula>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>C</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>I</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>U</mml:mi>
<mml:mo>+</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mi>&#x3c1;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">b</mml:mtext>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">b</mml:mtext>
<mml:mrow>
<mml:mi>g</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mi>c</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>+</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
<mml:mi>&#x3c5;</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where IoU is the Intersection over Union between predicted and ground truth boxes, &#x3c1;(b,b<sup>(gt)</sup>) is the Euclidean distance between the centers of the predicted box b and ground truth box b<sup>(gt)</sup>, c is the diagonal length of the smallest enclosing box covering both b and b<sup>(gt)</sup>, v measures the similarity of aspect ratios, and &#x3b1; is the trade-off parameter to balance the impact of aspect ratio.</p>
</sec>
<sec id="s3_5_2">
<label>3.5.2</label>
<title>Objectness loss (binary cross-entropy loss)</title>
<p>This function is used to evaluate whether an object is present in a predicted bounding box:</p>
<disp-formula>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>o</mml:mi>
<mml:mi>b</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mi>y</mml:mi>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where y is the ground truth objectness (1 if the object exists and 0 otherwise) and p is the predicted objectness confidence.</p>
</sec>
<sec id="s3_5_3">
<label>3.5.3</label>
<title>Classification loss (binary cross-entropy per class)</title>
<p>This is a similar form to the objectness loss but applied independently for each class in multi-class settings. The total loss</p>
<disp-formula>
<mml:math display="block" id="M3">
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>C</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mi>log</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where C is the number of classes, y<sub>c</sub>&#x200b; is the ground truth for class c, and p<sub>c</sub> is the predicted probability for class c.</p>
<p>YOLOv7 also employs E-ELAN for deep and efficient feature learning, enhancing detection accuracy even for small, low-contrast tumors in MRI images.</p>
<p>We used YOLOv7&#x2019;s default ComputeLoss function, which combines CIoU loss for bounding box accuracy, objectness loss to detect tumor presence, and classification loss for tumor type. These help to improve detection and classification performance. Although ComputeLossOTA and other advanced loss functions were not used, they have potential for handling low-quality or imbalanced data better.</p>
</sec>
</sec>
<sec id="s3_6">
<label>3.6</label>
<title>Hyperparameter tuning and optimization of YOLOv7 model</title>
<p>To optimize the efficiency of a deep learning model, one must fine-tune the specific hyperparameters related to it. In this study, the hyperparameters of the YOLOv7 model were tuned to maximize detection and classification accuracy while minimizing resource expenditure. Their values, alongside a brief description of the reason for their selection, are summarized in <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>YOLOv7 hyperparameter tuning details.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Hyperparameter</th>
<th valign="middle" align="center">Value</th>
<th valign="middle" align="center">Description</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="left">Learning rate</td>
<td valign="middle" align="center">0.01</td>
<td valign="middle" align="left">The initial rate for model weight updates was chosen to balance speed and stability.</td>
</tr>
<tr>
<td valign="middle" align="left">Batch size</td>
<td valign="middle" align="center">16</td>
<td valign="middle" align="left">Number of training samples per batch; set to optimize GPU memory utilization.</td>
</tr>
<tr>
<td valign="middle" align="left">Number of epochs</td>
<td valign="middle" align="center">50</td>
<td valign="middle" align="left">The total number of passes through the training dataset is sufficient for convergence.</td>
</tr>
<tr>
<td valign="middle" align="left">Optimizer</td>
<td valign="middle" align="center">Adam</td>
<td valign="middle" align="left">An adaptive optimizer was selected for efficient convergence and weight adjustments.</td>
</tr>
<tr>
<td valign="middle" align="left">Weight decay</td>
<td valign="middle" align="center">0.0005</td>
<td valign="middle" align="left">A parameter for regularization that penalizes heavy weights to avoid overfitting.</td>
</tr>
<tr>
<td valign="middle" align="left">Input image resolution</td>
<td valign="middle" align="center">608 &#xd7; 608</td>
<td valign="middle" align="left">Image resizing ensures consistency across the dataset while maintaining detail.</td>
</tr>
<tr>
<td valign="middle" align="left">Data augmentation techniques</td>
<td valign="middle" align="center">Rotation, scaling, and flipping</td>
<td valign="middle" align="left">Applied to enhance model generalization and robustness.</td>
</tr>
<tr>
<td valign="middle" align="left">Anchor boxes</td>
<td valign="middle" align="center">Customized (based on k-means clustering)</td>
<td valign="middle" align="left">Improves the accuracy of bounding box localization.</td>
</tr>
<tr>
<td valign="middle" align="left">IoU threshold</td>
<td valign="middle" align="center">0.5</td>
<td valign="middle" align="left">Minimum overlap is required for a detection to be considered valid.</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>IoU, Intersection over Union.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>The learning rate was 0.01, facilitating incremental weight adjustments to promote stable learning and convergence. The selected batch size of 16 optimized training efficiency while accounting for GPU memory limitations. Training the model for 50 epochs achieved a balance between training duration and performance, allowing for sufficient learning while mitigating the risk of overfitting.</p>
<p>The Adam optimizer was utilized because of its adaptive learning rate mechanism, enhancing convergence in non-convex optimization problems. Regularization was implemented through a weight decay of 0.0005, which reduced overfitting while maintaining model complexity.</p>
<p>To further boost model performance on the training images, various data augmentation practices such as rotation, scaling, and flipping were implemented. These augmentations emulate changes found in real data, improving the model&#x2019;s performance on data that it has not previously encountered. Additionally, all images were set to a uniform input size of 608 &#xd7; 608 pixels. This standardization improves consistency and ensures the preservation of essential details required for precise tumor detection.</p>
<p>Customized anchor boxes, generated through k-means clustering on the training data, facilitated the accurate localization of bounding boxes. The IoU threshold of 0.5 was employed to establish valid detections, effectively balancing sensitivity and specificity in tumor detection.</p>
<p>The model was carefully designed to avoid overfitting to larger tumors and underfitting on smaller ones by incorporating effective regularization, data augmentation, and architectural enhancements like SPPF+. These strategies ensured balanced learning across different tumor sizes, improving generalization and maintaining high detection accuracy.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Comparison of performance metrics of YOLOv7</title>
<p>Although combining YOLOv7 with post-processing algorithms like GrabCut can refine the segmentation process, our dataset did not require that for YOLOv7 due to its strong spatial feature extraction and precise bounding box predictions. With high-resolution MRI images and well-annotated labels, YOLOv7 was able to accurately localize tumor boundaries and did not need additional segmentation processes, thus saving steps in the workflow while maintaining high performance.</p>
<p>The trained model is validated using evaluation metrics based on the confusion matrix. In the confusion matrix, true positive (TP) represents the value that has been accurately predicted and corresponds to the label that is indeed present. When a model predicts an identifier that was not present in fact, it is said to be a false positive (FP). True negatives (TNs) suggest that the model is not grounded in the truth and does not predict a label. The same applies to false negatives (FNs), although these metrics, along with F1 score, precision, recall, and mean average precision (mAP), suggest some form of the truth.</p>
<p>Precision measures the accuracy of input received from users along with the outcomes produced by the system. The overall predictions demonstrate the accuracy rate of the predictions. In the case where the model needs to be validated, precision is calculated. Also, the proportion of correct positive statements made is called recall, also known as the ratio of true positives to the total number.</p>
<p>The cumulative performance of all classes based on the average precision is referred to as the mAP. This entails calculating the AP for each class and then finding its mean. The notation mAP@0.5 denotes the metric mAP at convergence over an IoU threshold of 0.5, while the notation mAP@0.5:0.95 signifies the average mAP calculated for the range of IoU thresholds from 0.5 to 0.95. This figure illustrates the relationship between the F1 score and the object detection confidence threshold. Studying the F1-confidence curve is probably useful in understanding the analytic balance between recall and precision at different confidence thresholds (<xref ref-type="disp-formula" rid="eq1">Equations 1</xref>&#x2013;<xref ref-type="disp-formula" rid="eq4">4</xref>) (<xref ref-type="bibr" rid="B21">21</xref>).</p>
<disp-formula id="eq1">
<label>(1)</label>
<mml:math display="block" id="M4">
<mml:mrow>
<mml:mtext>F</mml:mtext>
<mml:mn>1</mml:mn>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>s</mml:mtext>
<mml:mtext>c</mml:mtext>
<mml:mtext>o</mml:mtext>
<mml:mtext>r</mml:mtext>
<mml:mtext>e</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">/</mml:mo>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>F</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>F</mml:mtext>
<mml:mtext>N</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="eq2">
<label>(2)</label>
<mml:math display="block" id="M5">
<mml:mrow>
<mml:mtext>P</mml:mtext>
<mml:mtext>r</mml:mtext>
<mml:mtext>e</mml:mtext>
<mml:mtext>c</mml:mtext>
<mml:mtext>i</mml:mtext>
<mml:mtext>s</mml:mtext>
<mml:mtext>i</mml:mtext>
<mml:mtext>o</mml:mtext>
<mml:mtext>n</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo stretchy="false">/</mml:mo>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>F</mml:mtext>
<mml:mtext>P</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="eq3">
<label>(3)</label>
<mml:math display="block" id="M6">
<mml:mrow>
<mml:mtext>R</mml:mtext>
<mml:mtext>e</mml:mtext>
<mml:mtext>c</mml:mtext>
<mml:mtext>a</mml:mtext>
<mml:mtext>l</mml:mtext>
<mml:mtext>l</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>=</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo stretchy="false">/</mml:mo>
<mml:mtext>T</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo>&#xa0;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:mtext>F</mml:mtext>
<mml:mtext>N</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="eq4">
<label>(4)</label>
<mml:math display="block" id="M7">
<mml:mrow>
<mml:mtext>m</mml:mtext>
<mml:mtext>A</mml:mtext>
<mml:mtext>P</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>N</mml:mi>
</mml:mfrac>
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mi>A</mml:mi>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
<sec id="s5" sec-type="results">
<label>5</label>
<title>Results</title>
<p>The dataset was divided into two sets: 70% for training and 30% for validation. The model managed to obtain a low training loss of 0.021 and a final validation loss of 0.034, showcasing effective learning and generalization.</p>
<p>The model was carefully designed to avoid overfitting to larger tumors and underfitting on smaller ones by incorporating effective regularization, data augmentation, and architectural enhancements like SPPF+. These strategies ensured balanced learning across different tumor sizes, improving generalization and maintaining high detection accuracy.</p>
<sec id="s5_1">
<label>5.1</label>
<title>Precision performance</title>
<p>
<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> displays the results of YOLOv7&#x2019;s performance evaluation on 217 labeled box images. The precision metric was applied to pituitary brain tumors, meningiomas, gliomas, and no tumors. The model can precisely identify regions of interest (ROIs) in images according to precision measurements. The precision score of YOLOv7 was 0.837 across all classes. YOLOv7 obtained the greatest precision score of 0.909 for meningioma. The precision&#x2013;confidence curves are displayed in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3a</bold>
</xref>, which illustrates how well the models performed.</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Performance evaluation of the YOLOv7 model.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="center">Class</th>
<th valign="top" align="center">Images</th>
<th valign="top" align="center">Label</th>
<th valign="top" align="center">Precision</th>
<th valign="top" align="center">Recall</th>
<th valign="top" align="center">mAP@0.5</th>
<th valign="top" align="center">mAP@0.5:0.95</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center">All</td>
<td valign="top" rowspan="4" align="center">217</td>
<td valign="top" align="center">217</td>
<td valign="top" align="center">0.837</td>
<td valign="top" align="center">0.813</td>
<td valign="top" align="center">0.879</td>
<td valign="top" align="center">0.442</td>
</tr>
<tr>
<td valign="top" align="center">Glioma</td>
<td valign="top" align="center">46</td>
<td valign="top" align="center">0.887</td>
<td valign="top" align="center">0.854</td>
<td valign="top" align="center">0.948</td>
<td valign="top" align="center">0.511</td>
</tr>
<tr>
<td valign="top" align="center">Meningioma</td>
<td valign="top" align="center">50</td>
<td valign="top" align="center">
<bold>0.909</bold>
</td>
<td valign="top" align="center">
<bold>0.96</bold>
</td>
<td valign="top" align="center">
<bold>0.974</bold>
</td>
<td valign="top" align="center">
<bold>0.52</bold>
</td>
</tr>
<tr>
<td valign="top" align="center">Pituitary tumors</td>
<td valign="top" align="center">72</td>
<td valign="top" align="center">0.865</td>
<td valign="top" align="center">0.886</td>
<td valign="top" align="center">0.929</td>
<td valign="top" align="center">0.461</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>mAP, mean average precision. Meningioma tumor has the highest Precision, Recall, mAP@0.5 and mAP@0.5:0.95 values among all the tumor classes.</p>
<p>The Bold Values in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> highlights : that the Meningioma tumor class has given the highest Precision, Recall, mAP@0.5 and mAP@0.5:0.95 values among all the classes of tumor in our proposed model, which is already highlighted under the <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>The Performance Curves of the YOLOv7 model for brain tumor detection: <bold>(a)</bold> Precision confidence curve. <bold>(b)</bold> Recall confidence curve. <bold>(c)</bold> F1-confidence curve. <bold>(d)</bold> Precision-recall curve.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g003.tif">
<alt-text content-type="machine-generated">Four subplots showing curves for different metrics: (a) Precision vs. Confidence shows curves for glioma, m_tumour, n_tumour, pituitary, and all classes, highlighting the precision. (b) Recall vs. Confidence illustrates recall for the same categories. (c) F1 vs. Confidence shows the F1 score, indicating model performance. (d) Precision vs. Recall highlights precision-recall trade-offs with mAP values for each category. Each plot uses distinct colors for each class and all classes combined.</alt-text>
</graphic>
</fig>
</sec>
<sec id="s5_2">
<label>5.2</label>
<title>Recall performance</title>
<p>
<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> displays a recall evaluation of the performance of YOLOv7 on 217 labeled box images. The recall score of YOLOv7 was 0.813 across all classes. YOLOv7 had the highest recall scores (0.96 for box detection) for meningioma and a score of 0.551 for no brain tumors. The box recall&#x2013;confidence curves of YOLOv7, as shown in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3b</bold>
</xref>, indicate that YOLOv7 performed well for meningiomas. In contrast, as shown by the green line in the figure, the recall&#x2013;confidence curve for no tumor implies poor performance of YOLOv7.</p>
</sec>
<sec id="s5_3">
<label>5.3</label>
<title>Mean average precision (mAP@0.5)</title>
<p>With 217 labeled box images, <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> displays the mAP with the versions of YOLOv7 assessed at the IoU threshold of 0.5. For box detection, the mAP score was 0.879. YOLOv7 had the greatest mAP scores (0.974) for box detection in meningioma cases. YOLOv7 had 0.948 for glioblastoma, 0.929 for pituitary tumors, and 0.665 for no brain tumors. The F1-confidence curves for YOLOv7 are shown in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3c</bold>
</xref>. The meningioma models functioned well. Nonetheless, the lack of tumor exhibits a low slope, signifying inadequate performance in comparison with comparable categories. Moreover, the data demonstrate that for YOLOv7, the optimal box confidence value was 0.322 for obtaining an F1 score of 0.88.</p>
</sec>
<sec id="s5_4">
<label>5.4</label>
<title>Mean average precision (mAP@0.5:0.95)</title>
<p>
<xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref> presents the mAP results obtained using the YOLOv7 algorithm for detecting the bounding boxes of 217 labeled images at an IoU of 0.5&#x2013;0.95. The model considers an object detected if its IoU threshold score is between 50 and 95. For box detection tasks, the model yielded a mAP@0.5:0.95 score of 0.442. The highest mAP@0.5:0.95 score of 0.52 was observed in the meningioma cases. The precision&#x2013;recall curves for the YOLOv7 model are provided in <xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3d</bold>
</xref>. It indicates that, other than the no tumor category, the model performs reasonably well for all classes. Unlike the meningioma and pituitary cases, the no tumor curve tended to have a higher rate of false positives.</p>
<p>Observation: High precision and recall across classes and strong mAP@0.5.</p>
<p>Decision: Confirms that the model effectively detects and classifies tumors with minimal false positives or negatives.</p>
</sec>
<sec id="s5_5">
<label>5.5</label>
<title>Confusion matrix</title>
<p>The YOLOv7 standardized confusion matrix is displayed in <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>, which reveals that the model performs well for all classes with the exception of no tumor, which has a high false-positive rate of 0.35.</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>Confusion matrix for YOLOv7 model.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g004.tif">
<alt-text content-type="machine-generated">Confusion matrix showing predicted versus true labels for glioma, m_tumor, n_tumor, pituitary, and background. Diagonal values indicate high classification accuracy, with glioma and m_tumor at 0.98, and pituitary at 0.97. Off-diagonal values like 0.26 and 0.35 indicate misclassifications. Color intensity represents accuracy levels.</alt-text>
</graphic>
</fig>
<p>The YOLOv7 model&#x2019;s forecast is shown in <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref>. These were used to maximize computational effectiveness and speed up inference, making it possible to identify instances in a variety of images. The anticipated images demonstrated the accuracy with which the brain tumor identification system operated following its training on the initial images. The complete outcomes of classification using the YOLOv7 model are presented in <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref>. The experiment&#x2019;s findings proved that YOLOv7 produced useful outcomes.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>Prediction of brain tumor using YOLOv7 model: <bold>(a)</bold> meningioma, <bold>(b)</bold> pituitary tumor, <bold>(c)</bold> glioma, and <bold>(d)</bold> no tumor.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g005.tif">
<alt-text content-type="machine-generated">Four MRI images of the brain with labels. (a) Shows a region marked as &#x201c;meningioma 1.0&#x201d; in red. (b) Indicates &#x201c;Pituitary: 0.939&#x201d; in white text. (c) Highlights &#x201c;Glioma 0.94&#x201d; in red. (d) Displays a faintly marked &#x201c;tumour 0.22&#x201d; in pink.</alt-text>
</graphic>
</fig>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>Comparison of performance metrics.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g006.tif">
<alt-text content-type="machine-generated">Bar chart titled &#x201c;Performance Metrics&#x201d; comparing precision, recall, mAP at point five, and mAP at point five to point nine five across four categories: All, Glioma, Meningioma, Pituitary, and No tumor. Each category shows varying performance levels for each metric, with the highest precision in Meningioma and the lowest mAP at point five to point nine five in No tumor.</alt-text>
</graphic>
</fig>
<p>Observation: Diagonal dominance with few misclassifications, mainly between glioma and meningioma.</p>
<p>Decision: Model performs well, but further data augmentation or fine-tuning could reduce misclassification.</p>
<p>Observation: Accurate bounding boxes drawn over tumors with correct class labels.</p>
<p>Decision: Demonstrates the model&#x2019;s potential for assisting in real-time clinical diagnosis.</p>
</sec>
</sec>
<sec id="s6" sec-type="discussion">
<label>6</label>
<title>Discussion</title>
<p>Meningioma, glioma, pituitary tumor, and no tumor were the four different types of brain tumors in which the results of the YOLOv7 were assessed. Numerous measures, such as the confusion matrix, precision, recall, F1-curve, and inference criteria, were used to assess these models. All different categories, excluding no tumor, which had a higher false-positive rate of 0.35 for YOLOv7, were found to perform worse than the model.</p>
<p>Notably, YOLOv7 outperformed the others in the identification of meningiomas, gliomas, and pituitary tumors. Together with the absence of a tumor, the research also presented precision&#x2013;confidence curves that showed how well the algorithms worked. Unexpectedly, meningioma recall was the top in YOLOv7 memory evaluations. These findings indicate that while both models performed exceptionally well overall, YOLOv7 performed better over a wide range of criteria for evaluation, especially when it comes to mAP@0.5&#x2013;0.95.</p>
<p>To resolve the questions outlined in Section 1, the study used a dataset of annotated MRI scans with brain tumors such as gliomas, meningiomas, and pituitary tumors under the said architecture. The results proved the efficacy of YOLOv7, as it achieved high detection accuracy and strong class discrimination granularity, thus reinforcing its suitability for brain tumor classification. As for the other algorithms, YOLOv7 outperformed them all in speed and mAP in comparison to YOLOv8, U-Net, and Faster R-CNN, illustrating its validity in real-time medical diagnosis. Moreover, the model was improved by adjusting anchor boxes, augmenting the dataset, and optimizing the learning rate, thus demonstrating the model&#x2019;s adaptability to the varying shapes and sizes of tumors.</p>
<p>These enhancements made the model sensitive and robust for real-world clinical use. In general, the study confirmed that with appropriate modifications, YOLOv7 is a reliable and competent brain tumor classification and detection tool.</p>
<p>As compared to the model presented by Rao et al. (<xref ref-type="bibr" rid="B12">12</xref>) in <xref ref-type="table" rid="T1">
<bold>Table&#xa0;1</bold>
</xref>, which implemented CNN-RNN with attention but lacked real-time capability, our model YOLOv7 + CBAM + SPPF+ achieves faster and more accurate detection (99.5%) with better interpretability results using Grad-CAM. Also, the attention-based enhancement in our model improved localization and feature learning, which helped it to better capture the small or irregular tumors that earlier models frequently missed detecting. Thus, our model offers an excellent balance between speed, accuracy, and visual explanation, therefore making it better suited for clinical deployments.</p>
</sec>
<sec id="s7">
<label>7</label>
<title>Comparison of the proposed architecture</title>
<p>Our earlier research used VGG16 deep learning models with a 73% classification accuracy to classify brain tumor grades from the Br35H, Figshare, and SARTAJ datasets as shown below in <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref>. Using an MRI scan of a collection on Figshare, the CNN model was used to classify tumor types (<xref ref-type="bibr" rid="B22">22</xref>). Additionally, they did not incorporate any data augmentation strategies to obtain more MRI images. They only managed an 84% categorization accuracy as a result. F1 score, accuracy (ACC), precision, and recall were some of the metrics. The outcomes highlight the potency of YOLOv7 for improving the performance measures of deep learning models, which are contrasted in <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref>.</p>
<table-wrap id="T4" position="float">
<label>Table&#xa0;4</label>
<caption>
<p>Comparison of the proposed architecture.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" align="center">Dataset</th>
<th valign="top" align="center">Architecture</th>
<th valign="top" align="center">Classification type</th>
<th valign="top" align="center">Accuracy (%)</th>
<th valign="top" align="center">Precision</th>
<th valign="top" align="center">Recall</th>
<th valign="top" align="center">F1 score</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="center">Br35H, Figshare, and SARTAJ</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">73</td>
<td valign="top" align="center">0.7</td>
<td valign="top" align="center">0.75</td>
<td valign="top" align="center">0.72</td>
</tr>
<tr>
<td valign="top" align="center">Figshare (<xref ref-type="bibr" rid="B22">22</xref>)</td>
<td valign="top" align="center">CNN</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">84.1</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="center">BraTS 2018 subset (<xref ref-type="bibr" rid="B14">14</xref>)</td>
<td valign="top" align="center">YOLOv5</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">85.95</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="center">BraTS 2017 dataset</td>
<td valign="top" align="center">SegNet</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">79</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">0.85</td>
</tr>
<tr>
<td valign="top" align="center">Brain tumor dataset by Jun Cheng</td>
<td valign="top" align="center">CNN</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">84.19</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
<td valign="top" align="center">&#x2013;</td>
</tr>
<tr>
<td valign="top" align="center">Roboflow</td>
<td valign="top" align="center">YOLOv7</td>
<td valign="top" align="center">Multi-class</td>
<td valign="top" align="center">87.9</td>
<td valign="top" align="center">0.837</td>
<td valign="top" align="center">0.813</td>
<td valign="top" align="center">0.88</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>YOLOv7 was used to identify safety equipment, such as helmets, goggles, coats, gloves, and shoes. The YOLOv7 model performed better than YOLOv7-X by 1.7%, YOLOv5s by 6.6%, and YOLOv5m by 12.2% in terms of mAP@0.5 value. This in-depth analysis confirms the versatility of the YOLOv7 models and recommends them as the approach for identifying safety equipment for building laborers (<xref ref-type="bibr" rid="B23">23</xref>). Numerous studies have shown the application of machine learning-based methods for identifying objects to detect flaws, like road and building cracks. In this study, the YOLOv5, YOLOv6, and YOLOv7 models were trained and run using a particular dataset of potholes and cracks on roadways. Their findings were reviewed and evaluated. Monitoring of the information showed that YOLOv7 performed the best, with a mAP@0.5 value of 79.0% (<xref ref-type="bibr" rid="B24">24</xref>). YOLOv7 performed better than the other models with a mAP@0.5 score. Our recommended work had a mAP@0.5 score of 87.9%, which is higher than the other mAP@0.5 score displayed in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7</bold>
</xref>.</p>
<fig id="f7" position="float">
<label>Figure&#xa0;7</label>
<caption>
<p>Comparison of the YOLOv7 model.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fonc-15-1508326-g007.tif">
<alt-text content-type="machine-generated">Bar chart comparing performance of YOLOv7 with various models. Bars represent models YOLOv5m, YOLOv5s, YOLOv5x, YOLOv7, and others. The proposed YOLOv7 model scores highest, suggesting superior performance.</alt-text>
</graphic>
</fig>
</sec>
<sec id="s8" sec-type="conclusions">
<label>8</label>
<title>Conclusion</title>
<p>An extensive assessment of the brain malignancies categorized and segmented by YOLO-based deep learning, namely, meningioma, glioma, and pituitary tumor, is presented in this work. When it came to accurately identifying and segmenting the particular tumor class, YOLOv7 performed better than the others. These models performed remarkably well in recognizing meningiomas; YOLOv7 performed particularly well in identifying gliomas and pituitary tumors. Moreover, YOLOv7 performed similarly in precision scores across all three tumor classifications. The greatest recall ratings for meningioma in YOLOv7 were noted. These results support the efficacy of YOLO models in accurately identifying brain tumors, especially meningiomas. They also provide useful data regarding both the constraints and effective traits for such designs, opening the door to more artificial intelligence and medicine developments. The proposed YOLOv7-based model, enhanced with CBAM, SPPF+, and Grad-CAM, maintains high accuracy and interpretability, which are both essential for real clinical settings. Like recent works, this model solves the major challenge of explainability, which is important for enabling trust and integration into diagnostic workflows. This approach is easy to validate by medical experts, and it is very important to do so. Radiologists can ensure that the model&#x2019;s predictions align with the clinical interpretations, especially using Grad-CAM for explainability. Their reviews and comments can help improve the model, thus enhancing trust in the model for real-world use.</p>
</sec>
</body>
<back>
<sec id="s9" sec-type="data-availability">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s10" sec-type="author-contributions">
<title>Author contributions</title>
<p>RN: Conceptualization, Methodology, Validation, Visualization, Writing &#x2013; original draft, Writing &#x2013; review &amp; editing. PD: Project administration, Supervision, Validation, Writing &#x2013; original draft, Writing &#x2013; review &amp; editing.</p>
</sec>
<sec id="s11" sec-type="funding-information">
<title>Funding</title>
<p>The author(s) declare that no financial support was received for the research, and/or publication of this article.</p>
</sec>
<sec id="s12" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s13" sec-type="ai-statement">
<title>Generative AI statement</title>
<p>The author(s) declare that no Generative AI was used in the creation of this manuscript.</p>
</sec>
<sec id="s14" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mohsen</surname> <given-names>H</given-names>
</name>
<name>
<surname>El-Dahshan</surname> <given-names>ESA</given-names>
</name>
<name>
<surname>El-Horbaty</surname> <given-names>ESM</given-names>
</name>
<name>
<surname>Salem</surname> <given-names>ABM</given-names>
</name>
</person-group>. <article-title>Classification using deep learning neural networks for brain tumors</article-title>. <source>Future Computing Inf J</source>. (<year>2018</year>) <volume>3</volume>:<fpage>68</fpage>&#x2013;<lpage>71</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.fcij.2017.12.001</pub-id>
</citation></ref>
<ref id="B2">
<label>2</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tandel</surname> <given-names>GS</given-names>
</name>
<name>
<surname>Balestrieri</surname> <given-names>A</given-names>
</name>
<name>
<surname>Jujaray</surname> <given-names>T</given-names>
</name>
<name>
<surname>Khanna</surname> <given-names>NN</given-names>
</name>
<name>
<surname>Saba</surname> <given-names>L</given-names>
</name>
<name>
<surname>Suri</surname> <given-names>JS</given-names>
</name>
</person-group>. <article-title>Multiclass magnetic resonance imaging brain tumor classification using artificial intelligence paradigm, Comput</article-title>. <source>Biol Med</source>. (<year>2020</year>) <volume>122</volume>:<fpage>103804</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compbiomed.2020.103804</pub-id>, PMID: <pub-id pub-id-type="pmid">32658726</pub-id></citation></ref>
<ref id="B3">
<label>3</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Latif</surname> <given-names>G</given-names>
</name>
<name>
<surname>Brahim</surname> <given-names>GB</given-names>
</name>
<name>
<surname>Iskandar</surname> <given-names>DNFA</given-names>
</name>
<name>
<surname>Bashar</surname> <given-names>A</given-names>
</name>
<name>
<surname>Alghazo</surname> <given-names>J</given-names>
</name>
</person-group>. <article-title>Glioma Tumors classification using deep-neural-network-based features with SVM classifier</article-title>. <source>Diagnostics</source>. (<year>2022</year>) <volume>12</volume>:<fpage>1018</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/diagnostics12041018</pub-id>, PMID: <pub-id pub-id-type="pmid">35454066</pub-id></citation></ref>
<ref id="B4">
<label>4</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nawaz</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Khan</surname> <given-names>DM</given-names>
</name>
<name>
<surname>Qadri</surname> <given-names>S</given-names>
</name>
</person-group>. <article-title>Brain tumor classification based on hybrid optimized multi-features analysis using magnetic resonance imaging dataset</article-title>. <source>Appl Artif Intell</source>. (<year>2022</year>) <volume>36</volume>:<fpage>2031824</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1080/08839514.2022.2031824</pub-id>
</citation></ref>
<ref id="B5">
<label>5</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alqazzaz</surname> <given-names>S</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>X</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>X</given-names>
</name>
<name>
<surname>Nokes</surname> <given-names>L</given-names>
</name>
</person-group>. <article-title>Automatic brain tumor detection and segmentation using SegNet convolutional neural networks with multi-modal MRI images</article-title>. <source>J Med Imaging Health Inf</source>. (<year>2019</year>) <volume>9</volume>:<page-range>209&#x2013;17</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s41095-019-0139-y</pub-id>
</citation></ref>
<ref id="B6">
<label>6</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kader</surname> <given-names>AE</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>G</given-names>
</name>
<name>
<surname>Shuai</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Saminu</surname> <given-names>S</given-names>
</name>
<name>
<surname>Javaid</surname> <given-names>I</given-names>
</name>
<name>
<surname>Ahmad</surname> <given-names>IS</given-names>
</name>
<etal/>
</person-group>. <article-title>Differential deep convolutional neural network model for brain tumor classification</article-title>. <source>Brain Sci</source>. (<year>2021</year>) <volume>11</volume>:<fpage>352</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/brainsci11030352</pub-id>, PMID: <pub-id pub-id-type="pmid">33801994</pub-id></citation></ref>
<ref id="B7">
<label>7</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Soomro</surname> <given-names>TA</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>L</given-names>
</name>
<name>
<surname>Afifi</surname> <given-names>AJ</given-names>
</name>
<name>
<surname>Ali</surname> <given-names>A</given-names>
</name>
<name>
<surname>Soomro</surname> <given-names>S</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>M</given-names>
</name>
<etal/>
</person-group>. <article-title>Image segmentation for MR brain tumor detection using machine learning: A review</article-title>. <source>IEEE Rev Biomed Eng</source>. (<year>2022</year>) <volume>16</volume>:<fpage>70</fpage>&#x2013;<lpage>90</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/RBME.2022.3185292</pub-id>, PMID: <pub-id pub-id-type="pmid">35737636</pub-id></citation></ref>
<ref id="B8">
<label>8</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khan</surname> <given-names>SI</given-names>
</name>
<name>
<surname>Rahman</surname> <given-names>A</given-names>
</name>
<name>
<surname>Debnath</surname> <given-names>T</given-names>
</name>
<name>
<surname>Karim</surname> <given-names>R</given-names>
</name>
<name>
<surname>Nasir</surname> <given-names>MK</given-names>
</name>
<name>
<surname>Band</surname> <given-names>SS</given-names>
</name>
<etal/>
</person-group>. <article-title>Accurate brain tumor detection using deep convolutional neural network</article-title>. <source>Comput Struct Biotechnol J</source>. (<year>2022</year>) <volume>20</volume>:<page-range>4733&#x2013;45</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.csbj.2022.08.039</pub-id>, PMID: <pub-id pub-id-type="pmid">36147663</pub-id></citation></ref>
<ref id="B9">
<label>9</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abd El Kader</surname> <given-names>I</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>G</given-names>
</name>
<name>
<surname>Shuai</surname> <given-names>Z</given-names>
</name>
<name>
<surname>Saminu</surname> <given-names>S</given-names>
</name>
<name>
<surname>Javaid</surname> <given-names>I</given-names>
</name>
<name>
<surname>Salim Ahmad</surname> <given-names>I</given-names>
</name>
</person-group>. <article-title>Differential deep comparative study</article-title>. <source>Bull Electrical Eng Inf</source>. (<year>2024</year>) <volume>13</volume>:<page-range>350&#x2013;60</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/brainsci11030352</pub-id>, PMID: <pub-id pub-id-type="pmid">33801994</pub-id></citation></ref>
<ref id="B10">
<label>10</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sung</surname> <given-names>H</given-names>
</name>
<name>
<surname>Ferlay</surname> <given-names>J</given-names>
</name>
<name>
<surname>Siegel</surname> <given-names>RL</given-names>
</name>
<name>
<surname>Laversanne</surname> <given-names>M</given-names>
</name>
<name>
<surname>Soerjomataram</surname> <given-names>I</given-names>
</name>
<name>
<surname>Jemal</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries</article-title>. <source>CA Cancer J Clin</source>. (<year>2021</year>) <volume>71</volume>:<page-range>209&#x2013;49</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3322/caac.21660</pub-id>, PMID: <pub-id pub-id-type="pmid">33538338</pub-id></citation></ref>
<ref id="B11">
<label>11</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Karthik</surname> <given-names>B</given-names>
</name>
<name>
<surname>Vijayan</surname> <given-names>T</given-names>
</name>
<name>
<surname>Rekha Sharmily</surname> <given-names>R</given-names>
</name>
</person-group>. <article-title>Brain Tumour Detection and Classification using Deep Learning And Transfer Learning Techniques</article-title>. <conf-name>Intelligent Computing and Control for Engineering and Business Systems (ICCEBS)</conf-name>. (<year>2023</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1109/ICCEBS58601.2023.10449015</pub-id>
</citation></ref>
<ref id="B12">
<label>12</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Rao</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Sridharraj</surname> <given-names>K</given-names>
</name>
<name>
<surname>Venkatesh</surname> <given-names>S</given-names>
</name>
<name>
<surname>Hathiram</surname> <given-names>N</given-names>
</name>
</person-group>. <article-title>Brain Tumor Recognition and Categorization in MRI Images Utilizing Optimal Deep Belief Network</article-title>. <conf-name>IEEE International Conference on Distributed Computing and Electrical Circuits and Electronics (ICDCECE)</conf-name>. (<year>2022</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.1109/ICDCECE53908.2022.9793211</pub-id>
</citation></ref>
<ref id="B13">
<label>13</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Almufareh</surname> <given-names>MF</given-names>
</name>
<name>
<surname>Imran</surname> <given-names>M</given-names>
</name>
<name>
<surname>Khan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Humayun</surname> <given-names>M</given-names>
</name>
<name>
<surname>Asim</surname> <given-names>M</given-names>
</name>
</person-group>. <article-title>Automated brain tumor segmentation and classification in MRI using YOLO-based deep learning</article-title>. <source>IEEE Access</source>. (<year>2024</year>) <volume>12</volume>:<fpage>16189</fpage>&#x2013;<lpage>16207</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/ACCESS.2024.3359418</pub-id>
</citation></ref>
<ref id="B14">
<label>14</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Paul</surname> <given-names>S</given-names>
</name>
<name>
<surname>Ahad</surname> <given-names>T</given-names>
</name>
<name>
<surname>Hasan</surname> <given-names>M</given-names>
</name>
</person-group>. (<year>2022</year>). <article-title>Brain cancer segmentation using YOLOv5 deep neural network</article-title>. pp. <fpage>1</fpage>&#x2013;<lpage>6</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.2212.13599</pub-id>
</citation></ref>
<ref id="B15">
<label>15</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Abiwinanda</surname> <given-names>N</given-names>
</name>
<name>
<surname>Hanif</surname> <given-names>M</given-names>
</name>
<name>
<surname>Hesaputra</surname> <given-names>ST</given-names>
</name>
<name>
<surname>Handayani</surname> <given-names>A</given-names>
</name>
<name>
<surname>Mengko</surname> <given-names>TR</given-names>
</name>
</person-group>. <article-title>Brain tumor classification using convolutional neural network</article-title>. <source>World Congress Med Phys Biomed Eng</source>. (<year>2019</year>) <volume>13</volume>:<page-range>183&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/978-981-10-9035-6_33</pub-id>
</citation></ref>
<ref id="B16">
<label>16</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gore</surname> <given-names>DV</given-names>
</name>
<name>
<surname>Deshpande</surname> <given-names>V</given-names>
</name>
</person-group>. (<year>2020</year>). <article-title>Comparative study of various techniques using deep Learning for brain tumor detection</article-title>, in: <conf-name>Proceedings of the 2020 IEEE International Conference for Emerging Technology (INCET)</conf-name>, <conf-loc>Belgaum, India</conf-loc>. pp. <fpage>1</fpage>&#x2013;<lpage>4</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/INCET49848.2020.9154030</pub-id>
</citation></ref>
<ref id="B17">
<label>17</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Passa</surname> <given-names>RS</given-names>
</name>
<name>
<surname>Nurmaini</surname> <given-names>S</given-names>
</name>
<name>
<surname>Rini</surname> <given-names>DP</given-names>
</name>
</person-group>. <article-title>YOLOv8 based on data augmentation for MRI brain tumor detection</article-title>. <source>Sci J Inf</source>. (<year>2023</year>) <volume>10</volume>:<page-range>363&#x2013;70</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.3837/tiis.2020.12.011</pub-id>
</citation></ref>
<ref id="B18">
<label>18</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>ZainEldin</surname> <given-names>H</given-names>
</name>
<name>
<surname>Gamel</surname> <given-names>SA</given-names>
</name>
<name>
<surname>El-Kenawy</surname> <given-names>E-SM</given-names>
</name>
<name>
<surname>Alharbi</surname> <given-names>AH</given-names>
</name>
<name>
<surname>Khafaga</surname> <given-names>DS</given-names>
</name>
<name>
<surname>Ibrahim</surname> <given-names>A</given-names>
</name>
<etal/>
</person-group>. <article-title>Brain tumor detection and classification using deep learning and sine-cosine fitness greyWolf optimization</article-title>. <source>Bioengineering</source>. (<year>2023</year>) <volume>10</volume>:<fpage>1</fpage>&#x2013;<lpage>19</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/bioengineering10010018</pub-id>, PMID: <pub-id pub-id-type="pmid">36671591</pub-id></citation></ref>
<ref id="B19">
<label>19</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shelatkar</surname> <given-names>T</given-names>
</name>
<name>
<surname>Urvashi</surname> <given-names>D</given-names>
</name>
<name>
<surname>Shorfuzzaman</surname> <given-names>M</given-names>
</name>
<name>
<surname>Alsufyani</surname> <given-names>A</given-names>
</name>
<name>
<surname>Lakshmanna</surname> <given-names>K</given-names>
</name>
</person-group>. <article-title>Diagnosis of brain tumor using light weight deep learning model with fine-tuning&#xa0;approach</article-title>. <source>Comput Math Methods Med</source>. (<year>2022</year>) <page-range>1&#x2013;9</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.1155/2022/2858845</pub-id>, PMID: <pub-id pub-id-type="pmid">35813426</pub-id></citation></ref>
<ref id="B20">
<label>20</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yusof</surname> <given-names>NIM</given-names>
</name>
<name>
<surname>Sophian</surname> <given-names>A</given-names>
</name>
<name>
<surname>Zaki</surname> <given-names>HFM</given-names>
</name>
<name>
<surname>Bawono</surname> <given-names>AA</given-names>
</name>
<name>
<surname>Embong</surname> <given-names>AH</given-names>
</name>
<name>
<surname>Ashraf</surname> <given-names>A</given-names>
</name>
</person-group>. <article-title>Assessing the performance of YOLOv5, YOLOv6, and YOLOv7 in road defect detection and classification: a comparative study</article-title>. (<year>2023</year>) <volume>13</volume>:<page-range>350&#x2013;60</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.11591/eei.v13i1.6317</pub-id>
</citation></ref>
<ref id="B21">
<label>21</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zahoor</surname> <given-names>MM</given-names>
</name>
<name>
<surname>Qureshi</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Bibi</surname> <given-names>S</given-names>
</name>
<name>
<surname>Khan</surname> <given-names>SH</given-names>
</name>
<name>
<surname>Khan</surname> <given-names>A</given-names>
</name>
<name>
<surname>Ghafoor</surname> <given-names>U</given-names>
</name>
<etal/>
</person-group>. <article-title>A new deep hybrid boosted and ensemble learning-based brain tumor analysis using MRI</article-title>. <source>Sensors</source>. (<year>2022</year>) <volume>22</volume>:<fpage>2726</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/s22072726</pub-id>, PMID: <pub-id pub-id-type="pmid">35408340</pub-id></citation></ref>
<ref id="B22">
<label>22</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Montalbo</surname> <given-names>FJP</given-names>
</name>
</person-group>. <article-title>A computer-aided diagnosis of brain tumors using a fine-tuned YOLO-based model with transfer learning</article-title>. <source>KSII Trans Internet Inf Syst (TIIS)</source>. (<year>2021</year>). <volume>14</volume>:<fpage>271</fpage>&#x2013;<lpage>350</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.3837/tiis.2020.12.011</pub-id>
</citation></ref>
<ref id="B23">
<label>23</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Abiwinanda</surname> <given-names>N</given-names>
</name>
<name>
<surname>Hanif</surname> <given-names>M</given-names>
</name>
<name>
<surname>Hesaputra</surname> <given-names>ST</given-names>
</name>
<name>
<surname>Handayani</surname> <given-names>A</given-names>
</name>
<name>
<surname>Mengko</surname> <given-names>TR</given-names>
</name>
</person-group>. <article-title>Brain tumor classification using convolutional neural network</article-title>. In: <source>World congress on medical physics and biomedical engineering</source>. <publisher-name>Springer</publisher-name>, <publisher-loc>Singapore</publisher-loc> (<year>2019</year>). p. <page-range>183&#x2013;9</page-range>.</citation></ref>
<ref id="B24">
<label>24</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Islam</surname> <given-names>S</given-names>
</name>
<name>
<surname>Shaqib</surname> <given-names>SM</given-names>
</name>
<name>
<surname>Ramit</surname> <given-names>SS</given-names>
</name>
<name>
<surname>Khushbu</surname> <given-names>SA</given-names>
</name>
<name>
<surname>Sattar</surname> <given-names>A</given-names>
</name>
<name>
<surname>Noori</surname> <given-names>SRH</given-names>
</name>
</person-group>. <article-title>A deep learning approach to detect complete safety equipment for construction workers based on YOLOv7</article-title>. <source>arXiv:2406.07707v2</source>. (<year>2024</year>). doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.2406.07707</pub-id>
</citation></ref>
</ref-list>
</back>
</article>