<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="systematic-review" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frai.2025.1496580</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Systematic Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>AI in phishing detection: a bibliometric review</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Popescul</surname>
<given-names>Daniela</given-names>
</name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="https://loop.frontiersin.org/people/1044591/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/investigation/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Radu</surname>
<given-names>Laura Diana</given-names>
</name>
<uri xlink:href="https://loop.frontiersin.org/people/1166627/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/investigation/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/software/"/>
<role content-type="https://credit.niso.org/contributor-roles/validation/"/>
<role content-type="https://credit.niso.org/contributor-roles/visualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff><institution>Department of Accounting, Business Information Systems and Statistics, Faculty of Economics and Business Administration, &#x201C;Alexandru Ioan Cuza&#x201D; University</institution>, <addr-line>Ia&#x0219;i</addr-line>, <country>Romania</country></aff>
<author-notes>
<fn fn-type="edited-by" id="fn0001">
<p>Edited by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1578228/overview">Piotr Sulikowski</ext-link>, West Pomeranian University of Technology, Poland</p>
</fn>
<fn fn-type="edited-by" id="fn0002">
<p>Reviewed by: <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/652891/overview">Evren &#x015E;adi &#x015E;eker</ext-link>, Istanbul University, T&#x00FC;rkiye</p>
<p><ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1803477/overview">Ramesh Subramanian</ext-link>, Quinnipiac University, United States</p>
<p><ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/2779992/overview">Muhammed Basheer Jasser</ext-link>, Sunway University, Malaysia</p>
</fn>
<corresp id="c001">&#x002A;Correspondence: Daniela Popescul, <email>rdaniela@uaic.ro</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>23</day>
<month>10</month>
<year>2025</year>
</pub-date>
<pub-date pub-type="collection">
<year>2025</year>
</pub-date>
<volume>8</volume>
<elocation-id>1496580</elocation-id>
<history>
<date date-type="received">
<day>14</day>
<month>09</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>09</month>
<year>2025</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2025 Popescul and Radu.</copyright-statement>
<copyright-year>2025</copyright-year>
<copyright-holder>Popescul and Radu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<sec id="sec1">
<title>Background</title>
<p>Phishing represents a category of cyber-attacks based on social engineering, with a significant impact on individuals and organizations, and a high capacity for reinvention by adapting its modus operandi according to technological advancements. With a relatively simple scenario and without using sophisticated technologies, phishing attacks exploit user vulnerabilities, convincing them to disclose sensitive personal or organizational data. Within anti-phishing solutions, the detection of spoofed URLs, counterfeit websites, and email or other types of messages that lure the user into entering their data in a form, plays an important role. Against this backdrop, artificial intelligence (AI) technologies, particularly Machine Learning (ML), have been successfully employed in phishing detection, with a rich body of literature in this field.</p>
</sec>
<sec id="sec2">
<title>Objective</title>
<p>A review of the existing literature on phishing detection using AI was conducted. This study aims to fill this gap by providing comprehensive bibliometric analysis, complementing existing surveys in the field, focusing on the role of AI in phishing detection.</p>
</sec>
<sec id="sec3">
<title>Methods</title>
<p>A total of 1096 documents focusing on AI, ML, Deep Learning (DL), or Natural Language Processing (NLP) in phishing detection were extracted from the Web of Science (WoS) scientific database. The information from these documents was subsequently loaded into the Biblioshiny (Bibliometrix package) and VOSviewer software.</p>
</sec>
<sec id="sec4">
<title>Results</title>
<p>The dataset allowed for the identification of publication trends, influential documents and publications, patterns of author collaboration, and key topics of interest within the main author clusters. A thematic analysis of the field highlighted driving themes, niche themes, emerging and declining themes, and basic themes. Furthermore, thematic evolution over time was examined based on authors&#x2019; keywords. A thorough review of the most relevant articles identified through bibliometric analysis was conducted to discuss the primary methods of phishing detection using AI.</p>
</sec>
<sec id="sec5">
<title>Conclusion</title>
<p>The research field of AI in phishing detection has evolved significantly starting with 2016, with a focus on using ML algorithms to identify phishing websites by extracting discriminative features, and experienced a consistent growth in 2024. Recent work emphasizes a shift from classical ML to DL, the importance of feature selection and engineering, and the use of hybrid models and classifier stacking.</p>
</sec>
</abstract>
<kwd-group>
<kwd>phishing</kwd>
<kwd>social engineering</kwd>
<kwd>machine learning</kwd>
<kwd>artificial intelligence</kwd>
<kwd>deep learning</kwd>
<kwd>bibliometric review</kwd>
</kwd-group>
<counts>
<fig-count count="12"/>
<table-count count="9"/>
<equation-count count="0"/>
<ref-count count="94"/>
<page-count count="25"/>
<word-count count="18127"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>AI in Business</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="sec6">
<label>1</label>
<title>Introduction</title>
<p>In today&#x2019;s digital age, the success of any organization&#x2014;regardless of its size or sector&#x2014;largely depends on how it understands and manages information. Information is not only a key operational resource, but also a strategic asset that influences decision-making, competitiveness, and organizational performance. When used effectively, information can improve the cost-efficiency of various internal processes and support the achievement of broader organizational goals.</p>
<p>Much like financial, human, or physical resources, information plays a central role in planning, coordination, and control. However, as organizations continue to generate and process increasing volumes of data, the management of this information becomes more complex. Without a clear understanding of the value and role of information, organizations risk exposing themselves to security threats, operational inefficiencies, and strategic blind spots.</p>
<p>An important step in ensuring the security of organizational information is the management of information assets. These assets&#x2014;whether tangible or intangible&#x2014;comprise collections of information that is worth protected (<xref ref-type="bibr" rid="ref28">European Parliament and Council, 2022</xref>). Information assets are managed as distinct units, enabling them to be understood, shared, protected, and utilized effectively (<xref ref-type="bibr" rid="ref89">The National Archives, 2017</xref>). Their value is tied not only to the advantages they bring to the organization, but also to the potential harm caused by loss, alteration, or unauthorized disclosure. Trust from stakeholders and business partners also depends on the integrity and availability of critical information (<xref ref-type="bibr" rid="ref76">Popescul, 2014</xref>). Information, however, is not managed in isolation. The information assets are handled by people across the organization&#x2014;employees, managers, and contractors&#x2014;each with varying levels of access, awareness, and technical competence. This diversity introduces a significant variable: human behavior. Individuals may unintentionally become sources of risk through negligence, lack of training, or failure to follow established procedures. In some cases, poor judgment or susceptibility to manipulation can lead to serious breaches. The human factor, therefore, represents both a critical asset and a potential vulnerability.</p>
<p>The security of information assets involves minimizing vulnerabilities and addressing threats that could compromise confidentiality, integrity, or availability. In recent years, threat landscapes have become increasingly dynamic and diverse. Reports from the European Union Agency for Cybersecurity (ENISA) highlight top cybersecurity threats such as ransomware, malware, social engineering, data breaches, distributed denial-of-service (DDoS) attacks, information manipulation, and supply chain attacks. Although, in response to these evolving threats, data security has significantly improved through advanced technical solutions, the human element remains an essential and often vulnerable link in the information lifecycle, as knowledge and information are frequently handled, processed, or interpreted by people rather than machines. In effect, protecting organizational information is as much a human challenge as it is a technical one (<xref ref-type="bibr" rid="ref70">Oprea, 2007</xref>). Social engineering exploits this human factor by manipulating individuals to disclose sensitive information, making it a particularly dangerous and evolving threat in the digital era.</p>
<p>In cybersecurity, the term <bold>social engineering</bold> refers to a wide range of activities that attempt to exploit user behavior in order to gain access to information or services that the attacker is not authorized to use. The users are lured &#x201C;into opening documents, files or emails, visiting websites or granting unauthorized persons access to systems or services&#x201D; (<xref ref-type="bibr" rid="ref27">ENISA, 2023b</xref>). Within social engineering, <bold>phishing</bold> is a form of criminal activity in which the attacker obtains sensitive data, such as login credentials for banking applications or e-commerce platforms, credit card information, bank account details, Personal Identification Numbers (PINs), as well as other personal and/or confidential information, by using techniques to manipulate the identity of a person or organization (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>; <xref ref-type="bibr" rid="ref2">Adebowale et al., 2019</xref>; <xref ref-type="bibr" rid="ref6">Alsariera et al., 2020</xref>; <xref ref-type="bibr" rid="ref17">Capuano et al., 2022</xref>). In <xref ref-type="bibr" rid="ref13">Basit et al. (2021)</xref>, phishing websites are considered frequent gateways for online social engineering attacks. Phishing is particularly dangerous because it has a direct impact on the physical world (<xref ref-type="bibr" rid="ref6">Alsariera et al., 2020</xref>), with consequences such as drained bank accounts, compromised security systems, or even threats to personal safety. By deceiving individuals into sharing private information, phishing blurs the line between the digital and physical realms, making its effects far-reaching and tangible. In phishing, attackers often use psychological manipulation techniques. For example, an individual whose profile has been accurately determined through data collection from various social media platforms is sent an unusual request, outside the norms of internal procedures, as if coming from an official or superior with whom the targeted person does not usually have direct contact. The victims are led to believe there is an alleged urgency, they are flattered, promised rewards, and asked to maintain confidentiality, ultimately being guided into performing actions they would not normally do. Karim et al. also state that by using &#x201C;social engineering tricks&#x201D; the message can deceive the recipient into acting in the attacker&#x2019;s favor, even without the need for malicious links or attachments sent digitally (<xref ref-type="bibr" rid="ref48">Karim et al., 2019</xref>).</p>
<p>The trend of &#x201C;escalating frequency, severity, and impact&#x201D; associated with phishing, mentioned in 2019 by <xref ref-type="bibr" rid="ref2">Adebowale et al. (2019)</xref> has continued in the following years, with the diversification of methods targeting victims and the increasing quality of attacks. According to ENISA, Europol and the FBI report that phishing and social engineering remain the main vectors for payment fraud, growing over time both in volume and sophistication (<xref ref-type="bibr" rid="ref26">ENISA, 2023a</xref>). Beyond the costs incurred by organizations and individuals in managing it, phishing is used alongside identity theft (<xref ref-type="bibr" rid="ref31">Gangavarapu et al., 2020</xref>) and ransomware. Currently, healthcare is one of the most targeted sectors by phishing (<xref ref-type="bibr" rid="ref85">Sharma et al., 2023</xref>). In an specialized ENISA report on health sector, the scenario for an initial attack is described as starting with a phishing campaign, followed by a ransomware attack with negative effects on patient data (<xref ref-type="bibr" rid="ref27">ENISA, 2023b</xref>).</p>
<p>The forms of &#x201C;bait&#x201D; for users have also evolved over time. Alongside technological advancements, email messages with numerous spelling mistakes were joined by SMS messages, social media posts, voice calls made by humans and later by synthetic voices, deep-fake video images, QR codes, and tampered mobile apps. Depending on how the user is deceived, there are various types of phishing. The &#x201C;classic&#x201D; <italic>phishing</italic> variant involves creating a website as a replica of a legitimate one and luring the victim to access it through email messages which contain a hyperlink with a URL similar to that of the original site. In the case of <italic>pharming</italic>, the victim is automatically redirected to the duplicate website, directly through DNS manipulation or execution of malicious code, making further &#x201C;deception&#x201D; unnecessary. <italic>Smishing</italic> refers to the type of phishing in which victims&#x2019; financial or personal information is collected with SMS messages. <italic>Vishing</italic> is a combination of phishing and voice, where information is provided over the phone by victims deceived through social engineering techniques. In <italic>quishing</italic>, the victim is directed to a malicious site or file by scanning a QR code. <italic>Covert redirect</italic> is a type of phishing attack that exploits vulnerabilities in third-party authentication systems, to redirect users to malicious websites without their knowledge. Unlike traditional phishing attacks, covert redirect does not require users to enter their credentials directly into a fake login page; instead, it tricks them into granting permissions to a malicious app or site, which then gains unauthorized access to their data or accounts. The attack is difficult to detect because it appears to be part of the legitimate authentication process. <italic>Clone phishing</italic> is a type of phishing attack where the attacker copies or &#x201C;clones&#x201D; a legitimate, previously sent email, typically one that contains a link or attachment. The attacker then alters the email by replacing the original link or attachment with a malicious version and sends it from an email address that appears to be from the original sender. Since the recipient is already familiar with the email content, they are more likely to trust and click the malicious link or open the attachment, leading to credential theft or malware installation.</p>
<p>The accuracy with which victims are targeted has also increased over time. <italic>Spear-phishing</italic> is a more sophisticated version of phishing that addresses specific organizations or individuals, about whom the attackers gather information in advance. This type of phishing often bypasses the detection power of automatic anti-phishing filters, as the approach, appearance, and content of the messages are much more personalized. Spear-phishing can be used to generate Advanced Persistent Threats (<xref ref-type="bibr" rid="ref48">Karim et al., 2019</xref>). <italic>Whaling</italic> is a sub-type of spear-phishing, which addresses senior executives with high-level access by impersonating a trusted entity, such as the company&#x2019;s CEO or a legitimate business partner. These attacks often present an urgent issue affecting the entire company or a critical customer complaint, pressuring the executive to act quickly (<xref ref-type="bibr" rid="ref48">Karim et al., 2019</xref>). <italic>Social phishing</italic> and <italic>context-aware phishing</italic> are two techniques that use publicly available personal information to make the attacks more effective (<xref ref-type="bibr" rid="ref22">De La Torre Parra et al., 2020</xref>).</p>
</sec>
<sec id="sec7">
<label>2</label>
<title>Related works</title>
<p>As presented above, phishing attacks have evolved alongside technological advancements. Initially targeting computers, these attacks have progressively shifted toward mobile devices and IoT systems, leveraging social media, e-commerce platforms, and other online environments that attract large user populations. In <xref ref-type="bibr" rid="ref24">Dwivedi et al. (2023)</xref>, it is shown that criminals can exploit even the metaverse for phishing attacks by creating fake versions of real-world brands, tricking users into sharing personal information or sending cryptocurrency to counterfeit entities. In the ENISA report published in 2023, it is stated that current innovations in social engineering are primarily driven by AI, especially considering the release of ChatGPT during the reporting period. AI is used to create more convincing phishing emails and messages that closely mimic legitimate sources, while deepfakes are mainly employed for voice cloning. Deepfakes target the integrity and availability of data, introducing substantial risks to decisions based entirely on unverified data. For instance, a deepfake voice call led to a fraudulent bank transfer of nearly $35 million (<xref ref-type="bibr" rid="ref25">ENISA, 2022</xref>). In <xref ref-type="bibr" rid="ref37">Gupta et al. (2023)</xref>, the authors describe how ChatGPT can be used to generate messages for spear-phishing. Its ability to learn communication patterns associated with, for example, a specific website or individual increases the likelihood that the generated messages will be credible and convincing, leading to the attacker obtaining the desired information in response.</p>
<p>In combating phishing, two major categories of solutions can be identified: educating users to increase awareness about the value of information as an intangible asset of their employer and of their own personal, financial, and medical data, and enhancing their understanding of how internet technologies work; and implementing technical solutions, such as: anti-phishing plug-ins or toolbars in browser, anti-malware software, visual similarity/content-based filtering, blacklist/whitelist-based methods, heuristics, Machine Learning (ML) and hybrid approaches (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>; <xref ref-type="bibr" rid="ref2">Adebowale et al., 2019</xref>; <xref ref-type="bibr" rid="ref4">Ali and Ahmed, 2019</xref>; <xref ref-type="bibr" rid="ref31">Gangavarapu et al., 2020</xref>). Traditional technical solutions reflect a common trend in cybersecurity: attackers frequently exhibit greater innovation and technological expertise compared to their victims, as well as compared to law enforcement, researchers, and other professionals. For example, in <xref ref-type="bibr" rid="ref4">Ali and Ahmed (2019)</xref>, the authors present the inefficiency of blacklist-based methods, as these methods are outpaced by the speed&#x2014;often measured in seconds&#x2014;at which attackers create new websites. Also, list-based detection mechanisms involve frequent updates of URLs/IPs and significant system resources (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>). The great variety of phishing forms makes detection difficult.</p>
<p>As in many other fields, AI has begun to be utilized in cybersecurity, with the combination of human efforts and AI applications being considered by some authors as the only solution to address the escalation of attacks (<xref ref-type="bibr" rid="ref17">Capuano et al., 2022</xref>). As <xref ref-type="bibr" rid="ref35">Grover et al. (2023)</xref> highlight, the rise in manipulation attacks&#x2014;enabled by advances in generating deceptive texts, images, audio, and even video (e.g., deepfakes)&#x2014;requires equally advanced, automated detection methods. The increasing availability of datasets related to such attacks, along with improvements in computational capabilities, has accelerated the development of AI-based solutions for identifying and responding to phishing attempts more effectively and at scale. Security professionals are now considering AI not just as a technological trend, but as a necessary component in building resilient, adaptive defense systems in the face of a rapidly evolving threat landscape. In the greater AI sphere, ML is defined as the development of algorithms and statistical models that enable computers to perform tasks without explicit instructions, Deep Learning (DL) is a subset of ML that uses neural networks with many layers (known as deep neural networks) to model complex patterns in data, and Natural Language Processing (NLP) is the field in which machines are able to understand, interpret, and generate human language in a meaningful way (<xref ref-type="bibr" rid="ref83">Santos et al., 2024</xref>). AI can identify spam, phishing, spear-phishing, and various other types of attacks by leveraging prior knowledge from datasets (<xref ref-type="bibr" rid="ref13">Basit et al., 2021</xref>). Solutions based on AI have proven to be highly promising (<xref ref-type="bibr" rid="ref48">Karim et al., 2019</xref>; <xref ref-type="bibr" rid="ref6">Alsariera et al., 2020</xref>; <xref ref-type="bibr" rid="ref31">Gangavarapu et al., 2020</xref>), but not infallible. Among their limitations, in <xref ref-type="bibr" rid="ref6">Alsariera et al. (2020)</xref> are mentioned the &#x201C;high false alarm rate, low detection rate, and the inability of single classifiers and some hybridized methods to produce highly effective and efficient phishing website detection solutions.&#x201D; In contrast to opaque, black-box solutions, eXplainable AI (XAI) applications have been employed. Clarifying why an message is flagged as phishing is highly valuable, XAI in this area helps people recognize and avoid an ever-present threat (<xref ref-type="bibr" rid="ref17">Capuano et al., 2022</xref>).</p>
<p>In response to these evolving threats and the increasing interest in AI-based security solutions, a growing body of research has emerged focusing on the use of ML and AI to detect phishing attacks. However, despite this surge in publications, there is a lack of comprehensive overviews that map the structure, development, and key contributions within this research domain. Our study therefore fills this gap by systematically analyzing the field through bibliometric techniques.</p>
<p>Previous bibliometric analyses and literature reviews on AI for phishing detection have yielded significant insights into the field. To contextualize the contribution of our research, we present the most relevant papers and their impact on advancing knowledge in this domain in <xref ref-type="table" rid="tab1">Table 1</xref>.</p>
<table-wrap position="float" id="tab1">
<label>Table 1</label>
<caption>
<p>Previous findings in the field.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Study</th>
<th align="left" valign="top">Aim</th>
<th align="left" valign="top">Methods</th>
<th align="left" valign="top">Data</th>
<th align="left" valign="top">Main results</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><italic>Email classification research trends: review and open issues</italic><break/>(<xref ref-type="bibr" rid="ref64">Mujtaba et al., 2017</xref>)</td>
<td align="left" valign="top">Review of e-mail classification methods from 2006 to 2016, analyzing five aspects: application areas, datasets, feature spaces, classification techniques, and performance measures</td>
<td align="left" valign="top">Comprehensive review and analysis</td>
<td align="left" valign="top">98 articles (56 articles from Web of Science core collection databases and 42 articles from Scopus database)</td>
<td align="left" valign="top">The authors identify five techniques&#x2014;supervised, semi-supervised, unsupervised, content-based, and statistical learning&#x2014;with supervised ML being the most common and Support Vector Machine showing the best performance, followed by Decision Trees and Naive Bayes</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A recent review of conventional vs. automated cybersecurity anti-phishing techniques</italic><break/>(<xref ref-type="bibr" rid="ref77">Qabajeh et al., 2018</xref>)</td>
<td align="left" valign="top">Examination of the effectiveness of traditional anti-phishing approaches, such as awareness campaigns, user education, and periodic training sessions, in comparison to computerized anti-phishing techniques</td>
<td align="left" valign="top">Classification of anti-phishing approaches in the analyzed literature into 3 main categories: &#x201C;education and legal, computerized using human-crafted methods, and intelligent ML methods&#x201D;</td>
<td align="left" valign="top">75 studies</td>
<td align="left" valign="top">ML and rule induction are particularly effective in phishing prevention, offering high detection accuracy and easily interpretable results. The tendency to use ML and DL algorithms for website classification to identify phishing sites was considered promising by the authors for reasons of cost and accuracy</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Evaluation of phishing techniques based on machine learning</italic><break/>(<xref ref-type="bibr" rid="ref52">Kunju et al., 2019</xref>)</td>
<td align="left" valign="top">Survey of phishing attacks and their detection methods, with the intention to raise user awareness about the associated risks, and present various machine learning techniques (kNN, Na&#x00EF;ve Bayes, Decision Tree, SVM, Neural Network, Random Forest) used for predicting and preventing phishing websites</td>
<td align="left" valign="top">Overview of ML algorithms for detecting phishing websites, including k-Nearest Neighbors (kNN), Na&#x00EF;ve Bayes, Decision Trees, Support Vector Machines (SVM), Neural Networks, and Random Forest</td>
<td align="left" valign="top">14 studies</td>
<td align="left" valign="top">The necessity of employing multiple techniques to enhance phishing detection effectiveness is highlighted</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Toward the detection of phishing attacks</italic><break/>(<xref ref-type="bibr" rid="ref10">Athulya and Praveen, 2020</xref>)</td>
<td align="left" valign="top">The paper aims to raise user awareness about phishing strategies and present a hybrid detection method that offers fast response time and high accuracy</td>
<td align="left" valign="top">Review of various phishing attacks, evasion techniques, and anti-phishing approaches</td>
<td align="left" valign="top">9 research articles</td>
<td align="left" valign="top">The most effective approach to mitigating phishing attacks is raising user awareness and selecting the most appropriate anti-phishing security software</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Toward a systematic description of the field using bibliometric analysis: Malware evolution</italic><break/>(<xref ref-type="bibr" rid="ref58">Mat et al., 2021</xref>)</td>
<td align="left" valign="top">A bibliometric analysis of a decade of evolution in malware research, considering it as an umbrella term for all malicious software, with a specific focus on Android malware due to a significant rise in occurrences in 2019</td>
<td align="left" valign="top">Bibliometric review</td>
<td align="left" valign="top">1,278 articles</td>
<td align="left" valign="top">The article does not explicitly address phishing</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Applications of deep learning for phishing detection: A systematic literature review</italic><break/>(<xref ref-type="bibr" rid="ref18">Catal et al., 2022</xref>)</td>
<td align="left" valign="top">Analysis of the use of DL for phishing detection</td>
<td align="left" valign="top">Systematic literature review</td>
<td align="left" valign="top">43 studies</td>
<td align="left" valign="top">The most commonly used algorithm is the Deep Neural Network (DNN), followed by Convolutional Neural Networks (CNN) and Recurrent Neural Networks (RNN)/Long Short-Term Memory Networks (LSTM). The study also indicates that DNN and Hybrid DL algorithms achieved the best performance in phishing detection</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A bibliometric analysis of phishing in the Big Data era: high focus on algorithms and low focus on people</italic><break/>(<xref ref-type="bibr" rid="ref73">Peji&#x0107;-Bach et al., 2023</xref>)</td>
<td align="left" valign="top">A co-occurrence analysis using VOSviewer on a set of 136 articles focused on big data and phishing</td>
<td align="left" valign="top">Bibliometric review</td>
<td align="left" valign="top">NA (WoS database)</td>
<td align="left" valign="top">Predominantly technical research (computer science, engineering, telecommunications); big data ML cluster emphasizes ML/DL benefits for real-time anti-phishing; approaches include models, voting frameworks, consensus clustering, URL analysis; Gray Wolf Optimizer outperforms other algorithms via feature analysis (e.g., URL length, HTTP response)</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A systematic literature review on phishing website detection techniques</italic><break/>(<xref ref-type="bibr" rid="ref80">Safi and Singh, 2023</xref>)</td>
<td align="left" valign="top">An update in the previous systematic literature surveys with more focus on the latest trends in phishing detection techniques</td>
<td align="left" valign="top">Systematic literature review</td>
<td align="left" valign="top">80 scientific papers published between 2017 and 2021</td>
<td align="left" valign="top">ML techniques dominated phishing detection (71.25%), followed by heuristic (66.25%), visual similarity (43.75%), DL-based (17.5%), and list-based methods (12.5%); PhishTank was the main data source, while Random Forest, SVM, and Decision Tree were the most used ML algorithms, with CNN achieving the highest accuracy (99.98%)</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Mapping the phishing attacks research landscape: a bibliometric analysis and taxonomy</italic><break/>(<xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>)</td>
<td align="left" valign="top">A holistic approach of the topic and presentation of an in-depth analysis of phishing research from 2004 to 2023, emphasizing the field&#x2019;s steady growth, emerging trends, and collaborative networks</td>
<td align="left" valign="top">Bibliometric review</td>
<td align="left" valign="top">3,139 phishing-related articles indexed in the Web of Science database</td>
<td align="left" valign="top">ML-based techniques play a central role in phishing research, with CANTINA+ (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>) ranking 3rd among the Top 10 Most Cited Publications (2004&#x2013;2013), after studies by <xref ref-type="bibr" rid="ref43">Jagatic et al. (2007)</xref> and one on the economics of information security (<italic>Science</italic>). In 2014&#x2013;2023, <xref ref-type="bibr" rid="ref81">Sahingoz et al. (2019)</xref> ranks 2nd, followed by <xref ref-type="bibr" rid="ref19">Chiew et al. (2019)</xref> and <xref ref-type="bibr" rid="ref56">Mahdavifar and Ghorbani (2019)</xref>, highlighting the growing impact of AI-related approaches (<xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>). Zhang and Xiang emerge as key co-citation nodes, while Chiew leads another cluster; Cranor and Hong are the most cited authors. Among keywords, <italic>Machine Learning</italic> ranks 3rd and <italic>Neural Networks</italic> 12th</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Enhancing spear phishing defense with AI: a comprehensive review and future directions</italic><break/>(<xref ref-type="bibr" rid="ref60">Mohamed et al., 2024</xref>)</td>
<td align="left" valign="top">A critical analysis of AI techniques, including ML, NLP, but also behavioral analytics, mitigating spear phishing attacks</td>
<td align="left" valign="top">Comprehensive review</td>
<td align="left" valign="top">30 seminal papers</td>
<td align="left" valign="top">ML models are effective for pattern recognition but require extensive training data, whereas NLP techniques enhance contextual and semantic understanding, improving detection of sophisticated phishing attempts</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The usefulness of applying ML and DL techniques in phishing detection was recognized as early as 2017&#x2013;2018, when several review studies were published to analyze their role in preventing email (<xref ref-type="bibr" rid="ref64">Mujtaba et al., 2017</xref>) and website phishing (<xref ref-type="bibr" rid="ref77">Qabajeh et al., 2018</xref>). In the following years, available algorithms were compared by various authors (<xref ref-type="bibr" rid="ref52">Kunju et al., 2019</xref>; <xref ref-type="bibr" rid="ref10">Athulya and Praveen, 2020</xref>), basing their analyses on a relatively small number of articles. Previous bibliometric studies have approached broader areas, such as malware in general (<xref ref-type="bibr" rid="ref58">Mat et al., 2021</xref>), phishing in general (<xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>), and the relationship between phishing and Big Data (<xref ref-type="bibr" rid="ref73">Peji&#x0107;-Bach et al., 2023</xref>), without focusing on the use of AI in phishing detection.</p>
<p>Against this background, the present work has the following main objectives:</p>
<list list-type="bullet">
<list-item>
<p>To analyze publication trends, influential documents, and leading sources within the field of AI-driven phishing detection;</p>
</list-item>
<list-item>
<p>To identify collaboration patterns and author clusters, thereby uncovering the structure of the research community;</p>
</list-item>
<list-item>
<p>To perform a thematic mapping and evolution analysis using authors&#x2019; keywords to detect driving, emerging, declining, niche, and foundational themes over time;</p>
</list-item>
<list-item>
<p>To provide a critical discussion of the most relevant articles, offering insights into the primary AI-based techniques employed for phishing detection;</p>
</list-item>
<list-item>
<p>To identify and propose integration pathways for AI-powered phishing detection into organizations: Security Information and Event Management platforms, endpoint protection, secure email gateways, and cloud-based defense systems.</p>
</list-item>
</list>
<p>By addressing these objectives, the study aims to offer researchers, practitioners, and policymakers a clearer understanding of the field&#x2019;s intellectual landscape, key developments, and future directions.</p>
<p>The remainder of this paper is organized as follows. Section 2 outlines the research methodology, including data sources, tools, and bibliometric techniques. Section 3 presents the results of the analysis. It covers publication trends, key documents and sources (3.1), collaboration networks and co-citation patterns (3.2), and the thematic development of the field (3.3). Section 4 discusses the main findings, with a focus on the AI techniques used in phishing detection. Section 5 concludes the paper by summarizing key contributions and suggesting directions for future research.</p>
</sec>
<sec id="sec8">
<label>3</label>
<title>Research methodology</title>
<p>To conduct the study, we considered bibliometric analysis to be the most appropriate method. According to <xref ref-type="bibr" rid="ref23">Donthu et al. (2021)</xref>, this approach condenses a vast amount of bibliometric data to illustrate the current intellectual landscape and highlight emerging trends within a specific topic or field. It is particularly suitable when the scope of review is broad, and the data set is too extensive for manual review. In recent years, a significant number of researchers have utilized knowledge-mapping tools to examine developmental trends and the evolution of various disciplines and research fields. Due to the availability of advanced computational tools, this process has become far more accessible. These tools enable the statistical and quantitative analysis of a large number of publications and academic articles, facilitating the generation of descriptive statistics, the creation of keyword networks, and the establishment of connections between articles, publications, citations, authors, institutions, and countries (<xref ref-type="bibr" rid="ref33">G&#x00F3;mez-Caicedo et al., 2022</xref>).</p>
<p>The aim of this paper is to identify, evaluate, and synthesize relevant studies on the use of AI in mitigating phishing. To understand the landscape of the field, the paper seeks to answer the following research questions:</p>
<disp-quote>
<p><italic>RQ1</italic>. How has the publication landscape in AI for phishing detection research evolved over time?</p>
</disp-quote>
<disp-quote>
<p><italic>RQ2</italic>. What are the core thematic clusters within the AI-based phishing detection research field, and how do these clusters interact and evolve over time?</p>
</disp-quote>
<disp-quote>
<p><italic>RQ3</italic>. How are the AI technologies identified in the study utilized within organizations?</p>
</disp-quote>
<p>The research approach is structured and follows the PRISMA Reporting Guidelines for systematic reviews (<xref ref-type="bibr" rid="ref62">Moher et al., 2009</xref>) aiming to ensure a rigorous evaluation of the literature published in the field. A comprehensive literature search was conducted using the Web of Science (WoS) database. The choice of this database is motivated by its multidisciplinary nature and reputation.</p>
<p>The search phrase included the following terms &#x201C;artificial intelligence,&#x201D; &#x201C;AI,&#x201D; &#x201C;natural language processing,&#x201D; machine learning,&#x201D; &#x201C;deep learning,&#x201D; &#x201C;phishing,&#x201D; and &#x201C;detection.&#x201D; The search was restricted to articles written in English, with no limitation on the time frame. The document types included in the analysis were journal articles, conference proceedings, and book chapters. Using the <italic>Refine results</italic> option on the platform, we excluded documents from the following categories: early access papers, review articles, retracted publications, and data papers. The review articles were excluded to avoid duplicating information, as they summarize the findings of original articles, and their inclusion could lead to distortions in bibliometric analysis. Similarly, we excluded early access articles, as they have not yet been formally assigned to a specific volume or issue and may be subject to modifications before their final publication, and their consideration could introduce inconsistencies in the bibliometric analysis, as metadata such as the number of pages, citations, and affiliations may change. Furthermore, early access papers may occasionally appear as duplicates in databases, being listed both as early access and as the final published article. Since we have removed duplicates, this issue does not affect the results of the research; however, we consider it important to be mentioned. The search criteria are detailed in <xref ref-type="table" rid="tab2">Table 2</xref>.</p>
<table-wrap position="float" id="tab2">
<label>Table 2</label>
<caption>
<p>Search criteria for extracting scientific articles from scientific database.</p>
</caption>
<table frame="hsides" rules="groups">
<tbody>
<tr>
<td align="left" valign="top">Keywords</td>
<td align="left" valign="top">((ALL&#x202F;=&#x202F;((&#x201C;artificial intelligence&#x201D; OR &#x201C;AI&#x201D; OR &#x201C;natural language processing&#x201D; OR &#x201C;machine learning&#x201D; OR &#x201C;deep learning&#x201D;) AND phishing AND detection)))</td>
</tr>
<tr>
<td align="left" valign="top">Database</td>
<td align="left" valign="top">Web of Science</td>
</tr>
<tr>
<td align="left" valign="top">Exclusion criteria</td>
<td align="left" valign="top">Early Access or Data Paper or Retracted Publication or Review Article or Editorial Material (Exclude &#x2013; Document Types) and Turkish and Spanish (Exclude &#x2013; Languages)</td>
</tr>
<tr>
<td align="left" valign="top">Period</td>
<td align="left" valign="top">Unrestricted</td>
</tr>
<tr>
<td align="left" valign="top">Language</td>
<td align="left" valign="top">English</td>
</tr>
<tr>
<td align="left" valign="top">Search date</td>
<td align="left" valign="top">27 August 2025</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The search yielded 1,096 documents. The relationship between research area, author(s), citation and impact journal was analyzed. Due to the export limitations of WoS, the operation was conducted in three stages: first for articles 1&#x2013;500, next for articles 501&#x2013;1,000 and subsequently for articles 1,001&#x2013;1,096. The record content included full records and cited references. The selected export format was plain text. We manually addressed inconsistencies related to incomplete or missing data. We standardized the names of authors, journals, conferences, and publishers in cases of inconsistencies. No duplicates were identified. All articles were then merged into a single file and imported into the Biblioshiny (Bibliometrix package) and VOSviewer software packages. <xref ref-type="fig" rid="fig1">Figure 1</xref> summarizes the study review protocol.</p>
<fig position="float" id="fig1">
<label>Figure 1</label>
<caption>
<p>PRISMA 2009 flow diagram (Source: adapted from <xref ref-type="bibr" rid="ref62">Moher et al., 2009</xref>).</p>
</caption>
<graphic xlink:href="frai-08-1496580-g001.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart depicting the process of selecting studies for qualitative synthesis. Identification: 1096 records were found in the Web of Science database. Screening: all 1096 records were screened with none excluded for irrelevance or duplication. Assessment: all 1096 records were assessed for eligibility with none excluded. Inclusion: all 1096 studies were included in the qualitative synthesis.</alt-text>
</graphic>
</fig>
<p>Biblioshiny and VOSviewer are two widely used tools for the bibliometric analysis of scientific output, each with its own strengths: Biblioshiny is more effective in visualizing trends, generating clearer and more easily interpretable graphics, while VOSviewer offers greater precision in clustering algorithms (<xref ref-type="bibr" rid="ref47">Jia et al., 2022</xref>). They enable the creation of various networks, such as co-authorship, co-citation, and keyword co-occurrence, as well as the identification of the most influential publications and research. Using this information, it is possible to analyse the evolution of themes related to the use of AI in phishing detection, explore the connections between discussed topics, and identify emerging trends and areas for further development in the field.</p>
<p>The Bibliometrix package offers the necessary options for quantitative analysis of articles within a dataset, proving to be particularly useful when dealing with large volumes of data, where a comprehensive analysis would be impossible or, at the very least, highly challenging to perform. The package enables a wide range of analyses, such as co-citation analysis, examination of collaborations among authors, institutions, and countries, and the exploration of relationships between various keywords declared by authors or identified by algorithms implemented in bibliographic databases, among others.</p>
<p>The VOSviewer software includes advanced techniques for network layout and clustering and provides functionalities for analysing author collaborations, co-occurrence, citation, co-citations, bibliographic coupling, and, notably, the concepts used together. The application employs NLP to create term co-occurrence networks, automatically distinguishing between relevant and irrelevant concepts.</p>
<p>In this research, Biblioshiny (Bibliometrix package) was utilized to analyse the main data, identify the top ten most influential authors and journals, and determine the primary research directions and their evolution, while VOSviewer was used to examine author collaborations and co-citations. The descriptive insights generated through Biblioshiny were integrated with the advanced visualizations of VOSviewer to achieve a comprehensive and synergistic approach to the bibliometric review.</p>
<p><xref ref-type="table" rid="tab3">Table 3</xref> presents the main information regarding the documents from the dataset generated by Biblioshiny based on our dataset extracted from WoS database. It includes 621 articles, 4 book chapters, and 471 conference proceedings, extracted from 644 sources and published between 2005 and 2025 by 3,327 authors. Out of the 1,096 documents, only 38 are single authored (~3.46%), with the remainder being collaborative works. The average number of authors per document is 3.84. The total number of citations received by the documents in the dataset is 13,435, with an average of 12.26 citations per published document. Additionally, 57% of the citations were concentrated on 77 articles, representing approximately 10% of the total documents analyzed. This indicates that a small proportion of articles have had a significant impact on the field. The total number of references in the dataset is 25,231. The annual growth rate during the examined period was 27.51%, with the growth particularly concentrated in recent years.</p>
<table-wrap position="float" id="tab3">
<label>Table 3</label>
<caption>
<p>The information about main data.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Description</th>
<th align="center" valign="top">Results</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top" colspan="2">Main information about data</td>
</tr>
<tr>
<td align="left" valign="top">Timespan</td>
<td align="center" valign="top">2005:2025</td>
</tr>
<tr>
<td align="left" valign="top">Sources (journals, books, etc.)</td>
<td align="center" valign="top">644</td>
</tr>
<tr>
<td align="left" valign="top">Documents</td>
<td align="center" valign="top">1,096</td>
</tr>
<tr>
<td align="left" valign="top">Annual growth rate %</td>
<td align="center" valign="top">27.51</td>
</tr>
<tr>
<td align="left" valign="top">Document average age</td>
<td align="center" valign="top">3.45</td>
</tr>
<tr>
<td align="left" valign="top">Average citations per doc</td>
<td align="center" valign="top">12.26</td>
</tr>
<tr>
<td align="left" valign="top">References</td>
<td align="center" valign="top">25,231</td>
</tr>
<tr>
<td align="left" valign="top" colspan="2">Document contents</td>
</tr>
<tr>
<td align="left" valign="top">Keywords plus (ID)</td>
<td align="center" valign="top">296</td>
</tr>
<tr>
<td align="left" valign="top">Author&#x2019;s keywords (DE)</td>
<td align="center" valign="top">2,378</td>
</tr>
<tr>
<td align="left" valign="top" colspan="2">Authors</td>
</tr>
<tr>
<td align="left" valign="top">Authors</td>
<td align="center" valign="top">3,327</td>
</tr>
<tr>
<td align="left" valign="top">Authors of single-authored docs</td>
<td align="center" valign="top">37</td>
</tr>
<tr>
<td align="left" valign="top" colspan="2">Authors collaboration</td>
</tr>
<tr>
<td align="left" valign="top">Single-authored docs</td>
<td align="center" valign="top">38</td>
</tr>
<tr>
<td align="left" valign="top">Co-authors per doc</td>
<td align="center" valign="top">3.84</td>
</tr>
<tr>
<td align="left" valign="top">International co-authorships %</td>
<td align="center" valign="top">28.1</td>
</tr>
<tr>
<td align="left" valign="top">Document types</td>
<td align="center" valign="top">2005:2025</td>
</tr>
<tr>
<td align="left" valign="top">Article</td>
<td align="center" valign="top">621</td>
</tr>
<tr>
<td align="left" valign="top">Book chapter</td>
<td align="center" valign="top">4</td>
</tr>
<tr>
<td align="left" valign="top">Proceedings paper</td>
<td align="center" valign="top">471</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To achieve the aim of this study, we analyzed the most influential articles and publications in the field of AI-based phishing detection, as well as the key research directions, by identifying thematic areas and their evolution over time.</p>
</sec>
<sec sec-type="results" id="sec9">
<label>4</label>
<title>Results</title>
<p>The bibliographic analysis is divided into two components, following recommendations from the literature (<xref ref-type="bibr" rid="ref23">Donthu et al., 2021</xref>): performance analysis and science mapping. Performance analysis involves examining the contributions of dataset constituents, such as authors, journals, institutions, and countries, and is descriptive in nature. It measures productivity through the number of articles published within a specific time frame, the impact through the number of citations, and the influence of research components by tracking citations per year, per article, and per journal (<xref ref-type="bibr" rid="ref20">Chiroma et al., 2024</xref>). Science mapping focuses on analysing the relationships among the elements in the dataset, such as citation analysis, co-citation analysis, bibliographic coupling, co-word analysis, and co-authorship analysis (<xref ref-type="bibr" rid="ref23">Donthu et al., 2021</xref>).</p>
<sec id="sec10">
<label>4.1</label>
<title>Publication trends, impactful documents and publications</title>
<p>Research on the use of AI in phishing detection has grown significantly in recent years, as expected, driven by the increase in computational power and innovations across all branches of AI. This trend confirms that phishing remains a persistent, adaptive, and challenging global threat, and the authors are in search of relevant solutions against it. In the first 10 years included in the analysis, the number of published articles on this topic was relatively small (49 documents). However, starting in 2016, the number of published documents began to increase at an accelerated pace, reaching a peak in 2024 when the number of published articles nearly doubled compared to the previous year (<xref ref-type="fig" rid="fig2">Figure 2</xref>).</p>
<fig position="float" id="fig2">
<label>Figure 2</label>
<caption>
<p>Annual quantitative distribution of publications.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g002.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Bar chart displaying annual data from 2005 to 2025 with red bars. A trend line with the equation y = 9.3208x - 50.338 and R&#x00B2; = 0.7216 indicates an upward trend. Data shows a marked increase from 2014 onwards, peaking in 2024.</alt-text>
</graphic>
</fig>
<p>Naturally, given that significant advancements in the field have occurred in recent years, the documents with the greatest impact, as measured by citation count, have generally been published in the past few years. Among the top 10 most-cited studies, the majority (9 out of 10) were published within the last 8 years (<xref ref-type="table" rid="tab4">Table 4</xref>). This analysis was undertaken using Bibliometrix package.</p>
<table-wrap position="float" id="tab4">
<label>Table 4</label>
<caption>
<p>Top ten most impactful articles (papers sorted by local citations rate).</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Title</th>
<th align="center" valign="top">Research area</th>
<th align="center" valign="top">LC</th>
<th align="center" valign="top">GC</th>
<th align="center" valign="top">LCR (%)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Toward detection of phishing websites on client-side using machine learning based approach (<xref ref-type="bibr" rid="ref45">Jain and Gupta, 2018b</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">67</td>
<td align="center" valign="top">96</td>
<td align="center" valign="top">69.79</td>
</tr>
<tr>
<td align="left" valign="top">Detection of phishing websites using an efficient feature-based machine learning framework (<xref ref-type="bibr" rid="ref79">Rao and Pais, 2019</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">78</td>
<td align="center" valign="top">133</td>
<td align="center" valign="top">58.65</td>
</tr>
<tr>
<td align="left" valign="top">Phishing website detection based on multidimensional features driven by deep learning (<xref ref-type="bibr" rid="ref100">Yang et al., 2019</xref>)</td>
<td align="center" valign="top">DL</td>
<td align="center" valign="top">84</td>
<td align="center" valign="top">146</td>
<td align="center" valign="top">57.53</td>
</tr>
<tr>
<td align="left" valign="top">Machine learning based phishing detection from URLs (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">180</td>
<td align="center" valign="top">323</td>
<td align="center" valign="top">55.73</td>
</tr>
<tr>
<td align="left" valign="top">PhishStorm: detecting phishing with streaming analytics (<xref ref-type="bibr" rid="ref57">Marchal et al., 2014</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">70</td>
<td align="center" valign="top">132</td>
<td align="center" valign="top">53.03</td>
</tr>
<tr>
<td align="left" valign="top">A machine learning based approach for phishing detection using hyperlinks information (<xref ref-type="bibr" rid="ref46">Jain and Gupta, 2019</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">57</td>
<td align="center" valign="top">114</td>
<td align="center" valign="top">50.00</td>
</tr>
<tr>
<td align="left" valign="top">A new hybrid ensemble feature selection framework for machine learning-based phishing detection system (<xref ref-type="bibr" rid="ref19">Chiew et al., 2019</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">89</td>
<td align="center" valign="top">194</td>
<td align="center" valign="top">45.88</td>
</tr>
<tr>
<td align="left" valign="top">A stacking model using URL and HTML features for phishing webpage detection (<xref ref-type="bibr" rid="ref55">Li et al., 2019</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">61</td>
<td align="center" valign="top">134</td>
<td align="center" valign="top">45.52</td>
</tr>
<tr>
<td align="left" valign="top">CANTINA+: a feature-rich machine learning framework for detecting phishing web sites (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>)</td>
<td align="center" valign="top">ML</td>
<td align="center" valign="top">135</td>
<td align="center" valign="top">324</td>
<td align="center" valign="top">41.67</td>
</tr>
<tr>
<td align="left" valign="top">A comprehensive survey of AI-enabled phishing attacks detection techniques (<xref ref-type="bibr" rid="ref13">Basit et al., 2021</xref>)</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">61</td>
<td align="center" valign="top">163</td>
<td align="center" valign="top">37.42</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>LC, local citations; GC, global citations; LCR, local citations/global citations Ratio (%).</p>
</table-wrap-foot>
</table-wrap>
<p>Local citations count the number of citations a document receives from other articles within the dataset, reflecting its influence within the analyzed field. Global citations count the number of citations received by a work across the entire WoS database, indicating its broader impact across various disciplines. The ratio between local citations and global citations reflects the level of specialization of each article in the dataset (<xref ref-type="bibr" rid="ref14">Batista-Canino et al., 2023</xref>). A higher ratio of local citations suggests that the article is more specialized and highly relevant to the specific research area under investigation.</p>
<p>ML is the most frequently proposed approach for automated phishing detection, as evidenced by the top 10 most-cited articles in the field, presented in <xref ref-type="table" rid="tab4">Table 4</xref>. In the analyzed dataset, among the articles in the top 10 by citation count, the article titled &#x201C;Toward Detection of Phishing Websites on Client-Side Using Machine Learning-Based Approach&#x201D; (<xref ref-type="bibr" rid="ref45">Jain and Gupta, 2018b</xref>) ranks first with a 69.79% local citation ratio, indicating a very high level of specialization in the field. The second position is occupied by &#x201C;Detection of Phishing Websites Using an Efficient Feature-Based Machine Learning Framework&#x201D; (<xref ref-type="bibr" rid="ref79">Rao and Pais, 2019</xref>) with a 58.65% ratio. Both papers propose phishing detection models based on ML algorithms. The first paper utilizes features extracted from URL functions, hyperlinks, CSS, authentication forms, and identity, while the second paper uses features extracted from URLs, website content, and third-party services. On third place is the paper entitled &#x201C;Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning&#x201D; (<xref ref-type="bibr" rid="ref100">Yang et al., 2019</xref>) with a 57.53% local citation ratio. The authors propose using DL for phishing detection. Their approach is multidimensional and involves two stages: in the first stage, features from the character sequence of the URL are extracted and used for rapid classification through DL; in the second stage, statistical features of the URL, page code features, web page text features, and results from the rapid DL classification are combined into a multidimensional features&#x2019; set, in order to increase detection accuracy.</p>
<p>Regarding the productivity and popularity of publications, approximately half of the articles (584) were disseminated through 132 journals, volumes, and books, whereas the remainder were published across an additional 512 distinct publications. This analysis was undertaken using Bibliometrix package. <xref ref-type="table" rid="tab5">Table 5</xref> presents the top 10 publications ranked by the number of citations. It also includes relevant information about these journals: H-index, G-index, M-index, the number of published documents (with their ranking based on the number of articles), the impact factor for 2024, JRC category, and quartiles based on WoS classification. Among the top ten journals, three are classified as Quartile 1 (Q1) and five as Quartile 2 (Q2). Quartiles (Q1 to Q4) represent the ranking tiers of journals within a given subdiscipline, with Q1 indicating the highest ranking. Since the topic is closely related to the ICT field, all journals are primarily categorized under computer science. <italic>IEEE Access</italic> journal published the most articles (78), and the articles in this journal had the highest number of citations (1,354). In second place for citations is <italic>Computers &#x0026; Security</italic> with 662 citations, followed by <italic>Expert Systems with Applications</italic> with 627 citations. In terms of productivity, Computers &#x0026; Security ranks second with 27 papers, followed by <italic>Electronics</italic> with 24 papers, although it is ranked 5th in terms of citations (168). <italic>ACM Transactions on Information and System Security</italic> stands out with a significant impact, having published just one article but receiving 324 citations. Other journals with a smaller number of articles but significant impact include <italic>Telecommunication Systems</italic> (4 articles and 294 citations), <italic>Information Sciences</italic> (4 articles and 262 citations), <italic>Journal of Network and Computer Applications</italic> (7 articles and 282 citations), and <italic>Journal of Ambient Intelligence and Humanized Computing</italic> (4 articles and 192 citations). In terms of research areas, the majority of publications (641) are from the field of Computer Science, followed by Engineering (267) and Telecommunications (179).</p>
<table-wrap position="float" id="tab5">
<label>Table 5</label>
<caption>
<p>Top ten journals publishing ranked by total citations.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Publications</th>
<th align="center" valign="top">h-index</th>
<th align="center" valign="top">g-index</th>
<th align="center" valign="top">m-index</th>
<th align="center" valign="top">TC</th>
<th align="center" valign="top">TC (%)</th>
<th align="center" valign="top">NP</th>
<th align="center" valign="top">RA</th>
<th align="center" valign="top">IF</th>
<th align="center" valign="top">Q</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">IEEE Access</td>
<td align="center" valign="top">21</td>
<td align="center" valign="top">35</td>
<td align="center" valign="top">2.33</td>
<td align="center" valign="top">1,354</td>
<td align="center" valign="top">10.08</td>
<td align="center" valign="top">78 (1)</td>
<td align="center" valign="top">CS, E, T</td>
<td align="center" valign="top">3.6</td>
<td align="center" valign="top">Q2</td>
</tr>
<tr>
<td align="left" valign="top">Computers and Security</td>
<td align="center" valign="top">14</td>
<td align="center" valign="top">25</td>
<td align="center" valign="top">1.08</td>
<td align="center" valign="top">662</td>
<td align="center" valign="top">4.93</td>
<td align="center" valign="top">27 (2)</td>
<td align="center" valign="top">CS</td>
<td align="center" valign="top">5.4</td>
<td align="center" valign="top">Q1</td>
</tr>
<tr>
<td align="left" valign="top">Expert Systems with Applications</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">13</td>
<td align="center" valign="top">0.57</td>
<td align="center" valign="top">627</td>
<td align="center" valign="top">4.67</td>
<td align="center" valign="top">13 (5)</td>
<td align="center" valign="top">CS, E, ORMS</td>
<td align="center" valign="top">7.5</td>
<td align="center" valign="top">Q1</td>
</tr>
<tr>
<td align="left" valign="top">Neural Computing and Applications</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">343</td>
<td align="center" valign="top">2.55</td>
<td align="center" valign="top">8 (9)</td>
<td align="center" valign="top">CS</td>
<td align="center" valign="top">4.5</td>
<td align="center" valign="top">Q2</td>
</tr>
<tr>
<td align="left" valign="top">Electronics</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">17</td>
<td align="center" valign="top">1.67</td>
<td align="center" valign="top">325</td>
<td align="center" valign="top">2.42</td>
<td align="center" valign="top">24 (3)</td>
<td align="center" valign="top">CS, E, P</td>
<td align="center" valign="top">2.6</td>
<td align="center" valign="top">Q2</td>
</tr>
<tr>
<td align="left" valign="top">ACM Transactions on Information and System Security</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">0.067</td>
<td align="center" valign="top">324</td>
<td align="center" valign="top">2.41</td>
<td align="center" valign="top">1 (120)</td>
<td align="center" valign="top">CS</td>
<td align="center" valign="top">2.6</td>
<td align="center" valign="top">Q2</td>
</tr>
<tr>
<td align="left" valign="top">Telecommunication Systems</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.375</td>
<td align="center" valign="top">294</td>
<td align="center" valign="top">2.19</td>
<td align="center" valign="top">2 (26)</td>
<td align="center" valign="top">T</td>
<td align="center" valign="top">2.3</td>
<td align="center" valign="top">Q3</td>
</tr>
<tr>
<td align="left" valign="top">Journal of Network and Computer Applications</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">0.200</td>
<td align="center" valign="top">282</td>
<td align="center" valign="top">2.10</td>
<td align="center" valign="top">7 (14)</td>
<td align="center" valign="top">CS</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">Q1</td>
</tr>
<tr>
<td align="left" valign="top">Information Sciences</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.333</td>
<td align="center" valign="top">262</td>
<td align="center" valign="top">1.95</td>
<td align="center" valign="top">4 (26)</td>
<td align="center" valign="top">CS</td>
<td align="center" valign="top">6.8</td>
<td align="center" valign="top">Q1</td>
</tr>
<tr>
<td align="left" valign="top">Journal of Ambient Intelligence and Humanized Computing</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.500</td>
<td align="center" valign="top">192</td>
<td align="center" valign="top">1.43</td>
<td align="center" valign="top">4 (26)</td>
<td align="center" valign="top">CS, T</td>
<td align="center" valign="top">3.6</td>
<td align="center" valign="top">Q2</td>
</tr>
<tr>
<td align="left" valign="top">Totals</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">5,349</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">170</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">&#x2013;</td>
<td align="center" valign="top">&#x2013;</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>CS, Computer Science; E, Engineering; ORMS, Operations Research and Management Science; T, Telecommunications; RA, Research area; Q, Quartile.</p>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="sec11">
<label>4.2</label>
<title>Collaboration network of author and co-citation</title>
<p>In this study, various bibliometric techniques were employed to analyse the research landscape related to using AI in phishing detection. The selection of indicators was guided by their relevance in capturing the impact, collaboration patterns, and thematic structure of the field. The scope and complexity of the analyzed field significantly limit the possibility of singular research efforts. Numerous research groups have proposed anti-phishing solutions developed using AI. Collaboration networks provide insights into the structure of research communities, key contributors, and interdisciplinary trends. Analysing co-authorship dynamics helps identify influential research groups and partnerships. To achieve this, we utilized VOSviewer software with the following criteria: a minimum of 3 documents per author, at least 5 citations per document to filter out weakly connected nodes and full counting method. A threshold of 3 documents per author was selected to focus on researchers with a relevant level of influence and to minimize the risk of including collaborations with a marginal impact. The minimum number of citations was set at 5 per document to consider only works with a reasonable influence in the field. Thresholds that are too low or too high can negatively affect the results, either by including occasional collaborations or by excluding relevant contributions. We considered these thresholds reasonable in relation to the topic and the size of the dataset. For example, other authors have set the minimum threshold at 5 documents per author and 10 citations per document for a dataset consisting of 4,875 papers (<xref ref-type="bibr" rid="ref30">Ezugwu et al., 2021</xref>). To determine the thresholds, we conducted empirical tests on the dataset to achieve a set of stable and interpretable clusters. Thresholds that are too low or too high can negatively impact the results, either by including authors with insignificant impact or by excluding relevant contributions. To determine appropriate thresholds, we conducted empirical tests on the dataset to obtain a set of stable and interpretable clusters.</p>
<p>In the VOSviewer network, each node represents an author, and the connections between nodes reflect the intensity of collaboration, determined by the number of articles they have co-authored. Four clusters were identified, consisting of five authors each, along with three clusters of four authors, three clusters of three authors, and ten clusters of two authors (<xref ref-type="fig" rid="fig3">Figure 3</xref>).</p>
<fig position="float" id="fig3">
<label>Figure 3</label>
<caption>
<p>Co-authorship network for authors based on number of research.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g003.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Visualization of a network graph showing clusters of interconnected nodes representing individuals. Each cluster is distinguished by color, with names labeled next to the nodes. The layout is spread out with varying densities, indicating differing levels of connectivity.</alt-text>
</graphic>
</fig>
<p>In terms of productivity, Akshat Gaurav (10) and Ali Selamat (10) have published the most papers in the dataset, followed by Varsha Arya (9), Ankit Kumar Jain (9), Routhu Srinivasa Rao (9) and Indrakshi Ray (9). Akshat Gaurav and Varsha Arya are in the same cluster. Regarding the topics covered, Gaurav research the integration of semantic web and AI technologies (DL recurrent neural network, CNNs) for robust phishing detection (<xref ref-type="bibr" rid="ref32">Gaurav et al., 2024</xref>; <xref ref-type="bibr" rid="ref39">Gupta et al., 2024b</xref>). Ali Selamat focusses on using ML techniques in phishing detection analysing the performance of various algorithms based on DL and NLP (<xref ref-type="bibr" rid="ref68">Nguyet et al., 2021</xref>; <xref ref-type="bibr" rid="ref78">Quang et al., 2021</xref>). Jain and Gupta have developed ML algorithms for phishing detection, with two of their papers appearing in the top 10 most-cited documents. The group with the most substantial connections consists of Akshat Gaurav, Varsha Arya, Kwok Tai Chui, Ahmed Alhomoud, Razaz Waheeb Attar, Shavi Bansal and Brij Bhooshan Gupta. They have published articles related to using ML for phishing detection trying to identify the most efficient model (<xref ref-type="bibr" rid="ref32">Gaurav et al., 2024</xref>; <xref ref-type="bibr" rid="ref38">Gupta et al., 2024a</xref>; <xref ref-type="bibr" rid="ref78">Quang et al., 2021</xref>) and proposed optimized DL models for attack detection using feature selection techniques and hyperparameter optimization algorithms (Brown-Bear Optimization or Cuckoo Search), achieving high performance in detecting malicious URLs and attacks in web ecosystems. Another group, consisting of Igor Santos, Borga Sanz, and Xabier Ugarte-Pedrero, have published articles related to spam filtering methods based on anomaly detection using ML algorithms (<xref ref-type="bibr" rid="ref53">Laorden et al., 2014</xref>; <xref ref-type="bibr" rid="ref82">Santos et al., 2012</xref>). This topic is significant since spam messages are a common method for spreading computer viruses, worms, and phishing attempts, with statistics indicating that 46.8% of email traffic consists of spam messages (<xref ref-type="bibr" rid="ref75">Petrosyan, 2024</xref>). Another research group, comprising Kutti Padanyl Soman, Ravi Vinayakumar, Prabaharan Poornachandran, Mamoun Alazab, and Xiaosong Zhang, has explored the advantages of DL in phishing detection and developed a framework for cyber threat situational awareness based on email and URL data analysis (<xref ref-type="bibr" rid="ref94">Vinayakumar et al., 2019</xref>; <xref ref-type="bibr" rid="ref95">Vinayakumar et al., 2019</xref>). Sultan Asiri, Yang Xiao, Saleh Alzahrani and Tieshan Li have also investigated the use of DL for phishing detection and created an anti-phishing system capable of identifying both regular phishing attacks and more specific threats such as Tiny Uniform Resource Locators (TinyURLs) and Browsers in the Browser (BiTB; <xref ref-type="bibr" rid="ref9">Asiri et al., 2023</xref>; <xref ref-type="bibr" rid="ref7">Asiri et al., 2024</xref>; <xref ref-type="bibr" rid="ref8">Asiri et al., 2024</xref>).</p>
<p>Another important indicator for the analyzed field is co-citation analysis. Citation metrics serve as a proxy for research impact, highlighting influential papers, authors, and journals. They help identify seminal works that have shaped the field over time. According to <xref ref-type="bibr" rid="ref23">Donthu et al. (2021)</xref>, publications that are frequently co-cited often exhibit semantic similarities, and co-citation analysis can lead to a better understanding of the fundamental themes within the field. <xref ref-type="fig" rid="fig4">Figure 4</xref> presents a co-citation map for references with at least 40 citations to focus on impactful works and full counting method, generated by VOSviewer. A threshold of 40 citations per document was selected to highlight the articles with the greatest impact on the analyzed field. Our aim was to represent only the articles that have a defining influence on the topic researched. Similarly to the collaboration analysis, we conducted empirical experiments to identify the optimal and relevant minimum threshold. Co-citation network was mapped to visualize knowledge flows and intellectual foundations within the domain. Publications are connected when they appear in the reference list of another publication, with each connection representing a co-citation. The result includes 34 references grouped into 3 clusters and 1076 connections. Node size indicates the number of citations, the connection between two nodes shows that the references appeared together, and the thickness of the lines serves as an indicator of the frequency of these co-citations.</p>
<fig position="float" id="fig4">
<label>Figure 4</label>
<caption>
<p>Co-citations network for references.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g004.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Network visualization depicting interconnected nodes with labels, showcasing relationships between various academic papers. Nodes are color-coded: green, red, and blue, each representing different clusters or fields such as expert systems and communications. Lines illustrate citations or collaborations, indicating the density and interaction among the nodes.</alt-text>
</graphic>
</fig>
<p>In the first cluster, the paper &#x201C;Machine Learning Based Phishing Detection from URLs&#x201D; (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>) ranks first with 180 citations and 33 links, followed by &#x201C;Phishing Detection Based on Associative Classification Data Mining&#x201D; (<xref ref-type="bibr" rid="ref19">Chiew et al., 2019</xref>) with 89 citations and the same number of links. Both studies utilize ML for developing anti-phishing systems.</p>
<p>In the second cluster, the most cited article is &#x201C;CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Websites&#x201D; (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>) with 135 citations and 33 links, followed by &#x201C;Cantina: A Content-Based Approach to Detecting Phishing Websites&#x201D; (<xref ref-type="bibr" rid="ref102">Zhang et al., 2007</xref>) with 109 citations and the same number of links. The authors of these papers developed and subsequently enhanced a layered solution for phishing webpage detection using ML algorithms.</p>
<p>Finally, in the third cluster, the most cited articles are &#x201C;Phishing detection based Associative Classification data mining&#x201D; (<xref ref-type="bibr" rid="ref1">Abdelhamid et al., 2014</xref>) with 71 citations and 32 links and &#x201C;Predicting Phishing Websites Based on Self-Structuring Neural Network&#x201D; (<xref ref-type="bibr" rid="ref61">Mohammad et al., 2014</xref>) with 68 citations and 32 links. In the first article, the authors investigate the applicability of the Multi-label Classifier based Associative Classification method in detecting phishing websites, highlighting both its performance compared to other intelligent algorithms and its ability to generate new knowledge in the form of associative rules with high predictive value. In the second article, the authors propose an intelligent model for predicting phishing attacks based on self-structuring neural networks.</p>
</sec>
<sec id="sec12">
<label>4.3</label>
<title>Thematic analysis and evolution</title>
<p>Thematic analysis and evolution are important bibliometric approaches used to track how research topics emerge, develop, and evolve over time. By analysing keyword co-occurrence networks and the evolution of concepts, valuable insights are gained into shifts in research focus, the emergence of new subfields, and the continuity of key themes over time.</p>
<p>The thematic analysis highlights the topics associated with the use of AI in phishing detection. Topic modeling is a suite of content analysis methods that originates from ML (<xref ref-type="bibr" rid="ref16">Blei, 2012</xref>). The intellectual structure of the topic was defined based on author keyword co-occurrence analysis and visualized using a strategic map created with Bibliometrix package in R language. Keywords reflect the main topics of a research domain. Analysing authors&#x2019; keyword co-occurrence helps identify thematic structures and conceptual relationships between topics. Keywords frequency helps to identify dominant research themes and co-occurrence strength reflects the relationship between these themes. We identified 10 clusters generated based on these keywords by applying the Walktrap algorithm with a Min Cluster Frequency (per thousand docs) set to 10, Number of Words set to 250, Number of Labels to 3 and Label size to 0.3. The Walktrap algorithm is a community detection method used to identify groups of nodes or communities within a network. A random walk starts at a random node and moves to one of its neighbors at each step. Node within a community tends to be more tightly connected, increasing the likelihood that a random walk will remain within that community rather than transitioning to a less connection region of the network (<xref ref-type="bibr" rid="ref93">Van Poucke et al., 2018</xref>). This algorithm can be successfully used in the thematic analysis of a research field. Min Cluster Frequency (per thousand docs) sets a minimum threshold for the frequency of a theme within the dataset. We set the threshold at 10 to ensure that clusters represent recurring and meaningful topics rather than isolated occurrences, allowing us to include relevant themes while eliminating rare or insignificant ones. The word limit was set at 250 to provide the clustering algorithm with a sufficiently rich vocabulary for meaningful topic differentiation. Choosing three labels per cluster helps maintain interpretability by concisely summarizing the main themes, while the label size was adjusted to enhance readability in visualizations without overwhelming the graphical representation.</p>
<p>For cluster generation, we excluded expressions explicitly containing the terms &#x201C;phishing&#x201D; and &#x201C;AI&#x201D; (e.g., phishing, phishing detection, phishing attacks, phishing website detection, artificial intelligence, AI, etc.). Additionally, we compiled a list of synonyms to standardize related terms (e.g., blacklist, blacklisting, blacklists; blockchain, blockchains; bot, botnet, botnet applications, botnet detections; deep neural network, deep neural network (dl), deep neural network (dnn), deep neural networks; machine-learning, machine learning, machine learning algorithms, machine learning classifiers, machine learning models, machine learning techniques; malicious, malicious url, malicious url detection, malicious urls, malicious website, malicious websites; malware, malware detection, malware analysis, etc.). This approach ensures a more nuanced analysis by focusing on secondary themes and related concepts. Based on these hyperparameters and restrictions, the application creates ten clusters. <xref ref-type="table" rid="tab6">Table 6</xref> presents these clusters and their corresponding indicators, including Callon&#x2019;s centrality and density, rank centrality and density, and cluster frequency.</p>
<table-wrap position="float" id="tab6">
<label>Table 6</label>
<caption>
<p>Clusters resulted from thematic analysis.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Cluster</th>
<th align="center" valign="top">Callon centrality</th>
<th align="center" valign="top">Callon density</th>
<th align="center" valign="top">Rank centrality</th>
<th align="center" valign="top">Rank density</th>
<th align="center" valign="top">Cluster frequency</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Web security</td>
<td align="center" valign="top">0.01</td>
<td align="center" valign="top">5.88</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">17</td>
</tr>
<tr>
<td align="left" valign="top">Classifier</td>
<td align="center" valign="top">0.06</td>
<td align="center" valign="top">6.59</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">91</td>
</tr>
<tr>
<td align="left" valign="top">Bert</td>
<td align="center" valign="top">0.02</td>
<td align="center" valign="top">7.69</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">13</td>
<td align="center" valign="top">13</td>
</tr>
<tr>
<td align="left" valign="top">Fraud</td>
<td align="center" valign="top">0.16</td>
<td align="center" valign="top">7.37</td>
<td align="center" valign="top">13</td>
<td align="center" valign="top">12</td>
<td align="center" valign="top">81</td>
</tr>
<tr>
<td align="left" valign="top">Anomaly detection</td>
<td align="center" valign="top">0.08</td>
<td align="center" valign="top">6.90</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">52</td>
</tr>
<tr>
<td align="left" valign="top">Detection</td>
<td align="center" valign="top">0.09</td>
<td align="center" valign="top">6.19</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">53</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning</td>
<td align="center" valign="top">0.63</td>
<td align="center" valign="top">4.84</td>
<td align="center" valign="top">15</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">1438</td>
</tr>
<tr>
<td align="left" valign="top">Convolution neural network</td>
<td align="center" valign="top">0.05</td>
<td align="center" valign="top">5.47</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">60</td>
</tr>
<tr>
<td align="left" valign="top">Security</td>
<td align="center" valign="top">0.15</td>
<td align="center" valign="top">7.21</td>
<td align="center" valign="top">12</td>
<td align="center" valign="top">10</td>
<td align="center" valign="top">159</td>
</tr>
<tr>
<td align="left" valign="top">LSTM</td>
<td align="center" valign="top">0.04</td>
<td align="center" valign="top">7.26</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">31</td>
</tr>
<tr>
<td align="left" valign="top">Social engineering</td>
<td align="center" valign="top">0.39</td>
<td align="center" valign="top">8.44</td>
<td align="center" valign="top">14</td>
<td align="center" valign="top">14</td>
<td align="center" valign="top">187</td>
</tr>
<tr>
<td align="left" valign="top">Blockchains</td>
<td align="center" valign="top">0.11</td>
<td align="center" valign="top">4.20</td>
<td align="center" valign="top">11</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">61</td>
</tr>
<tr>
<td align="left" valign="top">XGBoost</td>
<td align="center" valign="top">0.00</td>
<td align="center" valign="top">10.00</td>
<td align="center" valign="top">1</td>
<td align="center" valign="top">15</td>
<td align="center" valign="top">10</td>
</tr>
<tr>
<td align="left" valign="top">Decision tree</td>
<td align="center" valign="top">0.02</td>
<td align="center" valign="top">5.56</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">18</td>
</tr>
<tr>
<td align="left" valign="top">Large language models</td>
<td align="center" valign="top">0.02</td>
<td align="center" valign="top">7.14</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">14</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Thematic clusters are positioned in a two-dimensional space, allowing for the identification of core and peripheral themes. Centrality reflects the relevance of a theme, identifying dominant and emerging themes. Density measures the internal cohesion of a theme and indicates its level of development. Callon centrality measures the interaction between networks. A high centrality signals that a node has many connections with other nodes, reflecting its potential influence in spreading ideas and information within the network. In the analysis context, <italic>ML</italic>, <italic>social engineering</italic>, and <italic>fraud</italic> are the topics with the highest centrality, playing a crucial role in the research within this field. Callon density measures the cohesion between nodes. A high density reflects a strong connection between themes and a coherent structure. In this case, eXtreme Gradient Boosting (XGBoost) has the highest Callon density value. Rank centrality and rank density provide information on the relative importance of nodes within a cluster, while cluster frequency represents the number of appearances in the dataset. A higher frequency indicates themes that appear more frequently in the dataset articles. Here, ML has a significantly higher frequency compared to other clusters. <xref ref-type="table" rid="tab6">Table 6</xref> is the basis for <xref ref-type="fig" rid="fig5">Figure 5</xref>. Each bubble represents a network cluster. The words within each cluster that define its name are those with the highest occurrence, and the size of the bubble is proportional to the frequency of those words. The centrality and density of the cluster, according to Callon&#x2019;s measures, are reflected in the position of the bubble. The thematic map provides a structured visualization of research topics, categorizing them based on their relevance and development within a field. This helps in understanding the intellectual structure of a research domain, identifying well-established themes, and detecting emerging trends.</p>
<fig position="float" id="fig5">
<label>Figure 5</label>
<caption>
<p>Thematic map.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g005.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Graph depicting various technology themes based on development degree (density) and relevance degree (centrality). It categorizes themes into four quadrants: Niche, Motor, Emerging or Declining, and Basic. Themes like "social engineering" and "data mining" are Motor, while "blockchains" and "deep learning" are Basic. "BERT" and "XGBoost" fall into Niche, with "web security" and "decision tree" as Emerging or Declining.</alt-text>
</graphic>
</fig>
<p>In the first quadrant, motor themes are characterized by a high degree of relevance and development, being essential for organizing the study topic. In this context, these themes are associated with: <italic>social engineering</italic> (<xref ref-type="bibr" rid="ref41">Innab et al., 2024</xref>; <xref ref-type="bibr" rid="ref74">Perera and Grob, 2024</xref>), <italic>network security</italic>, particularly in the context of the development of cloud computing (<xref ref-type="bibr" rid="ref21">Dawood et al., 2023</xref>), the use of <italic>data mining</italic> (<xref ref-type="bibr" rid="ref9003">Alshahrani et al., 2022</xref>; <xref ref-type="bibr" rid="ref15">Bejandi et al., 2022</xref>) and <italic>combating phishing fraud</italic>, a concern driven by the significant number of financial frauds in recent years (<xref ref-type="bibr" rid="ref42">Iscan et al., 2023</xref>). Other topics included in this quadrant are general concepts or tools related to phishing attacks such as <italic>electronic email</italic>, <italic>uniform resource locator</italic>, <italic>real-time</italic>, <italic>malware</italic> and <italic>intrusion detection</italic>.</p>
<p>In the second quadrant, niche themes exhibit a high level of development but a lower degree of relevance. For the analyzed dataset, this category includes topics encompasses various technologies used in cyber-attack detection and mitigation, including <italic>XGBoost</italic> (<xref ref-type="bibr" rid="ref9004">Agagu et al., 2024</xref>; <xref ref-type="bibr" rid="ref36">Gualberto et al., 2020</xref>; <xref ref-type="bibr" rid="ref69">Omari, 2023</xref>), <italic>large language models</italic> (<xref ref-type="bibr" rid="ref40">Heiding et al., 2024</xref>; <xref ref-type="bibr" rid="ref91">Trad and Chehab, 2024</xref>), <italic>Bidirectional Encoder Representations from Transformers (BERT)</italic> (<xref ref-type="bibr" rid="ref88">Thapa et al., 2023</xref>) and <italic>Long Short-Term Memory (LSTM)</italic> (<xref ref-type="bibr" rid="ref34">Gopali et al., 2024</xref>; <xref ref-type="bibr" rid="ref71">Orunsolu et al., 2022</xref>). Their position in this quadrant reflects that while these topics are well-developed, they have not yet attained central importance within the broader scientific field.</p>
<p>In the third quadrant, emerging or declining themes exhibit low density and centrality, indicating that they are in a developing stage. In the analyzed context, the use of <italic>decision tree</italic>, <italic>convolution neural network</italic> and <italic>deep neural network</italic> are less developed and have low centrality. It also includes a more generic topic related to phishing attack as <italic>web security</italic>. However, the theme could become more prominent due to the potential of using ML in identifying cyber-security attacks such as smishing (<xref ref-type="bibr" rid="ref44">Jain and Gupta, 2018a</xref>).</p>
<p>The last quadrant, basic themes, includes topics with high relevance but low development, typically serving as foundational elements for understanding the field. In the analyzed dataset, this quadrant encompasses: <italic>ML</italic> and <italic>DL</italic>, <italic>blockchain, mitigation of phishing attacks on the Ethereum platform</italic>, the second-largest blockchain platform (<xref ref-type="bibr" rid="ref54">Li et al., 2022</xref>), the use of <italic>social media</italic> for phishing attacks (<xref ref-type="bibr" rid="ref11">Aun et al., 2023</xref>; <xref ref-type="bibr" rid="ref50">Khan and Unhelkar, 2024</xref>). These themes are considered basic due to their foundational role in the study area, despite their current lower level of development.</p>
<p>Some topics lie at the intersection of different quadrants. The use of <italic>bots for detecting anomalies</italic> is positioned on the boundary between the first and fourth quadrants. This suggests a subject that is central to the field, with high relevance but an intermediate level of development. Future research may lead to the consolidation of work on the use of bots in detecting phishing attacks, or it may remain only a reference point for other future research directions. Similarly, concerns regarding the creation of <italic>blacklists</italic> and the use of <italic>ML and DL classifiers</italic> are located at the intersection of the penultimate two quadrants. This positioning suggests that these directions are recognized as relevant and connected to the core themes of the field but are still at a relatively early stage of development.</p>
<p>The thematic evolution of the research field reflects the dynamic changes in its core topics over time, revealing trends, emerging areas, and the persistence or decline of specific themes. This analysis provides valuable insights highlighting key trends and shifts in focus. Throughout the analyzed period, the theme associated with the use of AI in phishing detection has evolved and transformed due to both the expansion of the phenomenon and technological advancements. This analysis was undertaken using Bibliometrix package. A longitudinal analysis is conducted by segmenting data into time periods to analyse thematic evolution. We divided the analyzed period into three-time frames of different lengths to account for variations in the field&#x2019;s evolution. The first period covers 10 years (2005&#x2013;2015), as research in this area was more limited, and topics were less diverse. The last two periods were segmented into two intervals (2016&#x2013;2020 and 2021&#x2013;2025), reflecting the increasing diversification of topics related to the use of AI for phishing detection. <xref ref-type="table" rid="tab7">Table 7</xref> provides details on these transformations based on authors&#x2019; keywords. The same list of excluded terms and synonyms used in the thematic analysis was applied in the case of thematic evolution. Among the hyperparameters, only the Min Cluster Frequency (per thousand docs) value was modified to 5. The remaining parameters&#x2014;Number of Words, Weight Index, Min Weight Index, Label, Number of Labels (for each cluster), and Clustering Algorithm&#x2014;remained unchanged. Specifically, Number of Words was set to 250, Min Weight Index was set to 0.1, Label was set to 0.3, Number of Labels (for each cluster) was set to 3, and for the Clustering Algorithm, we selected Walktrap. The Weight Index chosen for thematic evolution was the Inclusion Index weighted by Word Occurrences. To establish the parameters, we applied the same reasoning as in the thematic analysis.</p>
<table-wrap position="float" id="tab7">
<label>Table 7</label>
<caption>
<p>Thematic evolution.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">From</th>
<th align="left" valign="top">To</th>
<th align="center" valign="top">Weighted inclusion index</th>
<th align="center" valign="top">Inclusion index</th>
<th align="center" valign="top">Occurrences</th>
<th align="center" valign="top">Stability index</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Anomaly detection&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Anomaly detection&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">0.11</td>
</tr>
<tr>
<td align="left" valign="top">Anomaly detection&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Computer crime&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">0.17</td>
<td align="center" valign="top">0.17</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.07</td>
</tr>
<tr>
<td align="left" valign="top">Anomaly detection&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Deep learning&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">0.10</td>
<td align="center" valign="top">0.11</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.04</td>
</tr>
<tr>
<td align="left" valign="top">Boosting&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Machine-learning&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.05</td>
</tr>
<tr>
<td align="left" valign="top">Decision tree&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Random forest&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.50</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Computer security&#x2013;2016&#x2013;2020</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.08</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Machine-learning&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">0.70</td>
<td align="center" valign="top">0.08</td>
<td align="center" valign="top">23</td>
<td align="center" valign="top">0.04</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2005&#x2013;2015</td>
<td align="left" valign="top">Social engineering&#x2014;2016&#x2013;2020</td>
<td align="center" valign="top">0.13</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">0.07</td>
</tr>
<tr>
<td align="left" valign="top">Accuracy&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Security&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.67</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.06</td>
</tr>
<tr>
<td align="left" valign="top">Anomaly detection&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Anomaly detection&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.20</td>
</tr>
<tr>
<td align="left" valign="top">Bot&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Bot&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">8</td>
<td align="center" valign="top">1.00</td>
</tr>
<tr>
<td align="left" valign="top">Computer crime&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Convolution neural network&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.17</td>
<td align="center" valign="top">0.25</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.11</td>
</tr>
<tr>
<td align="left" valign="top">Computer crime&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Deep neural network&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.17</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.14</td>
</tr>
<tr>
<td align="left" valign="top">Computer security&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Security&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.06</td>
</tr>
<tr>
<td align="left" valign="top">Convolution neural network&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Convolution neural network&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.64</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">0.17</td>
</tr>
<tr>
<td align="left" valign="top">Cyberattack&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Cybercrime&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Deep learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Anomaly detection&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.14</td>
<td align="center" valign="top">0.20</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">0.04</td>
</tr>
<tr>
<td align="left" valign="top">Deep learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">LSTM&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.09</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">0.05</td>
</tr>
<tr>
<td align="left" valign="top">Deep learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.73</td>
<td align="center" valign="top">0.05</td>
<td align="center" valign="top">32</td>
<td align="center" valign="top">0.02</td>
</tr>
<tr>
<td align="left" valign="top">Deep learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Security&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.02</td>
<td align="center" valign="top">0.06</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Detection&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Ensemble&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.57</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">4</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">LSTM&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">LSTM&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.33</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Blockchains&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.07</td>
<td align="center" valign="top">0.25</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">0.05</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.84</td>
<td align="center" valign="top">0.06</td>
<td align="center" valign="top">120</td>
<td align="center" valign="top">0.02</td>
</tr>
<tr>
<td align="left" valign="top">Machine-learning&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Security&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.05</td>
<td align="center" valign="top">0.06</td>
<td align="center" valign="top">9</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Mobile phishing&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.18</td>
<td align="center" valign="top">0.25</td>
<td align="center" valign="top">2</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Random forest&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Convolution neural network&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.46</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">6</td>
<td align="center" valign="top">0.20</td>
</tr>
<tr>
<td align="left" valign="top">Random forest&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.54</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Social engineering&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Security&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.74</td>
<td align="center" valign="top">0.33</td>
<td align="center" valign="top">14</td>
<td align="center" valign="top">0.06</td>
</tr>
<tr>
<td align="left" valign="top">Social media&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.71</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">5</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">Web security&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">0.78</td>
<td align="center" valign="top">0.50</td>
<td align="center" valign="top">7</td>
<td align="center" valign="top">0.03</td>
</tr>
<tr>
<td align="left" valign="top">XGBoost&#x2014;2016&#x2013;2020</td>
<td align="left" valign="top">Machine-learning&#x2014;2021&#x2013;2025</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">1.00</td>
<td align="center" valign="top">3</td>
<td align="center" valign="top">0.03</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The metrics used to quantify the transition or stability of the analyzed themes are weighted inclusion index, inclusion index, number of occurrences and stability index. These metrics offer an overview of the trends over time and the importance of specific topics based on the articles in the dataset. Weighted inclusion index and inclusion index are normalized metrics of relevance and overlap of themes, with values ranging from 0 to 1. A value of 1 indicates maximum overlap or relevance. Occurrences reflect the number of documents supporting the transition of themes. Stability index indicates the stability of a research theme between consecutive periods. A value closer to 1 indicates a higher number of studies supporting the transition of the theme. These metrics together provide a comprehensive view of how themes related to AI in phishing detection have developed and shifted over time, reflecting their growing or diminishing importance in the research landscape.</p>
<p>Each line in the table represents a change or continuation in research on a specific topic. A perfect transition is indicated by a value of 1 for the weighted inclusion index, inclusion index, and stability index. The weighted inclusion index and inclusion index both have a value of 1 for the following themes: Decision Tree to Random Forest&#x2014;a perfectly expected transition as Random Forest is an extension of Decision Tree (2005&#x2013;2015 to 2016&#x2013;2020); cyberattack, cybercrime to ML (2016&#x2013;2020 to 2021&#x2013;2025), and XGBoost to ML (2016&#x2013;2020 to 2021&#x2013;2025). This value indicates a maximum overlap or relevance between the mentioned themes.</p>
<p>Research in the fields of bots and LSTM continuity (2016&#x2013;2020 to 2021&#x2013;2025) reflects a consistent interest. However, LSTM has a smaller number of occurrences (2) and lower stability (0.33), whereas bots is a topic with a higher number of occurrences (8) and with string stability (1). This indicates that while bots maintain a strong presence with extensive research, the topic of LSTM has a broader and less stable presence. Furthermore, ML (2005&#x2013;2015 to 2016&#x2013;2021 and 2016&#x2013;2020 to 2021&#x2013;2025) shows continuity, reflecting the stability of the topic over the entire period analyzed, with a high weighted inclusion index in both cases. The lower inclusion index indicates a constant but slightly reduced interest, which may suggest both progress and saturation in the research efforts related to ML for anti-phishing. On the other hand, the very high number of occurrences of the concept (120) in the latter period might be an indicator of significant progress in the field.</p>
<p>The analysis period was divided into three slices: from 2005 to 2015, from 2016 to 2020, and from 2021 to 2025 (<xref ref-type="fig" rid="fig6">Figure 6</xref>). This decision was influenced by the evolution of the number of published works on the analyzed topic across these periods and by the diversity of topics. In the first slice, the number of published papers was very small, with authors&#x2019; concerns focused on two research directions: ML, anomaly detection, boosting and decision trees. In the subsequent period, the number of published papers on the subject increased significantly, showing an upward trend until 2022 and from 2023 to 2024. Themes that appeared at least ten times per thousand documents (Min Cluster Frequency (per thousand docs)) were considered relevant for inclusion in the graphic. We set the threshold at 10 to ensure that clusters represent recurring and meaningful topics, not isolated ones.</p>
<fig position="float" id="fig6">
<label>Figure 6</label>
<caption>
<p>Thematic evolution based on authors&#x2019; keywords.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g006.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Sankey diagram depicting the evolution of keywords in three time periods: 2005-2015, 2016-2020, and 2021-2025. Initial terms like "machine learning," "anomaly detection," and "decision tree" in 2005-2015 transition into others like "computer security," "deep learning," and "cybercrime" in 2016-2020. By 2021-2025, main terms include "machine learning," "security," and "convolution neural network." Flow thickness illustrates the transition intensity between periods.</alt-text>
</graphic>
</fig>
<p>In the second period, initial themes evolved and led to the emergence of new topics. ML was carried over into subsequent periods in its original form but also led to a new direction, focusing on computer security for phishing identification. Decision Tree evolved into Random Forest and then focused on ML and CNN in the period 2021&#x2013;2025. During the 2016&#x2013;2020 period, new themes related to AI in phishing detection emerged, including bot, LSTM, XGBoost, CNNs, cyberattack, cybercrime, deep learning, web security, social engineering and social media, etc. The Random Forest topic from 2016 to 2020 expanded into two topics: ML and CNNs. Themes such as Random Forest, ML, deep learning, cybercrime, accuracy detection, and cyberattack from 2016 to 2020 merged and reformed into the theme of ML for 2021&#x2013;2025.</p>
</sec>
</sec>
<sec sec-type="discussion" id="sec13">
<label>5</label>
<title>Discussion</title>
<sec id="sec14">
<label>5.1</label>
<title>RQ1. How has the publication landscape in AI for phishing detection research evolved over time?</title>
<p>Phishing attacks are widespread across the globe, and the methods to counter them are also of global interest (<xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>). The results presented in the previous section highlight a <italic>significant growth</italic> and <italic>specialization in research</italic> on the use of AI for detecting and mitigating phishing attacks, particularly in recent years. This trend can be attributed to a combination of factors. On one hand, the frequency of phishing attacks has increased markedly, while on the other hand, AI has become substantially more advanced, continuously enhancing its ability to understand complex behaviors, detect patterns within large datasets, and adapt to identify progressively sophisticated phishing techniques. The authors of the analyzed papers focused on subdomains of AI, such as ML, DL, and NLP, to identify the most effective algorithms and methods to improve the results obtained in phishing prevention and detection.</p>
<p>The top 10 most cited studies identified through our bibliometric analysis focus on phishing detection using ML methods and, more recently, DL techniques. The types of data analyzed by these algorithms include URLs and their components, HTML content and hyperlinks, JavaScript behavior, network indicators and third-party services, as well as metadata from search engines, among others. A taxonomy of the types of data used by the most cited articles is presented in <xref ref-type="fig" rid="fig7">Figure 7</xref>, and the classification of the used datasets is visible in <xref ref-type="fig" rid="fig8">Figure 8</xref>.</p>
<fig position="float" id="fig7">
<label>Figure 7</label>
<caption>
<p>Types of data used in phishing detection.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g007.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Hierarchical diagram showing types of data used in Machine Learning (ML) and Deep Learning (DL). Under ML: URL and components, HTML content and hyperlinks, network indicators and third-party services, and search engine metadata. Under DL: URL and components, visual data (screenshots). Various studies cited.</alt-text>
</graphic>
</fig>
<fig position="float" id="fig8">
<label>Figure 8</label>
<caption>
<p>Datasets used in phishing detection.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g008.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart displaying datasets categorized into ML and DL. Under ML: Public datasets from Alexa, PhishTank, OpenPhish, DMOZ; mixed sources; built by authors; domain-specific for payment gateways and banking websites. Under DL: Large URL collections, approximately 2 million. Each category cites references.</alt-text>
</graphic>
</fig>
<p>The main objectives pursued by the authors are to identify discriminative features between legitimate and phishing websites, develop ML/DL models for websites&#x2019; classification, reduce false alarm rates and response times, and design scalable real-time solutions. The results reported by these authors are presented synthetically in <xref ref-type="table" rid="tab8">Table 8</xref> and <xref ref-type="fig" rid="fig9">Figure 9</xref>.</p>
<table-wrap position="float" id="tab8">
<label>Table 8</label>
<caption>
<p>A synthesis of the work of the most cited 10 articles in our dataset.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Title</th>
<th align="left" valign="top">Method/model</th>
<th align="left" valign="top">Used data</th>
<th align="left" valign="top">Algorithms/techniques</th>
<th align="left" valign="top">Main findings</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top"><italic>Toward Detection of Phishing Websites on Client-Side Using Machine Learning Based Approach</italic> (<xref ref-type="bibr" rid="ref45">Jain and Gupta, 2018b</xref>)</td>
<td align="left" valign="top">ML on multiple datasets</td>
<td align="left" valign="top">Phishtank, OpenPhish, Alexa, payment gateways, banks</td>
<td align="left" valign="top">RF, SVM, Neural Nets, Logistic Regression, Naive Bayes</td>
<td align="left" valign="top">Improved accuracy using client-side data extraction</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Detection of Phishing Websites Using an Efficient Feature-Based Machine Learning Framework</italic> (<xref ref-type="bibr" rid="ref79">Rao and Pais, 2019</xref>)</td>
<td align="left" valign="top">Feature extraction from URL&#x202F;+&#x202F;source code + 3rd parties</td>
<td align="left" valign="top">Diverse data sets</td>
<td align="left" valign="top">8 ML algorithms</td>
<td align="left" valign="top">Better than CANTINA/CANTINA+, detects zero-day phishing</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Phishing Website Detection Based on Multidimensional Features Driven by Deep Learning</italic> (<xref ref-type="bibr" rid="ref100">Yang et al., 2019</xref>)</td>
<td align="left" valign="top">CNN for phishing detection</td>
<td align="left" valign="top">~2M URLs (1,021,758 phishing + 989,021 legitimate)</td>
<td align="left" valign="top">CNN</td>
<td align="left" valign="top">High performance and fast processing speed</td>
</tr>
<tr>
<td align="left" valign="top"><italic>Machine Learning Based Phishing Detection from URLs</italic> (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>)</td>
<td align="left" valign="top">Custom dataset + NLP</td>
<td align="left" valign="top">73,575 URLs (36,400 legitimate, 37,175 phishing)</td>
<td align="left" valign="top">DT, AdaBoost, K-star, kNN, RF, SMO, Naive Bayes</td>
<td align="left" valign="top">Scalable, real-time, detects new phishing attempts</td>
</tr>
<tr>
<td align="left" valign="top"><italic>PhishStorm: Detecting Phishing with Streaming Analytics</italic> (<xref ref-type="bibr" rid="ref57">Marchal et al., 2014</xref>)</td>
<td align="left" valign="top">PhishStorm &#x2013; real-time detection</td>
<td align="left" valign="top">PhishTank, DMOZ: URLs + search engine queries</td>
<td align="left" valign="top">Classical ML on URL components</td>
<td align="left" valign="top">94.91% accuracy, 1.44% false positives (FP)</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A Machine Learning Based Approach for Phishing Detection Using Hyperlinks Information</italic> (<xref ref-type="bibr" rid="ref46">Jain and Gupta, 2019</xref>)</td>
<td align="left" valign="top">HTML hyperlinks analysis</td>
<td align="left" valign="top">PhishTank, OpenPhish, Alexa: Hyperlinks from source code</td>
<td align="left" valign="top">Logistic Regression + 12 hyperlink features</td>
<td align="left" valign="top">Achieved 98.4% accuracy, language-independent</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A New Hybrid Ensemble Feature Selection Framework for Machine Learning-Based Phishing Detection System</italic> (<xref ref-type="bibr" rid="ref19">Chiew et al., 2019</xref>)</td>
<td align="left" valign="top">HEFS + CDF-g for optimal feature selection</td>
<td align="left" valign="top">Multiple sources</td>
<td align="left" valign="top">Ensemble framework</td>
<td align="left" valign="top">Improves accuracy through optimal feature selection</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A Stacking Model Using URL and HTML Features for Phishing Webpage Detection</italic> (<xref ref-type="bibr" rid="ref55">Li et al., 2019</xref>)</td>
<td align="left" valign="top">Stacking model on URL&#x202F;+&#x202F;HTML features</td>
<td align="left" valign="top">Phishtank (2k webpages)&#x202F;+&#x202F;Alexa (49,947 webpages)</td>
<td align="left" valign="top">Combined SVM, NN, DT, RF</td>
<td align="left" valign="top">High accuracy, stacking outperforms individual models</td>
</tr>
<tr>
<td align="left" valign="top"><italic>CANTINA+: A Feature-Rich Machine Learning Framework for Detecting Phishing Web Sites</italic> (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>)</td>
<td align="left" valign="top">Extraction of 15 high-level webpage characteristics from URLs, HTML DOM, 3rd party services, search engines</td>
<td align="left" valign="top">Diverse Web resources</td>
<td align="left" valign="top">SVM, Logistic Regression, Bayesian Network, J48, Random Forest, AdaBoost</td>
<td align="left" valign="top">Good TP/FP rate, competitive solution</td>
</tr>
<tr>
<td align="left" valign="top"><italic>A Comprehensive Survey of AI-Enabled Phishing Attacks Detection Techniques</italic> (<xref ref-type="bibr" rid="ref13">Basit et al., 2021</xref>)</td>
<td align="left" valign="top">Review on phishing</td>
<td align="left" valign="top">Diverse datasets</td>
<td align="left" valign="top">RF, SVM, kNN</td>
<td align="left" valign="top">ML and DL have up to 99% accuracy, much better than heuristics and data mining approaches</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig position="float" id="fig9">
<label>Figure 9</label>
<caption>
<p>Main contributions of the most cited studies.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g009.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart with a yellow box titled "Contributions" connected to four white boxes. These list: "Improved accuracy in phishing detection" with citations, "Real-time detection" citing Marchal et al., 2014, "Zero-day phishing detection" citing Rao &#x0026; Pais, 2019, and "Scalability and fast processing speed" with citations.</alt-text>
</graphic>
</fig>
<p>The key-trends extracted after an in-depth content analysis of the top 10 most cited articles ranked by Local Citation Rate (LCR)&#x2014;a metric reflecting their relevance within the domain &#x2013; are:</p>
<list list-type="order">
<list-item>
<p><italic>Growing adoption of ML methods</italic>. ML techniques are increasingly used for phishing detection, with frequently employed algorithms including Random Forest, Support Vector Machines, Neural Networks, Logistic Regression, Naive Bayes, k-Nearest Neighbors, Decision Trees, and AdaBoost. Using an extended number of features of different types (e.g. URLs&#x2019; configurations, source code, 3rd party services data) and being applied on large and diverse datasets (Alexa, PhishTank, OpenPhish, payment gateways data, or own data collected/generated by authors), these approaches achieve high detection rates compared with traditional methods: for instance <xref ref-type="bibr" rid="ref13">Basit et al. (2021)</xref>, while (<xref ref-type="bibr" rid="ref96">Wang et al., 2019</xref>) calculated a 97% accuracy for an algorithm based on CNN.</p>
</list-item>
<list-item>
<p><italic>Shift from classical ML to DL</italic>. While classical ML algorithms dominated research until 2018 and achieved strong performance, they required feature sets curated by people and exhibited limitations in detecting previously unseen phishing attacks (e.g., tiny URLs, BiB &#x2013; false authentication windows). After 2018, DL techniques&#x2014;such as CNNs, DNNs, and stacking-based architectures&#x2014;gained significant traction, offering higher accuracy, better scalability, and improved generalization. In particular, the adoption of CNNs has grown due to their ability to capture local correlations in data, especially in Big Data environments (<xref ref-type="bibr" rid="ref100">Yang et al., 2019</xref>). To mitigate long training times associated with DL (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>), recommend leveraging parallel processing techniques. <xref ref-type="fig" rid="fig10">Figure 10</xref> presents the main differences between ML and DL in phishing detection process.</p>
</list-item>
<list-item>
<p><italic>Increasing importance of feature selection and engineering</italic>. Feature engineering plays a critical role in enhancing detection performance. For example, <xref ref-type="bibr" rid="ref19">Chiew et al. (2019)</xref> introduce the Hybrid Ensemble Feature Selection (HEFS) framework and the CDF-g algorithm for optimal feature selection. Modern approaches typically combine URL-based features (e.g., length, entropy, number of subdomains), HTML-based features (e.g., suspicious links, login forms), and third-party data (e.g., Alexa rank, domain reputation).</p>
</list-item>
<list-item>
<p><italic>Hybrid models and classifier stacking</italic>. The analyzed studies demonstrate that combining ML and DL models with stacking-based approaches (<xref ref-type="bibr" rid="ref55">Li et al., 2019</xref>) significantly improves detection performance. Furthermore, integrating feature selection as a preprocessing step enhances accuracy by focusing on the most relevant attributes within the dataset.</p>
</list-item>
<list-item>
<p><italic>Real-time phishing detection</italic>. Real-time detection systems are considered a priority. For instance, PhishStorm (<xref ref-type="bibr" rid="ref57">Marchal et al., 2014</xref>) introduces an automated, Big Data-supported URL analysis framework capable of achieving 94.91% accuracy with a 1.44% false positive rate.</p>
</list-item>
<list-item>
<p><italic>Faster, scalable, and language-independent systems</italic>. Modern solutions are evolving toward client-side architectures that are fast, lightweight, and less dependent on third-party databases. This independence enhances scalability and robustness. Additionally, continuous model retraining on large, up-to-date datasets is increasingly adopted to maintain high adaptability against evolving phishing strategies.</p>
</list-item>
</list>
<fig position="float" id="fig10">
<label>Figure 10</label>
<caption>
<p>ML versus DL. Differences in phishing detection process.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g010.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Comparison chart of Machine Learning (ML) and Deep Learning (DL). ML uses text data, small datasets, manual feature engineering, models like RF and SVM, and low-resource training. DL uses large text and visual datasets, automatic feature extraction, CNN models, and GPU-intensive training.</alt-text>
</graphic>
</fig>
<p>Overall, the field is witnessing a paradigm shift&#x2014;from traditional ML-based approaches toward hybrid, DL-driven, real-time detection systems that integrate advanced feature engineering and scalable architecture. These developments position AI-powered solutions as the cornerstone of next-generation phishing defense mechanisms.</p>
</sec>
<sec id="sec15">
<label>5.2</label>
<title>RQ2. What are the core thematic clusters within the AI-based phishing detection research field, and how do these clusters interact and evolve over time?</title>
<p>The field is anchored around social engineering, network security, and ML/DL-based detection, while advanced methods like LLMs, BERT, CNNs, and blockchain are gaining traction. Future research is likely to consolidate work on bot-based anomaly detection and expand AI techniques for emerging phishing vectors such as smishing and social media attacks.</p>
<p>Research on AI-based phishing detection has evolved rapidly over the past two decades. In its initial stage (2005&#x2013;2015), studies were limited and relied on basic ML techniques applied to URL and webpage classification using small, manually engineered datasets. The expansion phase (2016&#x2013;2020) brought a surge in publications and the adoption of more advanced ML and DL methods, including Random Forests, CNNs, LSTMs, and XGBoost, alongside emerging themes such as bot detection, social engineering, and web security. In the consolidation phase (2021&#x2013;2025), research has shifted toward integrated ML&#x2013;DL frameworks, with CNN-based architectures dominating large-scale phishing detection, ensemble methods becoming standard, and bot detection maturing as a stable area. While ML remains central, there is a clear trend toward hybrid, AI-driven solutions that enhance accuracy, scalability, and zero-day detection capabilities (<xref ref-type="fig" rid="fig11">Figure 11</xref>).</p>
<fig position="float" id="fig11">
<label>Figure 11</label>
<caption>
<p>Main trends in time.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g011.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Timeline illustrating the progression of machine learning in phishing detection. From 2005 to 2015, classic ML and anomaly detection with small datasets emerge. 2016 to 2021 sees diversification into complex models like Random Forest, CNN, addressing cyberattacks. 2021 to 2025 focuses on advanced AI with CNN, deep learning, emphasizing botnet detection and zero-day attack mitigation.</alt-text>
</graphic>
</fig>
<p>Regarding the thematic evolution of the field, the analysis conducted using Bibliometrix reveals a gradual shift from simple approaches to increasingly complex methodologies based on ML and DL, accompanied by the emergence of new research directions:</p>
<list list-type="bullet">
<list-item>
<p>ML remains the foundation of phishing detection but is increasingly complemented by DL and hybrid models;</p>
</list-item>
<list-item>
<p>CNNs are becoming the de facto standard for handling large-scale datasets and detecting complex patterns;</p>
</list-item>
<list-item>
<p>XGBoost and Random Forest remain core algorithms but are now frequently integrated into ensemble-based detection systems;</p>
</list-item>
<list-item>
<p>Bot detection and social media analytics are gaining prominence in the context of large-scale phishing campaigns;</p>
</list-item>
<list-item>
<p>Scalability, real-time detection, and the ability to identify zero-day phishing attacks have become critical priorities;</p>
</list-item>
<list-item>
<p>As classical ML approaches reach a saturation point, the focus is shifting toward integrating advanced AI techniques to achieve higher accuracy and greater adaptability.</p>
</list-item>
</list>
</sec>
<sec id="sec16">
<label>5.3</label>
<title>RQ3. How are the AI technologies identified in the study utilized within organizations?</title>
<p>An important aspect highlighted by recent research is that the most effective solutions for protecting potential phishing victims need to be implemented at the organizational level through the application of technology security governance, following strict taxonomy classification (<xref ref-type="bibr" rid="ref73">Peji&#x0107;-Bach et al., 2023</xref>). These observations reiterate the potential of AI, given the significant resources required for the implementation of these technologies. <xref ref-type="fig" rid="fig12">Figure 12</xref> presents a taxonomy of anti-phishing solutions within organizations both based on traditional methods and based on AI.</p>
<fig position="float" id="fig12">
<label>Figure 12</label>
<caption>
<p>Practical implications&#x2014;a taxonomy.</p>
</caption>
<graphic xlink:href="frai-08-1496580-g012.tif" mimetype="image" mime-subtype="tiff">
<alt-text content-type="machine-generated">Flowchart illustrating anti-phishing solutions. It divides into Technical and Administrative branches. Technical includes methods: Traditional (rule-based systems, keyword matching, blacklists) and AI-based (DL, ML, NPL with tools like Barracuda, PhishTitan, Trustifi). Integration involves endpoint protection, security platforms, secure email gateways, and cloud-based systems. Source analyzed includes suspicious URLs, messages, anomalies, and brand impersonation. Administrative covers employee training and security policy implementation.</alt-text>
</graphic>
</fig>
<p>Developing effective systems for phishing attack detection remains a persistent challenge for <bold>cybersecurity experts</bold>. While traditional methods use mostly rule-based systems, keyword matching, blacklists, and heuristic techniques, current approaches predominantly leverage ML and DL algorithms; however, these techniques often exhibit high false-positive rates and demand substantial computational resources (<xref ref-type="bibr" rid="ref13">Basit et al., 2021</xref>) and access to relevant and updated large datasets. The significance of AI in detecting phishing attacks is primarily attributed to the capacity of ML and DL models to learn from new data and enhance detection accuracy over time. The current analysis of the most relevant studies identified through bibliometric tools highlights a sustained focus on improving the algorithms employed, particularly in response to challenges such as the shift of phishing attacks to mobile platforms, the targeting of multilingual websites, and the evolving nature of phishing tactics. AI&#x2019;s ability to integrate diverse data sources (e.g., images) and adapt to new attack patterns positions it as an important tool in addressing the existing gap in detection systems, enhancing both their robustness and adaptability.</p>
<p><bold>The interest of cybersecurity companies in integrating AI into phishing detection</bold> mechanisms is increasingly evident. Besides the AI-powered anti-phishing techniques used by giants like Microsoft and Google to provide safe browsing for the users, divers measures are adopted by smaller players on the cybersecurity market. For example, Barracuda employs AI to enhance phishing detection by analysing email content in real-time, identifying anomalies, and automating remediation while continuously adapting to emerging threats. The system refines its detection models through ML that assess behavioral deviations, minimizing false positives. Additionally, AI-powered anomaly detection in Barracuda&#x2019;s Managed XDR (eXtended Detection and Response) establishes security baselines, identifies suspicious activities, and enables proactive threat mitigation across diverse environments (<xref ref-type="bibr" rid="ref12">Barracuda, 2024</xref>). PhishTitan applies ML algorithms to analyse email content, inspect headers, and identify potential phishing attempts. Curated threat intelligence feeds assist in detecting malicious URLs, while real-time URL analysis and rewriting help mitigate risks (<xref ref-type="bibr" rid="ref90">TitanHQ, 2024</xref>).</p>
<p>In both cases, integration with Microsoft 365 enables enhanced security measures against phishing threats. Trustifi&#x2019;s cloud-based security solution employs text-based AI to detect impersonation, spoofing, spear-phishing, and business email compromise while scanning URLs and attachments for malicious content. AI-powered filtering mechanisms ensure inbox hygiene by eliminating spam and graymail, thereby mitigating phishing risks (<xref ref-type="bibr" rid="ref92">Trustifi, 2024</xref>). Abnormal employs AI and ML through computer vision and NLP to analyse email content, benchmark behaviors, and assess risk in account activity. Multiple AI models, including identity and behavioral mapping, BERT large language models, and risk profiling, are integrated to detect anomalies and correlate threats based on user identity fluctuations, event context, and risk assessment. AI automates email security management by understanding user behavior, remediating attacks, correlating malicious reports, and leveraging conversational AI for real-time security training, while continuously refining detection capabilities with large language models (<xref ref-type="bibr" rid="ref86">Shiebler, 2023</xref>). On this background, cybersecurity solutions developers can leverage the findings of this study to identify the most effective technologies and algorithms for phishing detection and subsequently integrate them into their own products, provided that sufficiently large datasets are available to support ML and DL models. Moreover, the identification of the most relevant authors, their geographic distribution, and their collaboration patterns presented in this study can facilitate partnerships between academia and industry, accelerating the transfer of theoretical advancements into practical cybersecurity solutions. Additionally, the identification of key topics within author clusters enables organizations to align their research and development strategies with cutting-edge innovations in phishing detection.</p>
<p>The <bold>integration of AI-based solutions for detecting suspicious URLs and messages into an organization&#x2019;s security infrastructure</bold> can be implemented in various ways. AI-generated phishing alerts can be incorporated as a functionality within enterprise security dashboards and security information and event management (SIEM) systems (<xref ref-type="bibr" rid="ref72">PaloAltoNetworks, 2025</xref>). AI phishing detection works alongside antivirus software and firewalls in endpoint security solutions (<xref ref-type="bibr" rid="ref84">SentinelOne, 2025</xref>), and is embedded by services like Microsoft Defender, Google Workspace Security, and third-party cybersecurity platforms in cloud email protection solutions (<xref ref-type="bibr" rid="ref66">Nathanson and Yamunan, 2025</xref>). These integrations strengthen an organization&#x2019;s defense against phishing by providing comprehensive monitoring, real-time threat detection, and automated responses across various platforms and devices. Cybersecurity officers can contribute by selecting solutions such as email security gateways, web filters, or behavioral analysis applications, implementing, testing, and comparatively analyzing them in terms of performance and in relation to traditional applications without AI integration. Additionally, they can support the process by collecting user feedback on AI-enhanced anti-phishing software from the systems they manage and reporting it to application developers. Legitimate and malicious emails, URLs accessed by employees, and log data can be collected and shared with dataset creators for training detection algorithms. This crowdsourcing approach aims to mitigate the impact of phishing on companies by increasing both the number and diversity of cases used in algorithm training, enhancing their ability to distinguish between harmful and safe messages.</p>
<p>While AI plays an important role in automated detection, human awareness remains an essential component of phishing mitigation. AI-based phishing detection systems not only neutralize threats but can also educate users about the risks associated with phishing attempts. When a phishing attempt is detected, AI-generated alerts provide users with contextual information about the nature of the threat. These alerts may include explanations of why an email is suspicious, potential consequences of interacting with the message, and recommended actions, such as reporting the phishing attempt or avoiding engagement. Security awareness training platforms leverage AI-generated phishing simulations to test users&#x2019; responses to deceptive emails, helping organizations assess employee susceptibility to phishing attacks. Studies have shown that periodic phishing awareness training, combined with real-time AI-generated warnings, significantly reduces user engagement with phishing attempts (<xref ref-type="bibr" rid="ref84">SentinelOne, 2025</xref>).</p>
<p>Regarding the <bold>formulation of relevant policies</bold>, this study highlights several potentially valuable insights for policymakers. Identifying the most influential authors can facilitate their recognition as experts and their involvement in the development of knowledge networks at national or global levels. Mapping research topics into categories such as basic, driving, niche, and emerging/declining can guide funding decisions toward either well-established areas with demonstrated potential or promising emerging fields. Additionally, key research topics can be disseminated to the general public through awareness-raising campaigns, enhancing cybersecurity literacy and preparedness.</p>
</sec>
</sec>
<sec id="sec17">
<label>6</label>
<title>Contributions, implications and conclusions</title>
<p>As synthetized in <xref ref-type="table" rid="tab1">Table 1</xref>, previous studies provide an overview of the use of AI, ML, and DL in phishing detection, in the form of systematic or comprehensive reviews and bibliometric analyses. The main trends identified indicate that ML dominates phishing detection methods, while DL (DNN, CNN, RNN/LSTM) is gaining ground, achieving the highest accuracy rates. NLP is becoming essential for detecting sophisticated phishing attacks, particularly spear-phishing. Distributed architecture enables Big Data analysis and real-time phishing detection. Standardized datasets (PhishTank, Alexa, UCI) are the most commonly used, supporting model comparisons. Recent bibliometric analysis (<xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>) reveal the constant growth of research, the strengthening of collaborations, and the central role of AI. CANTINA+ (<xref ref-type="bibr" rid="ref97">Xiang et al., 2011</xref>) and subsequent studies on URL-based detection (<xref ref-type="bibr" rid="ref81">Sahingoz et al., 2019</xref>) are considered foundational references in the field.</p>
<p>The contribution of this study in comparison with previous research is presented in <xref ref-type="table" rid="tab9">Table 9</xref>. The present study offers a focused bibliometric analysis dedicated exclusively to the application of AI in phishing detection, filling a gap left by previous research that either addressed phishing broadly or within the wider context of malware. By updating the temporal landscape, it highlights an significant growth of publications, with 2024 emerging as the most productive year and 2025 maintaining the upward trend. Unlike earlier studies that only mentioned DL as a promising direction, this research documents the full technological transition from classical ML to DL and hybrid models, explaining its drivers and performance advantages. Furthermore, it extends the discussion beyond academic models by incorporating practical insights from real-world AI-powered anti-phishing systems and proposes integration pathways into enterprise security infrastructures, thus bridging theoretical advancements with applied cybersecurity practices.</p>
<table-wrap position="float" id="tab9">
<label>Table 9</label>
<caption>
<p>Contributions.</p>
</caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left" valign="top">Dimension</th>
<th align="left" valign="top">Previous studies</th>
<th align="left" valign="top">Contributions of the present study</th>
<th align="left" valign="top">Added value compared to previous research</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left" valign="top">Scope and focus</td>
<td align="left" valign="top">Most studies are either general reviews (<xref ref-type="bibr" rid="ref64">Mujtaba et al., 2017</xref>; <xref ref-type="bibr" rid="ref77">Qabajeh et al., 2018</xref>; <xref ref-type="bibr" rid="ref52">Kunju et al., 2019</xref>; <xref ref-type="bibr" rid="ref10">Athulya and Praveen, 2020</xref>) or broad bibliometric analyses on malware, phishing in general, or phishing and Big Data (<xref ref-type="bibr" rid="ref58">Mat et al., 2021</xref>; <xref ref-type="bibr" rid="ref65">Mutluturk and Metin, 2023</xref>; <xref ref-type="bibr" rid="ref73">Peji&#x0107;-Bach et al., 2023</xref>)</td>
<td align="left" valign="top">Bibliometric analysis focused exclusively on AI for phishing detection, complemented by a content analysis of the 10 most cited papers in the dataset</td>
<td align="left" valign="top">ML and DL-centric overview of phishing detection, filling a gap in previous reviews</td>
</tr>
<tr>
<td align="left" valign="top">Temporal coverage</td>
<td align="left" valign="top">2021&#x2013;2025</td>
<td align="left" valign="top">Updates the temporal landscape by showing significant growth: 2024 is the most productive year (&#x2248;2&#x202F;&#x00D7;&#x202F;2023), with 2025 continuing the upward trend</td>
<td align="left" valign="top">Up-to-date perspective</td>
</tr>
<tr>
<td align="left" valign="top">Technological transition to DL</td>
<td align="left" valign="top">DL is mentioned as promising</td>
<td align="left" valign="top">Documents the transition from classical ML to DL and hybrid/stacking models, explaining the drivers: scalability, zero-day detection, FP reduction, and improved accuracy</td>
<td align="left" valign="top">Updated technological evolution</td>
</tr>
<tr>
<td align="left" valign="top">Practical implications</td>
<td align="left" valign="top">Mostly focused on academic models</td>
<td align="left" valign="top">Integrates practical examples: Microsoft Defender, Google Workspace, Barracuda, TitanHQ, Trustifi, Abnormal Security</td>
<td align="left" valign="top">Presentation of the implementation layer: AI-powered, real-time, behavioral, NLP, and computer vision-based detection of phishing</td>
</tr>
<tr>
<td align="left" valign="top">Integration into enterprise security</td>
<td align="left" valign="top">Not identified</td>
<td align="left" valign="top">Proposes integration of AI-powered phishing detection into SIEM, endpoint protection, secure email gateways, and cloud-based defense systems</td>
<td align="left" valign="top">Bridges between theoretical research and applied cybersecurity</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The findings reveal several critical research gaps that require further investigation. The most visible studies predominantly focus on traditional phishing attacks and on methods based especially on URLs analysis. The success of ML and DL in detecting phishing is still to be demonstrated in the case of less conventional phishing, targeting for example IoT sensors or voice assistants. Moreover, few studies have yet addressed explainable AI (XAI), an approach aimed at increasing transparency and trust in the functioning of the algorithms used. Understanding the reasoning behind AI-based phishing detection can enhance trust among professionals and the general public, leading to higher adoption rates. For cybersecurity professionals, XAI provides interpretable insights into model operations, allowing analysts to validate findings and adjust detection parameters as needed. This interpretability is particularly valuable in complex cases requiring nuanced judgment to differentiate between sophisticated phishing attempts and benign anomalies. By ensuring transparency in decision-making, XAI empowers analysts to make informed choices, ultimately improving the effectiveness of phishing detection systems (<xref ref-type="bibr" rid="ref67">Nguyen et al., 2024</xref>). Regulatory frameworks increasingly require transparency and accountability in AI-driven decision-making. The European Union&#x2019;s Artificial Intelligence (AI) Act establishes a harmonized legal framework for AI usage, emphasizing transparency and explainability&#x2014;key aspects of XAI. Targeting high-risk systems, the legislation mandates transparency, explainability, rigorous compliance assessments, and continuous monitoring throughout the system&#x2019;s lifecycle (<xref ref-type="bibr" rid="ref29">European Union, 2024</xref>). Although anti-phishing filters are not inherently classified as high-risk systems, their integration into critical infrastructures (e.g., banking, healthcare) elevates their risk level and compliance requirements. This impacts the use of opaque models and increases solution costs, as providers must invest in certifications and evaluations to meet regulatory standards. Furthermore, scalability remains a significant concern.</p>
<p>Despite these issues, AI-based solutions have undeniably advanced the defense mechanisms against phishing attacks, with ML methods yielding the best results. The integration of AI-based phishing detection with user-centered strategies remains underdeveloped. The analyzed research predominantly emphasizes technical solutions, often neglecting the role of human behavior in phishing susceptibility. Effective cybersecurity strategies require a combination of automated detection and adaptive user training, yet studies addressing this intersection are scarce. The lack of user awareness and humans&#x2019; inherent curiosity in responding to tempting messages continue to represent critical challenges, fostering conditions conducive to such attacks. Consequently, organizations must prioritize comprehensive training programs that educate users on how to avoid interacting with suspicious websites and links, or, where necessary, limit their exposure to critical organizational processes. Proposed solutions should integrate automated reporting mechanisms for phishing incidents by employees, in addition to browser plugins capable of autonomously detecting potential threats before they inflict damage. AI-driven technologies with self-improving capabilities hold considerable promise in this context. Future research should prioritize the full automation of phishing attack prevention by intercepting threats before malicious links reach the end user.</p>
<p>In conclusion, the research field investigated, AI in phishing detection, has shown an evolutionary trend beginning in 2016. The topic is of particular interest to researchers from technical fields, such as computer science, engineering, and telecommunications. The papers were extracted from the Web of Science database and were analyzed using Bibliometrix package in Biblioshiny and VOSviewer. The results indicate that the first paper was published in 2005, and the number of publications has increased almost continuously until 2022, with only minor exceptions. In 2023, the number of publications declined, likely due to the emergence of other security threats, but a remarkable growth was observed in 2024. For 2025, the data is not conclusive, as only articles published up to August 2025 were included. The highest number of articles were published in specialized journals within the field, followed by conference proceedings. The most cited articles have been published in recent years, focusing on the use of ML algorithms to identify phishing URLs and websites based on features capable of distinguishing them from the original, authentic ones. Researchers tend to work in relatively large teams, reflecting the complexity of the subject matter. Bibliometric analysis reveals a significant trend toward ML-based phishing detection solutions. These solutions typically involve extracting discriminative features from websites and training ML models to classify them as phishing or legitimate. While ML algorithms like Support Vector Machines, Random Forest, and Decision Trees have shown promising results, researchers explore new approaches, including DL and hybrid methods.</p>
<p>The findings highlight the importance of feature selection and the use of diverse datasets for effective phishing detection. As phishing attacks evolve, ongoing research is essential to develop robust and adaptive detection systems.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="sec18">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec sec-type="author-contributions" id="sec19">
<title>Author contributions</title>
<p>DP: Conceptualization, Investigation, Methodology, Writing &#x2013; original draft, Writing &#x2013; review &#x0026; editing. LR: Conceptualization, Investigation, Methodology, Software, Validation, Visualization, Writing &#x2013; original draft, Writing &#x2013; review &#x0026; editing.</p>
</sec>
<sec sec-type="funding-information" id="sec20">
<title>Funding</title>
<p>The author(s) declare that financial support was received for the research and/or publication of this article. This work was financially supported by Alexandru Ioan Cuza University of Iasi, Romania.</p>
</sec>
<sec sec-type="COI-statement" id="sec21">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="ai-statement" id="sec181">
<title>Generative AI statement</title>
<p>The authors declare that Generative AI was used in the creation of this manuscript.</p>
<p>During the preparation of this manuscript, the authors used ChatGPT (OpenAI) to translate the text from Romanian into English, as well as to enhance clarity and readability of the language. All output was carefully reviewed, edited, and adjusted by the authors. The authors take full responsibility for the content and integrity of the paper.</p>
</sec>
<sec sec-type="disclaimer" id="sec22">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="ref1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abdelhamid</surname><given-names>N.</given-names></name> <name><surname>Ayesh</surname><given-names>A.</given-names></name> <name><surname>Thabtah</surname><given-names>F.</given-names></name></person-group> (<year>2014</year>). <article-title>Phishing detection based associative classification data mining</article-title>. <source>Expert Syst. Appl.</source> <volume>41</volume>, <fpage>5948</fpage>&#x2013;<lpage>5959</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2014.03.019</pub-id></citation></ref>
<ref id="ref2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Adebowale</surname><given-names>M. A.</given-names></name> <name><surname>Lwin</surname><given-names>K. T.</given-names></name> <name><surname>S&#x00E1;nchez</surname><given-names>E.</given-names></name> <name><surname>Hossain</surname><given-names>M. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Intelligent web-phishing detection and protection scheme using integrated features of images, frames and text</article-title>. <source>Expert Syst. Appl.</source> <volume>115</volume>, <fpage>300</fpage>&#x2013;<lpage>313</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2018.07.067</pub-id></citation></ref>
<ref id="ref4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ali</surname><given-names>W.</given-names></name> <name><surname>Ahmed</surname><given-names>A. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Hybrid intelligent phishing website prediction using deep neural networks with genetic algorithm-based feature selection and weighting</article-title>. <source>IET Inf. Secur.</source> <volume>13</volume>, <fpage>659</fpage>&#x2013;<lpage>669</lpage>. doi: <pub-id pub-id-type="doi">10.1049/iet-ifs.2019.0006</pub-id></citation></ref>
<ref id="ref9004"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Agagu</surname><given-names>M.</given-names></name> <name><surname>Ogunbiyi</surname><given-names>I. A.</given-names></name> <name><surname>Lasisi</surname><given-names>A.</given-names></name> <name><surname>Omorogiuwa</surname><given-names>O.</given-names></name></person-group> (<year>2024</year>). <article-title>Detection of phishing websites from URLs using hybrid ensemble-based machine learning technique</article-title>. In <person-group person-group-type="editor"><name><surname>Ghazali</surname><given-names>R.</given-names></name> <name><surname>Nawi</surname><given-names>N. M.</given-names></name> <name><surname>Deris</surname><given-names>M. M.</given-names></name> <name><surname>Abawajy</surname><given-names>J. H.</given-names></name> <name><surname>Arbaiy</surname><given-names>N.</given-names></name></person-group> (Eds.), <source>Recent Advances on Soft Computing and Data Mining (Lecture Notes in Networks and Systems</source>, <volume>1078</volume>, pp. <fpage>11</fpage>&#x2013;<lpage>22</lpage>). <publisher-loc>Cham</publisher-loc>: <publisher-name>Springer</publisher-name>.  doi: <pub-id pub-id-type="doi">10.1007/978-3-031-66965-1_2</pub-id></citation></ref>
<ref id="ref6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alsariera</surname><given-names>Y. A.</given-names></name> <name><surname>Adeyemo</surname><given-names>V. E.</given-names></name> <name><surname>Balogun</surname><given-names>A. O.</given-names></name> <name><surname>Alazzawi</surname><given-names>A. K.</given-names></name></person-group> (<year>2020</year>). <article-title>AI meta-learners and extra-trees algorithm for the detection of phishing websites</article-title>. <source>IEEE Access</source> <volume>8</volume>, <fpage>142532</fpage>&#x2013;<lpage>142542</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2020.3013699</pub-id></citation></ref>
<ref id="ref9003"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alshahrani</surname><given-names>S. M.</given-names></name> <name><surname>Khan</surname><given-names>N. A.</given-names></name> <name><surname>Almalki</surname><given-names>J.</given-names></name> <name><surname>Shehri</surname><given-names>W.</given-names></name></person-group> (<year>2022</year>). <article-title>URL phishing detection using particle swarm optimization and data mining</article-title>. <source>Comp Materials and Continua.</source> <volume>73</volume>, <fpage>5625</fpage>&#x2013;<lpage>5640</lpage>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2022.030982</pub-id></citation></ref>
<ref id="ref7"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Asiri</surname><given-names>S.</given-names></name> <name><surname>Xiao</surname><given-names>Y.</given-names></name> <name><surname>Alzahrani</surname><given-names>S.</given-names></name></person-group> (<year>2024</year>). <article-title>Towards improving phishing detection system using human in the loop deep learning model</article-title>. <conf-name><italic>Proceedings of the 2024 ACM Southeast Conference on Marietta GA USA, the publisher is Association for Computing Machinery, New York, NY, United States</italic></conf-name>, <fpage>77</fpage>&#x2013;<lpage>85</lpage>.</citation></ref>
<ref id="ref8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Asiri</surname><given-names>S.</given-names></name> <name><surname>Xiao</surname><given-names>Y.</given-names></name> <name><surname>Alzahrani</surname><given-names>S.</given-names></name> <name><surname>Li</surname><given-names>T.</given-names></name></person-group> (<year>2024</year>). <article-title>PhishingRTDS: a real-time detection system for phishing attacks using a deep learning model</article-title>. <source>Comput. Secur.</source> <volume>141</volume>:<fpage>103843</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cose.2024.103843</pub-id></citation></ref>
<ref id="ref9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Asiri</surname><given-names>S.</given-names></name> <name><surname>Xiao</surname><given-names>Y.</given-names></name> <name><surname>Li</surname><given-names>T.</given-names></name></person-group> (<year>2023</year>). <article-title>Phishtransformer: a novel approach to detect phishing attacks using URL collection and transformer</article-title>. <source>Electronics</source> <volume>13</volume>:<fpage>30</fpage>. doi: <pub-id pub-id-type="doi">10.3390/electronics13010030</pub-id></citation></ref>
<ref id="ref10"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Athulya</surname><given-names>A. A.</given-names></name> <name><surname>Praveen</surname><given-names>K.</given-names></name></person-group> (<year>2020</year>). &#x201C;<article-title>Towards the detection of phishing attacks</article-title>&#x201D; in <source>2020 4th International conference on trends in electronics and informatics (Tirunelveli, India ICOEI)(48184)</source>, <fpage>337</fpage>&#x2013;<lpage>343</lpage>.</citation></ref>
<ref id="ref11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aun</surname><given-names>Y.</given-names></name> <name><surname>Gan</surname><given-names>M.-L.</given-names></name> <name><surname>Haliza Binti Abdul Wahab</surname><given-names>N.</given-names></name> <name><surname>Hock Guan</surname><given-names>G.</given-names></name></person-group> (<year>2023</year>). <article-title>Social engineering attack classifications on social media using deep learning</article-title>. <source>Comput. Mater. Contin.</source> <volume>74</volume>, <fpage>4917</fpage>&#x2013;<lpage>4931</lpage>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2023.032373</pub-id></citation></ref>
<ref id="ref12"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Barracuda</surname></name></person-group> (<year>2024</year>). <article-title>Securing tomorrow: A guide to the role of AI in cybersecurity</article-title>. Available online at: <ext-link xlink:href="https://assets.barracuda.com/assets/docs/dms/ciso-guide-ai-cybersecurity-ebook.pdf" ext-link-type="uri">https://assets.barracuda.com/assets/docs/dms/ciso-guide-ai-cybersecurity-ebook.pdf</ext-link></citation></ref>
<ref id="ref13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Basit</surname><given-names>A.</given-names></name> <name><surname>Zafar</surname><given-names>M.</given-names></name> <name><surname>Liu</surname><given-names>X.</given-names></name> <name><surname>Javed</surname><given-names>A. R.</given-names></name> <name><surname>Jalil</surname><given-names>Z.</given-names></name> <name><surname>Kifayat</surname><given-names>K.</given-names></name></person-group> (<year>2021</year>). <article-title>A comprehensive survey of AI-enabled phishing attacks detection techniques</article-title>. <source>Telecommun. Syst.</source> <volume>76</volume>, <fpage>139</fpage>&#x2013;<lpage>154</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11235-020-00733-2</pub-id>, PMID: <pub-id pub-id-type="pmid">33110340</pub-id></citation></ref>
<ref id="ref14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Batista-Canino</surname><given-names>R. M.</given-names></name> <name><surname>Santana-Hern&#x00E1;ndez</surname><given-names>L.</given-names></name> <name><surname>Medina-Brito</surname><given-names>P.</given-names></name></person-group> (<year>2023</year>). <article-title>A scientometric analysis on entrepreneurial intention literature: delving deeper into local citation</article-title>. <source>Heliyon</source> <volume>9</volume>:<fpage>e13046</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.heliyon.2023.e13046</pub-id>, PMID: <pub-id pub-id-type="pmid">36755622</pub-id></citation></ref>
<ref id="ref15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bejandi</surname><given-names>S. M.</given-names></name> <name><surname>Taghva</surname><given-names>M. R.</given-names></name> <name><surname>Hanafizadeh</surname><given-names>P.</given-names></name></person-group> (<year>2022</year>). <article-title>Applying swarm intelligence and data mining approach in detecting online and digital theft</article-title>. <source>Int. J. Inf. Comput. Secur.</source> <volume>19</volume>:<fpage>142</fpage>. doi: <pub-id pub-id-type="doi">10.1504/IJICS.2022.126758</pub-id></citation></ref>
<ref id="ref16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blei</surname><given-names>D. M.</given-names></name></person-group> (<year>2012</year>). <article-title>Probabilistic topic models</article-title>. <source>Commun. ACM</source> <volume>55</volume>, <fpage>77</fpage>&#x2013;<lpage>84</lpage>. doi: <pub-id pub-id-type="doi">10.1145/2133806.2133826</pub-id></citation></ref>
<ref id="ref17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Capuano</surname><given-names>N.</given-names></name> <name><surname>Fenza</surname><given-names>G.</given-names></name> <name><surname>Loia</surname><given-names>V.</given-names></name> <name><surname>Stanzione</surname><given-names>C.</given-names></name></person-group> (<year>2022</year>). <article-title>Explainable artificial intelligence in cybersecurity: a survey</article-title>. <source>IEEE Access</source> <volume>10</volume>, <fpage>93575</fpage>&#x2013;<lpage>93600</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2022.3204171</pub-id></citation></ref>
<ref id="ref18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Catal</surname><given-names>C.</given-names></name> <name><surname>Giray</surname><given-names>G.</given-names></name> <name><surname>Tekinerdogan</surname><given-names>B.</given-names></name> <name><surname>Kumar</surname><given-names>S.</given-names></name> <name><surname>Shukla</surname><given-names>S.</given-names></name></person-group> (<year>2022</year>). <article-title>Applications of deep learning for phishing detection: a systematic literature review</article-title>. <source>Knowl. Inf. Syst.</source> <volume>64</volume>, <fpage>1457</fpage>&#x2013;<lpage>1500</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10115-022-01672-x</pub-id>, PMID: <pub-id pub-id-type="pmid">35645443</pub-id></citation></ref>
<ref id="ref19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chiew</surname><given-names>K. L.</given-names></name> <name><surname>Tan</surname><given-names>C. L.</given-names></name> <name><surname>Wong</surname><given-names>K.</given-names></name> <name><surname>Yong</surname><given-names>K. S. C.</given-names></name> <name><surname>Tiong</surname><given-names>W. K.</given-names></name></person-group> (<year>2019</year>). <article-title>A new hybrid ensemble feature selection framework for machine learning-based phishing detection system</article-title>. <source>Inf. Sci.</source> <volume>484</volume>, <fpage>153</fpage>&#x2013;<lpage>166</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2019.01.064</pub-id></citation></ref>
<ref id="ref20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chiroma</surname><given-names>H.</given-names></name> <name><surname>Hashem</surname><given-names>I. A. T.</given-names></name> <name><surname>Maray</surname><given-names>M.</given-names></name></person-group> (<year>2024</year>). <article-title>Bibliometric analysis for artificial intelligence in the internet of medical things: mapping and performance analysis</article-title>. <source>Front. Artif. Intell.</source> <volume>7</volume>:<fpage>1347815</fpage>. doi: <pub-id pub-id-type="doi">10.3389/frai.2024.1347815</pub-id>, PMID: <pub-id pub-id-type="pmid">39188356</pub-id></citation></ref>
<ref id="ref21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dawood</surname><given-names>M.</given-names></name> <name><surname>Tu</surname><given-names>S.</given-names></name> <name><surname>Xiao</surname><given-names>C.</given-names></name> <name><surname>Alasmary</surname><given-names>H.</given-names></name> <name><surname>Waqas</surname><given-names>M.</given-names></name> <name><surname>Rehman</surname><given-names>S. U.</given-names></name></person-group> (<year>2023</year>). <article-title>Cyberattacks and security of cloud computing: a complete guideline</article-title>. <source>Symmetry</source> <volume>15</volume>:<fpage>1981</fpage>. doi: <pub-id pub-id-type="doi">10.3390/sym15111981</pub-id></citation></ref>
<ref id="ref22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>De La Torre Parra</surname><given-names>G.</given-names></name> <name><surname>Rad</surname><given-names>P.</given-names></name> <name><surname>Choo</surname><given-names>K.-K. R.</given-names></name> <name><surname>Beebe</surname><given-names>N.</given-names></name></person-group> (<year>2020</year>). <article-title>Detecting internet of things attacks using distributed deep learning</article-title>. <source>J. Netw. Comput. Appl.</source> <volume>163</volume>:<fpage>102662</fpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jnca.2020.102662</pub-id>, PMID: <pub-id pub-id-type="pmid">41035922</pub-id></citation></ref>
<ref id="ref23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Donthu</surname><given-names>N.</given-names></name> <name><surname>Kumar</surname><given-names>S.</given-names></name> <name><surname>Mukherjee</surname><given-names>D.</given-names></name> <name><surname>Pandey</surname><given-names>N.</given-names></name> <name><surname>Lim</surname><given-names>W. M.</given-names></name></person-group> (<year>2021</year>). <article-title>How to conduct a bibliometric analysis: an overview and guidelines</article-title>. <source>J. Bus. Res.</source> <volume>133</volume>, <fpage>285</fpage>&#x2013;<lpage>296</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jbusres.2021.04.070</pub-id></citation></ref>
<ref id="ref24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dwivedi</surname><given-names>Y. K.</given-names></name> <name><surname>Kshetri</surname><given-names>N.</given-names></name> <name><surname>Hughes</surname><given-names>L.</given-names></name> <name><surname>Rana</surname><given-names>N. P.</given-names></name> <name><surname>Baabdullah</surname><given-names>A. M.</given-names></name> <name><surname>Kar</surname><given-names>A. K.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Exploring the darkverse: a multi-perspective analysis of the negative societal impacts of the metaverse</article-title>. <source>Inf. Syst. Front.</source> <volume>25</volume>, <fpage>2071</fpage>&#x2013;<lpage>2114</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10796-023-10400-x</pub-id>, PMID: <pub-id pub-id-type="pmid">37361890</pub-id></citation></ref>
<ref id="ref25"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll1">ENISA</collab></person-group>. (<year>2022</year>). <article-title>ENISA Threat Landscape 2022</article-title>. Available online at: <ext-link xlink:href="https://www.enisa.europa.eu/publications/enisa-threat-landscape-2022?v2=1" ext-link-type="uri">https://www.enisa.europa.eu/publications/enisa-threat-landscape-2022?v2=1</ext-link></citation></ref>
<ref id="ref26"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll2">ENISA</collab></person-group>. (<year>2023a</year>). <article-title>ENISA Threat Landscape 2023</article-title>. Available online at: <ext-link xlink:href="https://www.enisa.europa.eu/publications/enisa-threat-landscape-2023" ext-link-type="uri">https://www.enisa.europa.eu/publications/enisa-threat-landscape-2023</ext-link></citation></ref>
<ref id="ref27"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll3">ENISA</collab></person-group>. (<year>2023b</year>). <article-title>Health Threat Landscape</article-title>. Available online at: <ext-link xlink:href="https://www.enisa.europa.eu/publications/health-threat-landscape" ext-link-type="uri">https://www.enisa.europa.eu/publications/health-threat-landscape</ext-link></citation></ref>
<ref id="ref28"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll4">European Parliament and Council</collab></person-group> (<year>2022</year>). <article-title>Regulation (EU) 2022/2554 of the European Parliament and of the Council of 14 December 2022 on digital operational resilience for the financial sector and amending Regulations (EC) No 1060/2009, (EU) No 648/2012, (EU) No 600/2014, (EU) No 909/2014</article-title>. Available online at: <ext-link xlink:href="https://www.digital-operational-resilience-act.com/DORA_Articles.html" ext-link-type="uri">https://www.digital-operational-resilience-act.com/DORA_Articles.html</ext-link>.</citation></ref>
<ref id="ref29"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll5">European Union</collab></person-group>. <source>Regulation (EU) 2024/1689 of the European Parliament and of the council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending regulations (EC) no 300/2008, (EU) no 167/2013, (EU) no 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/2144 and directives 2014/90/EU, (EU) 2016/797 and (EU) 2020/1828 (artificial intelligence act) (text with EEA relevance)</source> (<year>2024</year>). Available online at: <ext-link xlink:href="http://data.europa.eu/eli/reg/2024/1689/oj" ext-link-type="uri">http://data.europa.eu/eli/reg/2024/1689/oj</ext-link></citation></ref>
<ref id="ref30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ezugwu</surname><given-names>A. E.</given-names></name> <name><surname>Shukla</surname><given-names>A. K.</given-names></name> <name><surname>Agbaje</surname><given-names>M. B.</given-names></name> <name><surname>Oyelade</surname><given-names>O. N.</given-names></name> <name><surname>Jos&#x00E9;-Garc&#x00ED;a</surname><given-names>A.</given-names></name> <name><surname>Agushaka</surname><given-names>J. O.</given-names></name></person-group> (<year>2021</year>). <article-title>Automatic clustering algorithms: a systematic review and bibliometric analysis of relevant literature</article-title>. <source>Neural Comput. &#x0026; Applic.</source> <volume>33</volume>, <fpage>6247</fpage>&#x2013;<lpage>6306</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00521-020-05395-4</pub-id></citation></ref>
<ref id="ref31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gangavarapu</surname><given-names>T.</given-names></name> <name><surname>Jaidhar</surname><given-names>C. D.</given-names></name> <name><surname>Chanduka</surname><given-names>B.</given-names></name></person-group> (<year>2020</year>). <article-title>Applicability of machine learning in spam and phishing email filtering: review and approaches</article-title>. <source>Artif. Intell. Rev.</source> <volume>53</volume>, <fpage>5019</fpage>&#x2013;<lpage>5081</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10462-020-09814-9</pub-id></citation></ref>
<ref id="ref32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gaurav</surname><given-names>A.</given-names></name> <name><surname>Chui</surname><given-names>K. T.</given-names></name> <name><surname>Arya</surname><given-names>V.</given-names></name> <name><surname>Attar</surname><given-names>R. W.</given-names></name> <name><surname>Bansal</surname><given-names>S.</given-names></name> <name><surname>Alhomoud</surname><given-names>A.</given-names></name> <etal/></person-group>. (<year>2024</year>). <article-title>Optimized AI-driven semantic web approach for enhancing phishing detection in E-commerce platforms</article-title>. <source>Int. J. Semant. Web Inf. Syst.</source> <volume>20</volume>, <fpage>1</fpage>&#x2013;<lpage>13</lpage>. doi: <pub-id pub-id-type="doi">10.4018/IJSWIS.359767</pub-id></citation></ref>
<ref id="ref33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00F3;mez-Caicedo</surname><given-names>M. I.</given-names></name> <name><surname>Gait&#x00E1;n-Angulo</surname><given-names>M.</given-names></name> <name><surname>Bacca-Acosta</surname><given-names>J.</given-names></name> <name><surname>Bri&#x00F1;ez Torres</surname><given-names>C. Y.</given-names></name> <name><surname>Cubillos D&#x00ED;az</surname><given-names>J.</given-names></name></person-group> (<year>2022</year>). <article-title>Business analytics approach to artificial intelligence</article-title>. <source>Front. Artif. Intell.</source> <volume>5</volume>:<fpage>974180</fpage>. doi: <pub-id pub-id-type="doi">10.3389/frai.2022.974180</pub-id>, PMID: <pub-id pub-id-type="pmid">36248621</pub-id></citation></ref>
<ref id="ref34"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Gopali</surname><given-names>S.</given-names></name> <name><surname>Namin</surname><given-names>A. S.</given-names></name> <name><surname>Abri</surname><given-names>F.</given-names></name> <name><surname>Jones</surname><given-names>K. S.</given-names></name></person-group> (<year>2024</year>). &#x201C;<article-title>The performance of sequential deep learning models in detecting phishing websites using contextual features of URLs</article-title>&#x201D; in <source>Proceedings of the 39th ACM/SIGAPP Symposium on applied computing Avila Spain, the publisher is Association for Computing Machinery New York NY United States</source>, <fpage>1064</fpage>&#x2013;<lpage>1066</lpage>.</citation></ref>
<ref id="ref35"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Grover</surname><given-names>S.</given-names></name> <name><surname>Broll</surname><given-names>B.</given-names></name> <name><surname>Babb</surname><given-names>D.</given-names></name></person-group> (<year>2023</year>). <article-title>Cybersecurity education in the age of AI: integrating AI learning into cybersecurity high school curricula</article-title>. <conf-name><italic>Proceedings of the 54th ACM Technical Symposium on Computer Science Education V.1</italic></conf-name>, pp. <fpage>980</fpage>&#x2013;<lpage>986</lpage>. <publisher-name>ACM</publisher-name>, <publisher-loc>Toronto</publisher-loc>.</citation></ref>
<ref id="ref36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gualberto</surname><given-names>E. S.</given-names></name> <name><surname>De Sousa</surname><given-names>R. T.</given-names></name> <name><surname>Vieira</surname><given-names>T. P. D. B.</given-names></name> <name><surname>Da Costa</surname><given-names>J. P. C. L.</given-names></name> <name><surname>Duque</surname><given-names>C. G.</given-names></name></person-group> (<year>2020</year>). <article-title>From feature engineering and topics models to enhanced prediction rates in phishing detection</article-title>. <source>IEEE Access</source> <volume>8</volume>, <fpage>76368</fpage>&#x2013;<lpage>76385</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2020.2989126</pub-id></citation></ref>
<ref id="ref37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname><given-names>M.</given-names></name> <name><surname>Akiri</surname><given-names>C.</given-names></name> <name><surname>Aryal</surname><given-names>K.</given-names></name> <name><surname>Parker</surname><given-names>E.</given-names></name> <name><surname>Praharaj</surname><given-names>L.</given-names></name></person-group> (<year>2023</year>). <article-title>From ChatGPT to ThreatGPT: impact of generative AI in cybersecurity and privacy</article-title>. <source>IEEE Access</source> <volume>11</volume>, <fpage>80218</fpage>&#x2013;<lpage>80245</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2023.3300381</pub-id></citation></ref>
<ref id="ref38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gupta</surname><given-names>B. B.</given-names></name> <name><surname>Gaurav</surname><given-names>A.</given-names></name> <name><surname>Arya</surname><given-names>V.</given-names></name> <name><surname>Attar</surname><given-names>R. W.</given-names></name> <name><surname>Bansal</surname><given-names>S.</given-names></name> <name><surname>Alhomoud</surname><given-names>A.</given-names></name> <etal/></person-group>. (<year>2024a</year>). <article-title>Advanced BERT and CNN-based computational model for phishing detection in enterprise systems</article-title>. <source>Comput. Model. Eng. Sci.</source> <volume>141</volume>, <fpage>2165</fpage>&#x2013;<lpage>2183</lpage>. doi: <pub-id pub-id-type="doi">10.32604/cmes.2024.056473</pub-id></citation></ref>
<ref id="ref39"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Gupta</surname><given-names>B. B.</given-names></name> <name><surname>Gaurav</surname><given-names>A.</given-names></name> <name><surname>Wu</surname><given-names>J.</given-names></name> <name><surname>Arya</surname><given-names>V.</given-names></name> <name><surname>Chui</surname><given-names>K. T.</given-names></name></person-group> (<year>2024b</year>). &#x201C;<article-title>Deep learning and big data integration with cuckoo search optimization for robust phishing attack detection</article-title>&#x201D; in <source>ICC 2024&#x2014;IEEE international conference on communications</source>. eds. <person-group person-group-type="editor"><name><surname>Valenti</surname><given-names>M.</given-names></name> <name><surname>Reed</surname><given-names>D.</given-names></name> <name><surname>Torres</surname><given-names>M.</given-names></name></person-group>. (<publisher-loc>conference in Denver, Colorado, IEEE located in Piscataway, New Jersey</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1322</fpage>&#x2013;<lpage>1327</lpage>.</citation></ref>
<ref id="ref40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heiding</surname><given-names>F.</given-names></name> <name><surname>Schneier</surname><given-names>B.</given-names></name> <name><surname>Vishwanath</surname><given-names>A.</given-names></name> <name><surname>Bernstein</surname><given-names>J.</given-names></name> <name><surname>Park</surname><given-names>P. S.</given-names></name></person-group> (<year>2024</year>). <article-title>Devising and detecting phishing emails using large language models</article-title>. <source>IEEE Access</source> <volume>12</volume>, <fpage>42131</fpage>&#x2013;<lpage>42146</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2024.3375882</pub-id></citation></ref>
<ref id="ref41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Innab</surname><given-names>N.</given-names></name> <name><surname>Osman</surname><given-names>A. A. F.</given-names></name> <name><surname>Ataelfadiel</surname><given-names>M. A. M.</given-names></name> <name><surname>Abu-Zanona</surname><given-names>M.</given-names></name> <name><surname>Elzaghmouri</surname><given-names>B. M.</given-names></name> <name><surname>Zawaideh</surname><given-names>F. H.</given-names></name> <etal/></person-group>. (<year>2024</year>). <article-title>Phishing attacks detection using ensemble machine learning algorithms</article-title>. <source>Comput. Mater. Continua</source> <volume>80</volume>, <fpage>1325</fpage>&#x2013;<lpage>1345</lpage>. doi: <pub-id pub-id-type="doi">10.32604/cmc.2024.051778</pub-id></citation></ref>
<ref id="ref42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Iscan</surname><given-names>C.</given-names></name> <name><surname>Kumas</surname><given-names>O.</given-names></name> <name><surname>Akbulut</surname><given-names>F. P.</given-names></name> <name><surname>Akbulut</surname><given-names>A.</given-names></name></person-group> (<year>2023</year>). <article-title>Wallet-based transaction fraud prevention through LightGBM with the focus on minimizing false alarms</article-title>. <source>IEEE Access</source> <volume>11</volume>, <fpage>131465</fpage>&#x2013;<lpage>131474</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2023.3321666</pub-id></citation></ref>
<ref id="ref43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jagatic</surname><given-names>T. N.</given-names></name> <name><surname>Johnson</surname><given-names>N. A.</given-names></name> <name><surname>Jakobsson</surname><given-names>M.</given-names></name> <name><surname>Menczer</surname><given-names>F.</given-names></name></person-group> (<year>2007</year>). <article-title>Social phishing</article-title>. <source>Commun. ACM</source> <volume>50</volume>, <fpage>94</fpage>&#x2013;<lpage>100</lpage>. doi: <pub-id pub-id-type="doi">10.1145/1290958.1290968</pub-id></citation></ref>
<ref id="ref44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jain</surname><given-names>A. K.</given-names></name> <name><surname>Gupta</surname><given-names>B. B.</given-names></name></person-group> (<year>2018a</year>). <article-title>Rule-based framework for detection of smishing messages in mobile environment</article-title>. <source>Proc. Comput. Sci.</source> <volume>125</volume>, <fpage>617</fpage>&#x2013;<lpage>623</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.procs.2017.12.079</pub-id></citation></ref>
<ref id="ref45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jain</surname><given-names>A. K.</given-names></name> <name><surname>Gupta</surname><given-names>B. B.</given-names></name></person-group> (<year>2018b</year>). <article-title>Towards detection of phishing websites on client-side using machine learning based approach</article-title>. <source>Telecommun. Syst.</source> <volume>68</volume>, <fpage>687</fpage>&#x2013;<lpage>700</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11235-017-0414-0</pub-id></citation></ref>
<ref id="ref46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jain</surname><given-names>A. K.</given-names></name> <name><surname>Gupta</surname><given-names>B. B.</given-names></name></person-group> (<year>2019</year>). <article-title>A machine learning based approach for phishing detection using hyperlinks information</article-title>. <source>J. Ambient. Intell. Humaniz. Comput.</source> <volume>10</volume>, <fpage>2015</fpage>&#x2013;<lpage>2028</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s12652-018-0798-z</pub-id></citation></ref>
<ref id="ref47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jia</surname><given-names>K.</given-names></name> <name><surname>Wang</surname><given-names>P.</given-names></name> <name><surname>Li</surname><given-names>Y.</given-names></name> <name><surname>Chen</surname><given-names>Z.</given-names></name> <name><surname>Jiang</surname><given-names>X.</given-names></name> <name><surname>Lin</surname><given-names>C.-L.</given-names></name> <etal/></person-group>. (<year>2022</year>). <article-title>Research landscape of artificial intelligence and e-learning: a bibliometric research</article-title>. <source>Front. Psychol.</source> <volume>13</volume>:<fpage>795039</fpage>. doi: <pub-id pub-id-type="doi">10.3389/fpsyg.2022.795039</pub-id>, PMID: <pub-id pub-id-type="pmid">35250730</pub-id></citation></ref>
<ref id="ref48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Karim</surname><given-names>A.</given-names></name> <name><surname>Azam</surname><given-names>S.</given-names></name> <name><surname>Shanmugam</surname><given-names>B.</given-names></name> <name><surname>Kannoorpatti</surname><given-names>K.</given-names></name> <name><surname>Alazab</surname><given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>A comprehensive survey for intelligent spam email detection</article-title>. <source>IEEE Access</source> <volume>7</volume>, <fpage>168261</fpage>&#x2013;<lpage>168295</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2954791</pub-id></citation></ref>
<ref id="ref50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Khan</surname><given-names>A. I.</given-names></name> <name><surname>Unhelkar</surname><given-names>B.</given-names></name></person-group> (<year>2024</year>). <article-title>An enhanced anti-phishing technique for social media users: a multilayer Q-learning approach</article-title>. <source>Int. J. Adv. Comput. Sci. Appl.</source> <volume>15</volume>, <fpage>18</fpage>&#x2013;<lpage>28</lpage>. doi: <pub-id pub-id-type="doi">10.14569/IJACSA.2024.0150103</pub-id></citation></ref>
<ref id="ref52"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Kunju</surname><given-names>M. V.</given-names></name> <name><surname>Dainel</surname><given-names>E.</given-names></name> <name><surname>Anthony</surname><given-names>H. C.</given-names></name> <name><surname>Bhelwa</surname><given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Evaluation of phishing techniques based on machine learning</article-title>. <conf-name><italic>2019 International Conference on Intelligent Computing and Control Systems (ICCS)</italic></conf-name>, <fpage>963</fpage>&#x2013;<lpage>968</lpage>.</citation></ref>
<ref id="ref53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Laorden</surname><given-names>C.</given-names></name> <name><surname>Ugarte-Pedrero</surname><given-names>X.</given-names></name> <name><surname>Santos</surname><given-names>I.</given-names></name> <name><surname>Sanz</surname><given-names>B.</given-names></name> <name><surname>Nieves</surname><given-names>J.</given-names></name> <name><surname>Bringas</surname><given-names>P. G.</given-names></name></person-group> (<year>2014</year>). <article-title>Study on the effectiveness of anomaly detection for spam filtering</article-title>. <source>Inf. Sci.</source> <volume>277</volume>, <fpage>421</fpage>&#x2013;<lpage>444</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2014.02.114</pub-id></citation></ref>
<ref id="ref54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>S.</given-names></name> <name><surname>Gou</surname><given-names>G.</given-names></name> <name><surname>Liu</surname><given-names>C.</given-names></name> <name><surname>Hou</surname><given-names>C.</given-names></name> <name><surname>Li</surname><given-names>Z.</given-names></name> <name><surname>Xiong</surname><given-names>G.</given-names></name></person-group> (<year>2022</year>). <article-title>TTAGN: temporal transaction aggregation graph network for Ethereum phishing scams detection</article-title>. <source>Proc. ACM Web Conf.</source> <volume>2022</volume>, <fpage>661</fpage>&#x2013;<lpage>669</lpage>. doi: <pub-id pub-id-type="doi">10.1145/3485447.3512226</pub-id></citation></ref>
<ref id="ref55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname><given-names>Y.</given-names></name> <name><surname>Yang</surname><given-names>Z.</given-names></name> <name><surname>Chen</surname><given-names>X.</given-names></name> <name><surname>Yuan</surname><given-names>H.</given-names></name> <name><surname>Liu</surname><given-names>W.</given-names></name></person-group> (<year>2019</year>). <article-title>A stacking model using URL and HTML features for phishing webpage detection</article-title>. <source>Futur. Gener. Comput. Syst.</source> <volume>94</volume>, <fpage>27</fpage>&#x2013;<lpage>39</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.future.2018.11.004</pub-id></citation></ref>
<ref id="ref56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mahdavifar</surname><given-names>S.</given-names></name> <name><surname>Ghorbani</surname><given-names>A. A.</given-names></name></person-group> (<year>2019</year>). <article-title>Application of deep learning to cybersecurity: a survey</article-title>. <source>Neurocomputing</source> <volume>347</volume>, <fpage>149</fpage>&#x2013;<lpage>176</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.neucom.2019.02.056</pub-id></citation></ref>
<ref id="ref57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marchal</surname><given-names>S.</given-names></name> <name><surname>Francois</surname><given-names>J.</given-names></name> <name><surname>State</surname><given-names>R.</given-names></name> <name><surname>Engel</surname><given-names>T.</given-names></name></person-group> (<year>2014</year>). <article-title>PhishStorm: detecting phishing with streaming analytics</article-title>. <source>IEEE Trans. Network Serv. Manag.</source> <volume>11</volume>, <fpage>458</fpage>&#x2013;<lpage>471</lpage>. doi: <pub-id pub-id-type="doi">10.1109/TNSM.2014.2377295</pub-id></citation></ref>
<ref id="ref58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mat</surname><given-names>S. R. T.</given-names></name> <name><surname>Ab Razak</surname><given-names>M. F.</given-names></name> <name><surname>Kahar</surname><given-names>M. N. M.</given-names></name> <name><surname>Arif</surname><given-names>J. M.</given-names></name> <name><surname>Mohamad</surname><given-names>S.</given-names></name> <name><surname>Firdaus</surname><given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Towards a systematic description of the field using bibliometric analysis: malware evolution</article-title>. <source>Scientometrics</source> <volume>126</volume>, <fpage>2013</fpage>&#x2013;<lpage>2055</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s11192-020-03834-6</pub-id>, PMID: <pub-id pub-id-type="pmid">33583978</pub-id></citation></ref>
<ref id="ref60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mohamed</surname><given-names>N.</given-names></name> <name><surname>Taherdoost</surname><given-names>H.</given-names></name> <name><surname>Madanchian</surname><given-names>M.</given-names></name></person-group> (<year>2024</year>). <article-title>Enhancing spear phishing defense with AI: a comprehensive review and future directions</article-title>. <source>ICST Trans. Scalable Inform. Syst.</source> <volume>11</volume>. doi: <pub-id pub-id-type="doi">10.4108/eetsis.6109</pub-id></citation></ref>
<ref id="ref61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mohammad</surname><given-names>R. M.</given-names></name> <name><surname>Thabtah</surname><given-names>F.</given-names></name> <name><surname>McCluskey</surname><given-names>L.</given-names></name></person-group> (<year>2014</year>). <article-title>Predicting phishing websites based on self-structuring neural network</article-title>. <source>Neural Comput. &#x0026; Applic.</source> <volume>25</volume>, <fpage>443</fpage>&#x2013;<lpage>458</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00521-013-1490-z</pub-id></citation></ref>
<ref id="ref62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moher</surname><given-names>D.</given-names></name> <name><surname>Liberati</surname><given-names>A.</given-names></name> <name><surname>Tetzlaff</surname><given-names>J.</given-names></name> <name><surname>Altman</surname><given-names>D. G.</given-names></name><collab id="coll6">The PRISMA Group</collab></person-group> (<year>2009</year>). <article-title>Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement</article-title>. <source>PLoS Med.</source> <volume>6</volume>:<fpage>e1000097</fpage>. doi: <pub-id pub-id-type="doi">10.1371/journal.pmed.1000097</pub-id>, PMID: <pub-id pub-id-type="pmid">19621072</pub-id></citation></ref>
<ref id="ref64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mujtaba</surname><given-names>G.</given-names></name> <name><surname>Shuib</surname><given-names>L.</given-names></name> <name><surname>Raj</surname><given-names>R. G.</given-names></name> <name><surname>Majeed</surname><given-names>N.</given-names></name> <name><surname>Al-Garadi</surname><given-names>M. A.</given-names></name></person-group> (<year>2017</year>). <article-title>Email classification research trends: review and open issues</article-title>. <source>IEEE Access</source> <volume>5</volume>, <fpage>9044</fpage>&#x2013;<lpage>9064</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2017.2702187</pub-id></citation></ref>
<ref id="ref65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mutluturk</surname><given-names>M.</given-names></name> <name><surname>Metin</surname><given-names>B.</given-names></name></person-group> (<year>2023</year>). <article-title>Mapping the phishing attacks research landscape: a bibliometric analysis and taxonomy</article-title>. <source>J. Theor. Appl. Inf. Technol.</source> <volume>101</volume>, <fpage>6758</fpage>&#x2013;<lpage>6780</lpage>.</citation></ref>
<ref id="ref66"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Nathanson</surname><given-names>J.</given-names></name> <name><surname>Yamunan</surname><given-names>A.</given-names></name></person-group> (<year>2025</year>). <article-title>How-to guide: Defending against malware and phishing attacks</article-title>. <italic>Identity and Security</italic>. Available online at: <ext-link xlink:href="https://workspace.google.com/blog/identity-and-security/how-guide-defending-against-malware-and-phishing-attacks" ext-link-type="uri">https://workspace.google.com/blog/identity-and-security/how-guide-defending-against-malware-and-phishing-attacks</ext-link></citation></ref>
<ref id="ref67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nguyen</surname><given-names>V.</given-names></name> <name><surname>Wu</surname><given-names>T.</given-names></name> <name><surname>Yuan</surname><given-names>X.</given-names></name> <name><surname>Grobler</surname><given-names>M.</given-names></name> <name><surname>Nepal</surname><given-names>S.</given-names></name> <name><surname>Rudolph</surname><given-names>C.</given-names></name></person-group> (<year>2024</year>). <article-title>An Innovative Information Theory-based Approach to Tackle and Enhance The Transparency in Phishing Detection (Version 2)</article-title>. <source>arXiv</source>. doi: <pub-id pub-id-type="doi">10.48550/ARXIV.2402.17092</pub-id></citation></ref>
<ref id="ref68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nguyet</surname><given-names>Q. D.</given-names></name> <name><surname>Selamat</surname><given-names>A.</given-names></name> <name><surname>Krejcar</surname><given-names>O.</given-names></name> <name><surname>Yokoi</surname><given-names>T.</given-names></name> <name><surname>Fujita</surname><given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Phishing webpage classification via deep learning-based algorithms: an empirical study</article-title>. <source>Appl. Sci. Basel</source> <volume>11</volume>:<fpage>9210</fpage>. doi: <pub-id pub-id-type="doi">10.3390/app11199210</pub-id></citation></ref>
<ref id="ref69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Omari</surname><given-names>K.</given-names></name></person-group> (<year>2023</year>). <article-title>Comparative study of machine learning algorithms for phishing website detection</article-title>. <source>Int. J. Adv. Comput. Sci. Appl.</source> <volume>14</volume>, <fpage>417</fpage>&#x2013;<lpage>425</lpage>. doi: <pub-id pub-id-type="doi">10.14569/IJACSA.2023.0140945</pub-id></citation></ref>
<ref id="ref70"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Oprea</surname><given-names>D.</given-names></name></person-group> (<year>2007</year>). <source>Protectia si securitatea informatiilor</source>. <publisher-loc>Iasi</publisher-loc>: <publisher-name>Polirom</publisher-name>.</citation></ref>
<ref id="ref71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Orunsolu</surname><given-names>A. A.</given-names></name> <name><surname>Sodiya</surname><given-names>A. S.</given-names></name> <name><surname>Akinwale</surname><given-names>A. T.</given-names></name></person-group> (<year>2022</year>). <article-title>A predictive model for phishing detection</article-title>. <source>J. King Saud Univ. Comput. Inf. Sci.</source> <volume>34</volume>, <fpage>232</fpage>&#x2013;<lpage>247</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jksuci.2019.12.005</pub-id></citation></ref>
<ref id="ref72"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll7">PaloAltoNetworks</collab></person-group>. (<year>2025</year>). <article-title>What Is the Role of AI and ML in Modern SIEM Solutions?</article-title>. Available online at: <ext-link xlink:href="https://www.paloaltonetworks.com/cyberpedia/role-of-artificial-intelligence-ai-and-machine-learning-ml-in-siem" ext-link-type="uri">https://www.paloaltonetworks.com/cyberpedia/role-of-artificial-intelligence-ai-and-machine-learning-ml-in-siem</ext-link></citation></ref>
<ref id="ref73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peji&#x0107;-Bach</surname><given-names>M.</given-names></name> <name><surname>Jaji&#x0107;</surname><given-names>I.</given-names></name> <name><surname>Kamenjarska</surname><given-names>T.</given-names></name></person-group> (<year>2023</year>). <article-title>A bibliometric analysis of phishing in the big data era: high focus on algorithms and low focus on people</article-title>. <source>Proc. Comput. Sci.</source> <volume>219</volume>, <fpage>91</fpage>&#x2013;<lpage>98</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.procs.2023.01.268</pub-id></citation></ref>
<ref id="ref74"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Perera</surname><given-names>O.</given-names></name> <name><surname>Grob</surname><given-names>J.</given-names></name></person-group> (<year>2024</year>). &#x201C;<article-title>Generative AI in phishing detection: insights and research opportunities</article-title>&#x201D; in <source>2024 cyber awareness and research SYMPOSIUM, cars 2024</source> (<publisher-loc>New York</publisher-loc>: <publisher-name>2024 Cyber Awareness and Research Symposium</publisher-name>).</citation></ref>
<ref id="ref75"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Petrosyan</surname><given-names>A.</given-names></name></person-group> (<year>2024</year>). <source>Spam: Share of global e-mail traffic monthly 2014&#x2013;2023</source> (<publisher-name>Statista</publisher-name>). Available online at: <ext-link xlink:href="https://www.statista.com/statistics/420391/spam-email-traffic-sh" ext-link-type="uri">https://www.statista.com/statistics/420391/spam-email-traffic-sh</ext-link></citation></ref>
<ref id="ref76"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Popescul</surname><given-names>D.</given-names></name></person-group> (<year>2014</year>). <source>Securitatea informatiilor - instrumente &#x0219;i metode de lucru</source>. <publisher-loc>Iasi</publisher-loc>: <publisher-name>Tehnopress</publisher-name>.</citation></ref>
<ref id="ref77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qabajeh</surname><given-names>I.</given-names></name> <name><surname>Thabtah</surname><given-names>F.</given-names></name> <name><surname>Chiclana</surname><given-names>F.</given-names></name></person-group> (<year>2018</year>). <article-title>A recent review of conventional vs. automated cybersecurity anti-phishing techniques</article-title>. <source>Comput Sci Rev</source> <volume>29</volume>, <fpage>44</fpage>&#x2013;<lpage>55</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.cosrev.2018.05.003</pub-id></citation></ref>
<ref id="ref78"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Quang</surname><given-names>D. N.</given-names></name> <name><surname>Selamat</surname><given-names>A.</given-names></name> <name><surname>Krejcar</surname><given-names>O.</given-names></name></person-group> (<year>2021</year>). &#x201C;<article-title>Recent research on phishing detection through machine learning algorithm</article-title>&#x201D; in <source>Advances and trends in artificial intelligence. Artificial intelligence practices</source>. eds. <person-group person-group-type="editor"><name><surname>Fujita</surname><given-names>H.</given-names></name> <name><surname>Selamat</surname><given-names>A.</given-names></name> <name><surname>Lin</surname><given-names>J. C.-W.</given-names></name> <name><surname>Ali</surname><given-names>M.</given-names></name></person-group>, vol. <volume>12798</volume>. (<publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>495</fpage>&#x2013;<lpage>508</lpage>.</citation></ref>
<ref id="ref79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rao</surname><given-names>R. S.</given-names></name> <name><surname>Pais</surname><given-names>A. R.</given-names></name></person-group> (<year>2019</year>). <article-title>Detection of phishing websites using an efficient feature-based machine learning framework</article-title>. <source>Neural Comput. &#x0026; Applic.</source> <volume>31</volume>, <fpage>3851</fpage>&#x2013;<lpage>3873</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s00521-017-3305-0</pub-id></citation></ref>
<ref id="ref80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Safi</surname><given-names>A.</given-names></name> <name><surname>Singh</surname><given-names>S.</given-names></name></person-group> (<year>2023</year>). <article-title>A systematic literature review on phishing website detection techniques</article-title>. <source>J. King Saud Univ. Comput. Inf. Sci.</source> <volume>35</volume>, <fpage>590</fpage>&#x2013;<lpage>611</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.jksuci.2023.01.004</pub-id></citation></ref>
<ref id="ref81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sahingoz</surname><given-names>O. K.</given-names></name> <name><surname>Buber</surname><given-names>E.</given-names></name> <name><surname>Demir</surname><given-names>O.</given-names></name> <name><surname>Diri</surname><given-names>B.</given-names></name></person-group> (<year>2019</year>). <article-title>Machine learning based phishing detection from URLs</article-title>. <source>Expert Syst. Appl.</source> <volume>117</volume>, <fpage>345</fpage>&#x2013;<lpage>357</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2018.09.029</pub-id></citation></ref>
<ref id="ref82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Santos</surname><given-names>I.</given-names></name> <name><surname>Laorden</surname><given-names>C.</given-names></name> <name><surname>Sanz</surname><given-names>B.</given-names></name> <name><surname>Bringas</surname><given-names>P. G.</given-names></name></person-group> (<year>2012</year>). <article-title>Enhanced topic-based vector space model for semantics-aware spam filtering</article-title>. <source>Expert Syst. Appl.</source> <volume>39</volume>, <fpage>437</fpage>&#x2013;<lpage>444</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.eswa.2011.07.034</pub-id></citation></ref>
<ref id="ref83"><citation citation-type="confproc"><person-group person-group-type="author"><name><surname>Santos</surname><given-names>S.</given-names></name> <name><surname>Vilela</surname><given-names>J.</given-names></name> <name><surname>Carvalho</surname><given-names>T.</given-names></name> <name><surname>Rocha</surname><given-names>T.</given-names></name> <name><surname>Candido</surname><given-names>T.</given-names></name> <name><surname>Bezerra</surname><given-names>V.</given-names></name> <etal/></person-group>. (<year>2024</year>). <article-title>Artificial intelligence in sustainable smart cities: A systematic study on applications, benefits, challenges, and solutions</article-title>. In: <conf-name>Proceedings of the 26th International conference on Enterprise information systems</conf-name>. <publisher-loc>Angers, France</publisher-loc>. <fpage>644</fpage>&#x2013;<lpage>655</lpage>.</citation></ref>
<ref id="ref84"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll8">SentinelOne</collab></person-group>. (<year>2025</year>). <article-title>AI Threat Detection: Leverage AI to Detect Security Threats</article-title>. Available online at: <ext-link xlink:href="https://www.sentinelone.com/cybersecurity-101/data-and-ai/ai-threat-detection/" ext-link-type="uri">https://www.sentinelone.com/cybersecurity-101/data-and-ai/ai-threat-detection/</ext-link></citation></ref>
<ref id="ref85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sharma</surname><given-names>P.</given-names></name> <name><surname>Namasudra</surname><given-names>S.</given-names></name> <name><surname>Gonzalez Crespo</surname><given-names>R.</given-names></name> <name><surname>Parra-Fuente</surname><given-names>J.</given-names></name> <name><surname>Chandra Trivedi</surname><given-names>M.</given-names></name></person-group> (<year>2023</year>). <article-title>EHDHE: enhancing security of healthcare documents in IoT-enabled digital healthcare ecosystems using blockchain</article-title>. <source>Inf. Sci.</source> <volume>629</volume>, <fpage>703</fpage>&#x2013;<lpage>718</lpage>. doi: <pub-id pub-id-type="doi">10.1016/j.ins.2023.01.148</pub-id></citation></ref>
<ref id="ref86"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Shiebler</surname><given-names>D.</given-names></name></person-group> (<year>2023</year>). <article-title>An Abnormal Approach to Machine Learning: Feature Systems and Language Models</article-title>. Available online at: <ext-link xlink:href="https://abnormalsecurity.com/blog/machine-learning-feature-systems-models" ext-link-type="uri">https://abnormalsecurity.com/blog/machine-learning-feature-systems-models</ext-link></citation></ref>
<ref id="ref88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thapa</surname><given-names>C.</given-names></name> <name><surname>Tang</surname><given-names>J. W.</given-names></name> <name><surname>Abuadbba</surname><given-names>A.</given-names></name> <name><surname>Gao</surname><given-names>Y.</given-names></name> <name><surname>Camtepe</surname><given-names>S.</given-names></name> <name><surname>Nepal</surname><given-names>S.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Evaluation of federated learning in phishing email detection</article-title>. <source>Sensors</source> <volume>23</volume>:<fpage>4346</fpage>. doi: <pub-id pub-id-type="doi">10.3390/s23094346</pub-id>, PMID: <pub-id pub-id-type="pmid">37177549</pub-id></citation></ref>
<ref id="ref89"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll9">The National Archives</collab></person-group> (<year>2017</year>). &#x201C;<article-title>Identifying Information Assets and Business Requirements</article-title>.&#x201D; The National Archives. Available online at: <ext-link xlink:href="https://www.nationalarchives.gov.uk/information-management/manage-information/policy-process/digital-continuity/step-by-step-guidance/step-2/" ext-link-type="uri">https://www.nationalarchives.gov.uk/information-management/manage-information/policy-process/digital-continuity/step-by-step-guidance/step-2/</ext-link>.</citation></ref>
<ref id="ref90"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll10">TitanHQ</collab></person-group>. (<year>2024</year>). <article-title>PhishTitan</article-title>. Available online at: <ext-link xlink:href="https://www.titanhq.com/phishing-protection/" ext-link-type="uri">https://www.titanhq.com/phishing-protection/</ext-link></citation></ref>
<ref id="ref91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Trad</surname><given-names>F.</given-names></name> <name><surname>Chehab</surname><given-names>A.</given-names></name></person-group> (<year>2024</year>). <article-title>Prompt engineering or fine-tuning? A case study on phishing detection with large language models</article-title>. <source>Mach. Learn. Knowl. Extr.</source> <volume>6</volume>, <fpage>367</fpage>&#x2013;<lpage>384</lpage>. doi: <pub-id pub-id-type="doi">10.3390/make6010018</pub-id></citation></ref>
<ref id="ref92"><citation citation-type="other"><person-group person-group-type="author"><collab id="coll11">Trustifi</collab></person-group>. (<year>2024</year>). <article-title>AI Powered Email Security &#x0026; Compliance</article-title>. Available online at: <ext-link xlink:href="https://trustifi.com/" ext-link-type="uri">https://trustifi.com/</ext-link></citation></ref>
<ref id="ref93"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Van Poucke</surname><given-names>S.</given-names></name> <name><surname>Goyal</surname><given-names>H.</given-names></name> <name><surname>Rowley</surname><given-names>D. D.</given-names></name> <name><surname>Zhong</surname><given-names>M.</given-names></name> <name><surname>Liu</surname><given-names>N.</given-names></name></person-group> (<year>2018</year>). <article-title>The top 2,000 cited articles in critical care medicine: a bibliometric analysis</article-title>. <source>J. Thorac. Dis.</source> <volume>10</volume>, <fpage>2437</fpage>&#x2013;<lpage>2447</lpage>. doi: <pub-id pub-id-type="doi">10.21037/jtd.2018.03.178</pub-id>, PMID: <pub-id pub-id-type="pmid">29850150</pub-id></citation></ref>
<ref id="ref94"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Vinayakumar</surname><given-names>R.</given-names></name> <name><surname>Soman</surname><given-names>K. P.</given-names></name> <name><surname>Prabaharan</surname><given-names>P.</given-names></name> <name><surname>Akarsh</surname><given-names>S.</given-names></name></person-group> (<year>2019</year>). &#x201C;<article-title>Application of deep learning architectures for cyber security</article-title>&#x201D; in <source>Cybersecurity and secure information systems</source>. eds. <person-group person-group-type="editor"><name><surname>Hassanien</surname><given-names>A. E.</given-names></name> <name><surname>Elhoseny</surname><given-names>M.</given-names></name></person-group>. (<publisher-loc>Cham, Switzerland</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>125</fpage>&#x2013;<lpage>160</lpage>.</citation></ref>
<ref id="ref95"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Vinayakumar</surname><given-names>R.</given-names></name> <name><surname>Soman</surname><given-names>K. P.</given-names></name> <name><surname>Prabaharan</surname><given-names>P.</given-names></name> <name><surname>Akarsh</surname><given-names>S.</given-names></name> <name><surname>Elhoseny</surname><given-names>M.</given-names></name></person-group> (<year>2019</year>). &#x201C;<article-title>Deep learning framework for cyber threat situational awareness based on email and URL data analysis</article-title>&#x201D; in <source>Cybersecurity and secure information systems</source>. eds. <person-group person-group-type="editor"><name><surname>Hassanien</surname><given-names>A. E.</given-names></name> <name><surname>Elhoseny</surname><given-names>M.</given-names></name></person-group> (<publisher-name>Springer International Publishing</publisher-name>), <fpage>87</fpage>&#x2013;<lpage>124</lpage>.</citation></ref>
<ref id="ref96"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname><given-names>W.</given-names></name> <name><surname>Zhang</surname><given-names>F.</given-names></name> <name><surname>Luo</surname><given-names>X.</given-names></name> <name><surname>Zhang</surname><given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>PDRCNN: precise phishing detection with recurrent convolutional neural networks</article-title>. <source>Secur. Commun. Netw.</source> <volume>2019</volume>, <fpage>1</fpage>&#x2013;<lpage>15</lpage>. doi: <pub-id pub-id-type="doi">10.1155/2019/2595794</pub-id>, PMID: <pub-id pub-id-type="pmid">41031239</pub-id></citation></ref>
<ref id="ref97"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xiang</surname><given-names>G.</given-names></name> <name><surname>Hong</surname><given-names>J.</given-names></name> <name><surname>Rose</surname><given-names>C. P.</given-names></name> <name><surname>Cranor</surname><given-names>L.</given-names></name></person-group> (<year>2011</year>). <article-title>Cantina+: a feature-rich machine learning framework for detecting phishing web sites</article-title>. <source>ACM Trans. Inf. Syst. Secur.</source> <volume>14</volume>, <fpage>1</fpage>&#x2013;<lpage>28</lpage>. doi: <pub-id pub-id-type="doi">10.1145/2019599.2019606</pub-id></citation></ref>
<ref id="ref100"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname><given-names>P.</given-names></name> <name><surname>Zhao</surname><given-names>G.</given-names></name> <name><surname>Zeng</surname><given-names>P.</given-names></name></person-group> (<year>2019</year>). <article-title>Phishing website detection based on multidimensional features driven by deep learning</article-title>. <source>IEEE Access</source> <volume>7</volume>, <fpage>15196</fpage>&#x2013;<lpage>15209</lpage>. doi: <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2892066</pub-id></citation></ref>
<ref id="ref102"><citation citation-type="other"><person-group person-group-type="author"><name><surname>Zhang</surname><given-names>Y.</given-names></name> <name><surname>Hong</surname><given-names>J. I.</given-names></name> <name><surname>Cranor</surname><given-names>L. F.</given-names></name></person-group> (<year>2007</year>). &#x201C;<article-title>Cantina: a content-based approach to detecting phishing web sites</article-title>&#x201D; in <source>Proceedings of the 16th International conference on world wide web conference held in Banff, Alberta, Canada</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>ACM publishing house</publisher-name>. <fpage>639</fpage>&#x2013;<lpage>648</lpage>.</citation></ref>
</ref-list>
</back>
</article>