<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="brief-report">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Digit. Health</journal-id>
<journal-title>Frontiers in Digital Health</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Digit. Health</abbrev-journal-title>
<issn pub-type="epub">2673-253X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fdgth.2022.878369</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Digital Health</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Validating GAN-BioBERT: A Methodology for Assessing Reporting Trends in Clinical Trials</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Myszewski</surname> <given-names>Joshua J.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1610833/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Klossowski</surname> <given-names>Emily</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Meyer</surname> <given-names>Patrick</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Bevil</surname> <given-names>Kristin</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Klesius</surname> <given-names>Lisa</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Schroeder</surname> <given-names>Kristopher M.</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>School of Medicine and Public Health, University of Wisconsin</institution>, <addr-line>Madison, WI</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>University of Wisconsin-Milwaukee</institution>, <addr-line>Milwaukee, WI</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Anesthesiology, School of Medicine and Public Health, University of Wisconsin</institution>, <addr-line>Madison, WI</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Angus Roberts, King&#x00027;s College London, United Kingdom</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Terri Elizabeth Workman, George Washington University, United States; Vasiliki Foufi, Consultant, Geneva, Switzerland</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Joshua J. Myszewski <email>jmyszewski&#x00040;wisc.edu</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Health Informatics, a section of the journal Frontiers in Digital Health</p></fn></author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>05</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>4</volume>
<elocation-id>878369</elocation-id>
<history>
<date date-type="received">
<day>17</day>
<month>02</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>05</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Myszewski, Klossowski, Meyer, Bevil, Klesius and Schroeder.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Myszewski, Klossowski, Meyer, Bevil, Klesius and Schroeder</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<sec>
<title>Background</title>
<p>The aim of this study was to validate a three-class sentiment classification model for clinical trial abstracts combining adversarial learning and the BioBERT language processing model as a tool to assess trends in biomedical literature in a clearly reproducible manner. We then assessed the model&#x00027;s performance for this application and compared it to previous models used for this task.</p>
</sec>
<sec>
<title>Methods</title>
<p>Using 108 expert-annotated clinical trial abstracts and 2,000 unlabeled abstracts this study develops a three-class sentiment classification algorithm for clinical trial abstracts. The model uses a semi-supervised model based on the Bidirectional Encoder Representation from Transformers (BERT) model, a much more advanced and accurate method compared to previously used models based upon traditional machine learning methods. The prediction performance was compared to those previous studies.</p>
</sec>
<sec>
<title>Results</title>
<p>The algorithm was found to have a classification accuracy of 91.3%, with a macro F1-Score of 0.92, significantly outperforming previous studies used to classify sentiment in clinical trial literature, while also making the sentiment classification finer grained with greater reproducibility.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>We demonstrate an easily applied sentiment classification model for clinical trial abstracts that significantly outperforms previous models with greater reproducibility and applicability to large-scale study of reporting trends.</p>
</sec></abstract>
<kwd-group>
<kwd>sentiment analysis</kwd>
<kwd>publication bias</kwd>
<kwd>natural language processing</kwd>
<kwd>clinical trial</kwd>
<kwd>meta-analyses</kwd>
</kwd-group>
<counts>
<fig-count count="2"/>
<table-count count="2"/>
<equation-count count="1"/>
<ref-count count="26"/>
<page-count count="6"/>
<word-count count="4484"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Publication bias is a systematic phenomenon of under or overreporting of research findings dependent on the direction of the results found (<xref ref-type="bibr" rid="B1">1</xref>). As a result of this phenomenon, systematic reviews of clinical guidelines may reach incorrect conclusions (<xref ref-type="bibr" rid="B2">2</xref>), and subsequently lead to harm to patients caused by treatments that have an otherwise poor evidence base. Despite this potential for harm and its widespread presence within clinical literature (<xref ref-type="bibr" rid="B1">1</xref>), there have been limited efforts to develop and utilize methods to characterize publication bias, particularly on a systematic scale. In 2016 Hedin et al. found that only 55 percent of meta-analysis in anesthesiology journals discussed publication bias, and only 43 percent actually used tools to assess the phenomenon (<xref ref-type="bibr" rid="B3">3</xref>). Furthermore, the methods currently used for assessing publication bias, such as funnel-plot based methods and selection models (<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B5">5</xref>), are criticized as unintuitive to interpret within the literature&#x00027;s context (<xref ref-type="bibr" rid="B4">4</xref>). These methods also focus on the quantitative findings expressed in the studies in question in the form of effect sizes and <italic>p</italic>-values and are therefore limited to those studies that express these types of findings.</p>
<p>The current gold standard for systematic assessment of the qualitative interpretation of the findings has been rating systems performed by human raters. However, this method of assessment is time and resource intensive and has inherently poor reproducibility due to variability between the raters used (<xref ref-type="bibr" rid="B6">6</xref>&#x02013;<xref ref-type="bibr" rid="B8">8</xref>). Fortunately, this is changing with the development of sentiment analysis and natural language processing as a toolset capable of understanding the qualitative statements made in a body of text with consistency and accuracy, creating a promising avenue to address these shortcomings.</p>
<p>In recent years, several studies have explored the assessment of citation sentiment analysis in academic literature (<xref ref-type="bibr" rid="B9">9</xref>&#x02013;<xref ref-type="bibr" rid="B12">12</xref>) with the goal of examining the sentiment toward papers cited in the body of another article as an assessment of article impact. Sentiment analysis has also been applied to the analysis of clinical notes in the electronic health record with the goal of prognostication (<xref ref-type="bibr" rid="B13">13</xref>, <xref ref-type="bibr" rid="B14">14</xref>). However, attempts to use sentiment analysis to characterize the qualitative findings authors express toward their own clinical publication&#x00027;s findings have been minimal, with only two studies being published at the time of writing of this manuscript (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B16">16</xref>). Importantly, the model accuracy in both cited studies were limited by the technology available at the time and the availability of labeled abstract data for training of the algorithms developed. Similarly the algorithms classes being limited to the two class tasks of positive/neutral (<xref ref-type="bibr" rid="B15">15</xref>) or positive/not positive (<xref ref-type="bibr" rid="B16">16</xref>) respectively, limited their practical use. The methods used in these studies did not take advantage of newer natural language processing architectures such as the context-sensitive Bidirectional Encoder Representations from Transformers (BERT) model (<xref ref-type="bibr" rid="B17">17</xref>), or similar newer models for the analyses built for biomedical text (<xref ref-type="bibr" rid="B18">18</xref>), instead opting for the use of a support vector machine (<xref ref-type="bibr" rid="B15">15</xref>) and a sequential neural network (<xref ref-type="bibr" rid="B16">16</xref>), respectively.</p>
<p>With the limitations of these previous studies in mind, this study&#x00027;s goal was to develop and validate a sentiment analysis model for clinical trial abstracts that can be practically applied to large-scale assessment of clinical literature using the more modern GAN-BERT architecture (<xref ref-type="bibr" rid="B19">19</xref>). This model is a semi-supervised approach to fine tuning a BERT model, taking advantage of both the decreased sample size required due to a semi-supervised approach, and the increased accuracy that the BERT architecture has become known for (<xref ref-type="bibr" rid="B17">17</xref>). Developing a tool with reproducible results for systematic large-scale assessment of reporting trends in clinical literature in this manner is a large step forward in actually addressing the issue of biased reporting in clinical literature and it&#x00027;s subsequent harm to patients.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec>
<title>Creation of the Labeled and Unlabeled Training Sets</title>
<p>There are no publicly available annotated datasets specific to sentiment analysis of clinical trials, so for the purposes of this study an appropriate annotated dataset had to be created. Given that the best raters for the sentiment rating of clinical trials are trained clinician experts, creation of a fully annotated dataset for this study was determined to be particularly resource intensive, a problem inherent to clinically related natural language processing (NLP) tasks (<xref ref-type="bibr" rid="B20">20</xref>). As such, this study elected to use a semi-supervised approach, combining expert-annotated clinical trial abstracts and large amounts of unlabeled data to minimize the resources required to create the final algorithm.</p>
</sec>
<sec>
<title>Data Gathering and Annotation</title>
<p>All abstracts gathered for this study were from the National Library of Medicine&#x00027;s (NLM) PubMed database, filtered specifically to publications classified as clinical trials. The collection of these abstracts was automated using the NCBI&#x00027;s Entrez search and retrieval system with a data mining tool built by the authors using the BioPython toolkit (<xref ref-type="bibr" rid="B21">21</xref>). This tool can gather all MEDLINE data that is reported for a particular PubMed query, and is able to search in a specific medical field by cross referencing journal ID numbers with a NLM catalog query.</p>
<p>For the creation of the labeled dataset, 12 abstracts each from clinical trials in the fields of Obstetrics &#x00026; Gynecology, Orthopedics, Pediatrics, Anesthesiology, General Surgery, Internal Medicine, Thoracic Surgery, Critical Care, and Cardiology were randomly selected, for a total of 108 labeled abstracts. These abstracts were then stripped of all information other than the abstract text and provided to a panel of three clinicians with a range of 9&#x02013;19 years of experience to independently label the abstracts as having positive, negative, or neutral sentiment, with examples shown in <xref ref-type="table" rid="T1">Table 1</xref>. The ground truth class of these abstracts was then defined by the most common rating assigned to the abstract by a panel of three clinicians, plus a fourth to review abstracts when there was not a majority decision between the other three. The smallest and largest class of the labeled data was then oversampled or undersampled to equal the number of samples from the median class to create an algorithm with a maximally balanced accuracy between classes (<xref ref-type="bibr" rid="B22">22</xref>). The unlabeled dataset was a collection of 2,000 clinical trial abstracts selected from PubMed in the same manner described above, excluding those used in the labeled dataset. The unlabeled data is then given a label of UNK UNK so that when it is used to train the classification algorithm the label is appropriately masked.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Examples of positive, negative, and neutral text in abstracts.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Example</bold></th>
<th valign="top" align="left"><bold>Classification</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">This study showed promising results regarding treatment A</td>
<td valign="top" align="left">Positive</td>
</tr>
<tr>
<td valign="top" align="left">This study showed no significant difference between Treatment A and Treatment B</td>
<td valign="top" align="left">Negative</td>
</tr>
<tr>
<td valign="top" align="left">This study showed that treatment A is inappropriate for common use</td>
<td valign="top" align="left">Neutral</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>Data Preprocessing</title>
<p>The conclusion sentences of the labeled and unlabeled abstracts to be used for training and validation were then extracted. This was done as it was found in previous study that using solely the concluding sentences led to an increase in classification accuracy (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B16">16</xref>). Using the Natural Language Tool Kit (NLTK) Python toolkit (<xref ref-type="bibr" rid="B23">23</xref>), concluding sentences were identified as those following the conclusion heading for structured abstracts. For unstructured abstracts, the conclusion sentences were determined to be the last n sentences of an abstract based on the number of sentences in the abstract using equation 1 below, where St is the total number of sentences.</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi></mml:mstyle><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>S</mml:mi></mml:mstyle><mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mi>t</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mo>*</mml:mo></mml:mstyle></mml:msup><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mn>0</mml:mn></mml:mstyle><mml:mstyle mathvariant='bold' mathsize='normal'><mml:mo>.</mml:mo><mml:mn>125</mml:mn></mml:mstyle></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The relative value of 0.125 was determined empirically in a previous study based on the analysis of 2,000 structured abstracts (<xref ref-type="bibr" rid="B15">15</xref>).</p>
<p>Tokenization: Following extraction of the conclusion sentences, all sentences were tokenized using the BERT tokenizer available as part of the HuggingFace Transformers toolkit (<xref ref-type="bibr" rid="B24">24</xref>), which tokenizes each word. The BERT tokenizer begins by tagging the first token of each sentence with the token [CLS], then converting each token to its corresponding ID that is defined in the pre-trained BERT model. The end of each sentence is then padded with the tag [PAD] to a fixed sentence length, as the BERT model requires a fixed length sentence as an input (<xref ref-type="bibr" rid="B17">17</xref>).</p>
</sec>
<sec>
<title>GAN-BioBERT Workflow</title>
<p>Generally, the GAN-BERT architecture consists of a generator function G based on the Semi-Supervised generalized adversarial network (GAN) architecture that generates fake samples F using a noise vector as input (<xref ref-type="bibr" rid="B25">25</xref>), the pre-trained BERT model, which is given the labeled data, and a discriminator function D that is a BERT-based k-class classifier that is fine-tuned to the classification task (<xref ref-type="bibr" rid="B19">19</xref>). This workflow is shown graphically in <xref ref-type="fig" rid="F1">Figure 1</xref>, with further discussion of each element to follow. The GAN-BioBERT architecture as it is written by its original creators uses HuggingFace transformers as the basis for it&#x00027;s creation in python, which is also what was used in this study (<xref ref-type="bibr" rid="B19">19</xref>).</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>A visual representation of the GAN-BERT algorithm as described by the original developers where G, Generator D; Discriminator; F, Fake Sample (<xref ref-type="bibr" rid="B19">19</xref>).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdgth-04-878369-g0001.tif"/>
</fig>
</sec>
<sec>
<title>BERT Architecture</title>
<p>Before discussing the details of the algorithm used in this study it is key to first discuss the general BERT architecture. Bidirectional Encoder Representations from Transformers or BERT model is a method for language processing first described in 2018 by Devlin et al. that achieved state of the art performance on a variety of natural language processing tasks and has since become a heavily used tool in natural language processing research (<xref ref-type="bibr" rid="B17">17</xref>). BERT functions using 2 sequential workflows, a semi-supervised language modeling task that develops a general language model, then a supervised learning step specific to the language processing task the model is being applied to such as text classification. For developing the pre-trained language model BERT is provided with a very large corpus from a particular domain, such as publications in PubMed (<xref ref-type="bibr" rid="B18">18</xref>), documents from a particular language (<xref ref-type="bibr" rid="B26">26</xref>), or English Wikipedia and BooksCorpus as in the original BERT model (<xref ref-type="bibr" rid="B17">17</xref>). BERT then develops a complete language model from the provided corpus using both masked language modeling, which determine the meaning of individual words within the sentence&#x00027;s context, and next sentence prediction, which works to understand the relationship between sentences. The result of this process is a trained context-sensitive general language model for the specific domain being studied that can then be disseminated for a wide variety of applications. The pretrained language model from the semi-supervised stage of BERT is then fine-tuned for a specific language task by providing task-specific inputs and outputs and then adjusting the parameters of the model accordingly to create the complete task-specific algorithm (<xref ref-type="bibr" rid="B17">17</xref>).</p>
</sec>
<sec>
<title>BERT Pretrained Model Selection</title>
<p>Given the important role of the pretrained model in the BERT architecture, and the relative complexity of biomedical literature, general language models are likely to encounter lower accuracy when applied to a biomedical application such as the one in this study due to a change in the word distributions between general and biomedical corpora (<xref ref-type="bibr" rid="B18">18</xref>). As such, in this study the pretrained BioBERT model was used as the general language model to be fine-tuned for sentiment classification (<xref ref-type="bibr" rid="B18">18</xref>). BioBERT is a 2020 pretrained BERT model by Lee et al. that is specific to the biomedical domain that was trained on PubMed abstracts and PubMed Central full-text articles, as well as English Wikipedia and BooksCorpus as was done in the original BERT model (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B18">18</xref>). As a result of this domain specific training, BioBERT has shown improved performance on a variety of biomedical NLP tasks when compared to the standard BERT models (<xref ref-type="bibr" rid="B18">18</xref>).</p>
</sec>
<sec>
<title>GAN-BERT</title>
<p>While BERT and its derivatives have been able to achieve state of the art performance on a variety of tasks, one major limitation of the model is that fully trained models typically require thousands of annotated examples to achieve these results (<xref ref-type="bibr" rid="B19">19</xref>). Significant drops in performance were observed when &#x0003C;200 annotated examples are used (<xref ref-type="bibr" rid="B19">19</xref>). In order to address this limitation, Croce, Castellucci, and Basili developed the GAN-BERT model in 2020 as a semi-supervised approach to fine tuning BERT models that achieves performance competitive with fully supervised settings (<xref ref-type="bibr" rid="B19">19</xref>). Specifically, GAN-BERT expands upon the BERT architecture by the introduction of a Semi-Supervised Generative Adversarial Network (SS-GAN) to the finetuning step of the BERT architecture (<xref ref-type="bibr" rid="B25">25</xref>). In a SS-GAN, a &#x0201C;generator&#x0201D; is trained to produce samples resembling the data distribution of the training data i.e., the labeled abstracts in this study. This process is dependent on a &#x0201C;discriminator,&#x0201D; a BERT-based classifier in the case of this study, which in an SS-GAN is trained to classify the data into their true classes, in addition to identifying whether the sample was created by the generator or not. When trained in this manner, the labeled abstract data was used to train the discriminator, while both the unlabeled abstracts and the generated data is used to improve the model&#x00027;s inner representations of the classes, which subsequently increases the model&#x00027;s generalizability to new data (<xref ref-type="bibr" rid="B19">19</xref>). As a result of this approach the minimum number of annotated samples to train a BERT model is reduced from thousands, to a few dozen (<xref ref-type="bibr" rid="B19">19</xref>). Because of this effect, this study uses GAN-BERT to minimize the resource intensive process of creating an expert-annotated corpus of clinical trial abstracts. A detailed mathematical description of this algorithm and it&#x00027;s processes, including the determination of it&#x00027;s loss functions, can be found elsewhere (<xref ref-type="bibr" rid="B19">19</xref>).</p>
<p>In summary, in this study GAN-BioBERT takes the BERT architecture pretrained on biomedical text using BioBERT (<xref ref-type="bibr" rid="B18">18</xref>) and fine-tunes it for sentiment classification of clinical trial abstracts in a semi-supervised manner by using adversarial learning in the form of an SS-GAN architecture known as GAN-BERT (<xref ref-type="bibr" rid="B19">19</xref>, <xref ref-type="bibr" rid="B25">25</xref>). The training data used consisted of a set of clinical trial abstracts annotated by three expert raters as positive, negative, or neutral, where the least common class was upsampled and the most common class was downsampled to create a balanced training set, as well as 2000 (121,856 tokens) unlabeled clinical trial abstracts. The validation accuracy and F1-scores of the resulting algorithm were then determined and compared to both previous attempts at applying sentiment analysis to the findings in clinical trial abstracts, as well as the performance of a fourth expert rater on the same labeled data used to train and validate the algorithm, and the original GAN-BERT algorithm without BioBERT (i.e., with the standard BERT pretrained model).</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<p>Of the 108 abstracts (4,674 tokens) labeled by the expert raters, 26 were classified as positive, 69 were classified as neutral, and 13 were classified as negative by the raters. As such, the negative samples were up-sampled, and the neutral samples were down-sampled so that each class contained 26 examples, for a final labeled dataset of 78 abstracts for training purposes. In order to have a test set with a distribution similar to what is present in application of the algorithm, 23 of the samples were held out as the test set for determining the performance of the algorithm prior to balancing of the training dataset.</p>
<p>After completion of training, the final GAN-BioBERT algorithm was found to have an accuracy of 91.3%, and a macro F1-Score of 0.92. The training of the algorithm took 45 min using the Google Colaboratory Environment using 35 GB of RAM with TPU hardware acceleration. The confusion matrix associated with these results is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Confusion matrix for GAN-BioBERT.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fdgth-04-878369-g0002.tif"/>
</fig>
<p>GAN-BERT using the base uncased BERT pretrained model was found to have an accuracy of 82.6% and a macro F1-score of 0.824. These results, alongside the results of the two previous studies investigating sentiment analysis of clinical trial abstracts, are summarized in <xref ref-type="table" rid="T2">Table 2</xref>.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Performance metric results for both this study and previous studies.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Study</bold></th>
<th valign="top" align="left"><bold>Classification Method</bold></th>
<th valign="top" align="left"><bold>Classification Type (&#x00023; of classes)</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
<th valign="top" align="center"><bold>F1-Score</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Fischer and Steiger (<xref ref-type="bibr" rid="B16">16</xref>)</td>
<td valign="top" align="left">Word Frequency &#x0002B; Sequential Neural Network</td>
<td valign="top" align="left">Positive, Not Positive (<xref ref-type="bibr" rid="B2">2</xref>)</td>
<td valign="top" align="center">73%</td>
<td valign="top" align="center">N/A</td>
</tr>
<tr>
<td valign="top" align="left">Zlabinger et al. (<xref ref-type="bibr" rid="B15">15</xref>)</td>
<td valign="top" align="left">Uni-gram Features &#x0002B; Support Vector Machine (SVM)</td>
<td valign="top" align="left">Positive, Neutral (<xref ref-type="bibr" rid="B2">2</xref>)</td>
<td valign="top" align="center">76%</td>
<td valign="top" align="center">0.72</td>
</tr>
<tr>
<td valign="top" align="left">This study, 2021</td>
<td valign="top" align="left">GAN-BERT</td>
<td valign="top" align="left">Positive, Negative, Neutral (<xref ref-type="bibr" rid="B3">3</xref>)</td>
<td valign="top" align="center">82.6%</td>
<td valign="top" align="center">0.824</td>
</tr>
<tr>
<td valign="top" align="left">This study, 2021</td>
<td valign="top" align="left">GAN-BioBERT</td>
<td valign="top" align="left">Positive, Negative, Neutral (<xref ref-type="bibr" rid="B3">3</xref>)</td>
<td valign="top" align="center">91.3%</td>
<td valign="top" align="center">0.92</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>From a technical perspective, these results show that GAN-BioBERT is a significant step forward for assessing the sentiment in clinical trial literature, with an 8.7% improvement in performance over GAN-BERT for the same classification task. This improvement in domain specific classification performance with creation of a domain specific algorithm is reasonable to expect but is important to assure the algorithms viability in an application where highly specific and technical language is commonplace. Beyond this, a reliable, rapid assessment method for clinical literature is a large step forward in the process of assessing trends in clinical literature as traditionally the assessment of clinical literature has been performed manually which creates significant resource and time restrictions on larger literature reviews. This also provides a reliable method of assessing potential biases in the literature by being able to operationalize some amount of subjective assessment of the literature using artificial intelligence.</p>
<p>When technically compared to previous studies&#x00027; attempts at classifying sentiment in clinical trial abstracts (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B16">16</xref>), this improvement is even more significant as there is an absolute accuracy improvement of 15.3%, while also expanding the classification task to the three classes positive, negative, and neutral, as opposed to the two-class positive/not positive (<xref ref-type="bibr" rid="B16">16</xref>), or positive/neutral (<xref ref-type="bibr" rid="B15">15</xref>). This significant improvement in accuracy and expansion of the number of classifiers make GAN-BioBERT much more suitable for large-scale assessment of the sentiment in clinical trial literature with improved accuracy and data resolution. With the already high classification accuracy of the algorithm in mind, further development of this algorithm technically may include the introduction of finer-grained sentiment classification, as well as the use of a larger set of labeled training data with more expert raters contributing to improve inference performance given the subjectivity of the task.</p>
</sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusion</title>
<p>This study presents GAN-BioBERT, a sentiment analysis classifier for the assessment of the sentiment expressed in clinical trial abstracts. GAN-BioBERT was shown to significantly outperform previous attempts to classify sentiment in clinical trial abstracts using sentiment analysis with regards to accuracy and number of sentiment classes. Considering this high multi-class accuracy, and the reproducible results GAN-BioBERT generates, this study posits GAN-BioBERT as a viable tool for large-scale assessment of the findings expressed in clinical trial literature in a way that was not previously possible, making a needed step forward in the methods used to address the important and patient-impacting issue of reporting bias in clinical literature. By using a tool such as GAN-BioBERT the large-scale assessment of qualitative reporting trends in clinical trial literature becomes significantly more feasible with more reproducible findings when compared to the past practice of manual assessment of reporting bias.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>The model generated for this study can be found in the Zenodo repository at: <ext-link ext-link-type="uri" xlink:href="https://doi.org/10.5281/zenodo.5699018">https://doi.org/10.5281/zenodo.5699018</ext-link>.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>JM, EK, and KS were responsible for conception and design. KS was responsible for administrative support. PM, KB, LK, and KS were responsible for provision of study materials. JM and EK were responsible for collection and assembly of data. JM was responsible for data analysis and interpretation. All authors were responsible for manuscript writing as well as final approval of the manuscript.</p>
</sec>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This study was funded by the University of Wisconsin School of Medicine and Public Health&#x00027;s Shapiro Summer Research Program.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McGauran</surname> <given-names>N</given-names></name> <name><surname>Wieseler</surname> <given-names>B</given-names></name> <name><surname>Kreis</surname> <given-names>J</given-names></name> <name><surname>Sch&#x000FC;ler</surname> <given-names>YB</given-names></name> <name><surname>K&#x000F6;lsch</surname> <given-names>H</given-names></name> <name><surname>Kaiser</surname> <given-names>T</given-names></name></person-group>. <article-title>Reporting bias in medical research - a narrative review</article-title>. <source>Trials.</source> (<year>2010</year>) <volume>11</volume>:<fpage>37</fpage>. <pub-id pub-id-type="doi">10.1186/1745-6215-11-37</pub-id><pub-id pub-id-type="pmid">20388211</pub-id></citation></ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sutton</surname> <given-names>AJ</given-names></name> <name><surname>Duval</surname> <given-names>SJ</given-names></name> <name><surname>Tweedie</surname> <given-names>RL</given-names></name> <name><surname>Abrams</surname> <given-names>KR</given-names></name> <name><surname>Jones</surname> <given-names>DR</given-names></name></person-group>. <article-title>Empirical assessment of effect of publication bias on meta-analyses</article-title>. <source>BMJ.</source> (<year>2000</year>) <volume>320</volume>:<fpage>1574</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1136/bmj.320.7249.1574</pub-id><pub-id pub-id-type="pmid">10845965</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hedin</surname> <given-names>RJ</given-names></name> <name><surname>Umberham</surname> <given-names>BA</given-names></name> <name><surname>Detweiler</surname> <given-names>BN</given-names></name> <name><surname>Kollmorgen</surname> <given-names>L</given-names></name> <name><surname>Vassar</surname> <given-names>M</given-names></name></person-group>. <article-title>Publication bias and nonreporting found in majority of systematic reviews and meta-analyses in anesthesiology journals</article-title>. <source>Anesth Analg.</source> (<year>2016</year>) <volume>123</volume>:<fpage>1018</fpage>&#x02013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.1213/ANE.0000000000001452</pub-id><pub-id pub-id-type="pmid">27537925</pub-id></citation></ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>L</given-names></name> <name><surname>Chu</surname> <given-names>H</given-names></name></person-group>. <article-title>Quantifying publication bias in meta-analysis</article-title>. <source>Biometrics.</source> (<year>2018</year>) <volume>74</volume>:<fpage>785</fpage>&#x02013;<lpage>94</lpage>. <pub-id pub-id-type="doi">10.1111/biom.12817</pub-id><pub-id pub-id-type="pmid">29141096</pub-id></citation></ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Egger</surname> <given-names>M</given-names></name> <name><surname>Smith</surname> <given-names>GD</given-names></name> <name><surname>Schneider</surname> <given-names>M</given-names></name> <name><surname>Minder</surname> <given-names>C</given-names></name></person-group>. <article-title>Bias in meta-analysis detected by a simple, graphical test</article-title>. <source>BMJ.</source> (<year>1997</year>) <volume>315</volume>:<fpage>629</fpage>&#x02013;<lpage>34</lpage>. <pub-id pub-id-type="doi">10.1136/bmj.315.7109.629</pub-id><pub-id pub-id-type="pmid">9310563</pub-id></citation></ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Oliveira Jr</surname> <given-names>GS</given-names></name> <name><surname>Chang</surname> <given-names>R</given-names></name> <name><surname>Kendall</surname> <given-names>MC</given-names></name> <name><surname>Fitzgerald</surname> <given-names>PC</given-names></name> <name><surname>McCarthy</surname> <given-names>RJ</given-names></name></person-group>. <article-title>Publication bias in the anesthesiology literature</article-title>. <source>Anesth Analg.</source> (<year>2012</year>) <volume>114</volume>:<fpage>1042</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1213/ANE.0b013e3182468fc6</pub-id><pub-id pub-id-type="pmid">22344237</pub-id></citation></ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chia-Chun Yuan</surname> <given-names>J</given-names></name> <name><surname>Shyamsunder</surname> <given-names>N</given-names></name> <name><surname>Adelino Ricardo Bar&#x000E3;o</surname> <given-names>V</given-names></name> <name><surname>Lee</surname> <given-names>DJ</given-names></name> <name><surname>Sukotjo</surname> <given-names>C</given-names></name></person-group>. <article-title>Publication bias in five dental implant journals: an observation from 2005 to 2009</article-title>. <source>Int J Oral Maxillofacial Implants.</source> (<year>2011</year>) <volume>26</volume>:<fpage>1024</fpage>&#x02013;<lpage>32</lpage>.<pub-id pub-id-type="pmid">22010086</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vecchi</surname> <given-names>S</given-names></name> <name><surname>Belleudi</surname> <given-names>V</given-names></name> <name><surname>Amato</surname> <given-names>L</given-names></name> <name><surname>Davoli</surname> <given-names>M</given-names></name> <name><surname>Perucci</surname> <given-names>CA</given-names></name></person-group>. <article-title>Does direction of results of abstracts submitted to scientific conferences on drug addiction predict full publication?</article-title>. <source>BMC Med Res Methodol.</source> (<year>2009</year>) <volume>9</volume>:<fpage>1</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1186/1471-2288-9-23</pub-id><pub-id pub-id-type="pmid">19356245</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>J</given-names></name> <name><surname>Zhang</surname> <given-names>Y</given-names></name> <name><surname>Wu</surname> <given-names>Y</given-names></name> <name><surname>Wang</surname> <given-names>J</given-names></name> <name><surname>Dong</surname> <given-names>X</given-names></name> <name><surname>Xu</surname> <given-names>H</given-names></name></person-group>. <article-title>Citation sentiment analysis in clinical trial papers</article-title>. In: <source>AMIA Annual Symposium Proceedings.</source> <publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>American Medical Informatics Association</publisher-name> (<year>2015</year>). p. <fpage>1334</fpage>.<pub-id pub-id-type="pmid">26958274</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aljuaid</surname> <given-names>H</given-names></name> <name><surname>Iftikhar</surname> <given-names>R</given-names></name> <name><surname>Ahmad</surname> <given-names>S</given-names></name> <name><surname>Asif</surname> <given-names>M</given-names></name> <name><surname>Afzal</surname> <given-names>MT</given-names></name></person-group>. <article-title>Important citation identification using sentiment analysis of in-text citations</article-title>. <source>Telemat Inform.</source> (<year>2021</year>) <volume>56</volume>:<fpage>101492</fpage>. <pub-id pub-id-type="doi">10.1016/j.tele.2020.101492</pub-id></citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yousif</surname> <given-names>A</given-names></name> <name><surname>Niu</surname> <given-names>Z</given-names></name> <name><surname>Tarus</surname> <given-names>JK</given-names></name> <name><surname>Ahmad</surname> <given-names>A</given-names></name></person-group>. <article-title>A survey on sentiment analysis of scientific citations</article-title>. <source>Artificial Intellig Rev.</source> (<year>2019</year>) <volume>52</volume>:<fpage>1805</fpage>&#x02013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1007/s10462-017-9597-8</pub-id></citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kilicoglu</surname> <given-names>H</given-names></name> <name><surname>Peng</surname> <given-names>Z</given-names></name> <name><surname>Tafreshi</surname> <given-names>S</given-names></name> <name><surname>Tran</surname> <given-names>T</given-names></name> <name><surname>Rosemblat</surname> <given-names>G</given-names></name> <name><surname>Schneider</surname> <given-names>J</given-names></name></person-group>. <article-title>Confirm or refute?: A comparative study on citation sentiment classification in clinical research publications</article-title>. <source>J Biomed Inform.</source> (<year>2019</year>) <volume>91</volume>:<fpage>103123</fpage>. <pub-id pub-id-type="doi">10.1016/j.jbi.2019.103123</pub-id><pub-id pub-id-type="pmid">30753947</pub-id></citation></ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weissman</surname> <given-names>GE</given-names></name> <name><surname>Ungar</surname> <given-names>LH</given-names></name> <name><surname>Harhay</surname> <given-names>MO</given-names></name> <name><surname>Courtright</surname> <given-names>KR</given-names></name> <name><surname>Halpern</surname> <given-names>SD</given-names></name></person-group>. <article-title>Construct validity of six sentiment analysis methods in the text of encounter notes of patients with critical illness</article-title>. <source>J Biomed Inform.</source> (<year>2019</year>) <volume>89</volume>:<fpage>114</fpage>&#x02013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbi.2018.12.001</pub-id><pub-id pub-id-type="pmid">30557683</pub-id></citation></ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ghassemi</surname> <given-names>MM</given-names></name> <name><surname>Mark</surname> <given-names>RG</given-names></name> <name><surname>Nemati</surname> <given-names>S</given-names></name></person-group>. <article-title>A visualization of evolving clinical sentiment using vector representations of clinical notes</article-title>. In: <source>2015 Computing in Cardiology Conference (CinC).</source> <publisher-loc>Nice</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2015</year>). pp. <fpage>629</fpage>&#x02013;<lpage>32</lpage>.<pub-id pub-id-type="pmid">27774487</pub-id></citation></ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zlabinger</surname> <given-names>M</given-names></name> <name><surname>Andersson</surname> <given-names>L</given-names></name> <name><surname>Brassey</surname> <given-names>J</given-names></name> <name><surname>Hanbury</surname> <given-names>A</given-names></name></person-group>. <article-title>Extracting the population, intervention, comparison and sentiment from randomized controlled trials</article-title>. In: <source>Building Continents of Knowledge in Oceans of Data: The Future of Co-Created eHealth.</source> <publisher-loc>Gothenburg</publisher-loc>: <publisher-name>IOS Press</publisher-name> (<year>2018</year>). pp. <fpage>146</fpage>&#x02013;<lpage>50</lpage>.<pub-id pub-id-type="pmid">29677940</pub-id></citation></ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fischer</surname> <given-names>I</given-names></name> <name><surname>Steiger</surname> <given-names>HJ</given-names></name></person-group>. <article-title>Toward automatic evaluation of medical abstracts: the current value of sentiment analysis and machine learning for classification of the importance of PubMed abstracts of randomized trials for stroke</article-title>. <source>J Stroke Cerebrovasc Dis.</source> (<year>2020</year>) <volume>29</volume>:<fpage>105042</fpage>. <pub-id pub-id-type="doi">10.1016/j.jstrokecerebrovasdis.2020.105042</pub-id><pub-id pub-id-type="pmid">32807454</pub-id></citation></ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Devlin</surname> <given-names>J</given-names></name> <name><surname>Chang</surname> <given-names>MW</given-names></name> <name><surname>Lee</surname> <given-names>K</given-names></name> <name><surname>Toutanova</surname> <given-names>K</given-names></name></person-group>. <article-title>Bert: pre-training of deep bidirectional transformers for language understanding</article-title>. <source>arXiv preprint arXiv:1810.04805.</source> (<year>2018</year>).</citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>J</given-names></name> <name><surname>Yoon</surname> <given-names>W</given-names></name> <name><surname>Kim</surname> <given-names>S</given-names></name> <name><surname>Kim</surname> <given-names>D</given-names></name> <name><surname>Kim</surname> <given-names>S</given-names></name> <name><surname>So</surname> <given-names>CH</given-names></name> <etal/></person-group>. <article-title>BioBERT: a pre-trained biomedical language representation model for biomedical text mining</article-title>. <source>Bioinformatics.</source> (<year>2020</year>) <volume>36</volume>:<fpage>1234</fpage>&#x02013;<lpage>40</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btz682</pub-id><pub-id pub-id-type="pmid">31501885</pub-id></citation></ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Croce</surname> <given-names>D</given-names></name> <name><surname>Castellucci</surname> <given-names>G</given-names></name> <name><surname>Basili</surname> <given-names>R</given-names></name></person-group>. <article-title>GAN-BERT: generative adversarial learning for robust text classification with a bunch of labeled examples</article-title>. In: <source>Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics.</source> (<year>2020</year>). pp. <fpage>2114</fpage>&#x02013;<lpage>19</lpage>.</citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Xia</surname> <given-names>F</given-names></name> <name><surname>Yetisgen-Yildiz</surname> <given-names>M</given-names></name></person-group>. <article-title>Clinical corpus annotation: challenges and strategies</article-title>. In: <source>Proceedings of the Third Workshop on Building and Evaluating Resources for Biomedical Text Mining (BioTxtM&#x00027;2012) in conjunction with the International Conference on Language Resources and Evaluation (LREC).</source> <publisher-loc>Istanbul</publisher-loc> (<year>2012</year>). p. 67.</citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cock</surname> <given-names>PJ</given-names></name> <name><surname>Antao</surname> <given-names>T</given-names></name> <name><surname>Chang</surname> <given-names>JT</given-names></name> <name><surname>Chapman</surname> <given-names>BA</given-names></name> <name><surname>Cox</surname> <given-names>CJ</given-names></name> <name><surname>Dalke</surname> <given-names>A</given-names></name> <etal/></person-group>. <article-title>Biopython: freely available Python tools for computational molecular biology and bioinformatics</article-title>. <source>Bioinformatics.</source> (<year>2009</year>) <volume>25</volume>:<fpage>1422</fpage>&#x02013;<lpage>3</lpage>. <pub-id pub-id-type="doi">10.1093/bioinformatics/btp163</pub-id><pub-id pub-id-type="pmid">19304878</pub-id></citation></ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wei</surname> <given-names>Q</given-names></name> <name><surname>Dunbrack Jr</surname> <given-names>RL</given-names></name></person-group>. <article-title>The role of balanced training and testing data sets for binary classifiers in bioinformatics</article-title>. <source>PloS ONE.</source> (<year>2013</year>) <volume>8</volume>:<fpage>e67863</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0067863</pub-id><pub-id pub-id-type="pmid">23874456</pub-id></citation></ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Bird</surname> <given-names>S</given-names></name> <name><surname>Klein</surname> <given-names>E</given-names></name> <name><surname>Loper</surname> <given-names>E</given-names></name></person-group>. <source>Natural Language Processing With Python: Analyzing Text With the Natural Language Toolkit</source>. <publisher-loc>Sebastopol, CA</publisher-loc>: <publisher-name>O&#x00027;Reilly Media, Inc</publisher-name>. (<year>2009</year>).<pub-id pub-id-type="pmid">34709599</pub-id></citation></ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wolf</surname> <given-names>T</given-names></name> <name><surname>Chaumond</surname> <given-names>J</given-names></name> <name><surname>Debut</surname> <given-names>L</given-names></name> <name><surname>Sanh</surname> <given-names>V</given-names></name> <name><surname>Delangue</surname> <given-names>C</given-names></name> <name><surname>Moi</surname> <given-names>A</given-names></name> <etal/></person-group>. <article-title>Transformers: state-of-the-art natural language processing</article-title>. In: <source>Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations.</source> (<year>2020</year>). pp. <fpage>38</fpage>&#x02013;<lpage>45</lpage>.</citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salimans</surname> <given-names>T</given-names></name> <name><surname>Goodfellow</surname> <given-names>I</given-names></name> <name><surname>Zaremba</surname> <given-names>W</given-names></name> <name><surname>Cheung</surname> <given-names>V</given-names></name> <name><surname>Radford</surname> <given-names>A</given-names></name> <name><surname>Chen</surname> <given-names>X</given-names></name></person-group>. <article-title>Improved techniques for training gans</article-title>. <source>Adv Neural Inform Proc Syst.</source> (<year>2016</year>) <volume>29</volume>:<fpage>2234</fpage>&#x02013;<lpage>42</lpage>.</citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arkhipov</surname> <given-names>M</given-names></name> <name><surname>Trofimova</surname> <given-names>M</given-names></name> <name><surname>Kuratov</surname> <given-names>Y</given-names></name> <name><surname>Sorokin</surname> <given-names>A</given-names></name></person-group>. <article-title>Tuning multilingual transformers for language-specific named entity recognition</article-title>. In: <source>Proceedings of the 7th Workshop on Balto-Slavic Natural Language Processing.</source> (<year>2019</year>). pp. <fpage>89</fpage>&#x02013;<lpage>93</lpage>.</citation>
</ref>
</ref-list> 
</back>
</article>