<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<?covid-19-tdm?>
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Public Health</journal-id>
<journal-title>Frontiers in Public Health</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Public Health</abbrev-journal-title>
<issn pub-type="epub">2296-2565</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpubh.2022.1070870</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Public Health</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>An NLP tool for data extraction from electronic health records: COVID-19 mortalities and comorbidities</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>BuHamra</surname> <given-names>Sana S.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2048988/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Almutairi</surname> <given-names>Abdullah N.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Buhamrah</surname> <given-names>Abdullah K.</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2080543/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Almadani</surname> <given-names>Sabah H.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Alibrahim</surname> <given-names>Yusuf A.</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2107447/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Information Science, Kuwait University</institution>, <addr-line>Kuwait City</addr-line>, <country>Kuwait</country></aff>
<aff id="aff2"><sup>2</sup><institution>Surgery Department, Al-Adan Hospital</institution>, <addr-line>Al Ahmadi</addr-line>, <country>Kuwait</country></aff>
<aff id="aff3"><sup>3</sup><institution>Department of Medical Imaging, Faculty of Medicine, University of Toronto</institution>, <addr-line>Toronto, ON</addr-line>, <country>Canada</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Jacques Demongeot, Universit&#x000E9; Grenoble Alpes, France</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Olumide Babatope Longe, Academic City University College, Ghana; Mustapha Rachdi, Universit&#x000E9; Grenoble Alpes, France</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Sana S. BuHamra <email>sana.buhamra&#x00040;ku.edu.kw</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Infectious Diseases: Epidemiology and Prevention, a section of the journal Frontiers in Public Health</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>01</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>1070870</elocation-id>
<history>
<date date-type="received">
<day>15</day>
<month>10</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>14</day>
<month>11</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 BuHamra, Almutairi, Buhamrah, Almadani and Alibrahim.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>BuHamra, Almutairi, Buhamrah, Almadani and Alibrahim</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<sec>
<title>Background</title>
<p>The high infection rate, severe symptoms, and evolving aspects of the COVID-19 pandemic provide challenges for a variety of medical systems around the world. Automatic information retrieval from unstructured text is greatly aided by Natural Language Processing (NLP), the primary approach taken in this field. This study addresses COVID-19 mortality data from the intensive care unit (ICU) in Kuwait during the first 18 months of the pandemic. A key goal is to extract and classify the primary and intermediate causes of death from electronic health records (EHRs) in a timely way. In addition, comorbid conditions or concurrent diseases were retrieved and analyzed in relation to a variety of causes of mortality.</p>
</sec>
<sec>
<title>Method</title>
<p>An NLP system using the Python programming language is constructed to automate the process of extracting primary and secondary causes of death, as well as comorbidities. The system is capable of handling inaccurate and messy data, this includes inadequate formats, spelling mistakes and mispositioned information. A machine learning decision trees method is used to classify the causes of death.</p>
</sec>
<sec>
<title>Results</title>
<p>For 54.8% of the 1691 ICU patients we studied, septic shock or sepsis-related multiorgan failure was the leading cause of mortality. About three-quarters of patients die from acute respiratory distress syndrome (ARDS), a common intermediate cause of death. An arrhythmia (AF) disorder was determined to be the strongest predictor of intermediate cause of death, whether caused by ARDS or other causes.</p>
</sec>
<sec>
<title>Conclusion</title>
<p>We created an NLP system to automate the extraction of causes of death and comorbidities from EHRs. Our method processes messy and erroneous data and classifies the primary and intermediate causes of death of COVID-19 patients. We advocate arranging the EHR with well-defined sections and menu-driven options to reduce incorrect forms.</p>
</sec>
</abstract>
<kwd-group>
<kwd>natural language processing</kwd>
<kwd>text mining</kwd>
<kwd>information extraction</kwd>
<kwd>SARS-CoV-2</kwd>
<kwd>mortality</kwd>
<kwd>decision tree</kwd>
<kwd>prediction</kwd>
</kwd-group>
<counts>
<fig-count count="8"/>
<table-count count="7"/>
<equation-count count="0"/>
<ref-count count="33"/>
<page-count count="14"/>
<word-count count="7519"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>The COVID-19 pandemic has had a significant impact on how and where healthcare is delivered effectively and efficiently. During the pandemic, the need for novel and current technologies arise to assist in predicting clinical outcomes in critical time with the high overflow of patients. Clinical (text) notes constitute a major source of medical data and are rarely used to their full capacity, even though they include a wealth of subjective information. Prior to electronic health records (EHRs), practitioners had to manually collect data from clinical notes, which was costly and difficult to scale up. Despite the expanding volumes of healthcare data, Kong (<xref ref-type="bibr" rid="B1">1</xref>) claims that over 80% of text, image, signal, and other medical data collections remain unstructured and unused. One main goal in medical research is to use EHRs to extract and analyze well-structured data. Many methods were devised and evaluated using EHRs for detecting patients with known risk factors for consequences such as stroke and significant bleeding (<xref ref-type="bibr" rid="B2">2</xref>), as well as investigating the difficulties of decoding and comprehending clinical narratives (<xref ref-type="bibr" rid="B3">3</xref>). Natural language processing (NLP) can expedite diagnosis and care to patients who are most vulnerable during pandemics by using textual data from medical records. According to Zhou et al. (<xref ref-type="bibr" rid="B4">4</xref>), only NLP can extract information about a patient&#x00027;s family history from free-text clinical papers. The researchers employed word embeddings and a Convolutional Neural Network (CNN) to recognize International Classification of Diseases (ICD-10) diagnostic codes in discharge notes and outperformed current methods with little data preparation (<xref ref-type="bibr" rid="B5">5</xref>).</p>
<p>Artificial Intelligence (AI) and Machine Learning (ML) technologies including NLP can be used to aid in the diagnosis and treatment of individuals suffering from acute and chronic diseases during the COVID-19 pandemic. DeCapprio et al. (<xref ref-type="bibr" rid="B6">6</xref>) used medical records that had already been made public as COVID-19 proxies (pneumonia, influenza, acute bronchitis, and upper respiratory illnesses). Zoabi et al. (<xref ref-type="bibr" rid="B7">7</xref>) came up with a machine learning decision tree model that predicts a positive COVID-19 infection in an RT-PCR test during the first month of the pandemic. Izquierdo et al. (<xref ref-type="bibr" rid="B8">8</xref>) used a mix of traditional epidemiological methods, NLP, and ML predictive modeling to find out what symptoms COVID-19 patients have that make them likely to be admitted to the ICU. Guan et al. (<xref ref-type="bibr" rid="B9">9</xref>) employed simple-tree XGBoost to identify high-risk COVID-19 cases and assessed how much faster causes of death may be identified using minimally preprocessed notes.</p>
<p>This study intends to construct an NLP system to automate the extraction of primary and secondary causes of death, as well as comorbidities, from the mortality EHRs of COVID-19 patients admitted to the ICU in Kuwait during the pandemic. Since many of the free-text notes were inadequately formatted, contained spelling mistakes and were placed in the wrong field, acquiring sufficient and reliable data was the largest hurdle. In fact, the causes of death in most records in our data were not expressed precisely nor was in the correct field although the EHRs file is mortality specific.</p>
<p>Other work in the literature used available clean EHRs for their analysis. However, EHRs may sometimes be inaccurate and noisy due to them being compiled under extreme pressures of time and manpower due to the large influx of patients with critical cases, such as the case during the pandemic. EHRs need to be first corrected and cleaned to be used for proper analysis or be used in medical systems such the Unified Medical Language System (UMLS) and SNOMED CT. Otherwise, a significant amount of information will be lost.</p>
<p>To correct the EHRs we used physicians as the domain knowledge experts to understand and extract the common mistakes in the EHRs that were done by their fellow physicians. Their knowledge and findings were converted to a Python language code to automate cleaning and fixing the data in the EHRs. Also, the Python code used the domain expert knowledge to distinguish between acute diseases and causes of death in some circumstances. In addition, the causes of death were classified to a direct cause or a related one. Comorbidities were used as an important factor in analyzing the cause of death. This will offer precise information on the casuality and spectrum of comorbidities in fatal instances, allowing for an accurate evaluation of COVID-19&#x00027;s hazardous nature. Finally, we have utilized a decision tree-based model to predict death due to ARDS or other complications. These findings can assist healthcare systems to plan for the spread of future pandemics and identify groups at risk.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>Methods</title>
<sec>
<title>The data</title>
<p>Data on COVID-19 mortalities were retrieved from Jaber Hospital&#x00027;s mortality Electronic Health Records (EHR) for all patients admitted to the ICU between March 7, 2020, and August 19, 2021, and death reported between March 7, 2020, and August 27, 2021. The data set contains 1691 cases after excluding 12 children (&#x0003C;17 years old) and 46 with no data entries. The monthly total death rate in Kuwait is depicted in Worldometer cite (<xref ref-type="bibr" rid="B10">10</xref>). On the final day of data collection for this study, the total number of COVID-19 deaths was reported to be 2415; thus, our sample size covers 70% (1691/2415) of the COVID-19 mortality population. We also covered all death peaks and pandemic main waves during this time.</p>
<p>Initially, the data was extracted as a pdf file and then converted to an Excel spreadsheet. Patients&#x00027; demographics (age, gender, and residency), date of ICU admission, date of death, reasons for admission, admission diagnosis, final diagnosis, cause of death, brief history, brief summary, and contributing factors are all included in each record. To ensure confidentiality, all data was anonymized and all patient identifiers were removed. Additional data cleansing were also performed to ensure data accuracy.</p>
</sec>
<sec>
<title>Creating corpus of terminologies</title>
<p>The data sheet obtained from EHR has mainly eleven columns: date of admission to ICU, date of death, age at death, admission diagnosis, reason for admission, final diagnosis, cause of death (COD), brief history, brief summary, and contributing factors. Except for the first three, all remaining columns are text features.</p>
<p>The underlying cause of death, such as &#x0201C;COVID-19&#x0201D; or &#x0201C;COVID 19 pneumonia,&#x0201D; was listed in the COD column in many records, whereas the primary/intermediate causes were found explicitly or indirectly in the brief summary or brief history columns. Furthermore, there were two major flaws in the free-text notes in the mortality EHR. The first issue is that many terminologies have misspellings or improper forms. For example, multiorgan failure is referred to as &#x0201C;Multi-Organs&#x0201D; or &#x0201C;Multiorgan Failure.&#x0201D; The second issue is inconsistency in the reporting of text notes. The causes of death are not always listed in the data columns that you would expect. Comorbidities, on the other hand, are not consistently included in the list of contributing factors. As a result, we are unable to use the existing NLP tools or Unified Medical Language System (UMLS). We had to develop our own system to extract concepts, knowledge, and relationships from the mortality EHR at hand.</p>
<sec>
<title>CODs and comorbidity glossary tables</title>
<p>Our strategy is to extract the causes of death (COD) and comorbidities/diseases by using NLP techniques such as a bag-of-word (BoW) model. The BoW model will be applied on each column to extract all terms and phrases that represent the CODs and comorbidities for each patient. The model achieves this by tokenizing all text columns in the data sheet and creating a case/term occurrence matrix where each row represents a patient&#x00027;s case and each column represents each medical term relating to a cause of death or comorbidity, all other word tokens will be omitted. The cells of the matrix will contain a 0 or 1 representing the occurrence or absence of the term from the case. The terms related to cause of death will be categorized to three stages similar to the fashion of death certificates. These stages are the primary, intermediate and the underlying cause (which led to the intermediate).</p>
<p>The list of primary causes of death, according to WHO guidelines, denotes the condition (injury, complication, or disease) that directly preceded death. WHO issued an updated International Classification of Diseases (ICD) and health-related problems to accommodate COVID-19-related death complications (<xref ref-type="bibr" rid="B11">11</xref>). The condition(s) that led to the primary COD are reflected in the intermediate COD. Multiple complications contributing to the intermediate COD were identified in the majority of COVID-19 decedents in the ICU in this study. Additionally, COVID-19 pneumonia was the most frequently encountered underlying cause in those ICU cases, resulting in an intermediate stage of complication.</p>
<p>In order to create BoW, the COD and comorbidity terms were extracted from the EHR in several steps. Starting with a preliminary text analysis using the text mining package (<italic>tm</italic>) and the word cloud generator package (<italic>wordcloud</italic>) in R to extract the most common terms (<xref ref-type="fig" rid="F1">Figure 1</xref>). To create glossary tables, our medical experts validated the extracted terms by reviewing 50&#x02013;100 EHRs at random. The process was repeated four times to ensure that the majority of the terminologies were covered. This helped identify alternative terminologies and misspelled terms. <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref> show the refined list of primary COD and intermediate COD. In accordance with the International Classification of Diseases (<xref ref-type="bibr" rid="B11">11</xref>), <xref ref-type="table" rid="T3">Table 3</xref> provides twelve general disease categories (GDC), 34 distinct comorbidities, and a risk factor associated with our data. Detailed versions of <xref ref-type="table" rid="T1">Tables 1</xref>&#x02013;<xref ref-type="table" rid="T3">3</xref>, including all potential alternate terms and/or incorrect forms may be requested from the corresponding author.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Word cloud plot of death causes and contributing factors.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Primary causes of death.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Primary COD (Abbrev.)</bold></th>
<th valign="top" align="left"><bold>Alternative terms</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Cardiopulmonary arrest (CPA)</td>
<td valign="top" align="left">Cardiopulmonary collapse, cardiorespiratory arrest, cardiorespiratory failure, cardiorespiratory collapse, circulatory collapse, asystole</td>
</tr>
<tr>
<td valign="top" align="left">Cardiac arrest (CA)</td>
<td valign="top" align="left">Cardiogenic shock, cardiovascular collapse, cardiac event, bradycardic arrest, STEMI</td>
</tr>
<tr>
<td valign="top" align="left">Respiratory failure (HRF)</td>
<td valign="top" align="left">Pulmonary failure, pulmonary arrest, pulmonary dysfunction, hypoxia, hypoxic, hypoxemia, hypoxemic, desaturate</td>
</tr>
<tr>
<td valign="top" align="left">Multiorgan failure (MOF)</td>
<td valign="top" align="left">MODS, multiple organ dysfunction syndrome, multi organ failure, multiple organ failure, multisystem failure</td>
</tr>
<tr>
<td valign="top" align="left">Hepatic failure (LF)</td>
<td valign="top" align="left">Liver failure, worsening liver function, hepatic failure</td>
</tr>
<tr>
<td valign="top" align="left">Renal failure (RF)</td>
<td valign="top" align="left">kidney failure, dialysis, CRRT</td>
</tr>
<tr>
<td valign="top" align="left">Septic shock (SS)</td>
<td/>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Intermediate causes of death.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Intermediate COD (Abbrev.)</bold></th>
<th valign="top" align="left"><bold>Alternative terms</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Acute respiratory distress syndrome (ARDS)</td>
<td valign="top" align="left">Mechanical ventilation, acute respiratory failure, hypoxic respiratory failure, HRF</td>
</tr>
<tr>
<td valign="top" align="left">Acute kidney failure (AKI)</td>
<td valign="top" align="left">Acute kidney injury, renal impairment, anuric, hyperkalemia, dialysis</td>
</tr>
<tr>
<td valign="top" align="left">Pulmonary embolism (PE)</td>
<td valign="top" align="left">DVT collapse, thrombosis</td>
</tr>
<tr>
<td valign="top" align="left">Heart failure (HF)</td>
<td valign="top" align="left">Rescue PCI, cardiomyopathy, myocarditis</td>
</tr>
<tr>
<td valign="top" align="left">Stroke (ST)</td>
<td valign="top" align="left">CVA, cerebrovascular accident, failed thrombolysis, hemorrhagic cerebral, subdural, subarachnoid hemorrhage, hge</td>
</tr>
<tr>
<td valign="top" align="left">Pneumothorax (PN)</td>
<td valign="top" align="left">Tension pneumothorax, hemothorax, hemopneumothorax, hydropneumothorax, pneumoperitoneum, bilateral chest tubes, chest tube</td>
</tr>
<tr>
<td valign="top" align="left">Myocardial infarction (MI)</td>
<td valign="top" align="left">STEMI, PCI, CCU, ischemic changes, cardiac strain, st elevation, troponin elevated, NStemi</td>
</tr>
<tr>
<td valign="top" align="left">Arrhythmia (AR)</td>
<td valign="top" align="left">Ventricular fibrillation, VFib, ventricular tachycardia, vtach, rhythm, atrial fibrillation, AF, PAF</td>
</tr>
<tr>
<td valign="top" align="left">Bleeding (BL)</td>
<td valign="top" align="left">ICH, hematoma, AVM, intracerebral hemorrhage, epistaxis, PRBC, transfusion, melena, upper GI bleeds</td>
</tr>
<tr>
<td valign="top" align="left">Disseminated intravascular coagulation (DI)</td>
<td valign="top" align="left">DIC</td>
</tr>
<tr>
<td valign="top" align="left">Urinary tract infection (UT)</td>
<td valign="top" align="left">UTI, urinary tract infection, urosepsis, E.col</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>General disease categories (GDC), comorbidities and other risk factors.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>GDC (Abbrev.)</bold></th>
<th valign="top" align="left"><bold>Comorbidity/risk factor (Abbrev.)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Endocrine, nutritional, and metabolic diseases (ENMs)</td>
<td valign="top" align="left">Diabetes mellitus (DM), thyroid disease (THY), dyslipidemia (DLP), obesity (OB), Addison disease (ADs)</td>
</tr>
<tr>
<td valign="top" align="left">Diseases of the nervous system (DNS)</td>
<td valign="top" align="left">Stroke (CVA), Parkinson&#x00027;s disease (PD), dementia (DEM), multiple sclerosis (MS), epilepsy (EP), psychiatric disorders (OCD)</td>
</tr>
<tr>
<td valign="top" align="left">Diseases of the circulatory system (DCS)</td>
<td valign="top" align="left">Hypertension (HTN), anemia (IDA), pulmonary embolism (PE), peripheral vascular disease (PVD), bleeding disorders (BDs)</td>
</tr>
<tr>
<td valign="top" align="left">Cardiovascular system diseases (CVD)</td>
<td valign="top" align="left">Coronary artery disease (CAD), cardiomyopathy (HCM), valvular heart disease (AVR), heart failure (HF), arrhythmia (AF)</td>
</tr>
<tr>
<td valign="top" align="left">Respiratory diseases (RDs)</td>
<td valign="top" align="left">Asthma (BA), chronic obstructive pulmonary disease (COPD), lung disease (LD)</td>
</tr>
<tr>
<td valign="top" align="left">GI disorders (GIDs)</td>
<td valign="top" align="left">Inflammatory bowel disease (IBD), gastroesophageal reflux (GERD), liver disease (LD)</td>
</tr>
<tr>
<td valign="top" align="left">Diseases of the genitourinary (DGS)</td>
<td valign="top" align="left">Chronic kidney disease (CKD), benign prostatic hyperplasia (BPH)</td>
</tr>
<tr>
<td valign="top" align="left">Autoimmune disorders (ADs)</td>
<td valign="top" align="left">Rheumatoid arthritis (RA), Immunecompromised (IC) (<italic>a risk factor</italic>)</td>
</tr>
<tr>
<td valign="top" align="left">Ortho disorders (ODs)</td>
<td valign="top" align="left">Bone disorders (OA)</td>
</tr>
<tr>
<td valign="top" align="left">Infectious diseases (IDs)</td>
<td valign="top" align="left">HIV-infection (HIV)</td>
</tr>
<tr>
<td valign="top" align="left">Neoplasms (CRC)</td>
<td valign="top" align="left">Cancer (CA) of any kind</td>
</tr>
<tr>
<td valign="top" align="left">Congenital disorders (CDs)</td>
<td valign="top" align="left">Down syndrome (DS)</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec>
<title>Developing and applying NLP methods</title>
<p>We created an NLP method to identify, extract, and automatically encode natural language from mortality EHRs into structured clinical data. <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref> are used as keywords to extract primary and intermediate CODs, while <xref ref-type="table" rid="T3">Table 3</xref> presents keywords to extract comorbidities. Method created in Python. <xref ref-type="fig" rid="F2">Figure 2</xref> shows our algorithm.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Primary and Intermediate CODs encoding flowchart.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0002.tif"/>
</fig>
<p>In this method, text is stripped of punctuation, special characters, capitalization, stop words, and tokenization. Used EHR variables include cause of death, final diagnosis, brief history, and brief summary. To create a case/COD term occurrence matrix, binary variables must be created for each primary/intermediate COD listed in <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref>. Initial occurrence matrix setting is zero. CODs or equivalents are compatible with tokens. The case/term occurrence matrix cell is set to 1 upon a match. Every case applies (rows). A COD abbreviation was not mistaken for a term, as PE is not present in hypertensive or hyperthyroid. Negation was also carefully handled; if a term is preceded by a negative or conditional word, it will not match. Exclusion words consist of (no, not, no sign of, non, no history of, no active, no previous medical, not known to have, no indications of, previous condition, old condition). Text format is used to list the final primary and intermediate CODs. The pseudocode used to extract the final primary COD is depicted in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Pseudocode for Primary COD.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0003.tif"/>
</fig>
<p>Determining the actual intermediate CODs are handled differently. Multiple intermediate CODs are reported as a group. Our clinicians manually validated and separated the correct outcome to determine which disorders were terminal. A counter matching the extracted causes is also computed to help identify the terminal cause based on the most common causes to cross-check the accuracy of the findings.</p>
<p>The comorbidities for each case are identified using <xref ref-type="table" rid="T3">Table 3</xref> in the same manner that CODs are identified. Preprocessed word tokens are extracted from the EHR reason for admission, contributing factors, admission diagnosis and brief summary.</p>
</sec>
<sec>
<title>Data manipulation and analysis</title>
<p>Original EHR mortality data had two sets of variables. First set included seven categorical and quantitative variables. Second set included eight free-text variables. The pdf data sheet was converted to an Excel sheet for data manipulation and cleaning. The second set of data was used to generate 70 variables using Python to determine death causes and comorbidities. During exploratory data analysis, we generated appropriate graphs (bar, pie, boxplots) and summary statistics (mean, median, SD, IQR). Hypothesis tests included Chi-square, TURF, ANOVA, and Kruskal Wallis. Finally, we built our prediction model with a decision tree. SPSS V23 and R were used for the statistical analysis.</p>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>Results</title>
<sec>
<title>Overall findings</title>
<p>The majority of the 1,691 anonymous COVID-19 decedents were male 963 (56.9%). The age at death ranges between 19.8 and 103.2 years with 63.8 years (SD 14.4). On the average the duration stay in ICU prior to death was 18.5 days (SD 12.8). Two or more comorbidities were present (mean 2.5, SD 1.9) with hypertension and diabetes mellitus shared among more than half of them (<xref ref-type="table" rid="T4">Table 4</xref>). Since these patients died in the intensive care unit, COVID-19 pneumonia was mainly the underlying cause of death that resulted in intermediate and thus primary causes of death. COVID-19 pneumonia was detected in 94 percent of cases (1592/1691).</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Demographic and clinical characteristics.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="left"><bold>Summary</bold></th>
<th valign="top" align="left"><bold>Count (%)</bold></th>
<th valign="top" align="left"><bold>Graph</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Age</td>
<td valign="top" align="left">Mean (sd): 63.8 (14.4)<break/> min &#x02264; med &#x02264; max: 19.8 &#x02264; 64.5 &#x02264; 103.2<break/> IQR (CV): 20.7 (0.2)</td>
<td valign="top" align="left">662 distinct values</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0001.tif"/></td>
</tr>
<tr>
<td valign="top" align="left">Age group</td>
<td valign="top" align="left">1. (&#x0003C; 50) years<break/> 2. (50&#x02013;64) years<break/> 3. (65&#x0002B;) years</td>
<td valign="top" align="left">303 (17.9%)<break/> 573 (33.9%)<break/> 815 (48.2%)</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0002.tif"/></td>
</tr>
<tr>
<td valign="top" align="left">Gender</td>
<td valign="top" align="left">1. Female<break/> 2. Male</td>
<td valign="top" align="left">728 (43.1%)<break/> 963 (56.9%)</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0003.tif"/></td>
</tr>
<tr>
<td valign="top" align="left">ICU (days)</td>
<td valign="top" align="left">Mean (sd): 18.5 (12.8)<break/> min &#x02264; med &#x02264; max: 0 &#x02264; 16 &#x02264; 86<break/> IQR (CV): 14 (0.7)</td>
<td valign="top" align="left">74 distinct values</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0004.tif"/></td>
</tr>
<tr>
<td valign="top" align="left">Total comorbidities</td>
<td valign="top" align="left">Mean (sd): 2.7 (1.9)<break/> min &#x02264; med &#x02264; max: 0 &#x02264; 3 &#x02264; 11<break/> IQR (CV): 3 (0.7)</td>
<td valign="top" align="left">12 distinct values</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0005.tif"/></td>
</tr>
<tr>
<td valign="top" align="left">Total comorbidities group</td>
<td valign="top" align="left">Mean (sd): 2.6 (1.7)<break/> min &#x02264; med &#x02264; max: 0 &#x02264; 3 &#x02264; 6<break/> IQR (CV): 3 (0.6)</td>
<td valign="top" align="left">0 : 172 (10.8%)<break/> 1 : 288 (18.1%)<break/> 2 : 333 (20.9%)<break/> 3 : 304 (19.1%)<break/> 4 : 245 (15.4%)<break/> 5 : 143 (9.0%)<break/> 6 : 110 (6.9%)</td>
<td valign="top" align="left"><inline-graphic xlink:href="fpubh-10-1070870-i0006.tif"/></td>
</tr>
</tbody>
</table>
</table-wrap>
<p>When the mean ICU stay was compared across the three age groups of &#x0003C;50, 50&#x02013;64, and 65 or more, no significant difference (<xref ref-type="table" rid="T5">Table 5</xref>) using the ANOVA F-test (<italic>p</italic>-value = 0.903). On the other hand, testing for mean total comorbidities across these three age groups was significant (<italic>p</italic>-value &#x0003C;0.0001), and the Tukey B multiple comparison test reveals significance with three means for groups in homogenous subsets of mean total comorbidities of 1.25, 2.19, and 3.28, respectively.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Demographic and clinical characteristics by age group.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th/>
<th/>
<th valign="top" align="center" colspan="3" style="border-bottom: thin solid #000000;"><bold>Age group (yrs.)</bold></th>
<th/>
</tr>
<tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="center"><bold>N</bold></th>
<th valign="top" align="center"><bold>Overall</bold></th>
<th valign="top" align="center"><bold>Age &#x0003C; 50</bold></th>
<th valign="top" align="center"><bold>Age [50-64]</bold></th>
<th valign="top" align="center"><bold>Age =65&#x0002B;</bold></th>
<th valign="top" align="center"><italic><bold>p</bold></italic><bold>-value<italic><xref ref-type="table-fn" rid="TN2"><sup>b</sup></xref></italic></bold></th>
</tr>
<tr>
<th/>
<th/>
<th valign="top" align="center"><italic><bold>N</bold></italic> <bold>= 1,691<italic><xref ref-type="table-fn" rid="TN1"><sup>a</sup></xref></italic></bold></th>
<th valign="top" align="center"><italic><bold>N</bold></italic> <bold>= 303<italic><xref ref-type="table-fn" rid="TN1"><sup>a</sup></xref></italic></bold></th>
<th valign="top" align="center"><italic><bold>N</bold></italic> <bold>= 573<italic><xref ref-type="table-fn" rid="TN1"><sup>a</sup></xref></italic></bold></th>
<th valign="top" align="center"><italic><bold>N</bold></italic> <bold>= 815<italic><xref ref-type="table-fn" rid="TN1"><sup>a</sup></xref></italic></bold></th>
<th/>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><bold>Age</bold></td>
<td valign="top" align="center">1,691</td>
<td valign="top" align="center">64 (54, 74)</td>
<td valign="top" align="center">43 (39, 47)</td>
<td valign="top" align="center">58 (54, 62)</td>
<td valign="top" align="center">75 (70, 81)</td>
<td valign="top" align="center">&#x0003C;0.001</td>
</tr>
<tr>
<td valign="top" align="left"><bold>Gender</bold></td>
<td valign="top" align="center">1,691</td>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">0.003</td>
</tr>
<tr>
<td valign="top" align="left">Female</td>
<td/>
<td valign="top" align="center">728 (43%)</td>
<td valign="top" align="center">114 (38%)</td>
<td valign="top" align="center">229 (40%)</td>
<td valign="top" align="center">385 (47%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">Male</td>
<td/>
<td valign="top" align="center">963 (57%)</td>
<td valign="top" align="center">189 (62%)</td>
<td valign="top" align="center">344 (60%)</td>
<td valign="top" align="center">430 (53%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left"><bold>ICU (days)</bold></td>
<td valign="top" align="center">1,691</td>
<td valign="top" align="center">16 (10, 24)</td>
<td valign="top" align="center">14 (9, 23)</td>
<td valign="top" align="center">16 (10, 24)</td>
<td valign="top" align="center">16 (10, 24)</td>
<td valign="top" align="center">0.19</td>
</tr>
<tr>
<td valign="top" align="left"><bold>Total comorbidities</bold></td>
<td valign="top" align="center">1,596</td>
<td valign="top" align="center">3 (1, 4)</td>
<td valign="top" align="center">1 (0, 2)</td>
<td valign="top" align="center">2 (1, 3)</td>
<td valign="top" align="center">3 (2, 5)</td>
<td valign="top" align="center">&#x0003C;0.001</td>
</tr>
<tr>
<td valign="top" align="left">Unknown</td>
<td/>
<td valign="top" align="center">95</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">33</td>
<td valign="top" align="center">62</td>
<td/>
</tr>
<tr>
<td valign="top" align="left"><bold>Total group comorbidities</bold></td>
<td valign="top" align="center">1,595</td>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x0003C;0.001</td>
</tr>
<tr>
<td valign="top" align="left">0</td>
<td/>
<td valign="top" align="center">172 (11%)</td>
<td valign="top" align="center">111 (37%)</td>
<td valign="top" align="center">61 (11%)</td>
<td valign="top" align="center">0 (0%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">1</td>
<td/>
<td valign="top" align="center">288 (18%)</td>
<td valign="top" align="center">85 (28%)</td>
<td valign="top" align="center">122 (23%)</td>
<td valign="top" align="center">81 (11%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td/>
<td valign="top" align="center">333 (21%)</td>
<td valign="top" align="center">59 (19%)</td>
<td valign="top" align="center">136 (25%)</td>
<td valign="top" align="center">138 (18%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td/>
<td valign="top" align="center">304 (19%)</td>
<td valign="top" align="center">30 (9.9%)</td>
<td valign="top" align="center">99 (18%)</td>
<td valign="top" align="center">175 (23%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td/>
<td valign="top" align="center">245 (15%)</td>
<td valign="top" align="center">12 (4.0%)</td>
<td valign="top" align="center">72 (13%)</td>
<td valign="top" align="center">161 (21%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td/>
<td valign="top" align="center">143 (9.0%)</td>
<td valign="top" align="center">2 (0.7%)</td>
<td valign="top" align="center">32 (5.9%)</td>
<td valign="top" align="center">109 (14%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">6</td>
<td/>
<td valign="top" align="center">110 (6.9%)</td>
<td valign="top" align="center">4 (1.3%)</td>
<td valign="top" align="center">18 (3.3%)</td>
<td valign="top" align="center">88 (12%)</td>
<td/>
</tr>
<tr>
<td valign="top" align="left">Unknown</td>
<td/>
<td valign="top" align="center">96</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">33</td>
<td valign="top" align="center">63</td>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn id="TN1"><label>a</label><p>Median (IQR) or Frequency (%).</p></fn>
<fn id="TN2"><label>b</label><p>Kruskal-Wallis rank sum test; Pearson&#x00027;s Chi-squared test.</p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec>
<title>Clinical characteristics and common causes of death among COVID-19 patients</title>
<p>We identified primary and secondary causes of death. Septic shock was the primary COD in 667 patients (39.4%), followed by cardiopulmonary arrest 304 (18.0%), respiratory failure 235 (13.9%), and cardiac arrest 180 (10.6%). The percentages of cases with (septic shock &#x00026; MOF), MOF, and renal failure were 135 (8.0%), 125 (7.4), and 44 (2.6%), respectively. Hepatic failure occurred in only one case and thus ignored from further analysis. On the other hand, ARDS was one of the main reasons for ICU admissions and was reported in all deaths. Numerous cases were reported in which a combination of intermediate death complications occurred. These cases were thoroughly examined by our physicians to determine which terminal complication is more likely to be classified as the intermediate COD. It was found that around 75% of these decedents had ARDS as an intermediate COD, while the remaining 25% had intermediate COD other than ARDS. Among the other causes are AKI, AR, BL, DI, HF, MI, PE, PN, ST, and UT. The frequency distribution of intermediate combined complications along with the frequency distribution of the terminal complication leading to intermediate COD are shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. <xref ref-type="table" rid="T6">Table 6</xref> shows the count and percentage of counts for primary and intermediate causes, as well as the column percentages for primary causes. While ARDS is the most prevalent intermediate COD regardless of primary cause, AR and MI disorders were significantly (7.2&#x02013;14.4%) linked with cardiac arrest and MOF.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Combined intermediate complications and terminal Intermediate COD.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0004.tif"/>
</fig>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Primary by intermediate causes of death.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="center" colspan="8" style="border-bottom: thin solid #000000;"><bold>Primary COD count (%)</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>Inter-mediate COD</bold></th>
<th valign="top" align="center"><bold>SS</bold></th>
<th valign="top" align="center"><bold>CPA</bold></th>
<th valign="top" align="center"><bold>HRF</bold></th>
<th valign="top" align="center"><bold>CA</bold></th>
<th valign="top" align="center"><bold>SS &#x0002B; MOF</bold></th>
<th valign="top" align="center"><bold>MOF</bold></th>
<th valign="top" align="center"><bold>RF</bold></th>
<th valign="top" align="center"><bold>Total</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left"><bold>AR</bold></td>
<td valign="top" align="center">53 (7.9)</td>
<td valign="top" align="center">26 (8.6)</td>
<td valign="top" align="center">11 (4.7)</td>
<td valign="top" align="center">26 (14.4)</td>
<td valign="top" align="center">14 (10.4)</td>
<td valign="top" align="center">12 (9.6)</td>
<td valign="top" align="center">3 (6.8)</td>
<td valign="top" align="center">145 (8.6)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>ARDS</bold></td>
<td valign="top" align="center">523 (78.4)</td>
<td valign="top" align="center">220 (72.4)</td>
<td valign="top" align="center">194 (82.6)</td>
<td valign="top" align="center">117 (65)</td>
<td valign="top" align="center">94 (69.6)</td>
<td valign="top" align="center">8 (64.8)</td>
<td valign="top" align="center">35 (79.5)</td>
<td valign="top" align="center">1,265 (74.8)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>BL</bold></td>
<td valign="top" align="center">12 (1.8)</td>
<td valign="top" align="center">3 (1)</td>
<td valign="top" align="center">5 (2.1)</td>
<td valign="top" align="center">3 (1.7)</td>
<td valign="top" align="center">1 (0.7)</td>
<td valign="top" align="center">5 (4)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">29 (1.7)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>DI</bold></td>
<td valign="top" align="center">14 (2.1)</td>
<td valign="top" align="center">2 (0.7)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (0.6)</td>
<td valign="top" align="center">11 (8.1)</td>
<td valign="top" align="center">4 (3.2)</td>
<td valign="top" align="center">1 (2.3)</td>
<td valign="top" align="center">33 (2)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>HF</bold></td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">2 (0.7)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (0.6)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (0.8)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">4 (0.2)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>MI</bold></td>
<td valign="top" align="center">23 (3.4)</td>
<td valign="top" align="center">26 (8.6)</td>
<td valign="top" align="center">4 (1.7)</td>
<td valign="top" align="center">21 (11.7)</td>
<td valign="top" align="center">5 (3.7)</td>
<td valign="top" align="center">9 (7.2)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">88 (5.2)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>PE</bold></td>
<td valign="top" align="center">14 (2.1)</td>
<td valign="top" align="center">16 (5.3)</td>
<td valign="top" align="center">12 (5.1)</td>
<td valign="top" align="center">7 (3.9)</td>
<td valign="top" align="center">5 (3.7)</td>
<td valign="top" align="center">5 (4)</td>
<td valign="top" align="center">1 (2.3)</td>
<td valign="top" align="center">60 (3.5)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>PN</bold></td>
<td valign="top" align="center">22 (3.3)</td>
<td valign="top" align="center">8 (2.6)</td>
<td valign="top" align="center">7 (3)</td>
<td valign="top" align="center">3 (1.7)</td>
<td valign="top" align="center">2 (1.5)</td>
<td valign="top" align="center">6 (4.8)</td>
<td valign="top" align="center">3 (6.8)</td>
<td valign="top" align="center">51 (3)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>ST</bold></td>
<td valign="top" align="center">2 (0.3)</td>
<td valign="top" align="center">1 (0.3)</td>
<td valign="top" align="center">1 (0.4)</td>
<td valign="top" align="center">1 (0.6)</td>
<td valign="top" align="center">1 (0.7)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">6 (0.4)</td>
</tr>
<tr>
<td valign="top" align="left"><bold>UT</bold></td>
<td valign="top" align="center">2 (0.3)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">1 (0.4)</td>
<td valign="top" align="center">0(0)</td>
<td valign="top" align="center">1 (0.7)</td>
<td valign="top" align="center">1(0.8)</td>
<td valign="top" align="center">0 (0)</td>
<td valign="top" align="center">5 (0.3)</td>
</tr>
<tr>
<td valign="top" align="left">Total</td>
<td valign="top" align="center">667 (100)</td>
<td valign="top" align="center">304 (100)</td>
<td valign="top" align="center">235 (100)</td>
<td valign="top" align="center">180 (100)</td>
<td valign="top" align="center">135 (100)</td>
<td valign="top" align="center">125 (100)</td>
<td valign="top" align="center">44 (100)</td>
<td valign="top" align="center">1,691 (100)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Age distribution appears to be similar by primary COD, with a median age at death of 64.5 years and an interquartile range (IQR = 20.7). However, a few young patients, approximately the age of 20, died because of MOF or renal failure (<xref ref-type="fig" rid="F5">Figure 5</xref>). The median length of stay in the ICU prior to death was approximately 16 days overall but was significantly longer (&#x0007E; 20 days) for those who died of septic shock or (septic shock &#x0002B; MOF). Patients who died because of MOF had an average of three or more comorbidities. Those who died of renal failure and (septic shock &#x0002B; MOF) died in a manner like that described above.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Primary and Intermediate CODs distribution by age and clinical characteristics.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0005.tif"/>
</fig>
<p>Those who died because of AR, DI, or MI had the highest average age (70 years) and total comorbidities (3 or more), as well as the shortest average stay in the ICU. Patients who died of HF were younger (average age 50 years) and had more than two comorbidities, with an ICU stay of &#x0003C; 20 days. We also noticed the sequences of (MI &#x02192; septic shock) and (PE &#x02192; respiratory failure) were associate with 4 or above comorbidities on the average.</p>
</sec>
<sec>
<title>Exploring the relationship between comorbidities and causes of death</title>
<p>The following is a list of the comorbidities of dead patients in this study. Hypertension (57%) is the most common condition, followed by diabetes (52%), coronary artery disease (23%), and chronic renal disease (14%). 12% for each arrhythmia and dyslipidemia, cancer (11%), Rheumatoid arthritis (10%), obesity (9%), thyroid disease (8%), stroke (7%), pulmonary embolism (5%), asthma (4%), valvular heart disease (4%), bleeding disorders (4%), and 3% for each COPD and dementia. The remaining comorbidities with &#x0003C; 3% reported incidence include Anemia, heart failure, prostate hyperplasia, liver disease, epilepsy, cardiomyopathy, peripheral vascular disease, lung disease, psychiatric disorders, osteoporosis, multiple sclerosis, down syndrome, Parkinson&#x00027;s disease, inflammatory bowel disease, gastroesophageal reflux disease, Addison disease, and HIV infection.</p>
<p>Next, we present the results of the Total Unduplicated Reach and Frequency (TURF) method. TURF is a popular statistical technique in market research that ranks product combinations according to the number of customers who favor them (<xref ref-type="bibr" rid="B12">12</xref>). In this study, we applied the method in a clinical setting, treating comorbidities and patients as products and people. The goal is to determine the most likely disease combinations that these patients share. The analysis traverses all possible combinations of comorbidities and records two statistics for each: reach and frequency. The reach is the percentage of individuals who exhibit at least one comorbidity in a given combination, and the frequency is the total number of times comorbidities are exhibited in a given combination. We tested the method for all comorbidities listed in <xref ref-type="table" rid="T3">Table 3</xref> and a range of reach values. <xref ref-type="table" rid="T7">Table 7</xref> provides a summary of the ideal choices according to the number of diseases (Size). For instance, the optimal combination of four comorbidities has a 73 percent success rate with RA, cancer, DM, and HTN. This indicates that seventy-three percent of the patients had at least one of the conditions (rheumatological disorders, cancer, DM, HTN). If Diabetes and High Blood Pressure were eliminated from the analysis due to their high prevalence and we wanted to evaluate other possible combinations of diseases, the one with the highest prevalence was (obesity, CAD, Cancer, RA) with 43.6%.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Best reach and frequency by group size.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="center"><bold>Size</bold></th>
<th valign="top" align="center"><bold>Reach</bold></th>
<th valign="top" align="center"><bold>Cases %</bold></th>
<th valign="top" align="center"><bold>Count</bold></th>
<th valign="top" align="center"><bold>Responses %</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ADDED: HTN</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">962</td>
<td valign="top" align="center">56.9</td>
<td valign="top" align="center">962</td>
<td valign="top" align="center">27.7</td>
</tr>
<tr>
<td valign="top" align="left">ADDED: DM</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">1,139</td>
<td valign="top" align="center">67.4</td>
<td valign="top" align="center">1,834</td>
<td valign="top" align="center">52.9</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6">KEPT: HTN</td>
</tr>
<tr>
<td valign="top" align="left">ADDED: Cancer</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">1,195</td>
<td valign="top" align="center">70.7</td>
<td valign="top" align="center">2,026</td>
<td valign="top" align="center">58.4</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6">KEPT: DM, HTN</td>
</tr>
<tr>
<td valign="top" align="left">ADDED: RA</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">1,234</td>
<td valign="top" align="center">73.0</td>
<td valign="top" align="center">2,196</td>
<td valign="top" align="center">63.3</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6">KEPT: Cancer, DM, HTN</td>
</tr>
<tr>
<td valign="top" align="left">ADDED: AF</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">1,266</td>
<td valign="top" align="center">74.9</td>
<td valign="top" align="center">2,402</td>
<td valign="top" align="center">69.3</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6">KEPT: Cancer, DM, HTN, RA</td>
</tr>
<tr>
<td valign="top" align="left">ADDED: Obesity</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">1,296</td>
<td valign="top" align="center">76.6</td>
<td valign="top" align="center">2,550</td>
<td valign="top" align="center">73.5</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6">KEPT: AF, Cancer, DM, HTN, RA</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>When we looked at the general disease classification frequencies, we found that over 60% of the patients had circulatory (DCS) and endocrine (ENMS) disorders, one-third had cardiovascular diseases (CVD), and the remaining categories (RDs, CRC, DNS, ADs, DGS) varied from 8 to 15%. In compared to patients who died of ARDS/PE/Other, approximately 65 percent of patients who died of MI or AR had cardiovascular illnesses (<xref ref-type="fig" rid="F6">Figure 6</xref>). Those who die from ARDS, on the other hand, usually have endocrine or circulatory system problems. Nervous system diseases were the least common among the PE dead. With chi-square test findings of (175.5, <italic>p</italic>-value 0.001) and (12.2, <italic>p</italic>-value = 0.016), the circulatory and nervous systems had the most significant association with intermediate COD.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Proportions of general disease categories by Intermediate COD.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0006.tif"/>
</fig>
</sec>
<sec>
<title>Predicting death due to ARDS or other causes</title>
<p>The total comorbidities distribution by age group of COVID-19 deaths due to ARDS or other cause is displayed in <xref ref-type="fig" rid="F7">Figure 7</xref>. Patients under the age of 50 have a similar comorbidity distribution, with an average of one disease. Two comorbidities were found on average per age group (50&#x02013;64) with more variation among those who died from causes other than ARDS. In contrast, older patients (age &#x0003E;65) who died from causes other than ARDS have an average of four comorbidities, compared to three for the other group who died mainly from ARDS.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Total comorbidities by age group due to ARDS or other causes of death.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0007.tif"/>
</fig>
<p>In this section, we used decision tree (DT) to determine the most parsimonious predictors of intermediate COD among COVID-19 patients in intensive care units. Decision trees learn to divide data into smaller and smaller categories to forecast the goal. The test is represented by a node, while the numerous outcomes are represented by edges. The dividing process is repeated until no further gains can be obtained or a preset rule is reached. Three common decision tree techniques include classification and regression tree (CART), chi-squared automatic interaction detection (CHAID), and quick unbiased efficient statistical tree (QUIEST). For mathematical explanations and performance comparisons of these DT approaches, see Lin et al. (<xref ref-type="bibr" rid="B13">13</xref>). <xref ref-type="fig" rid="F8">Figure 8</xref> illustrates the results of the QUEST model, which demonstrate that the existence of an arrhythmia (AF) was the best indicator of the intermediate cause (ARDS/Other). Patients with AF are more likely to have a cause other than ARDS (54.9%). Node 1 is considered a terminal node for predicting a cause of death other than ARDS since no child nodes was found below it. In patients without AF, on the other hand, CAD was the second-best predictor of (ARDS/Other). In patients without AF but with CAD, the terminal Node 3 predicted 66.9 ARDS vs. 33.1% for other causes. PE is an additional predictor in the model for patients who do not have AF or CAD. ARDS is the main intermediate COD in this group, accounting for over 83% of patients without PE and 58% of patients with PE who died from ARDS. The risk and classification tables allow for a quick evaluation of the model&#x00027;s performance. The risk of misclassifying the cause of death is estimated to be 0.272 (or 27.2%), which is consistent with the results of the classification table, which show that 76% of causes of death are correctly classified.</p>
<fig id="F8" position="float">
<label>Figure 8</label>
<caption><p>Decision Tree prediction model of ARDS/Other causes of death.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-1070870-g0008.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>Discussion</title>
<p>We used Machine learning NLP to extract clinical data and causes of death from EHRs for COVID-19 patients at Jaber Hospital in Kuwait. Consistency and completeness issues with the text data in these records made extraction difficult. During the pandemic, Jaber hospital was restricted to COVID-19 admissions, with most critical cases transferred from other hospitals. Many patient records were incomplete due to patients being transferred from district hospitals where their original medical records were kept. Machine learning and big data analytics have been used to investigate disease-related prognostic factors (<xref ref-type="bibr" rid="B14">14</xref>).</p>
<p>Several clinical characteristics have been linked to COVID-19 mortality. Age, gender, comorbidities, ICU stay, and disease severity are all factors. Increased proportions of 65-year-olds or older led to a significant age-mortality association (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B16">16</xref>). Males were more likely to die from COVID-19 (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B18">18</xref>). More than double the number of death patients had two or more comorbidities, according to Ayed et al. (<xref ref-type="bibr" rid="B17">17</xref>). Combining old age and comorbidities was also a factor in death (<xref ref-type="bibr" rid="B19">19</xref>) and survival time (<xref ref-type="bibr" rid="B20">20</xref>). On the other hand, Zhou et al. (<xref ref-type="bibr" rid="B21">21</xref>) reported a median (IQR) time of 18.5 (<xref ref-type="bibr" rid="B15">15</xref>&#x02013;<xref ref-type="bibr" rid="B22">22</xref>) days from onset of symptoms to death. In our study, 815 (48%) of 1691 deceased ICU COVID-19 patients were over 65, men were more prevalent (56.9 vs. 43.1%), patients with two or more comorbidities accounted for 52% of cases, and the mean (SD) survival time to death was 18.5 (12.8) days. Hypertension and diabetes accounted for more than half of all cases in this study. This confirms prior research (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B22">22</xref>&#x02013;<xref ref-type="bibr" rid="B24">24</xref>). In COVID-19 patients, cardiovascular disease and secondary infections increase disease severity and mortality (<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B25">25</xref>, <xref ref-type="bibr" rid="B26">26</xref>). Circulatory and cardiovascular diseases account for 61.6 and 32.5% of these patients, respectively; HIV-infections are rare. COVID-19 patients had a higher incidence of kidney and heart disease, and myocardium damage reduced survival (<xref ref-type="bibr" rid="B16">16</xref>, <xref ref-type="bibr" rid="B27">27</xref>, <xref ref-type="bibr" rid="B28">28</xref>).</p>
<p>Previous research on comorbidities and death causes has linked dysfunction to mortality (<xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B29">29</xref>). In this study, decedents with MOF and renal failure averaged three or more comorbidities. Septic shock was the leading primary cause, accounting for 667 deaths (39.4%), followed by cardiopulmonary arrest (304 deaths, 18%), respiratory failure (235 deaths, 13.9%), and cardiac arrest (180 deaths, 10.6%). The most common intermediate COD, on the other hand, was ARDS (1265, 74.8%). We also found 849 (50.2%) cases of sepsis. Other findings (<xref ref-type="bibr" rid="B21">21</xref>) revealed that sepsis was the leading cause of death (59%) among the 54 pandemic deaths, followed by respiratory failure (54%), ARDS (31%), heart failure (23%), and septic shock (20%).</p>
<p>Acute respiratory distress syndrome (ARDS) is a severe COVID-19 consequence. Patients with moderate-to-severe ARDS require invasive mechanical ventilation and intensive medical therapy (<xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B31">31</xref>). ARDS was one of the most common reasons for ICU hospitalizations, as it was recorded in 81.8% of ICU survivors and all fatalities (<xref ref-type="bibr" rid="B32">32</xref>). This is also demonstrated in our data, as all patients were admitted to the intensive care unit, and ARDS was a common morbid consequence. However, complications other than ARDS were deemed the predominant intermediate COD in 25% of the cases (<xref ref-type="fig" rid="F4">Figure 4</xref>). As a result, we employed decision trees to forecast the most significant contributing factors to intermediate COD, namely ARDS or Other cause. &#x0201C;Other&#x0201D; denotes a complication associated with AKI, AR, BL, DI, HF, MI, PE, PN, ST, or UT. We encountered only three significant predictors, namely arrhythmia (AF), coronary artery disease (CAD), and pulmonary embolism (PE). Patients with AF were more likely to have an etiology other than ARDS. According to Elezkurtaj et al. (<xref ref-type="bibr" rid="B33">33</xref>), the majority of decedents died from COVID-19, with preexisting health conditions and comorbidities only contributing to the mechanism of death. We agree because, among the many variables examined in this study, only a few contributing factors were found to be significant with intermediate COD.</p>
<sec>
<title>Strengths, limitations, and future work</title>
<p>The dynamic nature of the method, its usability, and its potential to maintain self-control all contribute to its strength. In addition, the sampled data span both significant pandemic waves and death peaks, accounting for 70% of the total reported COVID-19 fatality cases in Kuwait. The death rate drastically decreased after then. Therefore, our sample represents the population under consideration to a high degree of accuracy. Nevertheless, our study has several limitations. First, there is a chance of selection or referral bias as the research was conducted at a single location, i.e., Jaber Hospital. Second, the lack of information extracted from the inadequate documentation of the patient records. The absence of a symptom (such as obesity, smoking, etc.) does not necessarily suggest that a patient is symptom-free. Thirdly, patients were typically transferred late in the course of their disease, and their medical records lacked vital medical history information. Such discrepancies in clinical data may result in information bias that contributes to a decrease in model precision.</p>
<p>Future studies could potentially investigate the impact of vaccines on the time to death, provide survival time estimates by cause of death, and perform spatiotemporal analyses of transferable patients. Knowing the COVID-19 death rate and patient survival rate can help risk management experts. COVID-19 or its evolving variants can be avoided, and strategies can be used to slow their spread.</p>
</sec>
</sec>
<sec sec-type="conclusions" id="s5">
<title>Conclusion</title>
<p>We employ self-developed natural language processing (NLP) to automate the extraction of causes of death and comorbidities from the EHRs of COVID-19 decedents from the beginning of the pandemic through all major pandemic waves in this study. We structured the acquired text data and used it to conduct additional research.</p>
<p>We analyzed the demographic, clinical, and causes of death data for 1,691 ICU patients and discovered that the most common primary causes of death, which were documented in 54.8% of cases, were infection-related and included septic shock or sepsis-related multi-organ failure. The second most common cause of death was respiratory failure or cardiopulmonary arrest, which were documented in 32.2% of cases. Furthermore, cardiac arrest and renal failure account for 10.6 and 2.6% of all deaths, respectively. ARDS, on the other hand, was the most common cause of mortality in the intermediate stage. Arrhythmia (AF) was revealed to be the strongest predictor of intermediate cause (ARDS/Other) using machine learning decision tree analysis.</p>
<p>We recommend structuring the EHR with well-defined sections and providing menu-driven options for reporting causes of death and comorbidities to minimize misspellings or incorrect forms. Comprehensive assessment and user guidance are required for standards to be effectively integrated into EHR systems.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s7">
<title>Ethics statement</title>
<p>The studies involving human participants were reviewed and approved by Ethical Review Committee (ERC) at Kuwait Ministry of Health (No. 1529/2020). Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.</p>
</sec>
<sec id="s8">
<title>Author contributions</title>
<p>SB conceived and gained ethical approval for the project. SB, AB, and YA participated in the retrieval, processing, and purification of data. AB and YA developed the clinical concepts, played a key role in establishing the data extraction clinical criteria, and validations. SB and AA created both the method and the programming. SB and SA carried out statistical analysis and produce visuals. SB, AA, and SA contributed to the paper&#x00027;s drafting. All authors have reviewed, offered comments, and approved the submission of the work.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack>
<p>We would like to express our gratitude to the administration of Jaber Al-Ahmad Hospital for their cooperation and support. We would like to thank Eng. Naser Alibrahim for his support with Python coding.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kong</surname> <given-names>HJ</given-names></name></person-group>. <article-title>Managing unstructured big data in healthcare system</article-title>. <source>Healthc Inform Res</source>. (<year>2019</year>) <volume>25</volume>:<fpage>1</fpage>. <pub-id pub-id-type="doi">10.4258/hir.2019.25.1.1</pub-id><pub-id pub-id-type="pmid">30788175</pub-id></citation></ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>SV</given-names></name> <name><surname>Rogers</surname> <given-names>JR</given-names></name> <name><surname>Jin</surname> <given-names>Y</given-names></name> <name><surname>Bates</surname> <given-names>DW</given-names></name> <name><surname>Fischer</surname> <given-names>MA</given-names></name></person-group>. <article-title>Use of electronic healthcare records to identify complex patients with atrial fibrillation for targeted intervention</article-title>. <source>J Am Med Inform Assoc JAMIA.</source> (<year>2017</year>) <volume>24</volume>:<fpage>339</fpage>&#x02013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.1093/jamia/ocw082</pub-id><pub-id pub-id-type="pmid">27375290</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sheikhalishahi</surname> <given-names>S</given-names></name> <name><surname>Miotto</surname> <given-names>R</given-names></name> <name><surname>Dudley</surname> <given-names>JT</given-names></name> <name><surname>Lavelli</surname> <given-names>A</given-names></name> <name><surname>Rinaldi</surname> <given-names>F</given-names></name> <name><surname>Osmani</surname> <given-names>V</given-names></name></person-group>. <article-title>Natural language processing of clinical notes on chronic diseases: systematic review</article-title>. <source>JMIR Med Inform</source>. (<year>2019</year>) <volume>7</volume>:<fpage>e12239</fpage>. <pub-id pub-id-type="doi">10.2196/12239</pub-id><pub-id pub-id-type="pmid">31066697</pub-id></citation></ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>L</given-names></name> <name><surname>Lu</surname> <given-names>Y</given-names></name> <name><surname>Vitale</surname> <given-names>CJ</given-names></name> <name><surname>Mar</surname> <given-names>PL</given-names></name> <name><surname>Chang</surname> <given-names>F</given-names></name> <name><surname>Dhopeshwarkar</surname> <given-names>N</given-names></name> <etal/></person-group>. <article-title>Representation of information about family relatives as structured data in electronic health records</article-title>. <source>Appl Clin Inform.</source> (<year>2014</year>) <volume>5</volume>:<fpage>349</fpage>&#x02013;<lpage>67</lpage>. <pub-id pub-id-type="doi">10.4338/ACI-2013-10-RA-0080</pub-id><pub-id pub-id-type="pmid">25024754</pub-id></citation></ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>John Lin CC Yu</surname> <given-names>K</given-names></name> <name><surname>Hatcher</surname> <given-names>A</given-names></name> <name><surname>Huang</surname> <given-names>TW</given-names></name> <name><surname>Lee</surname> <given-names>HK</given-names></name> <name><surname>Carlson</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>Identification of diverse astrocyte populations and their malignant analogs</article-title>. <source>Nat Neurosci.</source> (<year>2017</year>) <volume>20</volume>:<fpage>396</fpage>&#x02013;<lpage>405</lpage>. <pub-id pub-id-type="doi">10.1038/nn.4493</pub-id><pub-id pub-id-type="pmid">28659761</pub-id></citation></ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>DeCapprio</surname> <given-names>D</given-names></name> <name><surname>Gartner</surname> <given-names>J</given-names></name> <name><surname>McCall</surname> <given-names>CJ</given-names></name> <name><surname>Burgess</surname> <given-names>T</given-names></name> <name><surname>Kothari</surname> <given-names>S</given-names></name> <name><surname>Sayed</surname> <given-names>S</given-names></name></person-group>. <article-title>Building a COVID-19 Vulnerability Index</article-title>. <source>MedRxiv</source>. (<year>2020</year>). <pub-id pub-id-type="doi">10.1101/2020.03.16.20036723</pub-id></citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zoabi</surname> <given-names>Y</given-names></name> <name><surname>Deri-Rozov</surname> <given-names>S</given-names></name> <name><surname>Shomron</surname> <given-names>N</given-names></name></person-group>. <article-title>Machine learning-based prediction of COVID-19 diagnosis based on symptoms</article-title>. <source>NPJ Digit Med</source>. (<year>2021</year>) <volume>4</volume>:<fpage>3</fpage>. <pub-id pub-id-type="doi">10.1038/s41746-020-00372-6</pub-id><pub-id pub-id-type="pmid">33398013</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Izquierdo</surname> <given-names>JL</given-names></name> <name><surname>Ancochea</surname> <given-names>J</given-names></name> <name><surname>Soriano</surname> <given-names>JB</given-names></name></person-group>. <article-title>Clinical characteristics and prognostic factors for intensive care unit admission of patients with COVID-19: retrospective study using machine learning and natural language processing</article-title>. <source>J Med Internet Res.</source> (<year>2020</year>) <volume>22</volume>:<fpage>e21801</fpage>. <pub-id pub-id-type="doi">10.2196/21801</pub-id><pub-id pub-id-type="pmid">33989164</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guan</surname> <given-names>X</given-names></name> <name><surname>Zhang</surname> <given-names>B</given-names></name> <name><surname>Fu</surname> <given-names>M</given-names></name> <name><surname>Li</surname> <given-names>M</given-names></name> <name><surname>Yuan</surname> <given-names>X</given-names></name> <name><surname>Zhu</surname> <given-names>Y</given-names></name> <etal/></person-group>. <article-title>Clinical and inflammatory features based machine learning model for fatal risk prediction of hospitalized COVID-19 patients: results from a retrospective cohort study</article-title>. <source>Ann Med.</source> (<year>2021</year>) <volume>53</volume>:<fpage>257</fpage>&#x02013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1080/07853890.2020.1868564</pub-id><pub-id pub-id-type="pmid">33410720</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Coronavirus</collab></person-group>. <source>Worldometer</source>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.worldometers.info/coronavirus/country/kuwait/">https://www.worldometers.info/coronavirus/country/kuwait/</ext-link> (accessed April 12, 2022).</citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>(ICD-10)</collab></person-group>. <source>International Classification of Diseases, Tenth Revision (ICD-10).</source> (<year>2021</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.cdc.gov/nchs/icd/icd10.htm">https://www.cdc.gov/nchs/icd/icd10.htm</ext-link> (accessed April 11, 2022).</citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Data Scientist</collab></person-group>. <source>Reflections of a Data Scientist.</source> (<year>2018</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.reflectionsofadatascientist.com/2018/05/r-turf-analysis-spss.html">https://www.reflectionsofadatascientist.com/2018/05/r-turf-analysis-spss.html</ext-link> (accessed November 11, 2022).</citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>CL</given-names></name> <name><surname>Fan</surname> <given-names>CL</given-names></name></person-group>. <article-title>Evaluation of CART, CHAID, and QUEST algorithms: a case study of construction defects in Taiwan</article-title>. <source>J Asian Archit Build Eng</source>. <volume>18</volume>:<fpage>539</fpage>&#x02013;<lpage>53</lpage>. <pub-id pub-id-type="doi">10.1080/13467581.2019.1696203</pub-id></citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Darabi</surname> <given-names>H</given-names></name> <name><surname>Tsinis</surname> <given-names>D</given-names></name> <name><surname>Zecchini</surname> <given-names>K</given-names></name> <name><surname>Whitcomb</surname> <given-names>W</given-names></name> <name><surname>Liss</surname> <given-names>A</given-names></name></person-group>. <article-title>&#x0201C;Forecasting mortality risk for patients admitted to intensive care units using machine learning,&#x0201D;</article-title> In: <source>Procedia Computer Science, vol. 140</source>. <publisher-loc>Chicago, IL</publisher-loc>: <publisher-name>Elsevier</publisher-name> (<year>2018</year>). p. <fpage>306</fpage>&#x02013;<lpage>313</lpage>. <pub-id pub-id-type="doi">10.1016/J.PROCS.2018.10.313</pub-id></citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ruan</surname> <given-names>Q</given-names></name> <name><surname>Yang</surname> <given-names>K</given-names></name> <name><surname>Wang</surname> <given-names>W</given-names></name> <name><surname>Jiang</surname> <given-names>L</given-names></name> <name><surname>Song</surname> <given-names>J</given-names></name></person-group>. <article-title>Clinical predictors of mortality due to COVID-19 based on an analysis of data of 150 patients from Wuhan, China</article-title>. <source>Intensive Care Med.</source> (<year>2020</year>) <volume>46</volume>:<fpage>846</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1007/s00134-020-05991-x</pub-id><pub-id pub-id-type="pmid">32253449</pub-id></citation></ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>W</given-names></name> <name><surname>Tang</surname> <given-names>J</given-names></name> <name><surname>Wei</surname> <given-names>F</given-names></name></person-group>. <article-title>Updated understanding of the outbreak of 2019 novel coronavirus (2019-nCoV) in Wuhan, China</article-title>. <source>J Med Virol.</source> (<year>2020</year>) <volume>92</volume>:<fpage>441</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1002/jmv.25689</pub-id><pub-id pub-id-type="pmid">31994742</pub-id></citation></ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ayed</surname> <given-names>M</given-names></name> <name><surname>Borahmah</surname> <given-names>AA</given-names></name> <name><surname>Yazdani</surname> <given-names>A</given-names></name> <name><surname>Sultan</surname> <given-names>A</given-names></name> <name><surname>Mossad</surname> <given-names>A</given-names></name> <name><surname>Rawdhan</surname> <given-names>H</given-names></name></person-group>. <article-title>Assessment of clinical characteristics and mortality-associated factors in COVID-19 critical cases in Kuwait</article-title>. <source>Med Princ Pract.</source> (<year>2021</year>) <volume>30</volume>:<fpage>185</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1159/000513047</pub-id><pub-id pub-id-type="pmid">33197912</pub-id></citation></ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Galbadage</surname> <given-names>T</given-names></name> <name><surname>Peterson</surname> <given-names>BM</given-names></name> <name><surname>Awada</surname> <given-names>J</given-names></name> <name><surname>Buck</surname> <given-names>AS</given-names></name> <name><surname>Ramirez</surname> <given-names>DA</given-names></name> <name><surname>Wilson</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>Systematic review and meta-analysis of sex-specific COVID-19 clinical outcomes</article-title>. <source>Front Med.</source> (<year>2020</year>) <volume>7</volume>:<fpage>348</fpage>. <pub-id pub-id-type="doi">10.3389/fmed.2020.00348</pub-id><pub-id pub-id-type="pmid">32671082</pub-id></citation></ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moon</surname> <given-names>SS</given-names></name> <name><surname>Lee</surname> <given-names>K</given-names></name> <name><surname>Park</surname> <given-names>J</given-names></name> <name><surname>Yun</surname> <given-names>S</given-names></name> <name><surname>Lee</surname> <given-names>YS</given-names></name> <name><surname>Lee</surname> <given-names>DS</given-names></name></person-group>. <article-title>Clinical characteristics and mortality predictors of COVID-19 patients hospitalized at nationally-designated treatment hospitals</article-title>. <source>J Korean Med Sci.</source> (<year>2020</year>) <volume>35</volume>:<fpage>e328</fpage>. <pub-id pub-id-type="doi">10.3346/jkms.2020.35.e328</pub-id><pub-id pub-id-type="pmid">32924343</pub-id></citation></ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sousa</surname> <given-names>GJB</given-names></name> <name><surname>Garces</surname> <given-names>TS</given-names></name> <name><surname>Cestari</surname> <given-names>VRF</given-names></name> <name><surname>Flor&#x000EA;ncio</surname> <given-names>RS</given-names></name> <name><surname>Moreira</surname> <given-names>TMM</given-names></name> <name><surname>Pereira</surname> <given-names>MLD</given-names></name></person-group>. <article-title>Mortality and survival of COVID-19</article-title>. <source>Epidemiol Infect.</source> (<year>2020</year>) <volume>148</volume>:<fpage>e123</fpage>. <pub-id pub-id-type="doi">10.1017/S0950268820001405</pub-id><pub-id pub-id-type="pmid">32580809</pub-id></citation></ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>F</given-names></name> <name><surname>Yu</surname> <given-names>T</given-names></name> <name><surname>Du</surname> <given-names>R</given-names></name> <name><surname>Fan</surname> <given-names>G</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name> <name><surname>Liu</surname> <given-names>Z</given-names></name> <etal/></person-group>. <article-title>Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study</article-title>. <source>Lancet Lond Engl.</source> (<year>2020</year>) <volume>395</volume>:<fpage>1054</fpage>&#x02013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1016/S0140-6736(20)30566-3</pub-id><pub-id pub-id-type="pmid">32171076</pub-id></citation></ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grasselli</surname> <given-names>G</given-names></name> <name><surname>Zangrillo</surname> <given-names>A</given-names></name> <name><surname>Zanella</surname> <given-names>A</given-names></name> <name><surname>Antonelli</surname> <given-names>M</given-names></name> <name><surname>Cabrini</surname> <given-names>L</given-names></name> <name><surname>Castelli</surname> <given-names>A</given-names></name> <etal/></person-group>. <article-title>Baseline characteristics and outcomes of 1591 patients infected with SARS-CoV-2 admitted to ICUs of the Lombardy Region, Italy</article-title>. <source>JAMA.</source> (<year>2020</year>) <volume>323</volume>:<fpage>1574</fpage>&#x02013;<lpage>81</lpage>. <pub-id pub-id-type="doi">10.1001/jama.2020.5394</pub-id><pub-id pub-id-type="pmid">32250385</pub-id></citation></ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nada</surname> <given-names>KM</given-names></name> <name><surname>Hsu E</surname> <given-names>shuo</given-names></name> <name><surname>Seashore</surname> <given-names>J</given-names></name> <name><surname>Zaidan</surname> <given-names>M</given-names></name> <name><surname>Nishi</surname> <given-names>SP</given-names></name> <name><surname>Duarte</surname> <given-names>A</given-names></name> <etal/></person-group>. <article-title>Determining cause of death during Coronavirus Disease 2019 pandemic</article-title>. <source>Crit Care Explor</source>. (<year>2021</year>) <volume>3</volume>:<fpage>e0419</fpage>. <pub-id pub-id-type="doi">10.1097/CCE.0000000000000419</pub-id><pub-id pub-id-type="pmid">33912841</pub-id></citation></ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yan</surname> <given-names>Y</given-names></name> <name><surname>Yang</surname> <given-names>Y</given-names></name> <name><surname>Wang</surname> <given-names>F</given-names></name> <name><surname>Ren</surname> <given-names>H</given-names></name> <name><surname>Zhang</surname> <given-names>S</given-names></name> <name><surname>Shi</surname> <given-names>X</given-names></name> <etal/></person-group>. <article-title>Clinical characteristics and outcomes of patients with severe COVID-19 with diabetes</article-title>. <source>BMJ Open Diabetes Res Care.</source> (<year>2020</year>) <volume>8</volume>:<fpage>e001343</fpage>. <pub-id pub-id-type="doi">10.1136/bmjdrc-2020-001343</pub-id><pub-id pub-id-type="pmid">32345579</pub-id></citation></ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>B</given-names></name> <name><surname>Yang</surname> <given-names>J</given-names></name> <name><surname>Zhao</surname> <given-names>F</given-names></name> <name><surname>Zhi</surname> <given-names>L</given-names></name> <name><surname>Wang</surname> <given-names>X</given-names></name> <name><surname>Liu</surname> <given-names>L</given-names></name> <etal/></person-group>. <article-title>Prevalence and impact of cardiovascular metabolic diseases on COVID-19 in China</article-title>. <source>Clin Res Cardiol Off J Ger Card Soc.</source> (<year>2020</year>) <volume>109</volume>:<fpage>531</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1007/s00392-020-01626-9</pub-id><pub-id pub-id-type="pmid">32161990</pub-id></citation></ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>J</given-names></name> <name><surname>Zheng</surname> <given-names>Y</given-names></name> <name><surname>Gou</surname> <given-names>X</given-names></name> <name><surname>Pu</surname> <given-names>K</given-names></name> <name><surname>Chen</surname> <given-names>Z</given-names></name> <name><surname>Guo</surname> <given-names>Q</given-names></name> <etal/></person-group>. <article-title>Prevalence of comorbidities and its effects in patients infected with SARS-CoV-2: a systematic review and meta-analysis</article-title>. <source>Int J Infect Dis IJID Off Publ Int Soc Infect Dis.</source> (<year>2020</year>) <volume>94</volume>:<fpage>91</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijid.2020.03.017</pub-id><pub-id pub-id-type="pmid">32173574</pub-id></citation></ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Arentz</surname> <given-names>M</given-names></name> <name><surname>Yim</surname> <given-names>E</given-names></name> <name><surname>Klaff</surname> <given-names>L</given-names></name> <name><surname>Lokhandwala</surname> <given-names>S</given-names></name> <name><surname>Riedo</surname> <given-names>FX</given-names></name> <name><surname>Chong</surname> <given-names>M</given-names></name> <etal/></person-group>. <article-title>Characteristics and outcomes of 21 critically ill patients with COVID-19 in Washington State</article-title>. <source>JAMA.</source> (<year>2020</year>) <volume>323</volume>:<fpage>1612</fpage>&#x02013;<lpage>4</lpage>. <pub-id pub-id-type="doi">10.1001/jama.2020.4326</pub-id><pub-id pub-id-type="pmid">32191259</pub-id></citation></ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodriguez-Morales</surname> <given-names>AJ</given-names></name> <name><surname>Cardona-Ospina</surname> <given-names>JA</given-names></name> <name><surname>Guti&#x000E9;rrez-Ocampo</surname> <given-names>E</given-names></name> <name><surname>Villamizar-Pe&#x000F1;a</surname> <given-names>R</given-names></name> <name><surname>Holguin-Rivera</surname> <given-names>Y</given-names></name> <name><surname>Escalera-Antezana</surname> <given-names>JP</given-names></name> <etal/></person-group>. <article-title>Clinical, laboratory and imaging features of COVID-19: a systematic review and meta-analysis</article-title>. <source>Travel Med Infect Dis.</source> (<year>2020</year>) <volume>34</volume>:<fpage>101623</fpage>. <pub-id pub-id-type="doi">10.1016/j.tmaid.2020.101623</pub-id><pub-id pub-id-type="pmid">32179124</pub-id></citation></ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ferreira</surname> <given-names>FL</given-names></name> <name><surname>Bota</surname> <given-names>DP</given-names></name> <name><surname>Bross</surname> <given-names>A</given-names></name> <name><surname>M&#x000E9;lot</surname> <given-names>C</given-names></name> <name><surname>Vincent</surname> <given-names>JL</given-names></name></person-group>. <article-title>Serial evaluation of the SOFA score to predict outcome in critically ill patients</article-title>. <source>JAMA.</source> (<year>2001</year>) <volume>286</volume>:<fpage>1754</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1001/jama.286.14.1754</pub-id><pub-id pub-id-type="pmid">11594901</pub-id></citation></ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gibson</surname> <given-names>PG</given-names></name> <name><surname>Qin</surname> <given-names>L</given-names></name> <name><surname>Puah</surname> <given-names>SH</given-names></name></person-group>. <article-title>COVID-19 acute respiratory distress syndrome (ARDS): clinical features and differences from typical pre-COVID-19 ARDS</article-title>. <source>Med J Aust</source>. (<year>2020</year>) <volume>213</volume>:<fpage>54</fpage>&#x02013;<lpage>6.e1</lpage>. <pub-id pub-id-type="doi">10.5694/mja2.50674</pub-id><pub-id pub-id-type="pmid">32572965</pub-id></citation></ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tzotzos</surname> <given-names>SJ</given-names></name> <name><surname>Fischer</surname> <given-names>B</given-names></name> <name><surname>Fischer</surname> <given-names>H</given-names></name> <name><surname>Zeitlinger</surname> <given-names>M</given-names></name></person-group>. <article-title>Incidence of ARDS and outcomes in hospitalized patients with COVID-19: a global literature survey</article-title>. <source>Crit Care Lond Engl.</source> (<year>2020</year>) <volume>24</volume>:<fpage>516</fpage>. <pub-id pub-id-type="doi">10.1186/s13054-020-03240-7</pub-id><pub-id pub-id-type="pmid">32825837</pub-id></citation></ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alshukry</surname> <given-names>A</given-names></name> <name><surname>Ali</surname> <given-names>H</given-names></name> <name><surname>Ali</surname> <given-names>Y</given-names></name> <name><surname>Al-Taweel</surname> <given-names>T</given-names></name> <name><surname>Abu-Farha</surname> <given-names>M</given-names></name> <name><surname>AbuBaker</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>Clinical characteristics of coronavirus disease 2019 (COVID-19) patients in Kuwait</article-title>. <source>PLoS ONE.</source> (<year>2020</year>) <volume>15</volume>:<fpage>e0242768</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0242768</pub-id><pub-id pub-id-type="pmid">33216801</pub-id></citation></ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Elezkurtaj</surname> <given-names>S</given-names></name> <name><surname>Greuel</surname> <given-names>S</given-names></name> <name><surname>Ihlow</surname> <given-names>J</given-names></name> <name><surname>Michaelis</surname> <given-names>EG</given-names></name> <name><surname>Bischoff</surname> <given-names>P</given-names></name> <name><surname>Kunze</surname> <given-names>CA</given-names></name> <etal/></person-group>. <article-title>Causes of death and comorbidities in hospitalized patients with COVID-19</article-title>. <source>Sci Rep.</source> (<year>2021</year>) <volume>11</volume>:<fpage>4263</fpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-82862-5</pub-id><pub-id pub-id-type="pmid">33608563</pub-id></citation></ref>
</ref-list> 
</back>
</article>
