<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<?covid-19-tdm?>
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="systematic-review">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Public Health</journal-id>
<journal-title>Frontiers in Public Health</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Public Health</abbrev-journal-title>
<issn pub-type="epub">2296-2565</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpubh.2022.897784</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Public Health</subject>
<subj-group>
<subject>Systematic Review</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Analysis of distribution characteristics of COVID-19 in America based on space-time scan statistic</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Zhao</surname> <given-names>Yuexu</given-names></name>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Liu</surname> <given-names>Qiwei</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1723557/overview"/>
</contrib>
</contrib-group>
<aff><institution>College of Economics, Hangzhou Dianzi University</institution>, <addr-line>Hangzhou</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Omar El Deeb, Lebanese American University, Lebanon</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Keun Hwa Lee, Hanyang University, South Korea; Silvia &#x02013; Giono Cerezo, Instituto Polit&#x000E9;cnico Nacional (IPN), Mexico</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Qiwei Liu <email>qiweiliu&#x00040;hdu.edu.cn</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Infectious Diseases - Surveillance, Prevention and Treatment, a section of the journal Frontiers in Public Health</p></fn></author-notes>
<pub-date pub-type="epub">
<day>10</day>
<month>08</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>897784</elocation-id>
<history>
<date date-type="received">
<day>16</day>
<month>03</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>07</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Zhao and Liu.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Zhao and Liu</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license> </permissions>
<abstract>
<p>Based on the epidemic data of COVID-19 in 50 states of the United States (the US) from December 2021 to January 2022, the spatial and temporal clustering characteristics of COVID-19 in the US are explored and analyzed. First, the retrospective spatiotemporal analysis is performed by using SaTScan 9.5, and 17 incidence areas are obtained. Second, the reliability of the results is tested by the circular distribution method in the time latitude and the clustering method in the spatial latitude, and it is confirmed that the retrospective spatiotemporal analysis accurately measures in time and reasonably divides regions according to the characteristics in space. Empirical results show that the first-level clustering area of the epidemic has six states with an average relative risk of 1.28 and the second-level clustering area includes 18 states with an average relative risk of 0.86. At present, the epidemic situation in the US continues to expand. It is necessary to do constructive work in epidemic prevention, reduce the impact of epidemic, and effectively control the spread of the epidemic.</p></abstract>
<kwd-group>
<kwd>the spatiotemporal analysis</kwd>
<kwd>COVID-19</kwd>
<kwd>scan statistic</kwd>
<kwd>spatial aggregation</kwd>
<kwd>Omicron</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="7"/>
<equation-count count="10"/>
<ref-count count="19"/>
<page-count count="11"/>
<word-count count="5394"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>Introduction</title>
<p>Coronavirus disease 2019 (COVID-19) refers to the new coronavirus infection in 2019 caused by acute respiratory infectious diseases in the majority of patients; some of them will develop as severe cases and even results in death. Since the large-scale outbreak of new corona pneumonia in 21 January 2020, the economy in the US has been gradually affected. Besides, the epidemic is a public emergency that all countries in the world have to face. As one of the most serious epidemic countries, the US has accumulated 74,741,586 new coronavirus cases as on 31 January 2022, and there is a certain aggregation tendency in time and space. Scan statistic is a method to test whether there is an aggregation of diseases, and detect whether the abnormal increase of diseases in time and space is caused by random variation. It has been widely used in infectious diseases, cardiovascular diseases, and other fields as a spatial statistical method in epidemic statistics.</p>
<p>In 1965, Joseph (<xref ref-type="bibr" rid="B1">1</xref>) proposed the concept of scan statistic. Kulldorff et al. (<xref ref-type="bibr" rid="B2">2</xref>&#x02013;<xref ref-type="bibr" rid="B5">5</xref>) proposed the spatial scan statistic, and applied scan statistic to analyze the breast cancer mortality in the US. For example, they utilized the dynamic variable scanning window to detect the leukemia data in northern New York, and used the log-likelihood ratio to determine the cluster with the highest degree of aggregation. They also proposed many statistical models of spatiotemporal scanning, such as retrospective space-time scan statistic in Bernoulli model or Poisson model, prospective space-time scan statistic, space-time rearrangement scan statistic, and elliptical spatial scan statistic in periodic geographic disease monitoring. As the research goes further, Jung et al. (<xref ref-type="bibr" rid="B6">6</xref>, <xref ref-type="bibr" rid="B7">7</xref>) proposed ordinal model scan statistic in 2007, which had excellent performance compared with Bernoulli scan statistic for binary classification of prostate cancer data. Huang et al. (<xref ref-type="bibr" rid="B8">8</xref>, <xref ref-type="bibr" rid="B9">9</xref>) proposed the spatial scan statistic based on the exponential model for the male survival data with prostate cancer in the US in 2007; this method could be applied to the survival data and pure spatial data. In order to study the spatial heterogeneity continuously measured in the population data, the weighted normal spatial scan statistic was proposed and applied to the two-stage lung cancer survival research in 2009. Barbara (<xref ref-type="bibr" rid="B10">10</xref>) found that Cutl&#x00027;s method was more effective than Kulldorff&#x00027;s scan statistic for irregular shape spatiotemporal clusters, and for cylindrical spatiotemporal clusters; these two methods had similar results. Li et al. (<xref ref-type="bibr" rid="B11">11</xref>) analyzed the fund sustainability. Yin (<xref ref-type="bibr" rid="B12">12</xref>) carried out the research on application in early warning of infectious diseases, and graded the data of provinces and cities. Ma et al. (<xref ref-type="bibr" rid="B13">13</xref>) selected the optimal spatial scale through the number of signals in the monitoring of infectious diseases. So far, scan statistic has been widely used in disease prevention, including tuberculosis, schistosomiasis, and hand, foot, and mouth disease.</p>
<p>The majority of the abovementioned studies explore the spatial aggregation of various infectious diseases. COVID-19 is a highly contagious disease, which has seriously affected people&#x00027;s lives since its outbreak, and has a great threat to people&#x00027;s health. Hohl et al. (<xref ref-type="bibr" rid="B14">14</xref>) used the daily new coronavirus case data provided by the John Hopkins University at the county level, and applied SaTScan to conduct a prospective space-time analysis, and detected the active clusters in various provinces and cities in the US. To avoid using prospective space-time scan statistic to identify emergence of COVID-19 disease groups, Beard et al. (<xref ref-type="bibr" rid="B15">15</xref>) proposed the COVID-19 monitoring method, which was based on spatiotemporal event sequence similarity. Hohl et al. (<xref ref-type="bibr" rid="B16">16</xref>) used prospective Poisson space-time scan statistic to detect daily clusters of COVID-19 at successive county levels in 48 states and Washington DC, which was helpful to facilitate decision-making and public health resource allocation. Pei et al. (<xref ref-type="bibr" rid="B17">17</xref>) found that the epidemic distribution had obvious space-time heterogeneity, and the spatial-temporal transmission had typical network characteristics.</p>
<p>In this paper, we will study the spatial aggregation of COVID-19 in the US from the following aspects. First, we construct a dynamic scanning window, calculate the relative risk to measure the intensity of aggregation, and utilize the scan statistical analysis through SaTScan9.5 based on the retrospective spatiotemporal analysis method. Second, we analyze the rational treatment of SaTScan9.5, and innovatively use circular distribution method (time latitude) and cluster analysis method (spatial latitude) to test the reliability of spatiotemporal scanning results. Through horizontal comparison, it is found that spatiotemporal scan analysis not only accurately measures in time but also reasonably divides regions according to characteristics in space. Finally, we take into account the data and how the COVID-19 pandemic changes on the ground, locating the gathering area and span period on time. At the same time, according to the scanning results, it not only provides an important theoretical basis for the relevant epidemic prevention work, but also has crucial importance for the establishment of an early warning system for the corresponding disease, ultimately playing a positive role in strengthening prevention and resolving the risk of major diseases in the world.</p></sec>
<sec sec-type="methods" id="s2">
<title>Methodology</title>
<p>Retrospective spatiotemporal analysis needs to build a scanning window to judge the number of diseases inside and outside the window. Since the scanning statistics involve time and space, the scanning window is in the form of a cylinder; the height of the cylinder represents the time, and the bottom area of the cylinder represents the area. The location and size of the scan window are dynamic, as it is unknown when and where the COVID-19 outbreak will occur.</p>
<p>In the analysis process, a position is randomly selected as the scanning center, and then, the cylindrical scanning window changes continuously. The cluster of geographic size of the scanning window ranges between zero and a predefined upper limit. There are several ways to determine the value of upper bound, for example, one can take the percentage of number of people at risk of disease or radius value of circle as the upper bound. In this article, we use the former method. The time length of the scan window specifies the maximum time frame according to the percentage of the entire study cycle or the specific number of days.</p>
<p>To determine the possibility of aggregation, the actual number of patients and the number of regional populations are calculated to obtain the theoretical number of patients, and the log-likelihood ratio (<italic>LLR</italic>) is constructed by using the actual and theoretical number of patients inside and outside the window; the relative risk (<inline-formula><mml:math id="M2"><mml:mover accent="true"><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>) is calculated to evaluate the strength of aggregation. Since the scanning window undergoes a dynamic change, numerous scanning windows will be generated during the scanning process. For controlling the false-positive rate at a certain level, the window with the largest <italic>LLR</italic> is selected as the clustering area among all scanning windows. The statistical significance of <italic>LLR</italic> is tested by Monte Carlo stochastic simulation method.</p>
<p>We then give hypothesis test as follows:</p>
<p>Null Hypothesis (<italic>H</italic><sub>0</sub>): <italic>The spatial and temporal distribution of newly confirmed cases of COVID-19 in the US is completely random;</italic></p>
<p>Alternative Hypothesis (<italic>H</italic><sub>1</sub>): <italic>The spatial and temporal distribution of newly confirmed cases of COVID-19 in the US is not completely random</italic>.</p>
<p>Assuming that the number of cases in window <italic>A</italic> is <italic>n</italic><sub><italic>A</italic></sub>, the population is <italic>m</italic><sub><italic>A</italic></sub>, <italic>E</italic>(<italic>A</italic>) is the expected number of cases in the scanning window based on the original assumption and adjusted by covariates, the total number of cases in the total region is <italic>n</italic><sub><italic>T</italic></sub>, the total population is <italic>m</italic><sub><italic>T</italic></sub>, and the expected number of cases is <italic>E</italic>(<italic>T</italic>), then
<disp-formula id="E1"><label>(1)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E2"><label>(2)</label><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>&#x02211;</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
The probability density function of specific points observed at region <italic>x</italic> is as follows:
<disp-formula id="E3"><label>(3)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none none none none none none none none none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>q</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>A</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>q</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>q</mml:mi><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02209;</mml:mo><mml:mi>A</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
where <italic>p</italic> is the ratio of actual incidence to expected incidence in window <italic>A</italic>, <italic>q</italic> is the ratio of actual incidence to expected incidence outside window <italic>A</italic>, and the probability of any specific point is independent of all other points, one can also refer to Tang et al. (<xref ref-type="bibr" rid="B18">18</xref>) and Yang (<xref ref-type="bibr" rid="B19">19</xref>).</p>
<p>If <italic>p</italic> &#x0003E; <italic>q</italic>, the likelihood function <italic>LR</italic>(<italic>A, p, q</italic>) is denoted by:
<disp-formula id="E4"><label>(4)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>L</mml:mi><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>q</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>!</mml:mo></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x0220F;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>A</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
Otherwise, the likelihood function <italic>LR</italic><sub>0</sub>(based on invalid hypothesis) is
<disp-formula id="E6"><label>(5)</label><mml:math id="M9"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>L</mml:mi><mml:msub><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>!</mml:mo></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x0220F;</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mi>A</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
Test statistic for spatiotemporal scan &#x003BB; is defined as follows:
<disp-formula id="E7"><label>(6)</label><mml:math id="M10"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>&#x003BB;</mml:mi><mml:mo>:</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mo>=</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mfrac><mml:mrow><mml:mstyle displaystyle="true"><mml:munder><mml:mrow><mml:mi>S</mml:mi><mml:mi>u</mml:mi><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mi>q</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>L</mml:mi><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>q</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:munder><mml:mrow><mml:mi>S</mml:mi><mml:mi>u</mml:mi><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mi>q</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>L</mml:mi><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi><mml:mo>,</mml:mo><mml:mi>p</mml:mi><mml:mo>,</mml:mo><mml:mi>q</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
According to Equations (4) and (5), we have
<disp-formula id="E8"><label>(7)</label><mml:math id="M11"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>&#x003BB;</mml:mi><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munder><mml:mrow><mml:mi>S</mml:mi><mml:mi>u</mml:mi><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mtext>&#x000A0;</mml:mtext><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E9"><label>(8)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>B</mml:mi><mml:mo>:</mml:mo><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>&#x0003E;</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
In formula (<xref ref-type="bibr" rid="B7">7</xref>), <italic>I</italic>(&#x000B7;) is a characteristic function. The ratio of the actual incidence to the expected incidence in window <italic>A</italic> is greater than the ratio of the actual incidence to the expected incidence outside window <italic>A</italic>. The is a measure of how risk within a cylinder differs from risk outside.</p>
<p>Next, we use Monte Carlo random method to simulate the <italic>p</italic> value of <italic>LLR</italic> to determine whether the aggregation is statistically significant. First, we simulate <italic>c</italic> random datasets, calculate the maximum <italic>LLR</italic> for each dataset, and rank it with the real <italic>LLR</italic> from big to small. If the real value rank is <italic>R</italic>, then we have
<disp-formula id="E10"><label>(9)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mi>p</mml:mi><mml:mo>=</mml:mo><mml:mi>R</mml:mi><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
If <italic>p</italic> &#x0003C; 0.05, we reject the original assumption. The relative risk of each aggregation is as follows:
<disp-formula id="E11"><label>(10)</label><mml:math id="M14"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mover accent="true"><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>T</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>E</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula></p></sec>
<sec id="s3">
<title>Empirical analysis</title>
<sec>
<title>Source of the data</title>
<p>This paper selects 50 states from the US for research. The data of the COVID-19 mainly came from the data of the New York Times, including the date of diagnosis, the current area, and the source of infection of patients with COVID-19. Demographic data mainly came from the US 2020 census data, the basic geographic information data of each state were derived from Google satellite map data, and the latitude and longitude coordinates mainly chose the state capital as the center position.</p>
<p>The variant data are derived from the Centers for Disease Control and Prevention (CDC) study that tracks the proportion of variants estimated from weekly random sampling in the Department of Health and Human Services region followed by gene sequencing tests across the region, which we use to estimate the number of variant infections in the state over a week with new cases per day.</p></sec>
<sec>
<title>Parameter setting</title>
<p>We use the retrospective spatiotemporal analysis method, and choose the discrete Poisson model. The scanning time is set to be from 1 December 2021 to 31 January 2022, and the time interval is 1 day. As COVID-19 is highly contagious, the population with a ceiling of 50% in the space window is at risk, and the maximum circle size file is set at 30% of the population, rather than 30% of the regular population, and the regional overlap is set at zero. Referring to a large number of relevant literatures, combined with the actual situation, it is known that the outbreak of COVID-19 is fast, and the incubation period is short. Besides, the inaction of the US government to manage the outbreak makes the cycle longer. The daily pattern of COVID-19 changes rapidly, so the minimum time cluster is set to 1 day. In the test window, the number of Monte Carlo random simulation is set to 999.</p></sec>
<sec>
<title>Description analysis</title>
<p>In the population distribution, the US COVID-19 has nothing to do with gender, and included patients mainly in the age group of 44 to 59 years. In terms of time distribution, the US had the largest number of new cases on 10<sup>th</sup> January, with 1,420,374 cases. On 3<sup>rd</sup>, 18<sup>th</sup>, and 24<sup>th</sup> January, more than 1 million new cases were added daily with 1,003,751, 1,173,885, and 1,025,999 cases, respectively. In terms of the overall trend, the outbreak in the early stages of each state is relatively serious, and the number of confirmed cases has experienced a short lag and rapid growth. From <xref ref-type="fig" rid="F1">Figure 1</xref>, we can see that the overall epidemic situation has not been effectively controlled, so the number of confirmed cases has increased cumulatively, having a certain increasing tendency. In the regional distribution, cases were mainly concentrated in the east and west of the US, and California has the largest number of confirmed cases, followed by New York.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The number of COVID-19 cases.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-897784-g0001.tif"/>
</fig></sec>
<sec>
<title>Space-time analysis</title>
<p>A retrospective spatiotemporal analysis is carried out in the US. After SaTScan 9.5 is run, 17 clustering areas are obtained and arranged from large to small according to the log-likelihood ratio, and we obtain <italic>p</italic> &#x0003C; 0.01. The clustering areas are tested by the aboriginality test. The specific data are summarized in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Retrospective spatiotemporal analysis.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Cluster</bold></th>
<th valign="top" align="center"><bold>Start date</bold></th>
<th valign="top" align="center"><bold>The number of states</bold></th>
<th valign="top" align="center"><bold>The actual value</bold></th>
<th valign="top" align="center"><bold>Value of expectation</bold></th>
<th valign="top" align="center"><bold><inline-formula><mml:math id="M1"><mml:mover accent="true"><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula></bold></th>
<th valign="top" align="center"><bold><italic>LLR</italic></bold></th>
<th valign="top" align="center"><bold><italic>p</italic></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">4,157,904</td>
<td valign="top" align="center">3,363,765.04</td>
<td valign="top" align="center">1.28</td>
<td valign="top" align="center">101,152.5</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">18</td>
<td valign="top" align="center">5,817,663</td>
<td valign="top" align="center">6,526,216.68</td>
<td valign="top" align="center">0.86</td>
<td valign="top" align="center">52,598.68</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1,891,720</td>
<td valign="top" align="center">2,301,321.62</td>
<td valign="top" align="center">0.81</td>
<td valign="top" align="center">42,332.33</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">1,114,238</td>
<td valign="top" align="center">1,418,944.71</td>
<td valign="top" align="center">0.78</td>
<td valign="top" align="center">37,219.81</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">5</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">1,965,177</td>
<td valign="top" align="center">2,274,140.59</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">24,001.90</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">6</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">945,695</td>
<td valign="top" align="center">1,169,284.65</td>
<td valign="top" align="center">0.80</td>
<td valign="top" align="center">23,886.30</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">7</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">672,175</td>
<td valign="top" align="center">845,809.54</td>
<td valign="top" align="center">0.79</td>
<td valign="top" align="center">19,780.36</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">8</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">882,793</td>
<td valign="top" align="center">1,034,449.98</td>
<td valign="top" align="center">0.85</td>
<td valign="top" align="center">12,161.49</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">9</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">3,455,831</td>
<td valign="top" align="center">3,236,835.98</td>
<td valign="top" align="center">1.08</td>
<td valign="top" align="center">8,298.06</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">10</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1,844,383</td>
<td valign="top" align="center">1,700,649.87</td>
<td valign="top" align="center">1.09</td>
<td valign="top" align="center">6,333.58</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">11</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">1,351,620</td>
<td valign="top" align="center">1,228,440.97</td>
<td valign="top" align="center">1.11</td>
<td valign="top" align="center">6,284.42</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">12</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">535,773</td>
<td valign="top" align="center">465,366.48</td>
<td valign="top" align="center">1.15</td>
<td valign="top" align="center">5,172.76</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">13</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1,110,569</td>
<td valign="top" align="center">1,011,672.39</td>
<td valign="top" align="center">1.10</td>
<td valign="top" align="center">4,878.70</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">14</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">423,426</td>
<td valign="top" align="center">485,990.37</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">4,288.31</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">15</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">1,056,440</td>
<td valign="top" align="center">990,205.76</td>
<td valign="top" align="center">1.07</td>
<td valign="top" align="center">2,254.76</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">16</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">2,193,108</td>
<td valign="top" align="center">2,263,168.50</td>
<td valign="top" align="center">0.97</td>
<td valign="top" align="center">1,198.58</td>
<td valign="top" align="center">0.0000</td>
</tr>
<tr>
<td valign="top" align="left">17</td>
<td valign="top" align="center">Dec 1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">381,903</td>
<td valign="top" align="center">355,779.67</td>
<td valign="top" align="center">1.07</td>
<td valign="top" align="center">949.66</td>
<td valign="top" align="center">0.0000</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The cluster areas are mainly concentrated in Connecticut, Rhode Island, New York, Massachusetts, New Hampshire, and New Jersey from 1 December 2021 to 31 January 2022. The log-likelihood ratio is 101,152.56, and the aggregation is the highest, with a relative risk of 1.28. It also shows that the aggregation of COVID-19 in the six places during this period is strong. From 1 December 2021 to 31 January 2022, 18 states, such as Colorado, become the second agglomeration, with a log-likelihood ratio of 52,598.68 and a relative risk of 0.86. Texas from 1 December 2021 to 31 January 2022 is one of the three types of gathering areas, with a log-likelihood ratio of 42,332.33 and a relative risk of 0.81.</p>
<p>Combined with the daily incidence of each state, it can be observed that the starting time of the gathering area is just the time for the sudden increase of the confirmed cases of COVID-19 in the region, and the end time is the time for the growth rate of the confirmed cases to begin to decline. Combined with <xref ref-type="table" rid="T2">Table 2</xref>, the incidence of the four states involved, Rhode Island, New York, Massachusetts, and New Jersey, accounts for the top five regions of the incidence of COVID-19 in the US, and New York is the city with the second largest number of confirmed cases. Although Massachusetts and New Hampshire have fewer confirmed cases than New York, they are geographically close to New York, where the epidemic is relatively serious.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Numbers of cases and morbidities in the top 20 states.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>State</bold></th>
<th valign="top" align="center"><bold>Population</bold></th>
<th valign="top" align="center"><bold>Case</bold></th>
<th valign="top" align="center"><bold>Morbidity</bold></th>
<th valign="top" align="center"><bold>Rank</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Rhode Island</td>
<td valign="top" align="center">1,097,379</td>
<td valign="top" align="center">152,016</td>
<td valign="top" align="center">0.138526434</td>
<td valign="top" align="center">1</td>
</tr>
<tr>
<td valign="top" align="left">New York</td>
<td valign="top" align="center">20,201,249</td>
<td valign="top" align="center">2,060,920</td>
<td valign="top" align="center">0.102019435</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">Massachusetts</td>
<td valign="top" align="center">7,029,917</td>
<td valign="top" align="center">692,497</td>
<td valign="top" align="center">0.098507137</td>
<td valign="top" align="center">3</td>
</tr>
<tr>
<td valign="top" align="left">Delaware</td>
<td valign="top" align="center">989,948</td>
<td valign="top" align="center">94,921</td>
<td valign="top" align="center">0.095884834</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">New Jersey</td>
<td valign="top" align="center">9,288,994</td>
<td valign="top" align="center">859,151</td>
<td valign="top" align="center">0.092491286</td>
<td valign="top" align="center">5</td>
</tr>
<tr>
<td valign="top" align="left">South Carolina</td>
<td valign="top" align="center">5,118,425</td>
<td valign="top" align="center">466,698</td>
<td valign="top" align="center">0.091180002</td>
<td valign="top" align="center">6</td>
</tr>
<tr>
<td valign="top" align="left">Wisconsin</td>
<td valign="top" align="center">5,893,718</td>
<td valign="top" align="center">535,773</td>
<td valign="top" align="center">0.090905775</td>
<td valign="top" align="center">7</td>
</tr>
<tr>
<td valign="top" align="left">Kansas</td>
<td valign="top" align="center">2,937,880</td>
<td valign="top" align="center">264,696</td>
<td valign="top" align="center">0.090097621</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">Alaska</td>
<td valign="top" align="center">733,391</td>
<td valign="top" align="center">65,639</td>
<td valign="top" align="center">0.089500689</td>
<td valign="top" align="center">9</td>
</tr>
<tr>
<td valign="top" align="left">Hawaii</td>
<td valign="top" align="center">1,455,271</td>
<td valign="top" align="center">129,489</td>
<td valign="top" align="center">0.088979304</td>
<td valign="top" align="center">10</td>
</tr>
<tr>
<td valign="top" align="left">Utah</td>
<td valign="top" align="center">3,271,616</td>
<td valign="top" align="center">289,753</td>
<td valign="top" align="center">0.088565712</td>
<td valign="top" align="center">11</td>
</tr>
<tr>
<td valign="top" align="left">Illinois</td>
<td valign="top" align="center">12,812,508</td>
<td valign="top" align="center">1,110,569</td>
<td valign="top" align="center">0.086678502</td>
<td valign="top" align="center">12</td>
</tr>
<tr>
<td valign="top" align="left">Louisiana</td>
<td valign="top" align="center">4,657,757</td>
<td valign="top" align="center">399,318</td>
<td valign="top" align="center">0.085731823</td>
<td valign="top" align="center">13</td>
</tr>
<tr>
<td valign="top" align="left">Florida</td>
<td valign="top" align="center">21,538,187</td>
<td valign="top" align="center">1,844,383</td>
<td valign="top" align="center">0.085633159</td>
<td valign="top" align="center">14</td>
</tr>
<tr>
<td valign="top" align="left">North Carolina</td>
<td valign="top" align="center">10,439,388</td>
<td valign="top" align="center">884,922</td>
<td valign="top" align="center">0.084767613</td>
<td valign="top" align="center">15</td>
</tr>
<tr>
<td valign="top" align="left">Kentucky</td>
<td valign="top" align="center">4,505,836</td>
<td valign="top" align="center">381,903</td>
<td valign="top" align="center">0.084757412</td>
<td valign="top" align="center">16</td>
</tr>
<tr>
<td valign="top" align="left">West Virginia</td>
<td valign="top" align="center">1,793,716</td>
<td valign="top" align="center">151,977</td>
<td valign="top" align="center">0.08472746</td>
<td valign="top" align="center">17</td>
</tr>
<tr>
<td valign="top" align="left">Vermont</td>
<td valign="top" align="center">643,077</td>
<td valign="top" align="center">54,458</td>
<td valign="top" align="center">0.084683483</td>
<td valign="top" align="center">18</td>
</tr>
<tr>
<td valign="top" align="left">California</td>
<td valign="top" align="center">39,538,223</td>
<td valign="top" align="center">3,326,342</td>
<td valign="top" align="center">0.08412978</td>
<td valign="top" align="center">19</td>
</tr>
<tr>
<td valign="top" align="left">Arizona</td>
<td valign="top" align="center">7,151,502</td>
<td valign="top" align="center">600,864</td>
<td valign="top" align="center">0.084019273</td>
<td valign="top" align="center">20</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>Circular distribution analysis</title>
<p>Since the research time is 62 days, we divide 360&#x000B0; evenly over each day, then 1 day is equivalent to 5.81&#x000B0;, and 1 h is equivalent to 0.21&#x000B0;. To avoid the infinite calculation, the calculation time of each day is 8:00 a.m., that is, the one-third corresponding degree of 1 day is taken as the degree of the day. By the spatiotemporal scanning analysis, we obtain 17 clustering areas, and take the first three clustering areas as example. In order to compare the following analysis results to previous ones, we combine the daily newly confirmed cases according to the clusters. The <italic>r</italic>, <italic>r</italic><sub>0</sub>, and <italic>p</italic>-values of Rayleigh test are obtained through calculation, as summarized in <xref ref-type="table" rid="T3">Table 3</xref>.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Results of circular distribution analysis.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Cluster</bold></th>
<th valign="top" align="center"><bold><italic>r</italic></bold></th>
<th valign="top" align="center"><bold><italic>r</italic><sub>0</sub></bold></th>
<th valign="top" align="center"><bold><italic>p</italic></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">0.4392</td>
<td valign="top" align="center">0.0013</td>
<td valign="top" align="center">0.001</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">0.4769</td>
<td valign="top" align="center">0.0011</td>
<td valign="top" align="center">0.000</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">0.5398</td>
<td valign="top" align="center">0.0019</td>
<td valign="top" align="center">0.001</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The peak day and peaks of each cluster are summarized in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Peak day and peak period of incidence of COVID-19 in each cluster area.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Cluster</bold></th>
<th valign="top" align="center"><bold><inline-formula><mml:math id="M6"><mml:mover accent="false" class="mml-overline"><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mo accent="true">&#x000AF;</mml:mo></mml:mover></mml:math></inline-formula></bold></th>
<th valign="top" align="center"><bold><italic>s</italic></bold></th>
<th valign="top" align="center"><bold>Peak incidence</bold></th>
<th valign="top" align="center"><bold>Epidemic peak period</bold></th>
<th valign="top" align="center"><bold>Peak period span (Day)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">32.7887</td>
<td valign="top" align="center">73.5005</td>
<td valign="top" align="center">Dec 6</td>
<td valign="top" align="center">Dec 6&#x02013;Dec19</td>
<td valign="top" align="center">14</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">87.1093</td>
<td valign="top" align="center">69.7240</td>
<td valign="top" align="center">Dec 16</td>
<td valign="top" align="center">Dec 4&#x02013;Dec28</td>
<td valign="top" align="center">25</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">77.1640</td>
<td valign="top" align="center">63.6210</td>
<td valign="top" align="center">Dec 14</td>
<td valign="top" align="center">Dec 3&#x02013;Dec24</td>
<td valign="top" align="center">12</td>
</tr>
</tbody>
</table>
</table-wrap></sec>
<sec>
<title>Clustering analysis</title>
<p>Hierarchical cluster analysis method is commonly used in classification research. This method can overcome the shortcomings of qualitative classification. According to the index characteristics of the classification object, the total feature similarity is divided into a class. In this case, the cumulative confirmed cases, the regional population, and the incidence rate are used as the indicators of each region, and imported into R software for standardization. The deviation square and clustering analysis are used to divide them into four categories. Since the latitude and longitude coordinates are involved in the spatiotemporal scanning analysis, the central coordinates of the capital are added to the index, as shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. Due to the mess up of text and pictures as displayed in the diagram, it should be replaced with a geographical code (US-01), as shown in <xref ref-type="table" rid="T5">Table 5</xref>.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Coordinated hierarchical cluster analysis.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-897784-g0002.tif"/>
</fig>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Correspondence tables of states.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Code</bold></th>
<th valign="top" align="left"><bold>State</bold></th>
<th valign="top" align="left"><bold>Code</bold></th>
<th valign="top" align="left"><bold>State</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">US-01</td>
<td valign="top" align="left">Alabama</td>
<td valign="top" align="left">US-30</td>
<td valign="top" align="left">Montana</td>
</tr>
<tr>
<td valign="top" align="left">US-02</td>
<td valign="top" align="left">Alaska</td>
<td valign="top" align="left">US-31</td>
<td valign="top" align="left">Nebraska</td>
</tr>
<tr>
<td valign="top" align="left">US-04</td>
<td valign="top" align="left">Arizona</td>
<td valign="top" align="left">US-32</td>
<td valign="top" align="left">Nevada</td>
</tr>
<tr>
<td valign="top" align="left">US-05</td>
<td valign="top" align="left">Arkansas</td>
<td valign="top" align="left">US-33</td>
<td valign="top" align="left">New Hampshire</td>
</tr>
<tr>
<td valign="top" align="left">US-06</td>
<td valign="top" align="left">California</td>
<td valign="top" align="left">US-34</td>
<td valign="top" align="left">New Jersey</td>
</tr>
<tr>
<td valign="top" align="left">US-08</td>
<td valign="top" align="left">Colorado</td>
<td valign="top" align="left">US-35</td>
<td valign="top" align="left">New Mexico</td>
</tr>
<tr>
<td valign="top" align="left">US-09</td>
<td valign="top" align="left">Connecticut</td>
<td valign="top" align="left">US-36</td>
<td valign="top" align="left">New York</td>
</tr>
<tr>
<td valign="top" align="left">US-10</td>
<td valign="top" align="left">Delaware</td>
<td valign="top" align="left">US-37</td>
<td valign="top" align="left">North Carolina</td>
</tr>
<tr>
<td valign="top" align="left">US-12</td>
<td valign="top" align="left">Florida</td>
<td valign="top" align="left">US-38</td>
<td valign="top" align="left">North Dakota</td>
</tr>
<tr>
<td valign="top" align="left">US-13</td>
<td valign="top" align="left">Georgia</td>
<td valign="top" align="left">US-39</td>
<td valign="top" align="left">Ohio</td>
</tr>
<tr>
<td valign="top" align="left">US-15</td>
<td valign="top" align="left">Hawaii</td>
<td valign="top" align="left">US-40</td>
<td valign="top" align="left">Oklahoma</td>
</tr>
<tr>
<td valign="top" align="left">US-16</td>
<td valign="top" align="left">Idaho</td>
<td valign="top" align="left">US-41</td>
<td valign="top" align="left">Oregon</td>
</tr>
<tr>
<td valign="top" align="left">US-17</td>
<td valign="top" align="left">Illinois</td>
<td valign="top" align="left">US-42</td>
<td valign="top" align="left">Pennsylvania</td>
</tr>
<tr>
<td valign="top" align="left">US-18</td>
<td valign="top" align="left">Indiana</td>
<td valign="top" align="left">US-44</td>
<td valign="top" align="left">Rhode Island</td>
</tr>
<tr>
<td valign="top" align="left">US-19</td>
<td valign="top" align="left">Iowa</td>
<td valign="top" align="left">US-45</td>
<td valign="top" align="left">South Carolina</td>
</tr>
<tr>
<td valign="top" align="left">US-20</td>
<td valign="top" align="left">Kansas</td>
<td valign="top" align="left">US-46</td>
<td valign="top" align="left">South Dakota</td>
</tr>
<tr>
<td valign="top" align="left">US-21</td>
<td valign="top" align="left">Kentucky</td>
<td valign="top" align="left">US-47</td>
<td valign="top" align="left">Tennessee</td>
</tr>
<tr>
<td valign="top" align="left">US-22</td>
<td valign="top" align="left">Louisiana</td>
<td valign="top" align="left">US-48</td>
<td valign="top" align="left">Texas</td>
</tr>
<tr>
<td valign="top" align="left">US-23</td>
<td valign="top" align="left">Maine</td>
<td valign="top" align="left">US-49</td>
<td valign="top" align="left">Utah</td>
</tr>
<tr>
<td valign="top" align="left">US-24</td>
<td valign="top" align="left">Maryland</td>
<td valign="top" align="left">US-50</td>
<td valign="top" align="left">Vermont</td>
</tr>
<tr>
<td valign="top" align="left">US-25</td>
<td valign="top" align="left">Massachusetts</td>
<td valign="top" align="left">US-51</td>
<td valign="top" align="left">Virginia</td>
</tr>
<tr>
<td valign="top" align="left">US-26</td>
<td valign="top" align="left">Michigan</td>
<td valign="top" align="left">US-53</td>
<td valign="top" align="left">Washington</td>
</tr>
<tr>
<td valign="top" align="left">US-27</td>
<td valign="top" align="left">Minnesota</td>
<td valign="top" align="left">US-54</td>
<td valign="top" align="left">West Virginia</td>
</tr>
<tr>
<td valign="top" align="left">US-28</td>
<td valign="top" align="left">Mississippi</td>
<td valign="top" align="left">US-55</td>
<td valign="top" align="left">Wisconsin</td>
</tr>
<tr>
<td valign="top" align="left">US-29</td>
<td valign="top" align="left">Missouri</td>
<td valign="top" align="left">US-56</td>
<td valign="top" align="left">Wyoming</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>From <xref ref-type="fig" rid="F2">Figure 2</xref>, we can see that there are highly correlated with geographical location. The same category of states are adjacent states, and the case information is not reflected. Therefore, the clustering method cannot well-balance the relationship between the number of cases and their geographical locations.</p></sec>
<sec>
<title>Results comparison</title>
<p>The peak periods calculated by spatial-temporal scanning analysis are compared with those calculated by circular distribution method, as shown in <xref ref-type="table" rid="T6">Table 6</xref>. Combined with the actual situation, the peak period of the disease obtained by the circular distribution method is similar to the epidemic situation of COVID-19 in the region. The peak period of the disease obtained by the spatial-temporal scanning method is longer than that of the circular distribution method, and the time is generally advanced. Spatiotemporal scanning analysis can send early warning signals for COVID-19, which is a kind of fulminant and fast infectious disease, and has a higher practical value for disease prevention and control.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Comparison of STSA and CDA.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Cluster</bold></th>
<th valign="top" align="left"><bold>Fastigium (STSA)</bold></th>
<th valign="top" align="center"><bold>Span (Day)</bold></th>
<th valign="top" align="left"><bold>Fastigium (CDA)</bold></th>
<th valign="top" align="center"><bold>Span (Day)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Dec 1&#x02013;Jan 31</td>
<td valign="top" align="center">62</td>
<td valign="top" align="left">Dec 6&#x02013;Dec 19</td>
<td valign="top" align="center">14</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">Dec 1&#x02013;Jan 31</td>
<td valign="top" align="center">62</td>
<td valign="top" align="left">Dec 4&#x02013;Dec 28</td>
<td valign="top" align="center">25</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Dec 1&#x02013;Jan 31</td>
<td valign="top" align="center">62</td>
<td valign="top" align="left">Dec 3&#x02013;Dec 24</td>
<td valign="top" align="center">12</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The clustering areas obtained by spatiotemporal scanning analysis are compared with the classification results obtained by the system clustering method, as shown in <xref ref-type="table" rid="T7">Table 7</xref>. In the Spatiotemporal Scanning Analysis (STSA), Hierarchical Cluster Analysis (HCA), and Coordinated Hierarchical Cluster Analysis (CHCA), US-44 (Rhode Island), US-25 (Massachusetts), US-09 (Connecticut), US-34 (New Jersey), and US-33 (New Hampshire) are classified into the first category, while US-36 (New York) is classified into the first category by spatiotemporal scanning. Combined with the actual situation, it can be seen that the results have a great relationship with the cumulative confirmed cases. After adding the coordinate index, the results are highly correlated with the geographical location, and the case information is weakened. The spatiotemporal scanning method makes good use of the information of regional population, case information, geographical location, and other information to give a reasonable clustering area. In terms of disease prevention and control, spatiotemporal scanning method can better provide theoretical basis for its adaptation to local conditions.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Comparison of STSA and (C)HCA.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Cluster</bold></th>
<th valign="top" align="left"><bold>State (STSA)</bold></th>
<th valign="top" align="left"><bold>State (HCA)</bold></th>
<th valign="top" align="left"><bold>State (CHCA)</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">US-09 US-44 US-36 US-25 US-33 US-34</td>
<td valign="top" align="left">US-15 US-02 US-10 US-54 US-33 US-50 US-38 US-35 US-28 US-40 US-09 US-05 US-21 US-22 US-55 US-20 US-45 US-49 US-04 US-34 US-25 US-44</td>
<td valign="top" align="left">US-09 US-10 US-13 US-17 US-18 US-20 US-21 US-24 US-25 US-26 US-29 US-33 US-34 US-37 US-39 US-42 US-44 US-47 US-50 US-51 US-54 US-55</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">US-08 US-56 US-35 US-49 US-46 US-31 US-20 US-40 US-38 US-04 US-30 US-19 US-16 US-29 US-27 US-48 US-05 US-32</td>
<td valign="top" align="left">US-08 US-01 US-27 US-29 US-47 US-18 US-53 US-51 US-13 US-37 US-26 US-17 US-42 US-39</td>
<td valign="top" align="left">US-01 US-04 US-05 US-08 US-15 US-22 US-28 US-35 US-40 US-45 US-49</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">US-48</td>
<td valign="top" align="left">US-36 US-12 US-48 US-06</td>
<td valign="top" align="left">US-12 US-36 US-06 US-48</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="left">US-53 US-41 US-16 US-30 US-32</td>
<td valign="top" align="left">US-32 US-19 US-41 US-24 US-46 US-31 US-56 US-23 US-16 US-30</td>
<td valign="top" align="left">US-02 US-16 US-19 US-23 US-27 US-30 US-31 US-32 US-38 US-41 US-46 US-53 US-56</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Through the abovementioned comparative analysis, it can be seen that the circular distribution method and the space-time scan method have a certain overlap interval in the peak period of disease onset, and the clustering analysis method is certainly similar with its regional aggregation. However, the spatiotemporal scanning method can provide early warning and make better use of geographical factors to determine disease outbreak areas in detail, which is more instructive for the early warning and prevention and control of COVID-19.</p>
<p>The spatiotemporal scanning method can provide more objective grouping basis for the further model establishment of related research. According to the epidemic law of different regions, different groups can be included in different covariate modeling. The qualitative and quantitative research on the influencing factors of COVID-19 will provide an important basis for the development of effective epidemic prevention measures by health institutions such as disease control centers in the region by analyzing the incidence characteristics of patients with COVID-19 in different regions and at different times, and combining the economic level, population flow, medical conditions, and other factors in the region.</p></sec>
<sec>
<title>Omicron variation</title>
<p>In the study of infectious diseases, we cannot ignore the situation of some variants. Based on the time node selected in this paper, the first Omicron case was reported in the US on 1 December, so we are paying attention to Omicron at this stage. Next, we need to know more about Omicron. In fact, Omicron has a significant growth advantage over Delta, leading to rapid spread in the community with higher levels of incidence than previously seen in this pandemic. With the sharp increase of cases and the scarcity of medical resources, we should also give importance to its dissemination.</p>
<p>From <xref ref-type="fig" rid="F3">Figure 3</xref>, we can see that the B.167.2 (Delta) accounted for 99.25% on 4 December, while B.1.1.529 (Omicron) accounted for a low proportion. After 2 weeks, the proportion of Omicron increased rapidly, reaching 40.64%, while the corresponding Delta decreased to 58.08%. After another week, the proportion of Omicron exceeded Delta, becoming the largest variant of infection. After the following 5 weeks, the proportion reached 95.31%. Within 2 months, Omicron became the mutant with the largest proportion of infection, and its propagation speed was very fast.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Proportion of COVID-19 variants.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-897784-g0003.tif"/>
</fig>
<p>Based on the CDC&#x00027;s tracking data of variants and the prediction of the proportion of variants, we calculate the number of variants per week in different regions according to the new cases per day and the proportion of variants per week in the corresponding region. The following figure clearly shows the cumulative number of Omicron cases in the last week. In order to show the map integrally, US-02 and US-15 have changed the actual location in the map. According to the number clustering, the map is divided into four categories. We can find the features in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Distribution of Omicron cases from Jan 25 to Jan 31, 2022.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpubh-10-897784-g0004.tif"/>
</fig>
<p>The numbers of items in clusters 1&#x02013;4 are 18, 29, 2, 1, respectively, and the cluster centers are 89,703, 24,703, 217,010, 507,330 respectively, in <xref ref-type="fig" rid="F4">Figure 4</xref>. The number of Omicron cases increases rapidly in 2 months, with the largest number of cases in one state, US-06, accumulating to 507,330 per week, and also with the largest number in neighboring states.</p>
<p>To summarize, it can be clearly seen that the rate of infection of Omicron increased rapidly, and there is a trend of diffusion from the middle to the surrounding. Population flow is one of the reasons for the rapid spread of virus. The reality of the spread from densely populated cities to other cities can also be observed from the distribution map of Omicron cases.</p></sec></sec>
<sec sec-type="conclusions" id="s4">
<title>Conclusion</title>
<p>In this paper, a retrospective spatiotemporal analysis of confirmed cases of COVID-19 in 50 states of the US is carried out. The first cluster is Connecticut, Rhode Island, New York, Massachusetts, New Hampshire, and New Jersey. The second cluster comprises 18 states, and the three types of gathering area is Texas. Through observation, it can be seen that the geographical location of the capital belonging to the same type of gathering area is relatively close. There is minimal difference between the gathering time and the peak time of newly confirmed cases daily, and the incidence of prominent gathering areas is higher. The reliability test of space-time scan results show that space-time scan has the advantages of accurate measurement in time and reasonable division of regions according to characteristics in space. On the basis of making full use of the existing time and spatial information, a spatiotemporal scanning analysis accurately locates the clustering area, timing and quantifying the corresponding clustering area, and evaluating the risk degree of the region, as we know that a high level of economic development and perfect medical conditions have played a positive role in the recovery of patients. From the analysis of this paper, spatiotemporal scanning analysis has greatly improved the timeliness and effectiveness of early warning of diseases, and can provide scientific basis for early prevention and control of diseases.</p></sec>
<sec sec-type="data-availability" id="s5">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author.</p></sec>
<sec id="s6">
<title>Author contributions</title>
<p>YZ conceived and designed the study. QL analyzed the data. YZ and QL contributed to the writing of the manuscript. All authors contributed to the article and approved the submitted version.</p></sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p></sec>
<sec sec-type="disclaimer" id="s7">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Naus</surname> <given-names>J</given-names></name></person-group>. <article-title>The distribution of the size of the maximum cluster of points on a line</article-title>. <source>J Am Stat Assoc.</source> (<year>1965</year>) <volume>60</volume>:<fpage>532</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.2307/2282688</pub-id></citation></ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulldorff</surname> <given-names>M</given-names></name></person-group>. <article-title>Prospective time periodic geographical disease surveillance using a scan statistic</article-title>. <source>J Royal Statist Soc.</source> (<year>2001</year>) <volume>164</volume>:<fpage>61</fpage>&#x02013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.2307/2680534</pub-id></citation></ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Heffernan</surname> <given-names>R</given-names></name> <name><surname>Hartman</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>A space-time permutation scan statistic for the early detection of disease outbreaks</article-title>. <source>PLoS Med.</source> (<year>2005</year>) <volume>2</volume>:<fpage>e59</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pmed.0020059</pub-id><pub-id pub-id-type="pmid">15719066</pub-id></citation></ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Huang</surname> <given-names>L</given-names></name> <name><surname>Pickle</surname> <given-names>L</given-names></name> <name><surname>Duczmal</surname> <given-names>L</given-names></name></person-group>. <article-title>An elliptic spatial scan statistic</article-title>. <source>Stat Med.</source> (<year>2006</year>) <volume>25</volume>:<fpage>3929</fpage>&#x02013;<lpage>43</lpage>. <pub-id pub-id-type="doi">10.1002/sim.2490</pub-id><pub-id pub-id-type="pmid">16435334</pub-id></citation></ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Mostashari</surname> <given-names>F</given-names></name> <name><surname>Duczmal</surname> <given-names>L</given-names></name> <name><surname>Katherine Yih</surname> <given-names>W</given-names></name> <name><surname>Kleinman</surname> <given-names>K</given-names></name> <name><surname>Platt</surname> <given-names>R</given-names></name></person-group>. <article-title>Multivariate scan statistic for disease surveillance</article-title>. <source>Stat Med.</source> (<year>2007</year>) <volume>26</volume>:<fpage>1824</fpage>&#x02013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1002/sim.2818</pub-id><pub-id pub-id-type="pmid">17216592</pub-id></citation></ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jung</surname> <given-names>I</given-names></name> <name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Klassen</surname> <given-names>A</given-names></name></person-group>. <article-title>A spatial scan statistic for ordinal data</article-title>. <source>Stat Med.</source> (<year>2007</year>) <volume>26</volume>:<fpage>1594</fpage>&#x02013;<lpage>607</lpage>. <pub-id pub-id-type="doi">10.1002/sim.2607</pub-id><pub-id pub-id-type="pmid">28753674</pub-id></citation></ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jung</surname> <given-names>I</given-names></name> <name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Richard</surname> <given-names>OJ</given-names></name></person-group>. <article-title>A spatial scan statistic for multinomial data</article-title>. <source>Stat Med.</source> (<year>2010</year>) <volume>29</volume>:<fpage>1910</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1002/sim.3951</pub-id><pub-id pub-id-type="pmid">20680984</pub-id></citation></ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>L</given-names></name> <name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Gregorio</surname> <given-names>D</given-names></name></person-group>. <article-title>A spatial scan statistic for survival data</article-title>. <source>Biometrics.</source> (<year>2007</year>) <volume>63</volume>:<fpage>109</fpage>&#x02013;<lpage>18</lpage>. <pub-id pub-id-type="doi">10.1111/j.1541-0420.2006.00661.x</pub-id><pub-id pub-id-type="pmid">34238302</pub-id></citation></ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>L</given-names></name> <name><surname>Tiwari</surname> <given-names>RC</given-names></name> <name><surname>Zou</surname> <given-names>Z</given-names></name> <name><surname>Kulldorff</surname> <given-names>M</given-names></name> <name><surname>Feuer</surname> <given-names>EJ</given-names></name></person-group>. <article-title>Weighted normal spatial scan statistic for heterogeneous population data</article-title>. <source>J Am Stat Assoc.</source> (<year>2009</year>) <volume>104</volume>:<fpage>886</fpage>&#x02013;<lpage>98</lpage>. <pub-id pub-id-type="doi">10.1198/jasa.2009.ap07613</pub-id></citation></ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wieckowska</surname> <given-names>B</given-names></name> <name><surname>G&#x000F3;rna</surname> <given-names>I</given-names></name> <name><surname>Trojanowski</surname> <given-names>M</given-names></name> <name><surname>Pruciak</surname> <given-names>A</given-names></name> <name><surname>Stawi&#x00144;ska-Witoszy&#x00144;ska</surname> <given-names>B</given-names></name></person-group>. <article-title>Searching for space-time clusters: the CutL method compared to Kulldorff&#x00027;s scan statistic</article-title>. <source>Geospatial Health.</source> (<year>2019</year>) <volume>14</volume>:<fpage>314</fpage>&#x02013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.4081/gh.2019.791</pub-id><pub-id pub-id-type="pmid">31724381</pub-id></citation></ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>D</given-names></name> <name><surname>Fang</surname> <given-names>Z</given-names></name> <name><surname>Yu</surname> <given-names>Y</given-names></name></person-group>. <article-title>Scan statistic: a new method to detect the persistence of fund performance</article-title>. <source>Operat Res Manag.</source> (<year>2006</year>) <volume>15</volume>:<fpage>82</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.3969/j.issn.1007-3221.2006.01.018</pub-id></citation></ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yin</surname> <given-names>F</given-names></name></person-group>. <source>Application of Spatio-Temporal Scan Statistic in Early Warning of Infectious Diseases</source>. <publisher-loc>Chengdu</publisher-loc>: <publisher-name>Sichuan University</publisher-name> (<year>2007</year>).</citation></ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ma</surname> <given-names>Y</given-names></name> <name><surname>Li</surname> <given-names>X</given-names></name> <name><surname>Zhang</surname> <given-names>Y</given-names></name></person-group>. <article-title>Spatial scale selection of scan statistic in infectious disease surveillance</article-title>. <source>Modern Prevent Med.</source> (<year>2011</year>) <volume>38</volume>:<fpage>1601</fpage>&#x02013;<lpage>4</lpage>.</citation></ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohl</surname> <given-names>A</given-names></name> <name><surname>Delmelle</surname> <given-names>EM</given-names></name> <name><surname>Desjardins</surname> <given-names>MR</given-names></name> <name><surname>Lan</surname> <given-names>Y</given-names></name></person-group>. <article-title>Daily surveillance of COVID-19 using the prospective space-time scan statistic in the United States</article-title>. <source>Spat Spatiotemporal Epidemiol.</source> (<year>2020</year>) <volume>34</volume>:<fpage>100354</fpage>. <pub-id pub-id-type="doi">10.1016/j.sste.2020.100354</pub-id><pub-id pub-id-type="pmid">32807396</pub-id></citation></ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>FY</given-names></name> <name><surname>Beard</surname> <given-names>K</given-names></name></person-group>. <article-title>A comparison of prospective space-time scan statistic and spatiotemporal event sequence based clustering for COVID-19 surveillance</article-title>. <source>PLoS ONE.</source> (<year>2021</year>) <volume>16</volume>:<fpage>e0252990</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0252990</pub-id><pub-id pub-id-type="pmid">34111199</pub-id></citation></ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hohl</surname> <given-names>A</given-names></name> <name><surname>Delmelle</surname> <given-names>E</given-names></name> <name><surname>Desjardins</surname> <given-names>M</given-names></name></person-group>. <article-title>Rapid detection of COVID-19 clusters in the United States using a prospective space-time scan statistic: an update</article-title>. <source>SIGSPATIAL Special.</source> (<year>2020</year>) <volume>12</volume>:<fpage>27</fpage>&#x02013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1145/3404111.3404116</pub-id></citation></ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pei</surname> <given-names>T</given-names></name> <name><surname>Wang</surname> <given-names>X</given-names></name> <name><surname>Song</surname> <given-names>C</given-names></name> <name><surname>Liu</surname> <given-names>Y</given-names></name> <name><surname>Huang</surname> <given-names>Q</given-names></name> <name><surname>Shu</surname> <given-names>H</given-names></name> <etal/></person-group>. <article-title>Research progress on spatiotemporal analysis and modeling of COVID-19 epidemic</article-title>. <source>J Geo-Inform Sci</source>. (<year>2021</year>) <volume>23</volume>:<fpage>188</fpage>&#x02013;<lpage>210</lpage>. <pub-id pub-id-type="doi">10.12082/dqxxkx.2021.200434</pub-id><pub-id pub-id-type="pmid">35457742</pub-id></citation></ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>X</given-names></name> <name><surname>Zhou</surname> <given-names>H</given-names></name></person-group>. <article-title>Scanning statistics and its application in epidemiology</article-title>. <source>China Health Stat.</source> (<year>2011</year>) <volume>28</volume>:<fpage>332</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.3969/j.issn.1002-3674.2011.03.042</pub-id></citation></ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>L</given-names></name></person-group>. <source>The Distribution Characteristics of COVID-19 in Wenzhou City Based on Scan Statistic</source>. <publisher-loc>Hangzhou</publisher-loc>: <publisher-name>Hangzhou Dianzi University</publisher-name> (<year>2020</year>).</citation></ref>
</ref-list>
</back>
</article>