<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Phys.</journal-id>
<journal-title>Frontiers in Physics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Phys.</abbrev-journal-title>
<issn pub-type="epub">2296-424X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1061580</article-id>
<article-id pub-id-type="doi">10.3389/fphy.2023.1061580</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Physics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Efficient solutions of fermionic systems using artificial neural networks</article-title>
<alt-title alt-title-type="left-running-head">Nordhagen et al.</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fphy.2023.1061580">10.3389/fphy.2023.1061580</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Nordhagen</surname>
<given-names>Even M.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2038523/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kim</surname>
<given-names>Jane M.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2265310/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Fore</surname>
<given-names>Bryce</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Lovato</surname>
<given-names>Alessandro</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Hjorth-Jensen</surname>
<given-names>Morten</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="aff" rid="aff4">
<sup>4</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1653258/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Department of Physics and Njord Center</institution>, <institution>University of Oslo</institution>, <addr-line>Oslo</addr-line>, <country>Norway</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Department of Physics and Astronomy and Facility for Rare Isotope Beams</institution>, <institution>Michigan State University</institution>, <addr-line>East Lansing</addr-line>, <addr-line>MI</addr-line>, <country>United States</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>Physics Division</institution>, <institution>Argonne National Laboratory</institution>, <addr-line>Lemont</addr-line>, <addr-line>IL</addr-line>, <country>United States</country>
</aff>
<aff id="aff4">
<sup>4</sup>
<institution>Department of Physics and Center for Computing in Science Education</institution>, <institution>University of Oslo</institution>, <addr-line>Oslo</addr-line>, <country>Norway</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/79766/overview">Are Magnus Bruaset</ext-link>, Simula Research Laboratory, Norway</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/460864/overview">Tarc&#xed;sio Marciano Rocha Filho</ext-link>, University of Brasilia, Brazil</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1487901/overview">Mark Parsons</ext-link>, University of Edinburgh, United Kingdom</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Morten Hjorth-Jensen, <email>hjensen@msu.edu</email>
</corresp>
</author-notes>
<pub-date pub-type="epub">
<day>26</day>
<month>06</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>11</volume>
<elocation-id>1061580</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>10</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>25</day>
<month>04</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Nordhagen, Kim, Fore, Lovato and Hjorth-Jensen.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Nordhagen, Kim, Fore, Lovato and Hjorth-Jensen</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In this study, we explore the similarities and differences between variational Monte Carlo techniques that employ conventional and artificial neural network representations of the ground-state wave function for fermionic systems. Our primary focus is on shallow neural network architectures, specifically the restricted Boltzmann machine, and we examine unsupervised learning algorithms that are appropriate for modeling complex many-body correlations. We assess the advantages and drawbacks of conventional and neural network wave functions by applying them to a range of circular quantum dot systems. Our findings, which include results for systems containing up to 90 electrons, emphasize the efficient implementation of these methods on both homogeneous and heterogeneous high-performance computing facilities.</p>
</abstract>
<kwd-group>
<kwd>quantum dots</kwd>
<kwd>many-body physics</kwd>
<kwd>Monte Carlo methods</kwd>
<kwd>machine learning</kwd>
<kwd>Boltzmann machines</kwd>
</kwd-group>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Statistical and Computational Physics</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Solving the Schr&#xf6;dinger equation for systems of many interacting bosons or fermions is classified as an NP-hard problem due to the complexity of the required many-dimensional wave function, resulting in an exponential growth of degrees of freedom. Reducing the dimensionalities of quantum mechanical many-body systems is an important aspect of modern physics, ranging from the development of efficient algorithms for studying many-body systems to exploiting the increase in computing power. To write software that can fully utilize the available resources has long been known to be an important aspect of these endeavors. Despite tremendous progress having been made in this direction, traditional many-particle methods, either quantum mechanical or classical ones, face huge dimensionality problems when applied to studies of systems with many interacting particles.</p>
<p>Over the last 2&#xa0;decades, quantum computing and machine learning have emerged as some of the most promising approaches for studying complex physical systems where several length and energy scales are involved. Machine learning techniques and in particular neural-network quantum states Goodfellow et al. [<xref ref-type="bibr" rid="B1">1</xref>] have recently been applied to studies of many-body systems, see, for example, Refs. Carleo and Troyer [<xref ref-type="bibr" rid="B2">2</xref>]; Carrasquilla and Torlai [<xref ref-type="bibr" rid="B3">3</xref>]; Pfau et al. [<xref ref-type="bibr" rid="B4">4</xref>]; Calcavecchia et al. [<xref ref-type="bibr" rid="B5">5</xref>]; Carleo et al. [<xref ref-type="bibr" rid="B6">6</xref>]; Boehnlein et al. [<xref ref-type="bibr" rid="B7">7</xref>]; Adams et al. [<xref ref-type="bibr" rid="B8">8</xref>]; Lovato et al. [<xref ref-type="bibr" rid="B9">9</xref>], in various fields of physics and quantum chemistry, with very promising results. In many of these studies, results have aligned well with exact analytical solutions or are in close agreement with state-of-the-art quantum Monte Carlo calculations.</p>
<p>The variational and diffusion Monte Carlo algorithms are among the most popular and successful methods available for ground-state studies of quantum mechanical systems. They both rely on a suitable ansatz for the ground-state of the system, often dubbed the <italic>trial wave function</italic>, which is defined in terms of a set of variational parameters whose optimal values are found by minimizing the total energy of the system. Devising flexible and accurate functional forms for the trial wave functions requires prior knowledge and physical intuition about the system under investigation. However, for many systems we do not have this intuition, and as a result it is often difficult to define a good ansatz for the state function.</p>
<p>According to the universal approximation theorem, a deep feedforward neural network can represent any continuous function within a certain error Hornik et al. [<xref ref-type="bibr" rid="B10">10</xref>] &#x2014; see also Refs. Murphy [<xref ref-type="bibr" rid="B11">11</xref>]; Hastie et al. [<xref ref-type="bibr" rid="B12">12</xref>]; Bishop [<xref ref-type="bibr" rid="B13">13</xref>]; Goodfellow et al. [<xref ref-type="bibr" rid="B1">1</xref>] for further discussions of deep learning methods. Since the variational state in principle can take any functional form, it is natural to replace the trial wave function with a neural network and treat it as a machine learning problem. This approach has been successfully implemented in recent works, see, for example, Refs. Pfau et al. [<xref ref-type="bibr" rid="B4">4</xref>]; Carleo and Troyer [<xref ref-type="bibr" rid="B2">2</xref>]; Cassella et al. [<xref ref-type="bibr" rid="B14">14</xref>]; Adams et al. [<xref ref-type="bibr" rid="B8">8</xref>]; Lovato et al. [<xref ref-type="bibr" rid="B9">9</xref>], and forms the motivation for the present study. Here, the neural network of choice was derived from so-called Gaussian-binary restricted Boltzmann machines, much inspired by the recent contributions by Carleo <italic>et al.</italic>, see, for example, Refs. Carleo and Troyer [<xref ref-type="bibr" rid="B2">2</xref>]; Carleo et al. [<xref ref-type="bibr" rid="B6">6</xref>]. Unlike binary-binary restricted Boltzmann machines Sieber and Gehringer [<xref ref-type="bibr" rid="B15">15</xref>], the approximation properties of Gaussian-binary restricted Boltzmann aren&#x2019;t yet well-understood. However, they are considered highly-expressive networks for certain problems, such as dimensionality reduction and continuous probability density estimation. Note that neural-network representations of variational states are more general, as they don&#x2019;t in principle require prior knowledge of the ground-state wave function, thereby opening the door to systems that have yet to be solved. Particular attention however has to be devoted to the symmetries of the problem, whose inclusion is critical to achieve accurate results.</p>
<p>In this work, we will focus on systems of electrons confined to move in two-dimensional harmonic oscillator systems, so-called quantum dots. These are strongly confined electrons and offer a wide variety of complex and subtle phenomena which pose severe challenges to existing many-body methods. Due to their small size, quantum dots are characterized by discrete quantum levels. For instance, the ground states of circular dots show similar shell structures and magic numbers as seen for atoms and nuclei. These structures are particularly evident in measurements of the change in electrochemical potential due to the addition of one extra electron. Here, these systems will serve as our test of the applicability of restricted Boltzmann machines as artificial neural network variational states.</p>
<p>The theoretical foundation and the methodology are explained in <xref ref-type="sec" rid="s2">Section 2</xref>. The subsequent sections present our results with an analysis of computational methods and resources. In the last section we present our conclusions and perspectives for future work.</p>
</sec>
<sec sec-type="materials|methods" id="s2">
<title>2 Materials and methods</title>
<p>For any Hamiltonian <inline-formula id="inf1">
<mml:math id="m1">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> and trial wave function <italic>&#x3c8;</italic>
<sub>
<italic>T</italic>
</sub>, the variational principle guarantees that the expectation value of the energy <italic>E</italic>
<sub>
<italic>T</italic>
</sub> is greater than or equal to the true ground state energy <italic>E</italic>
<sub>0</sub>,<disp-formula id="e1">
<mml:math id="m2">
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2264;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x232a;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x232a;</mml:mo>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(1)</label>
</disp-formula>Thus, approximate solutions to the time-independent Schr&#xf6;dinger equation can be obtained by choosing a careful parameterization of the wave function and minimizing the energy <italic>E</italic>
<sub>
<italic>T</italic>
</sub> with respect to the parameters. Since the integrals representing <italic>E</italic>
<sub>
<italic>T</italic>
</sub> are normally high dimensional, it is most efficient to evaluate them by means of Monte Carlo methods<disp-formula id="e2">
<mml:math id="m3">
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2248;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:mspace width="0.3333em"/>
<mml:mspace width="0.3333em"/>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x223c;</mml:mo>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">&#x7c;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:math>
<label>(2)</label>
</disp-formula>This involves collecting <italic>n</italic> samples of configurations and averaging over the so-called local energies<disp-formula id="e3">
<mml:math id="m4">
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>We apply the variational Monte Carlo (VMC) method to various circular quantum dots systems. These are systems of interacting electrons confined to move in a two-dimensional harmonic oscillator well. The (scaled)<xref ref-type="fn" rid="fn1">
<sup>1</sup>
</xref> Hamiltonian is given by<disp-formula id="e4">
<mml:math id="m5">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi mathvariant="script">H</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msubsup>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(4)</label>
</disp-formula>where <italic>&#x3c9;</italic> is the oscillator frequency, <italic>r</italic>
<sub>
<italic>i</italic>
</sub> is the distance between electron <italic>i</italic> and the origin, and <italic>r</italic>
<sub>
<italic>ij</italic>
</sub> is the distance between electrons <italic>i</italic> and <italic>j</italic>. We will henceforth assume the total number of electrons <italic>N</italic> to be even and the total spin of the system to be zero.</p>
<p>A simple ansatz can be built starting from the analytical solutions to the non-interacting case. The harmonic oscillator eigenfunctions are given by<disp-formula id="e5">
<mml:math id="m6">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x221d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3c9;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msup>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(5)</label>
</disp-formula>where <italic>H</italic>
<sub>
<italic>n</italic>
</sub> are the Hermite polynomials of degree <italic>n</italic>. To constrain the antisymmetry of the many-body wave function, products of the lowest <italic>N</italic>/2 spatial states and the two spin states <italic>&#x3be;</italic>
<sub>&#xb1;</sub>(<italic>&#x3c3;</italic>) are used as a basis for a Slater determinant<disp-formula id="equ1">
<mml:math id="m7">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>SD</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3be;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>where <italic>m</italic>, <italic>n</italic>, <italic>k</italic> label the single-particle state, <italic>i</italic> labels the particle, and <bold>
<italic>R</italic>
</bold> contains all coordinates of the <italic>N</italic> particles. As an aside, we do not include the spin projections <italic>&#x3c3;</italic>
<sub>
<italic>i</italic>
</sub> as explicit inputs to the wave function as we will describe how to treat them separately in <xref ref-type="sec" rid="s2-2">Section 2.2</xref>. We then define a reference state by pulling the common exponential term out of the determinant and inserting a single variational parameter <italic>&#x3b1;</italic>
<disp-formula id="e6">
<mml:math id="m8">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>Ref</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
<mml:mi>&#x3c9;</mml:mi>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msup>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3be;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>Correlations among electrons can be handled by a Pad&#xe9;-Jastrow factor Drummond et al. [<xref ref-type="bibr" rid="B16">16</xref>],<disp-formula id="e7">
<mml:math id="m9">
<mml:mi>g</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>where <italic>&#x3b2;</italic> is a variational parameter and<disp-formula id="equ2">
<mml:math id="m10">
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="{" close="">
<mml:mrow>
<mml:mtable class="cases">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:mn>3</mml:mn>
<mml:mspace width="1em"/>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if&#x2009;</mml:mtext>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mn>1</mml:mn>
<mml:mspace width="1em"/>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if&#x2009;</mml:mtext>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2260;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>in order for the Kato cusp condition to be satisfied Huang et al. [<xref ref-type="bibr" rid="B17">17</xref>]. The product of the Slater determinant and the Pad&#xe9;-Jastrow factor is commonly named the Slater-Jastrow ansatz,<disp-formula id="e8">
<mml:math id="m11">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>Slater</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>Jastrow</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>Ref</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b1;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(8)</label>
</disp-formula>
</p>
<sec id="s2-1">
<title>2.1 Gaussian-binary restricted Boltzmann machine</title>
<p>There are many possible choices for a machine learning inspired wave function, but using an artificial neural network has recently been demonstrated to give remarkably good results in studies of quantum mechanical many-body systems Cassella et al. [<xref ref-type="bibr" rid="B14">14</xref>]; Adams et al. [<xref ref-type="bibr" rid="B8">8</xref>]; Lovato et al. [<xref ref-type="bibr" rid="B9">9</xref>]; Fore et al. [<xref ref-type="bibr" rid="B18">18</xref>]; Rigo et al. [<xref ref-type="bibr" rid="B19">19</xref>]. Inspired by Ref. Carleo and Troyer [<xref ref-type="bibr" rid="B2">2</xref>], our choice is to start from a restricted Boltzmann machine (RBM) configured for continuous inputs, illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>. The inputs <inline-formula id="inf2">
<mml:math id="m12">
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> are the ravelled particle positions and interactions between the particles are mediated by <italic>H</italic> hidden binary nodes. After summing over all the possible values of the hidden nodes, the marginal distribution of the inputs to the Gaussian-binary RBM takes the form<disp-formula id="e9">
<mml:math id="m13">
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">w</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x220f;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Architecture of a restricted Boltzmann machine. Inter-layer connections between the visible and the hidden layer are represented by the solid lines, where, for instance, the line connecting <italic>x</italic>
<sub>1</sub> to <italic>h</italic>
<sub>1</sub> represents the weight <italic>w</italic>
<sub>11</sub>. The dotted lines represent the visible biases, where the line going from the bias unit to the visible unit <italic>x</italic>
<sub>3</sub> represents the bias weight <italic>a</italic>
<sub>3</sub>. The dashed lines represent the hidden biases, where the line going from the bias unit to the hidden unit <italic>h</italic>
<sub>3</sub> represents the bias weight <italic>b</italic>
<sub>3</sub>.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g001.tif"/>
</fig>
<p>Here, <inline-formula id="inf3">
<mml:math id="m14">
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> and <inline-formula id="inf4">
<mml:math id="m15">
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> are the bias parameters of the input and hidden nodes, respectively. The weights between the input and hidden nodes are <inline-formula id="inf5">
<mml:math id="m16">
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>, while <inline-formula id="inf6">
<mml:math id="m17">
<mml:mi mathvariant="bold-italic">&#x3c3;</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> are the widths of the Gaussian input nodes (not to be confused with the spin projections). It is possible to train these widths by reparameterizing them as <italic>&#x3c3;</italic>
<sub>
<italic>i</italic>
</sub> &#x3d; exp(<italic>s</italic>
<sub>
<italic>i</italic>
</sub>), but in this work all of the widths were fixed to <inline-formula id="inf7">
<mml:math id="m18">
<mml:mi>&#x3c3;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>/</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:math>
</inline-formula> and only the biases and weights are treated as variational parameters. This allows us thus to reduce the number of parameters to the biases and weights only. See <xref ref-type="app" rid="app2">Appendix B</xref> for the derivation of the marginal probability.</p>
<p>Notice how the marginal distribution in Eq. <xref ref-type="disp-formula" rid="e9">9</xref> mimics the Gaussian parts of our aforementioned ansatzes in Eqs <xref ref-type="disp-formula" rid="e6">6</xref>, <xref ref-type="disp-formula" rid="e8">8</xref>. Based on such observations, our next step is to construct two corresponding ansatzes<disp-formula id="e10">
<mml:math id="m19">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>RBM</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">w</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">w</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3be;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(10)</label>
</disp-formula>and<disp-formula id="e11">
<mml:math id="m20">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>RBM</mml:mtext>
<mml:mo>&#x2b;</mml:mo>
<mml:mtext>PJ</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">w</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">a</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">b</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">w</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>g</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">R</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
</mml:msqrt>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3be;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(11)</label>
</disp-formula>
</p>
<p>The two trial wave functions above apply different levels of physical intuition. While <italic>&#x3c8;</italic>
<sub>RBM</sub> doesn&#x2019;t contain specific information about the electron-electron interactions, <italic>&#x3c8;</italic>
<sub>RBM&#x2b;PJ</sub> contains a correlation factor that explicitly upholds the cusp condition. Both ansatzes contain knowledge about the required antisymmetry and the Gaussians in the marginal distribution help localize the wave functions to satisfy the boundary conditions far from the oscillator well. Also, as the marginal distribution is positive definite, these ansatzes will never collapse into the bosonic state even if the marginal distribution is not symmetric.</p>
</sec>
<sec id="s2-2">
<title>2.2 Code optimization</title>
<p>Parallel computing is an important part of our efforts for developing an efficient VMC solver. However, increasing the available computational resources alone is often not sufficient. One should also consider developing sophisticated algorithms that deliberately minimize the number of floating point operations, cache misses, and communication between parallel processes. Reducing the number of floating point operations is important in Monte Carlo calculations, in particular for the calculations reported here where we produce a large quantity (typically close to a billion) of samples. To see this, one can consider the evaluation of the kinetic energy. The kinetic energy term of the Schr&#xf6;dinger equation is usually one of the more computationally expensive parts to compute in terms of computing cycles. It includes, amongst other elements, the computation of the Laplacian of the wave function. The Laplacian term in the expression for the local energy can be written as<disp-formula id="equ3">
<mml:math id="m21">
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:math>
</disp-formula>This way of writing the kinetic energy term is beneficial for two reasons: First, our trial wave function has normally an exponential shape, which is taken care of by the log-function. This is often the case for many other ansatzes for the trial functions. Secondly, this form allows for separating various elements of the trial wave function and reducing thereby the number of floating point operations. Evaluating the left-hand side of the above equation directly as it stands requires more floating point operations than evaluating the first and second derivatives of the logaritmic function. Moreover, with a logarithmic function and exponential functions as ansatzes for the trial wave function, this facilitates the use of automatic differentiation Neidinger [<xref ref-type="bibr" rid="B20">20</xref>]; Baydin et al. [<xref ref-type="bibr" rid="B21">21</xref>].</p>
<p>By writing the trial wave function as a product of various terms, here <italic>&#x3c8;</italic>
<sub>
<italic>T</italic>
</sub> &#x3d; <italic>&#x220f;</italic>
<sub>
<italic>j</italic>
</sub>
<italic>&#x3c8;</italic>
<sub>
<italic>j</italic>
</sub>, the kinetic energy terms from each particle <italic>i</italic> can be written as a sum of their corresponding Laplacians and gradients<disp-formula id="e12">
<mml:math id="m22">
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>.</mml:mo>
</mml:math>
<label>(12)</label>
</disp-formula>
</p>
<p>Obtaining analytical expressions of the gradient and Laplacian for all the wave function elements is usually computationally advantageous. However, in many Monte Carlo studies they are normally evaluated numerically using automatic differentiation Neidinger [<xref ref-type="bibr" rid="B20">20</xref>]; Baydin et al. [<xref ref-type="bibr" rid="B21">21</xref>]. Nowadays, automatic differentiation algorithms are employed routinely in VMC calculations.</p>
<p>The computational complexity of calculating the determinant in the n&#xe4;ive way is proportional to <inline-formula id="inf8">
<mml:math id="m23">
<mml:mi mathvariant="script">O</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. This calls for a reduction of dimensionality as well as efficient evaluations of the Slater determinant. In this work we do not consider open systems and assume that all single-particle states up to the Fermi level are filled up. We can then split the Slater determinant in a spin-up and a spin-down part Pfau et al. [<xref ref-type="bibr" rid="B4">4</xref>] without affecting the expectation value of the energy, that is<disp-formula id="equ4">
<mml:math id="m24">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>RBM</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2191;</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mi>&#x3be;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2191;</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2193;</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mi>&#x3be;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2193;</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
</disp-formula>
</p>
<p>The gradient and Laplacian of the logarithm of a determinant with respect to particle <italic>i</italic> are given by<disp-formula id="equ5">
<mml:math id="m25">
<mml:mtable class="aligned">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mo>&#x3d;</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>det</mml:mi>
<mml:mo>&#x3d;</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>where <italic>d</italic>
<sub>
<italic>ji</italic>
</sub> is element (<italic>j</italic>, <italic>i</italic>) of the matrix and <inline-formula id="inf9">
<mml:math id="m26">
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is the corresponding element of the inverse matrix Hammond et al. [<xref ref-type="bibr" rid="B22">22</xref>]. The general solution requires inversion of the matrix, which is known to be costly. Fortunately, we can use a trick to reduce the cost: If we move one particle at a time in our sampling over configurations, this means that we change either the elements of one row or alternatively one column of the Slater matrix. In this case, there is a simple relation between the old and the new inverse matrix<disp-formula id="equ6">
<mml:math id="m27">
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="{" close="">
<mml:mrow>
<mml:mtable class="cases">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mspace width="1em"/>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if&#x2009;</mml:mtext>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mspace width="1em"/>
<mml:mspace width="1em"/>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mtext>if&#x2009;</mml:mtext>
<mml:mi>j</mml:mi>
<mml:mo>&#x2260;</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
</mml:math>
</disp-formula>such that the new inverse Slater matrix can be found by a few operations when the previous is known. Here, <italic>R</italic>
<sub>
<italic>i</italic>
</sub> is the ratio between the new and the old determinant and <italic>S</italic>
<sub>
<italic>ij</italic>
</sub> is the cross product between columns in the new rows and the old matrix,<disp-formula id="equ7">
<mml:math id="m28">
<mml:mtable class="aligned">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:msub>
<mml:mrow>
<mml:mi>R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right">
<mml:msub>
<mml:mrow>
<mml:mi>S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>By using these expressions, the entire Slater determinant matrix is inverted only once per simulation. We also avoid including spin flips in the simulations. We limit ourselves in this work to systems where we can use these approximations. For systems where all single-particle states up to the Fermi level are filled, the above serves as a useful approximation if the Hamiltonian does not contain spin-dependent terms. If not, every suggested move should include possible spin flips as well.</p>
<p>Like most other Monte Carlo schemes, the algorithm can be split into smaller individual parts and run efficiently in parallel. For each optimization step, the system is sampled independently in several processes, and the results from all the processes are averaged before performing the parameter optimization. In this way, we achieve near perfect parallelization with message passing interface (MPI). A flow chart of the simulation code can be found in <xref ref-type="fig" rid="F2">Figure 2</xref>. Notice that the parameters are updated with respect to the gradients of the expectation value of the local energy. The latter is given by<disp-formula id="equ8">
<mml:math id="m29">
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x232a;</mml:mo>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>2</mml:mn>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x232a;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">&#x232a;</mml:mo>
<mml:mo stretchy="false">&#x2329;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x2207;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>ln</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo stretchy="false">&#x232a;</mml:mo>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>as discussed by Ref. Umrigar and Filippi [<xref ref-type="bibr" rid="B23">23</xref>], among others. When the simulations are run with a burn-in period<xref ref-type="fn" rid="fn2">
<sup>2</sup>
</xref>, each process should have a burn-in time equal to the burn-in time for a single process. The theoretical parallelization efficiency is then given by (<italic>t</italic>
<sub>burn-in</sub> &#x2b; <italic>t</italic>
<sub>sample</sub>)/(<italic>mt</italic>
<sub>burn-in</sub> &#x2b; <italic>t</italic>
<sub>sample</sub>) where <italic>m</italic> is the number of parallel processes, <italic>t</italic>
<sub>burn-in</sub> is the burn-in time and <italic>t</italic>
<sub>sample</sub> is the total sampling time. Additionally, the weight optimization can&#x2019;t be parallelized, but has a negligible computational cost compared to the sampling. The communication can also be neglected even with low communication speed. The type of Markov-chain Monte Carlo simulations discussed here are rather simple to parallelize with almost no cost from communication between processes.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Flow chart of the solver with emphasis on the parallel processing. The sampling is parallelized across <italic>m</italic> processes, where the trial wave function (WF) is broadcast from process 0. Then <italic>E</italic>
<sub>
<italic>L</italic>
</sub>, &#x2207;<sub>
<italic>&#x3b8;</italic>
</sub> ln&#x2009;<italic>&#x3c8;</italic>
<sub>
<italic>T</italic>
</sub> and <italic>E</italic>
<sub>
<italic>L</italic>
</sub> &#x22c5;&#x2207;<sub>
<italic>&#x3b8;</italic>
</sub>&#x2009;ln&#x2009;<italic>&#x3c8;</italic>
<sub>
<italic>T</italic>
</sub> are sampled independently on each process, and an average is taken after all processes are done with sampling.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g002.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="results|discussion" id="s3">
<title>3 Results and discussions</title>
<p>In this section we compare the computational complexity, ground state energy, energy convergence, contribution from the different Hamiltonian terms and electron densities for various trial wave function ansatzes. The RBM ansatz consists of a Slater determinant with Hermite polynomials as the basis, multiplied with the RBM marginal distribution, as presented in Eq. <xref ref-type="disp-formula" rid="e10">10</xref> We have not made any attempt to include optimized single-particle state functions through mean-field optimizations like Hartree-Fock theory. The second ansatz is the RBM ansatz with a correlation factor described in Eq. <xref ref-type="disp-formula" rid="e7">7</xref>, abbreviated RBM &#x2b; PJ. We have also include results obtained using a traditional Slater-Jastrow ansatz (Eq. <xref ref-type="disp-formula" rid="e8">8</xref>), and as reference we use a plain Slater determinant (without a correlation factor, Eq. <xref ref-type="disp-formula" rid="e6">6</xref>). The diffusion Monte Carlo (DMC) results obtained by H&#xf8;gberget [<xref ref-type="bibr" rid="B24">24</xref>] are our main reference for the ground state energy. For quantum dots with two electrons we compare our results with corresponding analytical ones from Ref. Taut [<xref ref-type="bibr" rid="B25">25</xref>].</p>
<p>
<xref ref-type="fig" rid="F3">Figure 3</xref> displays the average CPU time per iteration as a function of system size for the different wave function ansatzes. To obtain these estimates, we averaged the CPU time per iteration over 10<sup>3</sup> iterations. The RBM &#x2b; PJ and Slater-Jastrow ansatzes are the most computationally demanding due to the Pad&#xe9;-Jastrow factor, which depends on the relative distance between electrons and requires updating of a matrix containing electron-electron distances. Each process has its own memory, and the matrix is not shared across processes.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>CPU time per iteration as a function of the number of electron <italic>N</italic> in a quantum dot well. Here we have employed 10<sup>4</sup> Monte Carlo cycles per iteration (with 10<sup>3</sup> iterations in total to reach an acceptable statistical error). The number of hidden nodes in the Boltzmann machine was set to <italic>H</italic> &#x3d; 6.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g003.tif"/>
</fig>
<p>The impact of parallel processing on computational overhead is minimal compared to other aspects of the code, except for the burn-in period. In fact, the variation in CPU time as a result of noise is much more important than the variation due to parallel processing. Additional information on CPU time per iteration for different wave function ansatzes is presented in <xref ref-type="table" rid="T1">Table 1</xref>. The reported calculations were performed for 10<sup>4</sup> Monte Carlo cycles and an oscillator frequency of <italic>&#x3c9;</italic> &#x3d; 1.0, with similar results for other values of <italic>&#x3c9;</italic>. Each benchmark simulation was run on a single core without a burn-in period. Production runs require more cycles and consequently longer CPU times per iteration. For instance, for a system size of <italic>N</italic> &#x3d; 90 with 2<sup>20</sup> Monte Carlo cycles run on 30 nodes (960 threads), the RBM, RBM &#x2b; PJ, and Slater-Jastrow ansatzes required 0.55960, 2.7345, and 1.3001&#xa0;s per iteration, respectively, with approximately 50,000 iterations needed for convergence.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>CPU time per iteration in seconds for various system sizes and ansatzes. The table compliments <xref ref-type="fig" rid="F3">Figure 3</xref>, and computational details are given in the main text.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">
<italic>N</italic>
</th>
<th align="left">RBM</th>
<th align="left">RBM &#x2b; PJ</th>
<th align="left">Slater-Jastrow</th>
<th align="left">Ref</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">2</td>
<td align="left">0.013517</td>
<td align="left">0.017304</td>
<td align="left">0.011518</td>
<td align="left">0.010029</td>
</tr>
<tr>
<td align="center">6</td>
<td align="left">0.027193</td>
<td align="left">0.040015</td>
<td align="left">0.028200</td>
<td align="left">0.022379</td>
</tr>
<tr>
<td align="center">12</td>
<td align="left">0.054788</td>
<td align="left">0.078638</td>
<td align="left">0.066240</td>
<td align="left">0.043121</td>
</tr>
<tr>
<td align="center">20</td>
<td align="left">0.10498</td>
<td align="left">0.17428</td>
<td align="left">0.14075</td>
<td align="left">0.075787</td>
</tr>
<tr>
<td align="center">30</td>
<td align="left">0.15861</td>
<td align="left">0.32107</td>
<td align="left">0.28436</td>
<td align="left">0.12629</td>
</tr>
<tr>
<td align="center">42</td>
<td align="left">0.24249</td>
<td align="left">0.55319</td>
<td align="left">0.52828</td>
<td align="left">0.19511</td>
</tr>
<tr>
<td align="center">56</td>
<td align="left">0.35651</td>
<td align="left">0.90180</td>
<td align="left">0.77236</td>
<td align="left">0.29539</td>
</tr>
<tr>
<td align="center">72</td>
<td align="left">0.45964</td>
<td align="left">1.4112</td>
<td align="left">1.3279</td>
<td align="left">0.43174</td>
</tr>
<tr>
<td align="center">90</td>
<td align="left">0.66245</td>
<td align="left">2.4634</td>
<td align="left">2.0244</td>
<td align="left">0.61491</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The ground state energies of two-dimensional quantum dots of various sizes and frequencies are presented in <xref ref-type="table" rid="T2">Table 2</xref>. The RBM &#x2b; PJ and Slater-Jastrow ansatzes provide the lowest ground state energies as expected as the correlation factor does explicit satisfy the Kato&#x2019;s cusp conditions. The relative errors with respect to diffusion Monte Carlo (DMC) calculations tend to be less than 0.2% for both methods. The various results obtained with the RBM &#x2b; PJ ansatz show that this ansatz dominates for small quantum dots, but is outperformed by the Slater-Jastrow ansatz for larger systems. We suspect this is due to the fact that the former ansatz is more complex and contains significantly more parameters than the latter, and has therefore a hard time finding the global minimum. The optimization could be improved with a more sophisticated optimization algorithm.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>The ground state energy of two-dimensional quantum dots with <italic>N</italic> electrons and frequency <italic>&#x3c9;</italic>. Other references include diffusion Monte Carlo results taken from H&#xf8;gberget [<xref ref-type="bibr" rid="B24">24</xref>] and semi-analytical results obtained by Taut [<xref ref-type="bibr" rid="B25">25</xref>]. The energy is given in Hartree units, and the numbers in parenthesis are the statistical uncertainties in the last digit. Bold values correspond to the lowest ground-state energy obtained in this work. For abbreviations see the text.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">
<italic>N</italic> </th>
<th align="center">
<italic>&#x3c9;</italic>
</th>
<th align="right">RBM</th>
<th align="right">RBM &#x2b; PJ</th>
<th align="right">Slater-Jastrow</th>
<th align="right">Ref</th>
<th align="center">DMC (Ref. H&#xf8;gberget[<xref ref-type="bibr" rid="B24">24</xref>])</th>
<th align="center">Semi-analytical (Ref. Taut[<xref ref-type="bibr" rid="B25">25</xref>])</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">2</td>
<td align="center">0.1</td>
<td align="right">0.46774(5)</td>
<td align="right">
<bold>0.440975(8)</bold>
</td>
<td align="right">0.44129(1)</td>
<td align="right">0.5279(1)</td>
<td align="right">0.44079(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1/6</td>
<td align="right">0.70389(7)</td>
<td align="right">
<bold>0.666665(1)</bold>
</td>
<td align="right">0.66710(1)</td>
<td align="right">0.7693(1)</td>
<td align="left"/>
<td align="right">2/3</td>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">1.07100(6)</td>
<td align="right">
<bold>1.021668(7)</bold>
</td>
<td align="right">1.02192(1)</td>
<td align="right">1.1388(1)</td>
<td align="right">1.02164(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">1.72343(7)</td>
<td align="right">1.659637(6)</td>
<td align="right">
<bold>1.65974(1)</bold>
</td>
<td align="right">1.7983(3)</td>
<td align="right">1.65977(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">3.0789(1)</td>
<td align="right">
<bold>2.999587(5)</bold>
</td>
<td align="right">2.99936(1)</td>
<td align="right">3.1484(3)</td>
<td align="right">3.00000(1)</td>
<td align="right">3.0</td>
</tr>
<tr>
<td align="center">6</td>
<td align="center">0.1</td>
<td align="right">3.6971(1)</td>
<td align="right">3.5700(2)</td>
<td align="right">
<bold>3.5695(1)</bold>
</td>
<td align="right">3.8552(5)</td>
<td align="right">3.55385(5)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">7.9318(3)</td>
<td align="right">
<bold>7.6203(2)</bold>
</td>
<td align="right">7.6219(1)</td>
<td align="right">8.0517(9)</td>
<td align="right">7.60019(6)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">12.2640(6)</td>
<td align="right">
<bold>11.80494(7)</bold>
</td>
<td align="right">11.8104(2)</td>
<td align="right">12.2799(9)</td>
<td align="right">11.78484(6)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">20.5635(6)</td>
<td align="right">
<bold>20.1773(1)</bold>
</td>
<td align="right">20.1918(2)</td>
<td align="right">20.697(1)</td>
<td align="right">20.15932(8)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">12</td>
<td align="center">0.1</td>
<td align="right">12.6772(4)</td>
<td align="right">12.3416(4)</td>
<td align="right">
<bold>12.29962(9)</bold>
</td>
<td align="right">12.9742(9)</td>
<td align="right">12.26984(8)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">26.389(2)</td>
<td align="right">25.7266(2)</td>
<td align="right">
<bold>25.7049(4)</bold>
</td>
<td align="right">26.625(2)</td>
<td align="right">25.63577(9)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">40.375(1)</td>
<td align="right">
<bold>39.2348(2)</bold>
</td>
<td align="right">39.2421(5)</td>
<td align="right">40.227(2)</td>
<td align="right">39.1596(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">67.620(3)</td>
<td align="right">65.7911(7)</td>
<td align="right">
<bold>65.7026(4)</bold>
</td>
<td align="right">66.744(3)</td>
<td align="right">65.7001(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">20</td>
<td align="center">0.1</td>
<td align="right">30.7906(8)</td>
<td align="right">30.1444(2)</td>
<td align="right">
<bold>30.0403(2)</bold>
</td>
<td align="right">31.253(2)</td>
<td align="right">29.9779(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">63.592(3)</td>
<td align="right">62.1445(5)</td>
<td align="right">
<bold>62.0755(7)</bold>
</td>
<td align="right">63.681(3)</td>
<td align="right">61.9268(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">96.356(5)</td>
<td align="right">94.101(1)</td>
<td align="right">
<bold>94.0433(9)</bold>
</td>
<td align="right">95.755(4)</td>
<td align="right">93.8752(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">159.428(3)</td>
<td align="right">156.104(1)</td>
<td align="right">
<bold>155.8900(4)</bold>
</td>
<td align="right">157.904(6)</td>
<td align="right">155.8822(1)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">30</td>
<td align="center">0.1</td>
<td align="right">61.853(2)</td>
<td align="right">60.774(2)</td>
<td align="right">
<bold>60.585(1)</bold>
</td>
<td align="right">62.449(4)</td>
<td align="right">60.4205(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">126.891(8)</td>
<td align="right">124.437(2)</td>
<td align="right">
<bold>124.195(2)</bold>
</td>
<td align="right">126.717(5)</td>
<td align="right">123.9683(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">191.455(9)</td>
<td align="right">187.488(2)</td>
<td align="right">
<bold>187.325(3)</bold>
</td>
<td align="right">189.977(6)</td>
<td align="right">187.0426(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">315.364(8)</td>
<td align="right">308.989(2)</td>
<td align="right">
<bold>308.576(1)</bold>
</td>
<td align="right">311.70(2)</td>
<td align="right">308.5627(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">42</td>
<td align="center">0.1</td>
<td align="right">109.767(7)</td>
<td align="right">108.128(2)</td>
<td align="right">
<bold>107.928(2)</bold>
</td>
<td align="right">110.630(7)</td>
<td align="right">107.6389(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">224.257(9)</td>
<td align="right">220.588(3)</td>
<td align="right">
<bold>220.224(2)</bold>
</td>
<td align="right">223.837(8)</td>
<td align="right">219.8426(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">337.43(1)</td>
<td align="right">331.410(3)</td>
<td align="right">
<bold>331.276(3)</bold>
</td>
<td align="right">335.18(1)</td>
<td align="right">330.6306(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">553.40(1)</td>
<td align="right">543.746(3)</td>
<td align="right">
<bold>542.977(2)</bold>
</td>
<td align="right">548.07(2)</td>
<td align="right">542.9428(8)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">56</td>
<td align="center">0.1</td>
<td align="right">179.035(8)</td>
<td align="right">176.659(2)</td>
<td align="right">
<bold>176.221(1)</bold>
</td>
<td align="right">180.08(1)</td>
<td align="right">175.9553(7)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">364.52(2)</td>
<td align="right">359.456(6)</td>
<td align="right">
<bold>358.470(2)</bold>
</td>
<td align="right">363.81(1)</td>
<td align="right">358.145(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">547.20(3)</td>
<td align="right">538.666(5)</td>
<td align="right">
<bold>537.841(4)</bold>
</td>
<td align="right">544.12(3)</td>
<td align="right">537.353(2)</td>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">894.12(2)</td>
<td align="right">881.010(5)</td>
<td align="right">
<bold>879.514(3)</bold>
</td>
<td align="right">887.20(5)</td>
<td align="right">879.3986(6)</td>
<td align="left"/>
</tr>
<tr>
<td align="center">72</td>
<td align="center">0.1</td>
<td align="right">274.12(1)</td>
<td align="right">270.870(3)</td>
<td align="right">
<bold>270.296(3)</bold>
</td>
<td align="right">275.34(2)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">556.63(2)</td>
<td align="right">549.899(8)</td>
<td align="right">
<bold>548.315(4)</bold>
</td>
<td align="right">555.45(2)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">833.85(3)</td>
<td align="right">822.78(2)</td>
<td align="right">
<bold>821.089(6)</bold>
</td>
<td align="right">829.31(3)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">1355.37(2)</td>
<td align="right">1341.54(2)</td>
<td align="right">
<bold>1339.85(1)</bold>
</td>
<td align="right">1349.65(6)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="center">90</td>
<td align="center">0.1</td>
<td align="right">399.84(1)</td>
<td align="right">395.486(4)</td>
<td align="right">
<bold>394.621(4)</bold>
</td>
<td align="right">401.19(2)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.28</td>
<td align="right">809.99(2)</td>
<td align="right">800.504(6)</td>
<td align="right">
<bold>799.187(5)</bold>
</td>
<td align="right">808.35(2)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">0.5</td>
<td align="right">1211.92(5)</td>
<td align="right">1198.12(3)</td>
<td align="right">
<bold>1195.025(9)</bold>
</td>
<td align="right">1205.65(4)</td>
<td align="left"/>
<td align="left"/>
</tr>
<tr>
<td align="left"/>
<td align="center">1.0</td>
<td align="right">1973.95(5)</td>
<td align="right">1948.75(2)</td>
<td align="right">
<bold>1946.27(1)</bold>
</td>
<td align="right">1958.58(5)</td>
<td align="left"/>
<td align="left"/>
</tr>
</tbody>
</table>
</table-wrap>
<p>For low frequency dots (<italic>&#x3c9;</italic> &#x3c; 0.28), the RBM produces ground state energies lower than the reference energy, but fails for large high frequency dots (<italic>N</italic> &#x3e; 6, <italic>&#x3c9;</italic> &#x3e; 0.28). For low frequency dots the interaction energy dominates, and the RBM manages to learn the correlations. For high frequency dots, it is the one-body part of the Hamiltonian which dominates and the interaction part is less important for the final expectation value of the energy. For these systems, the Slater determinant part of the wave function ansatz is the most important one, meaning in turn that the RBM ansatz we have used may not capture well the structure of the single-particle states that define the Slater determinants.</p>
<p>We also note that the reference values are similar to corresponding Hartree-Fock energies up to 30 particles Mariadason [<xref ref-type="bibr" rid="B26">26</xref>], which is expected as both approaches try to find the optimal single Slater determinant without a correlation factor.</p>
<p>Some simulations were also performed for the RBM and RBM &#x2b; PJ ansatzes with sorted network inputs to enforce anti-symmetry under exchange of two particles, as suggested by Ref. Saito [<xref ref-type="bibr" rid="B27">27</xref>], among others. Sorted inputs showed promising results, where the ground state energy typically dropped in the fourth or fifth digit. For example, the RBM &#x2b; PJ ansatz found the ground state energy of a system with <italic>N</italic> &#x3d; 30 electrons and <italic>&#x3c9;</italic> &#x3d; 0.5 to be 187.311(1) Hartree. This is lower than the Slater-Jastrow energy. We will investigate this further in a follow-up paper [<xref ref-type="bibr" rid="B28">28</xref>].</p>
<p>For the traditional ansatzes that do not contain artificial neural networks, the learning rate determines to a large extent how fast the trial wave function converges towards the true ground state wave function. However, for the RBM and the RBM &#x2b; PJ approaches, the results are sensitive to the chosen learning rate values. A too large learning rate can easily cause exploding energies, and with a too small learning rate the obtained energies with a given trial function might not converge at all. Our strategy is to find the highest learning rate that does not lead to exploding energies. In general this requires efficient grid searching methods. Here, when the energies have converged, we decrease the learning rate by a factor of 10<xref ref-type="fn" rid="fn3">
<sup>3</sup>
</xref> and let it run until it converges again. Also, the RBMs tend to converge step-wise, making it hard to know whether or not they have converged.</p>
<p>In <xref ref-type="fig" rid="F4">Figure 4</xref>, the convergence of the local energy is plotted for our four ansatzes for <italic>N</italic> &#x3d; 2 electrons. Here the results are compared with analytical calculations for <italic>N</italic> &#x3d; 2. A similar behavior is shown for <italic>N</italic> &#x3d; 56 electrons and <italic>&#x3c9;</italic> &#x3d; 0.1 in <xref ref-type="fig" rid="F5">Figure 5</xref>. For the <italic>N</italic> &#x3d; 2 electrons case, RBM &#x2b; PJ turns out to be the most accurate ansatz with small absolute errors. During training, the Pad&#xe9;-Jastrow parameter is rather constant, but the adjustment of weights seems important for the model to reach the energy minimum. For larger systems, the Slater-Jastrow ansatz provides a slightly lower energy than the RBM &#x2b; PJ ansatz. Since the number of inputs for the RBM increases with number of particles, the network contains additional training parameters as we increase the number of electrons. Therefore, the networks get more complex for large quantum dots, and they are naturally harder to train.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Energy convergence for quantum dots with <italic>N</italic> &#x3d; 2 and <italic>&#x3c9;</italic> &#x3d; 1/6. The figure shows how the ground state energy approaches the analytical value (Ref. Taut [<xref ref-type="bibr" rid="B25">25</xref>]) for various ansatzes.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g004.tif"/>
</fig>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Energy convergence for quantum dots with <italic>N</italic> &#x3d; 56 and <italic>&#x3c9;</italic> &#x3d; 0.1. The reference value was obtained by Ref. H&#xf8;gberget [<xref ref-type="bibr" rid="B24">24</xref>] using diffusion Monte Carlo. The labeling of the various results is the same as in the previous figure.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g005.tif"/>
</fig>
<p>The spatial one-body density plots for <italic>&#x3c9;</italic> &#x3d; 1.0 and the RBM and RBM &#x2b; PJ ansatzes are presented in <xref ref-type="fig" rid="F6">Figures 6</xref>, <xref ref-type="fig" rid="F7">7</xref>. The electron densities have a wave shape for both ansatzes, with two nodes for <italic>N</italic> &#x3d; 2, three nodes for <italic>N</italic> &#x3d; 6 and so on similar to those observed in Refs. Ghosal et al. [<xref ref-type="bibr" rid="B29">29</xref>]; H&#xf8;gberget [<xref ref-type="bibr" rid="B24">24</xref>]. The RBM seems to exaggerate the states with more distinct peaks (higher peaks and lower wave valleys) compared to the RBM &#x2b; PJ results. This can be explained by more localized electrons, which becomes even more apparent in low frequency dots (see <xref ref-type="fig" rid="F8">Figure 8</xref>), as the interactions are modelled differently. The RBM would hardly be able to model the correct electron-electron distances, as the network itself is purely linear. The electron density was found to be shape-invariant for high frequency dots (<italic>&#x3c9;</italic> &#x3e; 0.28), with decreasing spatial range as the frequency increases.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Electron density profiles, <italic>&#x3c1;</italic>(<italic>x</italic>, <italic>y</italic>), for two-dimensional quantum dots with frequency <italic>&#x3c9;</italic> &#x3d; 0.5 and <italic>N</italic> &#x3d; 2, 6 and 12 electrons seen from the top. The surface plot and the contour plot on the <italic>xy</italic>-plane illustrate the density, and the graph on the <italic>yz</italic>-plane represents the cross-section through <italic>x</italic> &#x3d; 0. They were obtained using RBM (left column) and RBM &#x2b; PJ (right column), with <italic>M</italic> &#x3d; 2<sup>30</sup> Monte Carlo cycles after convergence. The plots are noise-reduced using a Savitzky-Golay filter. For abbreviations and description of the natural units used, see the main text for more details.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g006.tif"/>
</fig>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Electron density profiles, <italic>&#x3c1;</italic>(<italic>x</italic>, <italic>y</italic>), for two-dimensional quantum dots with frequency <italic>&#x3c9;</italic> &#x3d; 0.5 and <italic>N</italic> &#x3d; 20, 30 and 42 electrons seen from the top. The surface plot and the contour plot on the xy-plane illustrate the density, and the graph on the yz-plane represents the cross-section through <italic>x</italic> &#x3d; 0. They were obtained using RBM (left column) and RBM &#x2b; PJ (right column), with <italic>M</italic> &#x3d; 2<sup>30</sup> Monte Carlo cycles after convergence. The plots are noise-reduced using a Savitzky-Golay filter. For abbreviations and description of the natural units used, see main text.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>One-body density profile, <italic>&#x3c1;</italic>(<italic>x</italic>, <italic>y</italic>), of two-dimensional quantum dots with frequency <italic>&#x3c9;</italic> &#x3d; 0.1 and <italic>N</italic> &#x3d; 2, 6 and 12 electrons seen from left to right. The ansatzes used are RBM (upper panels) and RBM &#x2b; PJ (lower panels). The surface plot and the contour plot on the xy-plane illustrate the density, and the graph on the yz-plane represents the cross-section through <italic>x</italic> &#x3d; 0. The surface plots are noise-reduced using a Savitzky-Golay filter. For abbreviations and description of the natural units used, see main text.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g008.tif"/>
</fig>
<p>When reducing the frequency further, the interaction energy dominates over the kinetic energy and harmonic oscillator energy (see <xref ref-type="fig" rid="F10">Figures 10</xref>, <xref ref-type="fig" rid="F11">11</xref>), and the electrons naturally become more localized. The Pad&#xe9;-Jastrow factor solves this by radially localizing the electrons and conserving the circular symmetry, such that the electrons are confined to specific orbitals. This can be seen from <xref ref-type="fig" rid="F8">Figure 8</xref>, where the spatial one-body density is plotted for low frequency dots (<italic>&#x3c9;</italic> &#x3d; 0.1) and system sizes <italic>N</italic> &#x3d; 2, 6 and 12. Radial localization is also what we would expect from the Hamiltonian, which is strictly circular symmetric. On the other hand, the RBM seems to localize the electrons both in radial and angular direction, with the number of electrons corresponding to the number of peaks. This is a nonphysical solution to the problem, and shows that the RBM ansatz breaks down for low frequencies. The RBM &#x2b; PJ ansatz, on the other hand, confines the electrons in orbitals. Distinct peaks in radial direction is what is expected from Wigner crystallization, which we might see indications of with density parameters <italic>r</italic>
<sub>
<italic>s</italic>
</sub> &#x2248; 6.7, <italic>r</italic>
<sub>
<italic>s</italic>
</sub> &#x2248; 1.2, and <italic>r</italic>
<sub>
<italic>s</italic>
</sub> &#x3d; 0.3 respectively for the three system sizes with the RBM &#x2b; PJ ansatz. Notice for instance the small &#x201c;pit&#x201d; on top of the <italic>N</italic> &#x3d; 2 plot for RBM &#x2b; PJ, which isn&#x2019;t seen for higher oscillator frequencies. In <xref ref-type="fig" rid="F9">Figure 9</xref>, we have decreased the one-body density even further to <italic>&#x3c9;</italic> &#x3d; 0.01 for <italic>N</italic> &#x3d; 2. As expected, the electrons become even more localized and are clearly showing Wigner crystallization effects with a density parameter of <italic>r</italic>
<sub>
<italic>s</italic>
</sub> &#x2248; 29. For the RBM ansatz, the electrons are strongly localized (in all directions) and the electron densities barely overlap. For the RBM &#x2b; PJ ansatz, we observe strong orbital confinement where the Wigner crystallization wasn&#x2019;t the target of this study, but the framework seems capable of a profound study of this phenomenon.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>One-body density profile, <italic>&#x3c1;</italic>(<italic>x</italic>, <italic>y</italic>), of two-dimensional quantum dots with frequency <italic>&#x3c9;</italic> &#x3d; 0.01 and <italic>N</italic> &#x3d; 2 electrons. The ansatzes used are RBM (left) and RBM &#x2b; PJ (right). The surface plot and the contour plot on the xy-plane illustrate the density, and the graph on the yz-plane represents the cross-section through <italic>x</italic> &#x3d; 0. The surface plots are noise-reduced using a Savitzky-Golay filter. For abbreviations and description of the natural units used, see main text.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g009.tif"/>
</fig>
<p>To understand the behavior of the one-body density for various frequencies, we investigate the expectation values of the kinetic energy <inline-formula id="inf10">
<mml:math id="m30">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>, the harmonic oscillator potential energy <inline-formula id="inf11">
<mml:math id="m31">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mtext>ext</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and the two-body interaction energy <inline-formula id="inf12">
<mml:math id="m32">
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mtext>ext</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. The energy distribution is plotted for the RBM &#x2b; PJ ansatzes for <italic>N</italic> &#x3d; 2 and <italic>N</italic> &#x3d; 20 (<xref ref-type="fig" rid="F10">Figures 10</xref>, <xref ref-type="fig" rid="F11">11</xref>) for frequencies <italic>&#x3c9;</italic> &#x2208; {0.01, 0.1, 0.28, 0.5, 1.0, 2.0, 3.0, 5.0, and 10.0}. For large values of the oscillator frequency it is the one-body part of the Hamiltonian which dominates in absolute value (kinetic energy and harmonic oscillator potential energy) compared with the expectation value of the two-body interaction. For such frequencies we notice also that the results can almost be interpreted in terms of the virial theorem. This theorem provides a useful relation between the kinetic and potential energy Fock [<xref ref-type="bibr" rid="B30">30</xref>]. For circular quantum dots without the two-electron interaction, the theorem reads <inline-formula id="inf13">
<mml:math id="m33">
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>2</mml:mn>
<mml:mrow>
<mml:mo stretchy="false">&#x27e8;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">&#x302;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mtext>ext</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">&#x27e9;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. From our results we notice that, due to the interaction energy, the ratio between kinetic and harmonic oscillator energies are not exactly equal to two for large frequencies. However, for such frequencies the interaction energy plays a less prominent role, resulting thus in a ratio which is close to the ideal value. When we decrease the frequency however, and the system becomes more dilute with an increase of the mean distance between particles, one notices a somewhat counter intuitive behavior. The expectation value of the kinetic energy and the harmonic oscillator potential energy decrease (and the electrons become more localized as seen in the one-body densities above). However, many-body correlations increase in importance with decreasing frequency. This is reflected in the increased role of the expectation value of the two-body Coulomb interaction, an effect which is simply due to the infinite range of the Coulomb interaction. If we were to multiply the Coulomb interaction with a finite range factor, this effect would disappear. <xref ref-type="fig" rid="F10">Figures 10</xref>, <xref ref-type="fig" rid="F11">11</xref> show this behavior rather clearly.</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Energy distribution for <italic>N</italic> &#x3d; 2 electrons for the RBM &#x2b; PJ ansatz, with frequencies ranging from <italic>&#x3c9;</italic> &#x3d; 0.01 to 10.0.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g010.tif"/>
</fig>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption>
<p>Energy distribution for <italic>N</italic> &#x3d; 20 electrons for the RBM &#x2b; PJ ansatz, with frequencies ranging from <italic>&#x3c9;</italic> &#x3d; 0.01 to 10.0.</p>
</caption>
<graphic xlink:href="fphy-11-1061580-g011.tif"/>
</fig>
</sec>
<sec id="s4">
<title>4 Computational details</title>
<p>The quantum dot systems were studied using a general framework for variational Monte Carlo simulations (The code is available on github.com/evenmn/VMaChine). Importance sampling was used to accelerate the simulations Metropolis et al. [<xref ref-type="bibr" rid="B31">31</xref>]. To minimize the local energy, we applied the Adam optimizer with <italic>&#x3b2;</italic>
<sub>1</sub> &#x3d; 0.9 and <italic>&#x3b2;</italic>
<sub>2</sub> &#x3d; 0.999 as suggested by Ref. Kingma and Ba [<xref ref-type="bibr" rid="B32">32</xref>]. We use an adaptive number of cycles, starting from 2<sup>20</sup> and then increased to 2<sup>24</sup> for the ten last iterations after the energy had converged, and further to 2<sup>30</sup> for the very last iteration to reduce the statistical uncertainty and noise in electron density. The particle step length was chosen to get an acceptance ratio close to 99.5%, spawning from 10<sup>1</sup> for the smallest and weakest systems to 10<sup>&#x2013;3</sup> for the large and narrow oscillators. The optimal learning rate was found by grid searches, and varied from 10<sup>1</sup> to 10<sup>&#x2013;5</sup>. Both the step length and the learning rate depend strongly on the system and the trial wave function.</p>
<p>For the RBM and RBM &#x2b; PJ ansatzes, only the raw particle positions were input to the RBM. This choice was made for performance reasons, as inputting processed positions like the electron-electron distances would lead to significantly larger computational efforts, with an increased network complexity.</p>
<p>The Gaussian parameter was initialized to <italic>&#x3b1;</italic> &#x3d; 1.0, which is the analytical optimum without interaction. We use Xavier initialization for the RBM weights, putting them all close to zero Glorot and Bengio [<xref ref-type="bibr" rid="B33">33</xref>]. The special case with all weights set to zero corresponds to our reference ansatz with <italic>&#x3b1;</italic> &#x3d; 1.0. For all the RBMs but the most narrow ones (<italic>&#x3c9;</italic> &#x3d; 1.0), the number of hidden nodes was set to 6, giving 40 to 1272 parameters for the various system sizes. For <italic>&#x3c9;</italic> &#x3d; 1.0, we used <italic>H</italic> &#x3d; 12 hidden nodes to achieve a lower energy.</p>
<p>All the simulations were run on Intel Xeon E5-2670 CPUs. In total the computational cost of this project was of the order of 10<sup>6</sup> CPU hours with largest amount of cycles spent, for obvious reasons, on the largest systems.</p>
</sec>
<sec id="s5">
<title>5 Conclusion and perspectives</title>
<p>We found that the RBM ansatz gives a significantly lower ground state energy than the reference ansatz for low frequency quantum dots. This may indicate that the RBM manages to capture some of the electron-electron correlations. Based on the one-body density plots, the RBM found the electrons to be localized both angularly and radially compared to the ansatzes containing a Pad&#xe9;-Jastrow factor, which confine electrons radially only. For high frequency dots, the RBM fails in the sense that the obtained ground state energy is larger than the reference energy. This can be explained by the fact that when the interactions get less important, the reference ansatz is a good guess. The RBM &#x2b; PJ ansatz gives energies close to the DMC energies for small quantum dots. The ansatz performance needs to be seen in light of the computational cost, as the ansatzes containing a Pad&#xe9;-Jastrow factor are far more computationally intensive.</p>
<p>From the one-body density and energy distribution plots, it is apparent that the RBM ansatz isn&#x2019;t able to capture the correlations at the same level as the Pad&#xe9;-Jastrow factor. Because of the linearity of the network, it is impossible for it to compute the distance between the particles, which is crucial to model the interactions correctly. One solution to this could be to input the electron-electron distance into the network, as discussed by Pfau et al. [<xref ref-type="bibr" rid="B4">4</xref>], but this would increase the computational intensity. Also, despite including a Slater determinant, the trial wave function is not necessarily anti-symmetric when including an RBM. We performed some simulations where anti-symmetry was forced by sorting the network inputs, which showed promising results in terms of the ground state energy. However, although promising results can be obtained, RBMs are less flexible than general neural networks that make fewer assumptions about the specific mathematical forms of the trial functions. We have encoded explicitly the anti-symmetry via a Slater determinant. Furthermore, two-body correlations are constructed using a Jastrow factor. An RBM with Gaussian distributions is capable of capturing the one-body part of the problem, but is less flexible in finding two-body or more complicated many-body correlations. Although the RBM results reported here are promising compared with existing VMC calculations, recent results with neural networks like those presented in, for example, Refs. Pfau et al. [<xref ref-type="bibr" rid="B4">4</xref>]; Cassella et al. [<xref ref-type="bibr" rid="B14">14</xref>]; Lovato et al. [<xref ref-type="bibr" rid="B9">9</xref>]; Adams et al. [<xref ref-type="bibr" rid="B8">8</xref>]; Carrasquilla and Torlai [<xref ref-type="bibr" rid="B3">3</xref>] offer much more flexible and promising research venues for deep learning methods applied to many-body problems. Results obtained with deep neural networks for these systems will be presented in a future work [<xref ref-type="bibr" rid="B28">28</xref>].</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The original contributions presented in the study are included in the article/supplementary materials, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author contributions</title>
<p>The work was conceived by MH-J, JK, and EN. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>AL and BF are supported by the U.S. Department of Energy, Office of Science, Office of Nuclear Physics, under contracts DE-AC02-06CH11357, by the NUCLEI SciDAC program, and the DOE Early Career Research Program. MH-J is supported by the U.S. Department of Energy, Office of Science, and office of Nuclear Physics under grant No. DE-SC0021152 and U.S. National Science Foundation Grants No. PHY-1404159 and PHY-2013047. JK is supported by the U.S. National Science Foundation Grants No. PHY-1404159 and PHY-2013047. EN is supported by the Norwegian Research Council under grant 287084.</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<fn id="fn1">
<label>1</label>
<p>Natural units are used with energy given in units of <italic>&#x210f;</italic> and length given in units of <inline-formula id="inf14">
<mml:math id="m34">
<mml:msqrt>
<mml:mrow>
<mml:mi>&#x210f;</mml:mi>
<mml:mo>/</mml:mo>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msqrt>
</mml:math>
</inline-formula>.</p>
</fn>
<fn id="fn2">
<label>2</label>
<p>A burn-in period or time refers to the amount of time which is needed before a steady state is reached in a Markov-chain Monte Carlo simulation. When the most likely or steady state has been reached, one can start collecting samples for the stochastic evaluation of the various integrals.</p>
</fn>
<fn id="fn3">
<label>3</label>
<p>We have employed a grid of learning rates in terms of powers of ten with negative exponents. Since we have not implemented an adaptive calculation of the learning rates, we search for the optimal results using a grid of decreasing learning rates.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Goodfellow</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Bengio</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Courville</surname>
<given-names>A</given-names>
</name>
</person-group>. <source>Deep learning</source>. <publisher-loc>Cambridge, Massachusetts)</publisher-loc>: <publisher-name>The MIT Press</publisher-name> (<year>2016</year>).</citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carleo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Troyer</surname>
<given-names>M</given-names>
</name>
</person-group>. <article-title>Solving the quantum many-body problem with artificial neural networks</article-title>. <source>Science</source> (<year>2017</year>) <volume>355</volume>:<fpage>602</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1126/science.aag2302</pub-id>
</citation>
</ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Carrasquilla</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Torlai</surname>
<given-names>G</given-names>
</name>
</person-group>. <article-title>Neural networks in quantum many-body physics: A hands-on tutorial</article-title> (<year>2021</year>). <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2101.11099">https://arxiv.org/abs/2101.11099</ext-link> (Accessed January 26, 2021)</comment>.</citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pfau</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Spencer</surname>
<given-names>JS</given-names>
</name>
<name>
<surname>Matthews</surname>
<given-names>AGDG</given-names>
</name>
<name>
<surname>Foulkes</surname>
<given-names>WMC</given-names>
</name>
</person-group>. <article-title>
<italic>Ab initio</italic> solution of the many-electron Schr&#xf6;dinger equation with deep neural networks</article-title>. <source>Phys Rev Res</source> (<year>2020</year>) <volume>2</volume>:<fpage>033429</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevResearch.2.033429</pub-id>
</citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Calcavecchia</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Pederiva</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Kalos</surname>
<given-names>MH</given-names>
</name>
<name>
<surname>K&#xfc;hne</surname>
<given-names>TD</given-names>
</name>
</person-group>. <article-title>Sign problem of the fermionic shadow wave function</article-title>. <source>Phys Rev E</source> (<year>2014</year>) <volume>90</volume>:<fpage>053304</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevE.90.053304</pub-id>
</citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Carleo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Cirac</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Cranmer</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Daudet</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Schuld</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Tishby</surname>
<given-names>N</given-names>
</name>
<etal/>
</person-group> <article-title>Machine learning and the physical sciences</article-title>. <source>Rev Mod Phys</source> (<year>2019</year>) <volume>91</volume>:<fpage>045002</fpage>. <pub-id pub-id-type="doi">10.1103/RevModPhys.91.045002</pub-id>
</citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Boehnlein</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Diefenthaler</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Sato</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Schram</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Ziegler</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Fanelli</surname>
<given-names>C</given-names>
</name>
<etal/>
</person-group> <article-title>
<italic>Colloquium</italic>: Machine learning in nuclear physics</article-title>. <source>Rev Moddern Phys</source> (<year>2022</year>) <volume>94</volume>:<fpage>031003</fpage>. <pub-id pub-id-type="doi">10.1103/RevModPhys.94.031003</pub-id>
</citation>
</ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Adams</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Carleo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Lovato</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Rocco</surname>
<given-names>N</given-names>
</name>
</person-group>. <article-title>Variational Monte Carlo calculations of A&#x2264;4 nuclei with an artificial neural-network correlator ansatz</article-title>. <source>Phys Rev Lett</source> (<year>2021</year>) <volume>127</volume>:<fpage>022502</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevLett.127.022502</pub-id>
</citation>
</ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Lovato</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Adams</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Carleo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Rocco</surname>
<given-names>N</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Hidden-nucleons neural-network quantum states for the nuclear many-body problem</article-title>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2206.10021">https://arxiv.org/abs/2206.10021</ext-link> (Accessed June 20, 2022).</comment>
</citation>
</ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hornik</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Stinchcombe</surname>
<given-names>M</given-names>
</name>
<name>
<surname>White</surname>
<given-names>H</given-names>
</name>
</person-group>. <article-title>Multilayer feedforward networks are universal approximators</article-title>. <source>Neural Networks</source> (<year>1989</year>) <volume>2</volume>:<fpage>359</fpage>&#x2013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1016/0893-6080(89)90020-8</pub-id>
</citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Murphy</surname>
<given-names>KP</given-names>
</name>
</person-group>. <source>Machine learning: A probabilistic perspective</source>. <publisher-loc>Cambdridge, Massachusetts)</publisher-loc>: <publisher-name>The MIT Press</publisher-name> (<year>2012</year>).</citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hastie</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Tibshirani</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Friedman</surname>
<given-names>J</given-names>
</name>
</person-group>. <source>The elements of statistical learning: Data mining, inference and prediction</source>. <publisher-loc>Berlin)</publisher-loc>: <publisher-name>Springer Verlag</publisher-name> (<year>2009</year>).</citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Bishop</surname>
<given-names>CM</given-names>
</name>
</person-group>. <source>Pattern recognition and machine learning</source>. <publisher-loc>Berlin)</publisher-loc>: <publisher-name>Springer Verlag</publisher-name> (<year>2006</year>).</citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cassella</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Sutterud</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Azadi</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Drummond</surname>
<given-names>ND</given-names>
</name>
<name>
<surname>Pfau</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Spencer</surname>
<given-names>JS</given-names>
</name>
<etal/>
</person-group> <article-title>Discovering quantum phase transitions with fermionic neural networks</article-title>. <source>Phys Rev Lett</source> (<year>2023</year>) <volume>130</volume>:<fpage>036401</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevLett.130.036401</pub-id>
</citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="web">
<person-group person-group-type="author">
<name>
<surname>Sieber</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Gehringer</surname>
<given-names>J</given-names>
</name>
</person-group>. <article-title>Quantitative universal approximation bounds for deep belief networks</article-title> (<year>2022</year>). <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://arxiv.org/abs/2208.09033">https://arxiv.org/abs/2208.09033</ext-link> (Accessed August 18, 2022)</comment>.</citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Drummond</surname>
<given-names>ND</given-names>
</name>
<name>
<surname>Towler</surname>
<given-names>MD</given-names>
</name>
<name>
<surname>Needs</surname>
<given-names>RJ</given-names>
</name>
</person-group>. <article-title>Jastrow correlation factor for atoms, molecules, and solids</article-title>. <source>Phys Rev B</source> (<year>2004</year>) <volume>70</volume>:<fpage>235119</fpage>. <pub-id pub-id-type="doi">10.1103/physrevb.70.235119</pub-id>
</citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>C-J</given-names>
</name>
<name>
<surname>Filippi</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Umrigar</surname>
<given-names>CJ</given-names>
</name>
</person-group>. <article-title>Spin contamination in quantum Monte Carlo wave functions</article-title>. <source>J Chem Phys</source> (<year>1998</year>) <volume>108</volume>:<fpage>8838</fpage>&#x2013;<lpage>47</lpage>. <pub-id pub-id-type="doi">10.1063/1.476330</pub-id>
</citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fore</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>JM</given-names>
</name>
<name>
<surname>Carleo</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Hjorth-Jensen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lovato</surname>
<given-names>A</given-names>
</name>
</person-group>. <article-title>Dilute neutron star matter from neural-network quantum states</article-title>. <source>arXiv</source> [<comment>Preprint</comment>] (<year>2022</year>). <pub-id pub-id-type="doi">10.48550/arXiv.2212.04436</pub-id>
</citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rigo</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Hall</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Hjorth-Jensen</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Lovato</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Pederiva</surname>
<given-names>F</given-names>
</name>
</person-group>. <article-title>Solving the nuclear pairing model with neural network quantum states</article-title>. <source>Phys Rev E</source> (<year>2023</year>) <volume>107</volume>:<fpage>025310</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevE.107.025310</pub-id>
</citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Neidinger</surname>
<given-names>RD</given-names>
</name>
</person-group>. <article-title>Introduction to automatic differentiation and MATLAB object-oriented programming</article-title>. <source>SIAM Rev</source> (<year>2010</year>) <volume>52</volume>:<fpage>545</fpage>&#x2013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.1137/080743627</pub-id>
</citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baydin</surname>
<given-names>AG</given-names>
</name>
<name>
<surname>Pearlmutter</surname>
<given-names>BA</given-names>
</name>
<name>
<surname>Andreyevich Radul</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Siskind</surname>
<given-names>JM</given-names>
</name>
</person-group>. <article-title>Automatic differentiation in machine learning: A survey</article-title>. <source>J Machine Learn Res</source> (<year>2018</year>) <volume>18</volume>:<fpage>1</fpage>. <pub-id pub-id-type="doi">10.5555/3122009.3242010</pub-id>
</citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hammond</surname>
<given-names>BL</given-names>
</name>
<name>
<surname>Lester</surname>
<given-names>WA</given-names>
</name>
<name>
<surname>Reynolds</surname>
<given-names>PJ</given-names>
</name>
</person-group>. <source>Monte Carlo methods in ab initio quantum chemistry</source>. <publisher-loc>Singapore</publisher-loc>: <publisher-name>World Scientific</publisher-name> (<year>1994</year>).</citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Umrigar</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Filippi</surname>
<given-names>C</given-names>
</name>
</person-group>. <article-title>Energy and variance optimization of many-body wave functions</article-title>. <source>Phys Rev Lett</source> (<year>2005</year>) <volume>94</volume>:<fpage>150201</fpage>. <pub-id pub-id-type="doi">10.1103/physrevlett.94.150201</pub-id>
</citation>
</ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="thesis">
<person-group person-group-type="author">
<name>
<surname>H&#xf8;gberget</surname>
<given-names>J</given-names>
</name>
</person-group>. <article-title>Quantum Monte Carlo studies of generalized many-body systems</article-title>. <comment>Master&#x2019;s thesis</comment>. <publisher-loc>Oslo</publisher-loc>: <publisher-name>University of Oslo</publisher-name> (<year>2013</year>).</citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Taut</surname>
<given-names>M</given-names>
</name>
</person-group>. <article-title>Two electrons in a homogeneous magnetic field: Particular analytical solutions</article-title>. <source>J Phys A</source> (<year>1994</year>) <volume>27</volume>:<fpage>1045</fpage>&#x2013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1088/0305-4470/27/3/040</pub-id>
</citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="thesis">
<person-group person-group-type="author">
<name>
<surname>Mariadason</surname>
<given-names>AA</given-names>
</name>
</person-group>. <article-title>Quantum many-body simulations of double dot system</article-title>. <comment>Master&#x2019;s thesis</comment>. <publisher-loc>Oslo</publisher-loc>: <publisher-name>University of Oslo</publisher-name> (<year>2018</year>).</citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Saito</surname>
<given-names>H</given-names>
</name>
</person-group>. <article-title>Method to solve quantum few-body problems with artificial neural networks</article-title>. <source>J Phys Soc Jpn</source> (<year>2018</year>) <volume>87</volume>:<fpage>074002</fpage>. <pub-id pub-id-type="doi">10.7566/jpsj.87.074002</pub-id>
</citation>
</ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Fore</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Nordhagen</surname>
<given-names>EM</given-names>
</name>
<name>
<surname>Lovato</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Hjorth-Jensen</surname>
<given-names>M</given-names>
</name>
</person-group>. <source>Deep learning and confined electrons in two dimensions</source> (<year>2022</year>).</citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ghosal</surname>
<given-names>A</given-names>
</name>
<name>
<surname>G&#xfc;&#xe7;l&#xfc;</surname>
<given-names>AD</given-names>
</name>
<name>
<surname>Umrigar</surname>
<given-names>CJ</given-names>
</name>
<name>
<surname>Ullmo</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Baranger</surname>
<given-names>HU</given-names>
</name>
</person-group>. <article-title>Incipient Wigner localization in circular quantum dots</article-title>. <source>Phys Rev B</source> (<year>2007</year>) <volume>76</volume>:<fpage>085341</fpage>. <pub-id pub-id-type="doi">10.1103/physrevb.76.085341</pub-id>
</citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fock</surname>
<given-names>V</given-names>
</name>
</person-group>. <source>Z f&#xfc;r Physik</source> (<year>1930</year>) <volume>63</volume>:<fpage>855</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1007/BF01339281</pub-id>
</citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Metropolis</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Rosenbluth</surname>
<given-names>AW</given-names>
</name>
<name>
<surname>Rosenbluth</surname>
<given-names>MN</given-names>
</name>
<name>
<surname>Teller</surname>
<given-names>AH</given-names>
</name>
<name>
<surname>Teller</surname>
<given-names>E</given-names>
</name>
</person-group>. <article-title>Equation of state calculations by fast computing machines</article-title>. <source>J Chem Phys</source> (<year>1953</year>) <volume>21</volume>:<fpage>1087</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1063/1.1699114</pub-id>
</citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname>
<given-names>DP</given-names>
</name>
<name>
<surname>Ba</surname>
<given-names>J</given-names>
</name>
</person-group>. <article-title>Adam: A method for stochastic optimization</article-title> (<year>2014</year>). <comment>Available at: </comment>
<ext-link ext-link-type="uri" xmlns:xlink="http://www.w3.org/1999/xlink" xlink:href="https://arxiv.org/abs/1412.6980">https://arxiv.org/abs/1412.6980</ext-link> (<comment>Accessed December 22, 2014)</comment>.</citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Glorot</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Bengio</surname>
<given-names>Y</given-names>
</name>
</person-group>. <article-title>Understanding the difficulty of training deep feedforward neural networks</article-title>. In: <conf-name>Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics</conf-name>; <conf-date>May 13-15, 2010</conf-date>; <conf-loc>Sardinia, Italy</conf-loc> (<year>2010</year>).</citation>
</ref>
</ref-list>
<app-group>
<app id="app1">
<title>Appendix A Derivation of RBM distributions</title>
<p>In this , we will derive the marginal and conditional distributions of a Gaussian-binary restricted Boltzmann machine with the system energy<disp-formula id="equ9">
<mml:math id="m35">
<mml:mi>E</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
</disp-formula>There are 2<italic>N</italic> visible units <italic>x</italic>
<sub>
<italic>i</italic>
</sub> with related bias weights <italic>a</italic>
<sub>
<italic>i</italic>
</sub> and <italic>H</italic> hidden units <italic>h</italic>
<sub>
<italic>j</italic>
</sub> with related bias weights <italic>b</italic>
<sub>
<italic>j</italic>
</sub>. <italic>w</italic>
<sub>
<italic>ij</italic>
</sub> are the weights connecting the visible units to the hidden units. The joint probability distribution is given by the Boltzmann distribution<disp-formula id="equ10">
<mml:math id="m36">
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3b2;</mml:mi>
<mml:mi>E</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>where <italic>Z</italic> is the partition function,<disp-formula id="equ11">
<mml:math id="m37">
<mml:mi>Z</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x222c;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mi>d</mml:mi>
<mml:mi mathvariant="bold-italic">h</mml:mi>
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>and <italic>&#x3b2;</italic> &#x3d; 1/<italic>k</italic>
<sub>
<italic>B</italic>
</sub>
<italic>T</italic> is the reciprocal temperature that will be fixed to 1. As the marginal and conditional distributions are closely related both for the visible and hidden layer, we present the distributions in sections respective for the two layers.</p>
</app>
<app id="app2">
<title>Appendix B Distributions of visible units</title>
<p>The distributions of the visible units are used to find properties related to the visible units. If we recall a restricted Boltzmann machine, the transformation between the visible units and the hidden units is <inline-formula id="inf15">
<mml:math id="m38">
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>. By this expression, we can express the joint probability distribution as<disp-formula id="eB1">
<mml:math id="m39">
<mml:mtable class="aligned">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right"/>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
<label>(B1)</label>
</disp-formula>The marginal distribution of the visible units is given by the sum over all possible hidden states, {<bold>
<italic>h</italic>
</bold>} &#x2208; {0, 1}:<disp-formula id="equ12">
<mml:math id="m40">
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
</disp-formula>as the hidden units can take binary values only. By inserting the expression of the joint probability distribution from Eq. <xref ref-type="disp-formula" rid="eB1">B1</xref>, we obtain<disp-formula id="equ13">
<mml:math id="m41">
<mml:mtable class="aligned">
<mml:mtr>
<mml:mtd columnalign="right">
<mml:mi>P</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mtd>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x2b;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right"/>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="{" close="}">
<mml:mrow>
<mml:mi mathvariant="bold-italic">h</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x220f;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right"/>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x220f;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="right"/>
<mml:mtd columnalign="left">
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>Z</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3c3;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mrow>
<mml:mo>&#x220f;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
</mml:munderover>
</mml:mstyle>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>exp</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">x</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold-italic">&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:math>
</disp-formula>This is what we will use as the marginal distribution of the visible units.</p>
</app>
</app-group>
</back>
</article>