<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Behav. Neurosci.</journal-id>
<journal-title>Frontiers in Behavioral Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Behav. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5153</issn>
<publisher>
<publisher-name>Frontiers Research Foundation</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnbeh.2010.00170</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Tonic Dopamine Modulates Exploitation of Reward Learning</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Beeler</surname> <given-names>Jeff A.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001">&#x0002A;</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Daw</surname> <given-names>Nathaniel</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Frazier</surname> <given-names>Cristianne R. M.</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhuang</surname> <given-names>Xiaoxi</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Neurobiology, University of Chicago</institution> <country>Chicago, IL, USA</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Psychology, Center for Neural Science, New York University</institution> <country>New York, NY, USA</country></aff>
<aff id="aff3"><sup>3</sup><institution>Committee on Neurobiology, University of Chicago</institution> <country>Chicago, IL, USA</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Julietta U. Frey, Leibniz Institute for Neurobiology, Germany</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Satoru Otani, University of Paris VI, France; Katharina A. Braun, OttoVonGuericke University, Germany</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Jeff Beeler, Department of Neurobiology, The University of Chicago, 924 E 57th Street R222, Chicago, IL 60637, USA. e-mail: <email>jabeeler&#x00040;uchicago.edu</email></p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>04</day>
<month>11</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<volume>4</volume>
<elocation-id>170</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>07</month>
<year>2010</year>
</date>
<date date-type="accepted">
<day>11</day>
<month>10</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2010 Beeler, Daw, Frazier and Zhuang.</copyright-statement>
<copyright-year>2010</copyright-year>
<license license-type="open-access" xlink:href="http://www.frontiersin.org/licenseagreement"><p>This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.</p></license>
</permissions>
<abstract>
<p>The impact of dopamine on adaptive behavior in a naturalistic environment is largely unexamined. Experimental work suggests that phasic dopamine is central to reinforcement learning whereas tonic dopamine may modulate performance without altering learning <italic>per se</italic>; however, this idea has not been developed formally or integrated with computational models of dopamine function. We quantitatively evaluate the role of tonic dopamine in these functions by studying the behavior of hyperdopaminergic DAT knockdown mice in an instrumental task in a semi-naturalistic homecage environment. In this &#x0201C;closed economy&#x0201D; paradigm, subjects earn all of their food by pressing either of two levers, but the relative cost for food on each lever shifts frequently. Compared to wild-type mice, hyperdopaminergic mice allocate more lever presses on high-cost levers, thus working harder to earn a given amount of food and maintain their body weight. However, both groups show a similarly quick reaction to shifts in lever cost, suggesting that the hyperdominergic mice are not slower at detecting changes, as with a learning deficit. We fit the lever choice data using reinforcement learning models to assess the distinction between acquisition and expression the models formalize. In these analyses, hyperdopaminergic mice displayed normal learning from recent reward history but diminished capacity to exploit this learning: a reduced coupling between choice and reward history. These data suggest that dopamine modulates the degree to which prior learning biases action selection and consequently alters the expression of learned, motivated behavior.</p>
</abstract>
<kwd-group>
<kwd>dopamine</kwd>
<kwd>reinforcement learning</kwd>
<kwd>DAT knock-down</kwd>
<kwd>explore-exploit</kwd>
<kwd>behavioral flexibility</kwd>
<kwd>environmental adaptation</kwd>
</kwd-group>
<counts>
<fig-count count="5"/>
<table-count count="2"/>
<equation-count count="2"/>
<ref-count count="105"/>
<page-count count="14"/>
<word-count count="12185"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction">
<title>Introduction</title>
<p>The dopamine system plays a critical role in learning about rewards and performing behaviors that yield them (Berke and Hyman, <xref ref-type="bibr" rid="B6">2000</xref>; Dayan and Balleine, <xref ref-type="bibr" rid="B24">2002</xref>; Wise, <xref ref-type="bibr" rid="B100">2004</xref>; Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>; Daw and Doya, <xref ref-type="bibr" rid="B21">2006</xref>; Salamone, <xref ref-type="bibr" rid="B74">2006</xref>; Berridge, <xref ref-type="bibr" rid="B7">2007</xref>; Day et al., <xref ref-type="bibr" rid="B23">2007</xref>; Phillips et al., <xref ref-type="bibr" rid="B69">2007</xref>; Schultz, <xref ref-type="bibr" rid="B80">2007a</xref>; Belin and Everitt, <xref ref-type="bibr" rid="B5">2008</xref>). Despite the ongoing debate on the precise role of dopamine in learning, motivation, and performance (Wise, <xref ref-type="bibr" rid="B100">2004</xref>; Salamone, <xref ref-type="bibr" rid="B74">2006</xref>; Berridge, <xref ref-type="bibr" rid="B7">2007</xref>), the impact of hypothesized dopamine functions on adaptive behavior in a more (semi-) naturalistic environment is largely unexamined.</p>
<p>In natural environments, animals often have to choose between several actions, and the outcome of these actions may shift across time. As a consequence, the animal has to continually sample the environment and adjust its behavior in response to changing reward contingencies. To accomplish this, the animal must strike a balance between exploiting actions that have been previously rewarded and exploring previously disfavored actions to determine whether contingencies have changed. In the study of reinforcement learning (RL), the challenge of striking such a balance has been termed the explore-exploit dilemma, and formalizes an issue that lies at the heart of behavioral flexibility and adaptive learning (Sutton and Barto, <xref ref-type="bibr" rid="B93">1998</xref>).</p>
<p>An implicit assumption in RL theories is that the learned value expectations determine action choice. Importantly, because of the explore-exploit dilemma, this control is not thought to be absolute: rather, choice in reinforcement learning tasks is characterized by a stochastic soft maximization (&#x0201C;softmax&#x0201D;) rule that allocates choices randomly, but with a bias toward the options believed to be richer (Daw et al., <xref ref-type="bibr" rid="B21">2006</xref>). An important open question, however, is how the brain controls the degree to which choice is focused on apparently better options; that is, how much prior experience biases current action selection. This is commonly operationalized in RL models by a gain parameter (called &#x0201C;temperature&#x0201D;) that scales the effect of learned values on biases in action choice; however, though some hypotheses exist, its physiological instantiation is unknown (Doya, <xref ref-type="bibr" rid="B26">2002</xref>; Daw et al., <xref ref-type="bibr" rid="B21">2006</xref>; Cohen et al., <xref ref-type="bibr" rid="B16">2007</xref>). In the present study, we consider the possibility that dopamine &#x02013; and specifically, dopamine signaling at a tonic timescale &#x02013; might be involved in controlling this aspect of behavioral expression and, as a result, modulate the balance between exploration and exploitation.</p>
<p>The hypothesized role of dopamine in learning about action values (Montague et al., <xref ref-type="bibr" rid="B58">1996</xref>; Schultz et al., <xref ref-type="bibr" rid="B83">1997</xref>) is based largely on recordings of phasic dopamine responses. However, dopamine neurons also exhibit a slower, more regular tonic background activity (Grace and Bunney, <xref ref-type="bibr" rid="B37">1984b</xref>). Pharmacological and genetic experiments, which impact dopamine signaling at a tonic timescale, suggest a role for tonic dopamine in the expression rather than acquisition of motivated behavior (Cagniard et al., <xref ref-type="bibr" rid="B10">2006a</xref>,<xref ref-type="bibr" rid="B11">b</xref>). To date, these experimental observations have not been analyzed in the context of computational reinforcement learning models, in a manner analogous to studies of phasic signaling, which has hampered efforts to formalize these results and to understand the relationship between theories of dopamine&#x00027;s action in performance and learning (Berridge, <xref ref-type="bibr" rid="B7">2007</xref>; Niv et al., <xref ref-type="bibr" rid="B65">2007</xref>; Salamone, <xref ref-type="bibr" rid="B75">2007</xref>). We take advantage of how the distinction between acquisition and expression is formalized in temporal difference RL models through the learning rate and temperature parameters, respectively, to quantitatively evaluate the impact of elevated tonic dopamine on choice behavior in the context of the computational model widely associated with phasic dopamine.</p>
<p>We used a homecage operant paradigm where mice earn their food entirely through lever pressing. In this &#x0201C;closed economy&#x0201D; (Rowland et al., <xref ref-type="bibr" rid="B72">2008</xref>) with no access to food outside of the work environment, no experimenter induced food-restriction is needed; the amount of resources gained and spent reflect the animal&#x00027;s behavioral strategy in adapting to its environment. In our paradigm, two levers yield food, but at different costs. At any one time, one lever is inexpensive (requiring few presses for a food) and another is expensive (requiring more presses). Which lever is expensive and which is inexpensive switches every 20-40&#x02009;minutes.</p>
<p>We tested wild-type C57BL/6 mice and hyperdopaminergic dopamine-transporter knock down mice (DATkd) with reduced DA clearance and elevated extracellular tonic DA (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>). Fitting the data to a reinforcement learning model, we find that altered dopamine modulates temperature &#x02013; the explore-exploit parameter &#x02013; resulting in decreased responsiveness to recent reward, without a change in learning rate, resulting in diminished behavioral flexibility in response to shifting environmental contingencies.</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec>
<title>Animals</title>
<p>All mice were male between 10 and 12&#x02009;weeks of age at the start of the experiment. Wild-type C57BL/6 mice were obtained from Jackson Laboratories. The dopamine transporter knock-down mice (DATkd) were from an established colony backcrossed with C57BL/6 more than ten generations. The DATkd have been previously described and characterized (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>; Pecina et al., <xref ref-type="bibr" rid="B67">2003</xref>; Cagniard et al., <xref ref-type="bibr" rid="B10">2006a</xref>; Yin et al., <xref ref-type="bibr" rid="B103">2006</xref>). All mice were housed under standard 12:12 light cycles. All animal procedures were approved by the Institutional Animal Care and Use Committee at The University of Chicago.</p>
</sec>
<sec>
<title>Behavior setup and housing</title>
<p>Mice were singly housed in standard cages equipped (Med-Associates, St. Albans, VT, USA) with two levers placed on one side of the cage approximately six inches apart with a food hopper between the levers. A pellet dispenser delivered 20&#x02009;mg grain-based precision pellets (Bio-Serv, Frenchtown, NJ, USA) contingent on lever presses according to a programmed schedule. No other food was available. Water was available <italic>ad libitum</italic>. Upon initial placement in the operant homecages, three pellets were placed in the food hopper and the first 50 lever presses on either lever yielded a pellet (continuous reinforcement), after which a fixed ratio (FR) schedule was initiated. The cumulative lever press count for each lever was reset for both levers at each pellet delivery. All mice acquired the lever pressing response overnight. On the first day of FR (baseline), both levers operated on an FR20 schedule. On subsequent days, at any given time one lever was expensive and the other inexpensive lever. The inexpensive lever was always FR20. The expensive lever incremented by 20 each day from 40 to 200. Which lever was cheap and which expensive switched every 20&#x02013;40&#x02009;min. After the final FR200 increment, the program reverted to baseline conditions (FR20 both levers) for 3&#x02009;days.</p>
</sec>
<sec>
<title>Data collection and analysis</title>
<p>All events &#x02013; lever presses, pellet delivery, cost change between levers &#x02013; were recorded and time-stamped using Med-PCIV software (Med-Associates, St. Albans, VT, USA). The data was then imported into MATLAB for analysis. Total consumption, high cost, low cost presses, ratio of low-cost to total, average cost per pellet, number of meals per day, average size of meals and duration of meals were calculated directly by the program operating the experiment (i.e., Figure <xref ref-type="fig" rid="F1">1</xref> and Table <xref ref-type="table" rid="T1">1</xref>). The onset of a meal was defined as the procurement of one pellet and the offset defined as the last pellet earned before 30&#x02009;min elapsed without procuring a pellet. To calculate average lever pressing before and after episodes of cost switching between the levers, averaged across the experiment (Figure <xref ref-type="fig" rid="F2">2</xref>), all experimental days (i.e., with a cost differential between levers) were combined into a single dataset for each mouse. The time points for all cost switches were identified and a 10-min window (data recorded in 0.1&#x02009;s bins) before and after each were averaged across switch episodes. The mean over all events was smoothed with a half-Gaussian filter using a weighted average kernel to retain original <italic>y</italic>-axis values from the data. The resulting smoothed data were averaged across mice within each genotype. To calculate runlength averaged across switch episodes, all lever presses within a run (consecutive presses on one lever without intervening presses on the other lever) were coded as the total length of the run (e.g., for a run of three presses, each would be coded as 3). Time bins in which no lever press occurred were coded with zero. When the mean across episodes was calculated, episodes without any pressing on either lever (e.g., mouse sleeping) were coded as not a number (NaN) and excluded from the mean. To make statistical comparisons of the above analyses, the raw data (.ie., not smoothed) across 0.1&#x02009;s bins were collapsed into 20 one minute bins which were used as repeated measures in two-way ANOVAs. For single statistical comparisons, <italic>t</italic>-tests were used.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>Lever pressing, consumption and body weight across experimental days</bold>. Average number of lever presses (LP) per gram of body weight on the <bold>(A)</bold> expensive lever (genotype, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01) and <bold>(B)</bold> inexpensive lever (genotype, NS). <bold>(C)</bold> Ratio of lever presses on the low cost lever to total lever presses (genotype, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.121). <bold>(D)</bold> Average number of lever presses per pellet earned (genotype, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.059). <bold>(E)</bold> Average number of pellets earned per day per gram of body weight (genotype, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.025). <bold>(F)</bold> Daily body weight across experiment (genotype, NS). Error bars&#x02009;&#x0003D; S.E.M., <italic>N</italic>&#x02009;&#x0003D;&#x02009;10.</p></caption>
<graphic xlink:href="fnbeh-04-00170-g001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p><bold>Comparison of baseline behavior between genotypes</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left"/>
<th align="left">Wild-type</th>
<th align="left">DATkd</th>
<th align="left"><italic>t</italic></th>
<th align="left"><italic>p</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Starting weight</td>
<td align="char" char=".">27.07</td>
<td align="char" char=".">26.79</td>
<td align="char" char=".">&#x02212;0.323</td>
<td align="char" char=".">0.7506</td>
</tr>
<tr>
<td align="left">Consumption (20&#x02009;mg pellets)</td>
<td align="char" char=".">159.7</td>
<td align="char" char=".">158.7</td>
<td align="char" char=".">&#x02212;0.169</td>
<td align="char" char=".">0.8692</td>
</tr>
<tr>
<td align="left">Total lever presses/day</td>
<td align="char" char=".">3394.4</td>
<td align="char" char=".">3415.2</td>
<td align="char" char=".">0.174</td>
<td align="char" char=".">0.8648</td>
</tr>
<tr>
<td align="left">Number of meals/day</td>
<td align="char" char=".">10.5</td>
<td align="char" char=".">10.1</td>
<td align="char" char=".">&#x02212;0.569</td>
<td align="char" char=".">0.5810</td>
</tr>
<tr>
<td align="left">Average meal size</td>
<td align="char" char=".">15.7</td>
<td align="char" char=".">17.0</td>
<td align="char" char=".">0.882</td>
<td align="char" char=".">0.3967</td>
</tr>
<tr>
<td align="left">Average meal duration</td>
<td align="char" char=".">75.3</td>
<td align="char" char=".">78.0</td>
<td align="char" char=".">0.368</td>
<td align="char" char=".">0.7197</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>Mean allocation of effort and runlength on the high and low cost lever following the switch in reward contingency (dashed line)</bold>. Mean lever presses per minute 10&#x02009;min before and after reward contingency switch for <bold>(A)</bold> wild-type and <bold>(B)</bold> DATkd (genotype &#x000D7; lever &#x000D7; time, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.0001). Mean runlength on each lever for <bold>(C)</bold> wild-type <bold>(D)</bold> DATkd (genotype &#x000D7; lever &#x000D7; time, <italic>p</italic>&#x02009;&#x0003E;&#x02009;0.001). Mean rate of reinforcement across all contingency switches for <bold>(E)</bold> wild-type and <bold>(F)</bold> DATkd on the low &#x02192; high cost lever (solid line, gold shading) and high &#x02192; low cost lever (dotted line, gray shading) averaged across all episodes of contingency switches between levers (vertical dashed lines). Shading&#x02009;&#x0003D; S.E.M., <italic>N</italic>&#x02009;&#x0003D;&#x02009;10.</p></caption>
<graphic xlink:href="fnbeh-04-00170-g002.tif"/>
</fig>
</sec>
<sec>
<title>Data modeling</title>
<p>To model leverpress-by-leverpress how choices were impacted by rewarding feedback, we first removed temporal information from the dataset to express the data as a series of choices <italic>c<sub>t</sub></italic> (&#x0003D;1 or &#x02212;1 according to which was pressed) of either lever, and of accompanying rewards <italic>r<sub>t</sub></italic> (&#x0003D;1, 0, or &#x02212;1 where no reward was coded as 0 and a rewarded response on lever 1 or &#x02212;1 was coded as 1 or &#x02212;1, respectively). We characterized the choice sequences using two models, a more general logistic regression model (Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>) and a more specific model based on temporal difference learning (Sutton and Barto, <xref ref-type="bibr" rid="B93">1998</xref>), and estimated the free parameters of these models for mice of each genotype.</p>
<p>In the regression model (Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>), the dependent variable was taken to be the binary choice variable <italic>c</italic>, and as explanatory variables for each <italic>t</italic> we included the <italic>N</italic> rewards preceding it, <italic>r</italic><sub><italic>t</italic>&#x02212;N&#x02026;t&#x02212;1</sub>. Additionally, we included the prior leverpress (<italic>c</italic><sub><italic>t</italic>&#x02212;1</sub>) to capture a tendency to stay or switch, and a bias variable (1) to capture fixed, overall preference for or against lever 1, for a total of <italic>N</italic>&#x02009;&#x0002B;&#x02009;2 free parameters (regression weights expressing, for each explanatory variable, how it impacted the chance of choosing either lever). We used logistic regression to estimate maximum likelihood weights for each mouse&#x00027;s choices separately, using the entire dataset concatenated across experimental days. We repeated the fit process for <italic>N</italic>&#x02009;&#x0003D;&#x02009;1 &#x02212;&#x02009;100.</p>
<p>Error-driven reinforcement learning models such as temporal difference learning are closely related to a special case of the above model (Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>) with many fewer parameters, and we also fit the parameters of such a model to animals&#x02019; choice behavior. In particular, we assumed subjects maintain a value <italic>V<sub>t</sub></italic> for each lever, and for each choice updated the value of the chosen lever according to <italic>V</italic><sub><italic>t</italic> &#x0002B;&#x02009;1</sub>(<italic>c<sub>t</sub></italic>)&#x02009;&#x0003D;&#x02009;<italic>V<sub>t</sub></italic>(<italic>c<sub>t</sub></italic>) &#x0002B;&#x02009;&#x003B1;<italic><sub>V</sub></italic>&#x000B7;&#x003B4;<italic><sub>t</sub></italic>, where &#x003B1;<italic><sub>V</sub></italic> is a free <italic>learning rate</italic> parameter and the <italic>prediction error</italic> &#x003B4;<italic><sub>t</sub></italic> is the difference between the received and expected reward amounts, which in our notation can be written &#x003B4;<italic><sub>t</sub>&#x02009;&#x0003D;&#x02009;abs</italic>(<italic>r<sub>t</sub></italic>) &#x02212;&#x02009;<italic>V<sub>t</sub></italic>(<italic>c<sub>t</sub></italic>). Additionally, defining &#x02212;<italic>c<sub>t</sub></italic> as the option not chosen, we assumed this option is also updated according to <italic>V</italic><sub><italic>t</italic> &#x0002B;&#x02009;1</sub>(&#x02212;<italic>c<sub>t</sub></italic>) <italic>&#x0003D;&#x02009;V<sub>t</sub></italic>(&#x02212;<italic>c<sub>t</sub></italic>) &#x0002B;&#x02009;&#x003B1;<italic><sub>V</sub></italic>(0 &#x02212;&#x02009;<italic>V<sub>t</sub></italic>(&#x02212;<italic>c<sub>t</sub></italic>)). (See Daw and Dayan, <xref ref-type="bibr" rid="B20">2004</xref>; Corrado et al., <xref ref-type="bibr" rid="B18">2005</xref>; Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>). Finally, we assumed subjects choose probabilistically according to a <italic>softmax</italic> choice rule, which is normally written:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="m1"><mml:mtable><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>exp</mml:mi><mml:mo>&#x02061;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x003B2;</mml:mo><mml:mo>&#x022C5;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:mi>exp</mml:mi><mml:mo>&#x02061;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x003B2;</mml:mo><mml:mo>&#x022C5;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>+</mml:mo><mml:mi>exp</mml:mi><mml:mo>&#x02061;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x003B2;</mml:mo><mml:mo>&#x022C5;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>=</mml:mo><mml:mo>&#x003C3;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>&#x003B2;</mml:mo><mml:mrow><mml:mo>[</mml:mo> <mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow> <mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Here the parameter &#x003B2; controls the degree to which choices are focused on the apparently best option. We refer to this parameter as the temperature, although it is technically the inverse temperature; the term originates in statistical mechanics where larger temperatures (here, smaller inverse temperatures) imply that particle velocities are more randomly distributed. In the second form of the equation, &#x003C3;(<italic>z</italic>) is the logistic function 1/(1 &#x0002B;&#x02009;exp(&#x02212;<italic>z</italic>)), highlighting the relationship between the RL model and logistic regression.</p>
<p>We augmented the model from Eq. <xref ref-type="disp-formula" rid="E1">1</xref> with additional bias terms, matching those used in the logistic regression model. Also, because the fits of the logistic regression model (see <xref ref-type="sec" rid="s2">Results</xref>) suggested additional short-latency effects of reward on choice, we included an additional term to capture these effects:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="m2"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mo>&#x02009;</mml:mo><mml:mo>&#x02009;</mml:mo><mml:mo>&#x02009;</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x003C3;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mo>&#x003B2;</mml:mo><mml:mi>V</mml:mi></mml:msub><mml:mrow><mml:mo>[</mml:mo> <mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow> <mml:mo>]</mml:mo></mml:mrow><mml:mo>+</mml:mo><mml:msub><mml:mo>&#x003B2;</mml:mo><mml:mn>1</mml:mn></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mo>&#x003B2;</mml:mo><mml:mi>c</mml:mi></mml:msub><mml:msub><mml:mi>c</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mo>&#x003B2;</mml:mo><mml:mi>s</mml:mi></mml:msub><mml:mrow><mml:mo>[</mml:mo> <mml:mrow><mml:msub><mml:mi>S</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>S</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy='false'>)</mml:mo></mml:mrow> <mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Here, as in the logistic regression model, the parameters &#x003B2;<sub>1</sub> and &#x003B2;<sub>c</sub> code biases for or against lever 1, and for or against sticking with the previous choice. <italic>S<sub>t</sub></italic> is a second, &#x0201C;short-latency&#x0201D; value function updated from received rewards using the same learning rules as <italic>V<sub>t</sub></italic> but with its own learning rate and temperature parameters, &#x003B1;<sub>s</sub> and &#x003B2;<italic><sub>s</sub></italic>. As for the logistic regression model, we fit the model of Eq.&#x02009;<xref ref-type="disp-formula" rid="E2">2</xref> to the choice and reward sequences for each mouse separately, in order to extract maximum likelihood estimates for the six free parameters (&#x003B1;<sub>V</sub>, &#x003B2;<sub>V</sub>, &#x003B1;<sub>s</sub>, &#x003B2;<italic><sub>s</sub></italic>, &#x003B2;<sub>1</sub>, and &#x003B2;<sub>c</sub>). For this, we searched for parameter estimates that maximized the log likelihood of the entire choice sequence (the sum over trials of the log of Eq. <xref ref-type="disp-formula" rid="E2">2</xref>) using a non-linear function optimizer (fmincon from MATLAB optimization toolbox, Mathworks, Natick, MA, USA).</p>
<p>To measure goodness of model fit, we report a pseudo-<italic>r</italic><sup>2</sup> statistic (Camerer and Ho, <xref ref-type="bibr" rid="B13">1999</xref>; Daw et al., <xref ref-type="bibr" rid="B21">2006</xref>), defined as (<italic>R&#x02009;</italic>&#x02212; <italic>L</italic>)<italic>/R</italic>, where <italic>R</italic> is the negative log likelihood of the data under random chance (the number of choices multiplied by &#x02212;<italic>log</italic>(0.5)), and <italic>L</italic> is the negative log likelihood of the data under the model. To compare models, we used the Bayesian Information Criterion (Schwarz, <xref ref-type="bibr" rid="B85">1978</xref>) to correct the raw likelihoods for the number of free parameters fit. Likelihoods and BIC scores were aggregated across mice. For comparing parameters between genotypes, we treated the parameter estimates as random variables instantiated once per animal then tested for between-group differences with two-sample <italic>t</italic>-tests. For visualization purposes, we plotted the mean coefficients for lagged reward from the logistic regression model with <italic>N</italic>&#x02009;&#x0003D;&#x02009;100, averaged across animals within each genotype. For the reinforcement learning model, we computed the equivalent weights on lagged rewards implicit from Eq.&#x02009;<xref ref-type="disp-formula" rid="E2">2</xref> (for rewards &#x003C4; trials ago, this is &#x003B1;<sub>V</sub> &#x000B7;&#x003B2;<sub>V</sub>&#x000B7;(1 &#x02013;&#x02009;&#x003B1;<sub>V</sub>)<sup>&#x003C4;&#x02013;1</sup> &#x0002B;&#x02009;&#x003B1;<sub>S</sub> &#x003B2;<sub>S</sub>&#x000B7;(1 &#x02013;&#x02009;1&#x003B1;<sub>S</sub>)<sup>&#x003C4;&#x02013;1</sup>, which can be obtained by iteratively substituting the update rules for <italic>V</italic> and <italic>S</italic> into Eq. <xref ref-type="disp-formula" rid="E2">2</xref>, &#x003C4; times), and again averaged these across animals.</p>
</sec>
</sec>
<sec id="s2">
<title>Results</title>
<sec>
<title>Wild-type and DATkd exhibit similar behavior when the cost of both levers is low</title>
<p>To assess for potential non-task related differences between the groups, baseline behavior was assessed during periods in which both levers yielded reward equally on a low-cost, FR20 schedule. Baseline measures were taken at the beginning and end of the experimental period. As there were no significant differences between pre- and post- experiment consumption (mean difference food consumed, 0.15g; <italic>t</italic>&#x02009;&#x0003D;&#x02009;0.732, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.4792, <italic>N</italic>&#x02009;&#x0003D;&#x02009;6&#x02013;7), they are combined in Table <xref ref-type="table" rid="T1">1</xref>. No differences were observed in total consumption, total lever pressing, number of meals, meal size, meal duration or starting weights between the groups. Although hyperdopaminergic mice have been associated with greater motivation and willingness to work for reward when food-restricted (Cagniard et al., <xref ref-type="bibr" rid="B10">2006a</xref>,<xref ref-type="bibr" rid="B11">b</xref>), we observe no difference in primary motivation for food or in the expenditure of energy (lever pressing) to obtain food under these initial, low cost conditions.</p>
</sec>
<sec>
<title>DATkd mice allocate more effort to high-cost lever pressing</title>
<p>During the experimental period there is always a cost differential between the levers and the assignment of low versus high cost to the left or right levers switches every 20&#x02013;40&#x02009;min. Figures <xref ref-type="fig" rid="F1">1</xref>A and B shows lever pressing on the high and low cost levers across the experiment. A full, repeated measure ANOVA with genotype and lever as independent variables reveals a significant main effect of genotype (<italic>F</italic><sub>(1,18)</sub>&#x02009;&#x0003D;&#x02009;17.13, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001) and a trend for genotype &#x000D7;&#x02009;lever interaction (<italic>F</italic><sub>(1,18)</sub>&#x02009;&#x0003D;&#x02009;3.43, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.08) on lever pressing. Analyzing the levers separately, the DATkd mice expend more effort on the high cost lever than wild-type (Figure <xref ref-type="fig" rid="F1">1</xref>A, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;8.65, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01). There is no statistically significant difference in pressing on the low cost lever (Figure <xref ref-type="fig" rid="F1">1</xref>B, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;1.95, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.179). This significant increase in high-cost pressing results in a trend toward diminished ratio of low cost versus total pressing (Figure <xref ref-type="fig" rid="F1">1</xref>C, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;2.64, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.121) and, as a result, DATkd, on average, spend more effort lever pressing in order to earn one pellet than wild-type mice (Figure <xref ref-type="fig" rid="F1">1</xref>D, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;4.04, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.059). Data in Figures <xref ref-type="fig" rid="F1">1</xref>A and B are normalized to body weight, i.e., lever presses per gram of body weight.</p>
<p>The DATkd mice consume more food (Figure <xref ref-type="fig" rid="F1">1</xref>E, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;5.94, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.025) per gram of body weight without gaining more weight than wild-type (Figure <xref ref-type="fig" rid="F1">1</xref>F, <italic>F</italic><sub>(1,144)</sub>&#x02009;&#x0003D;&#x02009;0.01, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.922), reflecting a less efficient behavioral strategy for maintaining energy balance. That is, the DATkd mice work harder and eat more to maintain the same body weight as wild-type. The increase in consumption does not reflect an overall higher basal activity level as there were no consumption or weight differences when the cost of both levers was low.</p>
</sec>
<sec>
<title>Wild-type and DATkd both respond to cost switches between levers but employ different strategies for maximizing reward</title>
<p>There are several possible explanations of why the DATkd spend more effort working for food on the high-cost lever in order to maintain their body weight. They may have impaired learning and are not able to process reward information accurately and efficiently enough to respond to changes in reward contingencies between the levers. They may be more perseverative in their behavior, making it difficult for them to disengage one lever and engage another. This would not only result in wasted presses on the high cost lever, but would reduce sampling efficiency making them slower to recognize when the cost contingencies between levers have changed. To examine their behavioral strategies in greater detail, we analyzed lever pressing on the high and low cost levers before and after episodes of contingency switches between the levers.</p>
</sec>
<sec>
<title>Total effort allocation</title>
<p>Figures <xref ref-type="fig" rid="F2">2</xref>A and B show the average lever press rate on both levers 10&#x02009;min prior to and after a switch in reward contingencies between the levers (vertical dashed line), averaged across the experiment. A significant difference is observed in the pattern of responding across contingency changes between the groups (Figures <xref ref-type="fig" rid="F2">2</xref>A and B; genotype main effect, <italic>F</italic><sub>(1,342)</sub>&#x02009;&#x0003D;&#x02009;17.11, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001; genotype&#x02009;&#x000D7; lever&#x02009;&#x000D7; time, <italic>F</italic><sub>(19,342)</sub>&#x02009;&#x0003D;&#x02009;2.53, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001). Prior to a switch in reward contingencies, wild-type mice exhibit pressing on both levers but clearly favor the inexpensive lever (Figure <xref ref-type="fig" rid="F2">2</xref>A; pre-switch main effect of lever, <italic>F</italic><sub>(1,81)</sub>&#x02009;&#x0003D;&#x02009;15.07, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0037). After cost contingencies switch, the wild-type show an initial burst of activity on what was once the low cost lever, but is now more expensive, followed by a decline in presses on this lever (Figure <xref ref-type="fig" rid="F2">2</xref>A; post-switch lever&#x02009;&#x000D7; time, <italic>F</italic><sub>(9,81)</sub>&#x02009;&#x0003D;&#x02009;72.518, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0001). After this burst, they increase their pressing on the newly established low cost lever, reversing their distribution of pressing in order to favor lower pressing per pellet (Figure <xref ref-type="fig" rid="F2">2</xref>A; last five bins only, lever main effect, <italic>F</italic><sub>(1,36)</sub>&#x02009;&#x0003D;&#x02009;10.726, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0096). The observed increase in pressing on the previously cheap but now expensive lever could reflect the animals&#x02019; recognition of the contingency change or arise simply as a consequence of continuing to press the previously preferred lever until it yields reward on the higher ratio. Figure <xref ref-type="fig" rid="F2">2</xref>E shows the rate of earned reinforcement 10&#x02009;min prior to and following the shift in lever costs averaged across the experiment. After the contingency change, there is an immediate increase in earned rewards on the now cheap lever followed by a brief decrease before the mice establish a new preference shifting effort to the now cheap lever. This indicates that the burst on the previously expensive lever does not arise as mice simply complete the now higher ratio. Instead, the mice rapidly experience reward at the new contingencies but nonetheless return to the previously cheap lever and persist with it temporarily before shifting and establishing a new preference. This suggests the sharp increase in cheap now expensive lever presses following contingency changes is analogous to an extinction burst. These data demonstrate that wild-type mice have an overall preference for the low cost lever (Figure <xref ref-type="fig" rid="F2">2</xref>A; full lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(19,171)</sub>&#x02009;&#x0003D;&#x02009;17.9, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.0001) and an ability to recognize when the reward contingencies switch between levers. After a contingency change, the wild-type mice sample the new contingencies to establish the relative value of each lever and establish a new policy to exploit their updated knowledge until the next contingency switch.</p>
<p>In contrast, the DATkd do not show a preference for the low-cost lever prior to contingency changes (Figure <xref ref-type="fig" rid="F2">2</xref>B; pre-switch main effect of lever, <italic>F</italic><sub>(1,81)</sub>&#x02009;&#x0003D;&#x02009;0.176, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.6848). However, they exhibit the same initial response to a change in cost contingencies as the wild-type (Figure <xref ref-type="fig" rid="F2">2</xref>B; post-switch lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(9,81)</sub>&#x02009;&#x0003D;&#x02009;9.127, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001): an initial burst of activity on what was once the low cost lever, but is now more expensive. After this burst, the DATkd do not show a preference for one lever or another (Figure <xref ref-type="fig" rid="F2">2</xref>B; last five bins only, lever main effect, <italic>F</italic><sub>(1,36)</sub>&#x02009;&#x0003D;&#x02009;0.035, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.8556). Figure <xref ref-type="fig" rid="F2">2</xref>F shows, that like the wild-type, the DAT mice also receive immediate reinforcement following the new contingencies, suggesting that the increase pressing on the previously cheap lever, as in wild-type, reflects an extinction burst. This indicates that the DATkd are sensitive to changes in reward contingencies and like wild-type sample the new contingencies to establish a new action policy (Figure <xref ref-type="fig" rid="F2">2</xref>B; full lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(19,171)</sub>&#x02009;&#x0003D;&#x02009;3.39, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.0001), ruling out the possibility that the DATkd are slower to recognize changes in the costs of the levers. However, despite their sensitivity to changes in the cost of rewards and the energetic advantage this knowledge could potentially provide if they were to exploit it, they do not preferentially press the inexpensive lever. Instead, they adopt an action policy of pressing both levers equally, despite the levers&#x02019; relative rates of return.</p>
</sec>
<sec>
<title>Run length as an index of persistence</title>
<p>Measuring average lever press rates alone does not enable us to evaluate the pattern of switching between levers. To study this pattern, we analyzed run length &#x02013; number of consecutive presses on a single lever before switching to the other lever (see <xref ref-type="sec" rid="s1">Materials and Methods</xref>) &#x02013; observing a significantly different pattern between the groups (Figures <xref ref-type="fig" rid="F2">2</xref>C and D; geno &#x000D7;&#x02009;lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(19,342)</sub>&#x02009;&#x0003D;&#x02009;3.545, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.0001). In wild-type, run length is consistent with the distribution of pressing observed in Figure <xref ref-type="fig" rid="F2">2</xref>A: the mice show greater run length on the low cost lever prior to the reward contingency switch between levers, followed by an extinction burst on the now high cost lever and a subsequent increase in run length with the now low cost lever (Figure <xref ref-type="fig" rid="F2">2</xref>C; lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(18,162)</sub>&#x02009;&#x0003D;&#x02009;4.674, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.0001). In contrast, prior to the reward contingency change, the DATkd show greater run length on the expensive lever. After the change in costs between levers, the DATkd <italic>decrease</italic> their run length on the new low cost lever and <italic>increase</italic> persistence on the new high cost lever resulting overall in no significant difference in pressing between the levers across time (Figure <xref ref-type="fig" rid="F2">2</xref>D; lever &#x000D7;&#x02009;time, <italic>F</italic><sub>(18,162)</sub>&#x02009;&#x0003D;&#x02009;0.317, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.9967). This indicates the DATkd increase or decrease their persistence commensurate with the cost of both levers, rather than focusing long runs on the low cost lever. Again, this suggests that the hyperdopaminergic mice are sensitive to contingency changes and their persistence on the expensive lever, relative to wild-type, is not indiscriminate.</p>
</sec>
<sec>
<title>Rate of responding and post-reinforcement pauses similar between groups</title>
<p>Apparent differences in choice behavior between the genotypes might arise secondary to a more fundamental difference in motor performance. We analyzed several measures to assess this possibility and find little difference between the groups. There is no significant difference between groups in the rate of responding averaged across meal episodes (mean: WT 4.75 &#x000B1;&#x02009;0.173, DAT, 5.52 &#x000B1;&#x02009;0.236, genotype main effect, <italic>F</italic><sub>(1,180)</sub>&#x02009;&#x0003D;&#x02009;2.347, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.1429, data not shown). Second, a histogram of inter-response times (IRTs) normalized as percentage of total IRTs shows no main effect of genotype (Figures <xref ref-type="fig" rid="F3">3</xref>A and B; <italic>F</italic><sub>(1,162)</sub>&#x02009;&#x0003D;&#x02009;3.155, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0925) though wild-type exhibit a slightly greater percentage of shorter IRTs (Figure <xref ref-type="fig" rid="F3">3</xref>A; genotype &#x000D7;&#x02009;bins <italic>F</italic><sub>(9,162)</sub>&#x02009;&#x0003D;&#x02009;2.67, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0065). These data suggests no great differences between the groups in rate of responding, though the wild-type may exhibit slightly more rapid, successive presses. Because subtle differences in pausing after reward may be lost in the IRT histogram, we specifically evaluated post-reinforcement pauses (PRPs). Figures <xref ref-type="fig" rid="F3">3</xref>C and D shows a histogram of PRPs for both groups with no significant differences observed. Together with no differences at baseline, these data indicate that generalized performance or vigor differences between the groups cannot account for the observed difference in behavioral choices and strategy.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>Inter-response times and post-reinforcement pauses across experiment</bold>. <bold>(A)</bold> Histogram of inter-response times (IRTs) in 1&#x02009;s bins normalized to percentage of total IRTs for WT (blue bars) and DATkd (red bars) (genotype &#x000D7; bins, <italic>p</italic>&#x02009;&#x0003D;&#x02009;0.0065). <bold>(B)</bold> Scatterplot of individual subject IRT histograms. <bold>(C)</bold> Histogram of post-reinforcement pauses (PRPs) in 1&#x02009;s bins normalized to total PRPs for WT (blue trace) and DATkd (red trace) (genotype &#x000D7; bins, NS). <bold>(D)</bold> Scatterplot of individual subject PRP histograms. <italic>N</italic>&#x02009;&#x0003D;&#x02009;10.</p></caption>
<graphic xlink:href="fnbeh-04-00170-g003.tif"/>
</fig>
</sec>
<sec>
<title>DATkd show effort distribution similar to wild-type when cost differential is stationary</title>
<p>There are several potential explanations to the behavioral results described. The DATkd mice may be insensitive to costs and/or might derive some intrinsic value from lever pressing itself. To test these, we conducted a similar experiment with a cheap and expensive lever but which lever was cheap and expensive remained constant. We observe no significant differences between the groups in the stationary version of the paradigm (Figures <xref ref-type="fig" rid="F4">4</xref>A&#x02013;D). This clearly indicates that the DATkd do not derive an intrinsic value from lever pressing. More importantly, though the results in the switching paradigm are consistent with a reduced sensitivity to cost in the DATkd, this experiment indicates that they are not indifferent to cost. Thus, their apparent reduced sensitivity to cost in the switching paradigm arises as a consequence of how they use reward (and cost) history to determine their behavioral strategy in a dynamic environment and not as a result of generalized indifference to cost.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p><bold>Effort and earned rewards when the price of the high and low cost levers does not switch</bold>. Average lever presses on the <bold>(A)</bold> low cost and <bold>(B)</bold> high cost lever and average pellets earned on <bold>(C)</bold> low and <bold>(D)</bold> high cost levers as the price of the high cost lever increases across days. No significant genotype differences across panels. Error bars&#x02009;&#x0003D; S.E.M, <italic>N</italic>&#x02009;&#x0003D;&#x02009;6.</p></caption>
<graphic xlink:href="fnbeh-04-00170-g004.tif"/>
</fig>
</sec>
<sec>
<title>DATkd lever choice is less influenced by recent reward than wild-type</title>
<p>The aggregate behavioral measures examined so far arise from cumulative, choice-by-choice decision-making. Animals must allocate their lever presses guided by recent rewarding outcomes, which are the only feedback that signals the periodic changes in cost contingencies. To understand how animals adapted their lever pressing, choice-by-choice, in response to reward outcomes and history, we fit behavior with reinforcement learning models that predict lever choice as a function of past experience (e.g., Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>). For this analysis, we considered only which levers were chosen in what order, and not the actual timing of lever presses. In this way, we were able to abstract away the temporal patterning of the behavior and analyze the choice between levers in a manner consistent with previous work on tasks in which choices occurred in discrete trials rather than ongoing free-operant responses (Sugrue et al., <xref ref-type="bibr" rid="B92">2004</xref>; Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>; Daw et al., <xref ref-type="bibr" rid="B21">2006</xref>). We used two models adapted from that literature, first a general logistic regression model that tests the overall form of the learning constrained by few assumptions (Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>) and, suggested by these fits, a more specific model based on temporal difference learning (Sutton and Barto, <xref ref-type="bibr" rid="B93">1998</xref>). Parameters estimated from the fit of the more specific model characterize different aspects of the learning, and these were compared between genotypes.</p>
<p>First, logistic regression was used to predict choices as a function of the rewards received (or not) for recent previous lever presses, along with additional predictive variables to capture biases (see <xref ref-type="sec" rid="s1">Materials and Methods</xref>). Figure <xref ref-type="fig" rid="F5">5</xref>A depicts the regression coefficients for rewards received from 1 to 100 lever presses previously, in predicting the current lever press. Coefficients (<italic>y</italic>-axis) greater than zero indicate that a reward tends to promote staying on the lever that produced it, while coefficients less than zero indicate that rewards instead promote switching. A standard error-driven reinforcement learning model (such as Eq. <xref ref-type="disp-formula" rid="E1">1</xref> from Materials and Methods) is equivalent to the logistic regression model with reward history coefficients that are everywhere positive, largest for the most recent rewards and with the effect of reward declining exponentially with delay (Lau and Glimcher, <xref ref-type="bibr" rid="B50">2005</xref>). The coefficients illustrated in Figure <xref ref-type="fig" rid="F5">5</xref>A instead were sharply negative for the most recent reward, indicating a strong tendency to switch to the other lever. This effect decayed quickly and was replaced by the opposite tendency to stay on the lever that recently yielded reward.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p><bold>Model of reward function and persistence on high and low cost lever averaged across reward procurement</bold>. <bold>(A)</bold> Reward history as 100 discrete parameters representing 100 actions (rewarded or not) back in time, solid line represents group averages superimposed on a scatterplot of individual subjects (wild-type, blue; DATkd, red). <bold>(B)</bold> Reward as a continuous function comprised of two exponentials (4 parameters). Though the function incorporates the effect of reward infinitely back in time, only the first 100 actions back are shown. Light traces show curves plotted &#x000B1; standard error of parameters within groups. <bold>(C,D)</bold> The two exponentials of the model plotted separately. Solid lines represent model using group means of parameters and light traces represent individual subjects. See Table <xref ref-type="table" rid="T2">2</xref> for statistics. N&#x02009;&#x0003D;&#x02009;10.</p></caption>
<graphic xlink:href="fnbeh-04-00170-g005.tif"/>
</fig>
<p>We reasoned that instead of reward dependency following a single exponential curve, as in a standard reinforcement learning model, the response to reward appeared to be well characterized by the superposition of two exponentials, a short-latency tendency to switch initially overwhelming a more traditional, longer-latency value learning process.</p>
<p>We therefore fit the animals&#x02019; choices with an augmented error-driven learning model (Eq.&#x02009;<xref ref-type="disp-formula" rid="E2">2</xref>), which included a standard value learning process accompanied by a second, short-latency process plus bias terms. This is equivalent to constraining the reward history coefficients from the logistic regression model to follow a curve described by the sum of two exponentials. Figure <xref ref-type="fig" rid="F5">5</xref>B displays the reward dependency curves implied by the best-fitting parameters of this reduced model to the choice data, in the same manner as those from the regression model; they appear to capture the major features of the original fits while somewhat &#x0201C;cleaning up&#x0201D; the noise. Although the reinforcement learning model had far fewer free parameters than the regression model (six per animal), it fit the choice data nearly as well (negative log likelihood, aggregated over animals, 1.156e &#x0002B;&#x02009;5; pseudo-<italic>r</italic><sup>2</sup>, 0.83). In order to compare the goodness of fit taking into account the number of parameters optimized, we used the Bayesian Information Criterion (BIC; Schwarz, <xref ref-type="bibr" rid="B85">1978</xref>) to penalize data likelihoods for the number of free parameters. According to this score, the best of the regression models, trading off fit and complexity, was that for <italic>N&#x02009;&#x0003D;&#x02009;</italic>20 (the number of rewards back in time for which coefficients were fit; 22 free parameters per animal, negative log likelihood, 1.168e &#x0002B;&#x02009;5, pseudo-<italic>r</italic><sup>2</sup>0.83). The 6-parameter reinforcement learning model thus fit the data better (smaller negative log likelihood) than this model, even before correcting for the fact that it had about 1/4 the number of free parameters. (The difference in BIC-corrected likelihoods was 4.81e &#x0002B;&#x02009;4 in favor of the simpler model, which constitutes &#x0201C;very strong&#x0201D; evidence according to the guidelines of Kass and Raftery, <xref ref-type="bibr" rid="B45">1995</xref>). In all, these results suggest that the choice data were well characterized by the 6-parameter reinforcement learning model.</p>
<p>Finally, having developed, fit and validated a computational characterization of the choice behavior, we used the estimates of the model&#x00027;s free parameters to compare the learning process between genotypes. Table <xref ref-type="table" rid="T2">2</xref> presents fitted parameters for each group and statistical comparisons. These comparisons show a selective difference in the parameter &#x003B2;<sub>V</sub>, which was smaller in the DATkd mice (<italic>t</italic>&#x02009;&#x0003D;&#x02009;3.1, <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01). This is the temperature parameter for the value learning process, which controls the extent to which learning about values guides action choice. This is consistent with the aggregate findings (Figures <xref ref-type="fig" rid="F1">1</xref> and <xref ref-type="fig" rid="F2">2</xref>) that they distribute effort more evenly across both levers, resulting in more high cost lever presses and an overall less cost-effective behavioral strategy. By contrast, the remaining parameters of the model did not differ. These results suggest that the effect of the DAT knockdown was specific to the value learning process and not to the short-latency switching part of the model (Figures <xref ref-type="fig" rid="F5">5</xref>C and D, two exponentials plotted separately) or the other bias terms. Within value learning, the genotype difference was specific to the temperature parameter rather than the learning rate parameter &#x003B1;<sub>V</sub>, which characterizes how readily values adapt to feedback. This selective difference between groups is also apparent in Figures <xref ref-type="fig" rid="F5">5</xref>B and C, where the tendency toward a short latency switch following a reward appears similar between groups, but the subsequent countervailing tendency to return to a lever that has delivered reward appears blunted (Figures <xref ref-type="fig" rid="F5">5</xref>B and D, lower peak). Although this tendency is scaled down in the DATkd mice, the time course by which rewards exert their effect, i.e., the timescale of decay of the function, which captures the learning rate parameter, appears unchanged. Together, these results indicate that the DATkd mice, choice-by-choice, adapt their choices to recent rewards with a similar temporal profile, but that recent rewards exhibit an overall less profound influence on their behavior, resulting in diminished coupling between temporally local rates of reinforcement and decision-making.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p><bold>Fitted model parameters by genotype</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th align="left"/>
<th align="left">Wild-type</th>
<th align="left">DATkd</th>
<th align="left"><italic>t</italic></th>
<th align="left"><italic>p</italic></th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">&#x02018;&#x02018;Switch&#x02019;&#x02019; learning rate</td>
<td align="char" char=".">0.342</td>
<td align="char" char=".">0.3431</td>
<td align="char" char=".">0.020</td>
<td align="char" char=".">0.984</td>
</tr>
<tr>
<td align="left">&#x02018;&#x02018;Switch&#x02019;&#x02019; Temperature</td>
<td align="char" char=".">&#x02212;8.185</td>
<td align="char" char=".">&#x02212;9.245</td>
<td align="char" char=".">0.521</td>
<td align="char" char=".">0.608</td>
</tr>
<tr>
<td align="left">&#x02018;&#x02018;Stay&#x02019;&#x02019; learning rate</td>
<td align="char" char=".">0.042</td>
<td align="char" char=".">0.044</td>
<td align="char" char=".">0.121</td>
<td align="char" char=".">0.904</td>
</tr>
<tr>
<td align="left">&#x02018;&#x02018;Stay&#x02019;&#x02019; temperature</td>
<td align="char" char=".">39.9</td>
<td align="char" char=".">26.06</td>
<td align="char" char=".">3.016</td>
<td align="char" char=".">0.007</td>
</tr>
<tr>
<td align="left">Last lever pressed</td>
<td align="char" char=".">3.082</td>
<td align="char" char=".">3.259</td>
<td align="char" char=".">1.163</td>
<td align="char" char=".">0.260</td>
</tr>
<tr>
<td align="left">Bias</td>
<td align="char" char=".">0.461</td>
<td align="char" char=".">0.455</td>
<td align="char" char=".">0.026</td>
<td align="char" char=".">0.979</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Though dopamine has been studied for decades, its impact on adaptive behavior in complex, naturalistic environments can difficult to infer in the absence of paradigms designed specifically to examine adaptation to environmental conditions. The paradigm used here trades the highly controlled approach of traditional behavior testing for a semi-naturalistic design that generates a rich dataset against which different models and hypotheses can be examined (and generated) and in the process eliminates many difficult to address confounds such as the impact of food restriction, handling, time of testing, and so on.</p>
<p>In the present study, we used a closed-economy, homecage paradigm to ask if elevated tonic dopamine alters the animals&#x02019; flexible adaptation to changing environmental reward contingencies. When shifting reward contingencies between the levers is introduced, wild-type mice distribute more effort to the currently less expensive lever, increasing yield for energy expended. In contrast, the hyperdopaminergic mice distribute their effort approximately equally between the levers, apparently less influenced by the relative cost of the two levers. As a consequence, on average they expend more effort for each pellet earned than wild-type mice. In this paradigm, however, little is gained by this effort. Data from low-cost baseline, when both levers function at the same cost, and from a non-switching version of the task, indicate that the differences observed between genotypes cannot be attributed to differences in baseline consumption, generalized effects of activity level, differences in motor performance, or an intrinsic valuation of lever pressing. Rather, the observed difference arises specifically as a consequence of on-going adaptation to a dynamic environment.</p>
<sec>
<title>Discerning alterations in reinforcement learning (acquisition) from changes in motivation (expression)</title>
<p>A fundamental debate is whether dopamine influences behavior through reinforcement learning or by modulating the expression of motivated behavior (Wise, <xref ref-type="bibr" rid="B100">2004</xref>; Salamone, <xref ref-type="bibr" rid="B74">2006</xref>; Berridge, <xref ref-type="bibr" rid="B7">2007</xref>). Accumulating data support both perspectives; however, distinguishing the relative contribution of learning versus expression to adaptive behavior and integrating these two roles into a comprehensive framework remain elusive. To disentangle these two potential influences on adaptive behavior, we ask how dopamine alters the updating and utilization of incentive values in decision-making, on a choice-by-choice basis, in response to shifting environmental contingencies and reward outcomes. By fitting the data to the computational model at the heart of reinforcement learning theories of dopamine (Montague et al., <xref ref-type="bibr" rid="B58">1996</xref>; Schultz et al., <xref ref-type="bibr" rid="B83">1997</xref>; Sutton and Barto, <xref ref-type="bibr" rid="B93">1998</xref>), we find that elevated tonic dopamine does not alter learning, as reflected in the learning rate parameter, but does alter the expression of that learning, as reflected by the temperature parameter, which modulates the degree to which prior reward biases action selection. Surprisingly, the DATkd mice are less influenced by recent reward resulting in diminished coupling between on-going reward information and behavioral choice.</p>
<p>It has been suggested that tonic and phasic dopamine may serve different functions (Schultz, <xref ref-type="bibr" rid="B81">2007b</xref>), with tonic contributing to the scaling of motivated behavior (Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>; Berridge, <xref ref-type="bibr" rid="B7">2007</xref>; Salamone, <xref ref-type="bibr" rid="B75">2007</xref>) while phasic provides a prediction error signal critical to learning (Schultz et al., <xref ref-type="bibr" rid="B82">1993</xref>, <xref ref-type="bibr" rid="B83">1997</xref>; Schultz and Dickinson, <xref ref-type="bibr" rid="B84">2000</xref>). Consistent with previous work (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>; Cagniard et al., <xref ref-type="bibr" rid="B10">2006a</xref>,<xref ref-type="bibr" rid="B11">b</xref>; Yin et al., <xref ref-type="bibr" rid="B103">2006</xref>), the current study supports this view as the DATkd mice retain phasic dopamine activity (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>; Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>) and show no alterations in learning. In contrast, we show for the first time that tonic dopamine can alter the temperature parameter in a temporal difference RL model, which suggests a mechanism by which the expression of motivated behavior may be modulated or scaled by dopamine within a common framework with its role in reinforcement learning.</p>
</sec>
<sec>
<title>Functional accounts of dopamine</title>
<p>In contrast to theories that focus on dopamine&#x00027;s role in reward learning, associated with phasic activity (but see Gutkin et al., <xref ref-type="bibr" rid="B38">2006</xref>; Palmiter, <xref ref-type="bibr" rid="B66">2008</xref>; Zweifel et al., <xref ref-type="bibr" rid="B105">2009</xref>), tonic dopamine has been associated with motivational accounts of dopamine function whereby dopamine increases an animal&#x00027;s energy expenditure toward a goal. The effects of dopamine on motivation have been characterized as enhanced incentive or &#x0201C;wanting&#x0201D;(Berridge, <xref ref-type="bibr" rid="B7">2007</xref>), decreased sensitivity to cost (Aberman and Salamone, <xref ref-type="bibr" rid="B1">1999</xref>; Salamone et al., <xref ref-type="bibr" rid="B78">2001</xref>; Mingote et al., <xref ref-type="bibr" rid="B55">2005</xref>), &#x0201C;scaling&#x0201D; of reinforced responding (Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>) or as a mediator of &#x0201C;vigor&#x0201D; (Lyons and Robbins, <xref ref-type="bibr" rid="B53">1975</xref>; Taylor and Robbins, <xref ref-type="bibr" rid="B94">1984</xref>; Niv et al., <xref ref-type="bibr" rid="B65">2007</xref>).</p>
<p>In one attempt to formalize these ideas and reconcile them with RL models of phasic dopamine, Niv et al. (<xref ref-type="bibr" rid="B65">2007</xref>) proposed that instrumental actions actually involve two separate decisions: what to do (the choice between actions), and when (or how vigorously) to do it. They suggested, moreover, that phasic dopamine might affect choice of &#x0201C;what to do&#x0201D; via learning while tonic dopamine would modulate the vigor of the chosen action, as an expression effect. In the present study, the DATkd genotype show altered choices between levers, suggesting that tonic dopamine can, independent of learning, affect choice of what to do as well as the vigor with which a choice is pursued (see also Salamone et al., <xref ref-type="bibr" rid="B77">2003</xref>).</p>
<p>The most straightforward and mechanistic interpretation of the data is that tonic dopamine modulates the gain in action selection mechanisms (Servan-Schreiber et al., <xref ref-type="bibr" rid="B87">1990</xref>; Braver et al., <xref ref-type="bibr" rid="B9">1999</xref>). Dopamine affects cellular and synaptic processes widely throughout the brain (Hsu et al., <xref ref-type="bibr" rid="B42">1995</xref>; Kiyatkin and Rebec, <xref ref-type="bibr" rid="B49">1996</xref>; Flores-Hernandez et al., <xref ref-type="bibr" rid="B28">1997</xref>; Nicola et al., <xref ref-type="bibr" rid="B64">2000</xref>; Cepeda et al., <xref ref-type="bibr" rid="B14">2001</xref>; Horvitz, <xref ref-type="bibr" rid="B41">2002</xref>; Reynolds and Wickens, <xref ref-type="bibr" rid="B71">2002</xref>; Bamford et al., <xref ref-type="bibr" rid="B4">2004</xref>; Goto and Grace, <xref ref-type="bibr" rid="B34">2005a</xref>,<xref ref-type="bibr" rid="B35">b</xref>; Calabresi et al., <xref ref-type="bibr" rid="B12">2007</xref>; Wu et al., <xref ref-type="bibr" rid="B101">2007</xref>; Kheirbek et al., <xref ref-type="bibr" rid="B48">2008</xref>; Wickens, <xref ref-type="bibr" rid="B98">2009</xref>), especially in the striatum, believed to be central in action selection (Mogenson et al., <xref ref-type="bibr" rid="B57">1980</xref>; Mink, <xref ref-type="bibr" rid="B56">1996</xref>; Redgrave et al., <xref ref-type="bibr" rid="B70">1999</xref>). Activation of D2 receptors on corticostriatal terminals has been shown to filter cortical input (Cepeda et al., <xref ref-type="bibr" rid="B14">2001</xref>; Bamford et al., <xref ref-type="bibr" rid="B4">2004</xref>) and activation of D1 receptors on striatal medium spiny neurons (MSNs) can provide a gain function by altering the threshold for switching from the down-state to the up-state while facilitating responsiveness of those MSNs already in the up-state (Nicola et al., <xref ref-type="bibr" rid="B64">2000</xref>). Consequently, dopamine is positioned to modulate the processing of information flowing through the striatum by modulating both plasticity and gain (or temperature), reflecting a dopaminergic role in learning and expression of learning, respectively (Braver et al., <xref ref-type="bibr" rid="B9">1999</xref>). This hypothesis, that tonic dopamine modulates gain on corticostriatal processing thereby regulating the temperature at which learned expected values influence action selection, would explain how tonic dopamine could affect both choice of &#x0201C;what to do&#x0201D; and the &#x0201C;scaling&#x0201D; of the expression of learned, reinforced behavioral responses.</p>
<p>Insofar as functional aspects of behavior, such as incentive and cost (or exploration, performance, uncertainty, and so on) are processed through the striatum, a temperature/gain regulation function of dopamine would alter these functional aspects of behavior. However, the functional effects and the underlying mechanism need not be co-extensive. Depending upon the input, task or specific anatomical region manipulated, a temperature modulation function might have seemingly distinct functional effects on behavior (Braver et al., <xref ref-type="bibr" rid="B9">1999</xref>). Though response selection in striatum is particularly associated with its dorsolateral region and incentive processing with ventral regions, the nucleus accumbens in particular (Humphries and Prescott, <xref ref-type="bibr" rid="B43">2010</xref>; Nicola, <xref ref-type="bibr" rid="B63">2007</xref>), in the present study, we cannot discern which striatal compartment contributes to the observed phenotype. Determining the unique contribution of the ventral and dorsal striatum to behavioral flexibility will require further studies.</p>
<p>The notion that dopamine may change the expression of motivated behavior by altering the gain operating on the processing of either cost or incentive is consistent with previous theories of dopaminergic function (Salamone and Correa, <xref ref-type="bibr" rid="B76">2002</xref>; Berridge, <xref ref-type="bibr" rid="B7">2007</xref>). However, discerning whether dopamine operates on costs, incentive value or both may ultimately require greater understanding of the precise neural representation of these functional constructs.</p>
<p>For example, Rushworth and colleagues (Rudebeck et al., <xref ref-type="bibr" rid="B73">2006</xref>) have suggested that tracking of delay- and effort-based costs are mediated by the orbitofrontal and anterior cingulate cortices, respectively, both of which project to the ventral striatum. Shidara and colleagues (Shidara et al., <xref ref-type="bibr" rid="B89">1998</xref>, <xref ref-type="bibr" rid="B90">2005</xref>; Shidara and Richmond, <xref ref-type="bibr" rid="B91">2004</xref>) provide data that the anterior cingulate processes reward expectancy and that the ventral striatum tracks progress toward a reward. Presumably such information maintains focus on a goal, favoring task-related action selection during the exertion of effort or across a temporal delay. This would give rise to an apparent reduced sensitivity to costs though the underlying mechanism would be an enhanced representation of progress toward a goal. A mechanism such as this would equally support dopamine theories of enhanced incentive and reduced sensitivity to costs, both of which arise as a consequence of dopaminergic modulation of gain in corticostriatal processing of information modulating action selection. Importantly, though, in this view dopamine is not modulating incentive value or cost sensitivity <italic>per se,</italic> but the gain in action selection processing which alters the influence of incentive or costs on behavioral choice.</p>
</sec>
<sec>
<title>Dopamine and the regulation of exploration and exploitation</title>
<p>It is curious that increased tonic dopamine diminishes coupling between choice and reward history when one might expect an enhanced gain function to make an organism more sensitive to recent reward and to marginal contrasts between putative values of two choices. However, the effects of changing concentrations of dopamine in various brain regions associated with different functions have been often characterized by an inverted U shaped curve (Seamans et al., <xref ref-type="bibr" rid="B86">1998</xref>; Williams and Dayan, <xref ref-type="bibr" rid="B99">2005</xref>; Delaveau et al., <xref ref-type="bibr" rid="B25">2007</xref>; Vijayraghavan et al., <xref ref-type="bibr" rid="B96">2007</xref>; Clatworthy et al., <xref ref-type="bibr" rid="B15">2009</xref>; Monte-Silva et al., <xref ref-type="bibr" rid="B59">2009</xref>; Schellekens et al., <xref ref-type="bibr" rid="B79">2010</xref>) such that too much dopamine may effectively reduce gain as observed on the behavioral level. One reason for this might be saturation in realistic neural representations: although in the model, gain can be increased without bound, in the brain, too much dopamine might ultimately wash out fine discriminations due to saturation. As a consequence, only middle ranges of extracellular dopamine would provide optimal gain for exploiting prior learning. In contrast, low dopamine would diminish exploitation resulting in generalized, non-goal- and task-related exploration while high dopamine would facilitate exploration between established, goal- and task-related options.</p>
<p>Because it modulates the connection between value and choice, the gain mechanism embodied by the softmax temperature in reinforcement learning models is often identified with regulating the balance between exploration and exploitation. If tonic dopamine affects this temperature, then it might, functionally, be involved in regulating exploration by modulating the degree to which prior learning biases action selection; that is, by controlling the degree of exploitation. Dopamine may not be unique in modulating the balance between exploration and exploitation; other accounts have associated exploration with top-down control from anterior frontal cortex (Daw et al., <xref ref-type="bibr" rid="B21">2006</xref>) and/or with temperature regulation by another monoamine neuromodulator, norepinephrine (Aston-Jones and Cohen, <xref ref-type="bibr" rid="B2">2005a</xref>,<xref ref-type="bibr" rid="B3">b</xref>).</p>
</sec>
<sec>
<title>Dopamine and behavioral flexibility</title>
<p>The ability to flexibly deploy and modify learned behaviors in response to a changing environment is critical to adaptation. Though the PFC is widely associated with behavioral flexibility, considerable data suggest that flexibility arises from a cortico-striatal circuit in which both cortical and subcortical regions contribute important components to flexible behavior (Cools et al., <xref ref-type="bibr" rid="B17">2004</xref>; Frank and Claus, <xref ref-type="bibr" rid="B30">2006</xref>; Lo and Wang, <xref ref-type="bibr" rid="B52">2006</xref>; Hazy et al., <xref ref-type="bibr" rid="B40">2007</xref>; Floresco et al., <xref ref-type="bibr" rid="B29">2009</xref>; Haluk and Floresco, <xref ref-type="bibr" rid="B39">2009</xref>; Pennartz et al., <xref ref-type="bibr" rid="B68">2009</xref>; Kehagia et al., <xref ref-type="bibr" rid="B46">2010</xref>). In the present study, it is possible that changed dopamine in the PFC contributed to the observed phenotype. Xu et al. (<xref ref-type="bibr" rid="B102">2009</xref>) recently reported that DAT knock-out mice (DATko) lack LTP in prefrontal pyramidal cells. However, the knock-out line used in that study and the knock-down used here differ significantly making it difficult to draw inferences from one line to the other. The DATko phenotype is more severe and complicated with developmental abnormalities, including growth retardation, pituitary hypoplasia, lactation deficits, and high mortality (Bosse et al., <xref ref-type="bibr" rid="B8">1997</xref>), none of which occur in the knock-down line used here (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>). More importantly, the DATko, consistent with a loss of PFC LTP, show learning, and memory deficits (Giros et al., <xref ref-type="bibr" rid="B33">1996</xref>; Gainetdinov et al., <xref ref-type="bibr" rid="B32">1999</xref>; Morice et al., <xref ref-type="bibr" rid="B60">2007</xref>; Weiss et al., <xref ref-type="bibr" rid="B97">2007</xref>; Dzirasa et al., <xref ref-type="bibr" rid="B27">2009</xref>). In contrast, learning has been shown to be normal in the DATkd (Cagniard et al., <xref ref-type="bibr" rid="B10">2006a</xref>,<xref ref-type="bibr" rid="B11">b</xref>; Yin et al., <xref ref-type="bibr" rid="B103">2006</xref>), including in the present study. Moreover, the weight of evidence suggest that dopamine reuptake in the PFC is mediated primarily by the norepinephrine transporter (NET) rather than DAT, suggesting that a knockdown of DAT would not significantly alter the kinetics of reuptake in the PFC (Sesack et al., <xref ref-type="bibr" rid="B88">1998</xref>; Mundorf et al., <xref ref-type="bibr" rid="B62">2001</xref>; Moron et al., <xref ref-type="bibr" rid="B61">2002</xref>). In contrast, the changes in dopamine dynamics in the striatum are pronounced and well documented (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>; Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>).</p>
<p>It is unlikely that behavioral flexibility is localized specifically to any single anatomical region; rather, flexibility is likely an emergent property arising from interdependent interaction between structures within circuits. For example, Kellendonk et al. (<xref ref-type="bibr" rid="B47">2006</xref>) demonstrate that overexpression of D2 receptors in the striatum can alter PFC function. From this perspective, we would expect that the PFC does contribute to the observed phenotype because it is an integral component of the corticostriatal circuit mediating choice behavior. In the present study, however, the weight of evidence supports the notion that potential changes in PFC function arise as a consequence of alterations in dopaminergic tone in the striatum rather than in the PFC directly, consistent with the widely held view that the striatum critically mediates reward learning and action selection. To this we add the suggestion that striatal dopamine may contribute to behavioral flexibility by modulating the degree to which prior learning is or is not exploited.</p>
</sec>
<sec>
<title>Distinguishing the contribution of tonic and phasic dopamine</title>
<p>Dopamine cells have been characterized as having two primary modes (Grace and Bunney, <xref ref-type="bibr" rid="B36">1984a</xref>,<xref ref-type="bibr" rid="B37">b</xref>), tonic (slow, irregular pacemaker activity), and phasic (short bursts of high frequency spikes). Experimentally isolating and manipulating these to investigate their putatively distinct functions remains a significant challenge. When DAT expression is reduced, the amplitude of dopamine release from evoked stimulation is reduced to 25% of wild-type (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>). Despite this reduced release, the effect on tonic dopamine is robust and clear, resulting in both increased rate of tonic activity and elevated extracellular dopamine in the striatum (Zhuang et al., <xref ref-type="bibr" rid="B104">2001</xref>; Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>). In contrast, phasic activity itself remains unaltered (Cagniard et al., <xref ref-type="bibr" rid="B11">2006b</xref>).</p>
<p>Though phasic activity itself remains intact, the impact of reduced amplitude of release during that activity is uncertain. That is, it is possible that reduced dopamine during phasic release might underlie the observed phenotype rather than increased tonic activity. The weight of evidence argues against this. Phasic activity is most widely associated with mediating a prediction error during reward learning (Schultz et al., <xref ref-type="bibr" rid="B83">1997</xref>), with evidence that the magnitude of phasic activity correlates to the magnitude of unexpected reward (Tobler et al., <xref ref-type="bibr" rid="B95">2005</xref>). However, we observe no alterations in reward learning. Dopamine has also been associated with energizing and mobilizing reward oriented appetitive behavior, but we observe no reduction in motivation and effort.</p>
<p>Bergman and colleagues (Joshua et al., <xref ref-type="bibr" rid="B44">2009</xref>) suggests that phasic dopamine activity itself may be composed of two components: a fast phase that serves an activational function and a more prolonged, slow phase that modulates plasticity. It is intriguing to consider that a reduction in the amplitude of putative fast phase activity may result in less activation and gain of learned values, effectively reducing the bias of prior learning on choice, as observed here. However, in the present study the mice have extensive experience with the lever and reward contingencies. The literature on phasic dopamine suggests that bursting should occur primarily during unexpected outcomes, such as contingency switches in this task. However, it is precisely around these switches that the WT and DATkd behavior is similar while differences in choice are observed primarily during the stable periods between contingency switches. Thus, though we cannot conclusively rule out a potential role for reduced amplitude of phasic release in the phenotype observed here, the weight of evidence points to the pronounced changes in tonic dopamine as the critical factor.</p>
<p>Though dopamine is often associated with greater motivation, willingness to work, and persistence in pursuing a goal, the present study suggests a potential trade-off between such enhanced motivation and flexibility. The relative value of persistence and flexibility will depend upon the environment. Consequently, polymorphisms in genes regulating dopamine function (D&#x00027;Souza and Craig, <xref ref-type="bibr" rid="B19">2008</xref>; Frank et al., <xref ref-type="bibr" rid="B31">2009</xref>; Le Foll et al., <xref ref-type="bibr" rid="B51">2009</xref>; Marco-Pallares et al., <xref ref-type="bibr" rid="B54">2009</xref>) may have evolved from evolutionary pressures arising from different environments. In some environments, extraordinary persistence (exploitation of prior learning) may be essential for survival. In other environments, exploration is essential and persistence with a previously, but not currently, successful action wastes energy. Genetic diversity in dopamine function may afford enhanced adaptive survival by providing a range of phylogenetic solutions to the problem of determining the degree to which an organism should base future behavior on past outcomes, a vexing challenge in adaption for any organism.</p>
</sec>
</sec>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack><p>This work was supported by NIDA, DA25875 (Jeff A. Beeler), and NIMH, MH066216 (Xiaoxi Zhuang), a Scholar Award from the McKnight Foundation (Nathaniel Daw) and a NARSAD Young Investigator Award (Nathaniel Daw).</p></ack>
<ref-list><title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aberman</surname> <given-names>J. E.</given-names></name> <name><surname>Salamone</surname> <given-names>J. D.</given-names></name></person-group> (<year>1999</year>). <article-title>Nucleus accumbens dopamine depletions make rats more sensitive to high ratio requirements but do not impair primary food reinforcement</article-title>. <source>Neuroscience</source> <volume>92</volume>, <fpage>545</fpage>&#x02013;<lpage>552</lpage>.<pub-id pub-id-type="doi">10.1016/S0306-4522(99)00004-4</pub-id><pub-id pub-id-type="pmid">10408603</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aston-Jones</surname> <given-names>G.</given-names></name> <name><surname>Cohen</surname> <given-names>J. D.</given-names></name></person-group> (<year>2005a</year>). <article-title>Adaptive gain and the role of the locus coeruleus-norepinephrine system in optimal performance</article-title>. <source>J. Comp. Neurol.</source> <volume>493</volume>, <fpage>99</fpage>&#x02013;<lpage>110</lpage>.<pub-id pub-id-type="doi">10.1002/cne.20723</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aston-Jones</surname> <given-names>G.</given-names></name> <name><surname>Cohen</surname> <given-names>J. D.</given-names></name></person-group> (<year>2005b</year>). <article-title>An integrative theory of locus coeruleus-norepinephrine function: adaptive gain and optimal performance</article-title>. <source>Annu. Rev. Neurosci.</source> <volume>28</volume>, <fpage>403</fpage>&#x02013;<lpage>450</lpage>.<pub-id pub-id-type="doi">10.1146/annurev.neuro.28.061604.135709</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bamford</surname> <given-names>N. S.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Schmitz</surname> <given-names>Y.</given-names></name> <name><surname>Wu</surname> <given-names>N. P.</given-names></name> <name><surname>Cepeda</surname> <given-names>C.</given-names></name> <name><surname>Levine</surname> <given-names>M. S.</given-names></name> <name><surname>Schmauss</surname> <given-names>C.</given-names></name> <name><surname>Zakharenko</surname> <given-names>S. S.</given-names></name> <name><surname>Zablow</surname> <given-names>L.</given-names></name> <name><surname>Sulzer</surname> <given-names>D.</given-names></name></person-group> (<year>2004</year>). <article-title>Heterosynaptic dopamine neurotransmission selects sets of corticostriatal terminals</article-title>. <source>Neuron</source> <volume>42</volume>, <fpage>653</fpage>&#x02013;<lpage>663</lpage>.<pub-id pub-id-type="doi">10.1016/S0896-6273(04)00265-X</pub-id><pub-id pub-id-type="pmid">15157425</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Belin</surname> <given-names>D.</given-names></name> <name><surname>Everitt</surname> <given-names>B. J.</given-names></name></person-group> (<year>2008</year>). <article-title>Cocaine seeking habits depend upon dopamine-dependent serial connectivity linking the ventral with the dorsal striatum</article-title>. <source>Neuron</source> <volume>57</volume>, <fpage>432</fpage>&#x02013;<lpage>441</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuron.2007.12.019</pub-id><pub-id pub-id-type="pmid">18255035</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berke</surname> <given-names>J. D.</given-names></name> <name><surname>Hyman</surname> <given-names>S. E.</given-names></name></person-group> (<year>2000</year>). <article-title>Addiction, dopamine, and the molecular mechanisms of memory</article-title>. <source>Neuron</source> <volume>25</volume>, <fpage>515</fpage>&#x02013;<lpage>532</lpage>.<pub-id pub-id-type="doi">10.1016/S0896-6273(00)81056-9</pub-id><pub-id pub-id-type="pmid">10774721</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berridge</surname> <given-names>K. C.</given-names></name></person-group> (<year>2007</year>). <article-title>The debate over dopamine&#x00027;s role in reward: the case for incentive salience</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>191</volume>, <fpage>391</fpage>&#x02013;<lpage>431</lpage>.<pub-id pub-id-type="doi">10.1007/s00213-006-0578-x</pub-id><pub-id pub-id-type="pmid">17072591</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bosse</surname> <given-names>R.</given-names></name> <name><surname>Fumagalli</surname> <given-names>F.</given-names></name> <name><surname>Jaber</surname> <given-names>M.</given-names></name> <name><surname>Giros</surname> <given-names>B.</given-names></name> <name><surname>Gainetdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Wetsel</surname> <given-names>W. C.</given-names></name> <name><surname>Missale</surname> <given-names>C.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name></person-group> (<year>1997</year>). <article-title>Anterior pituitary hypoplasia and dwarfism in mice lacking the dopamine transporter</article-title>. <source>Neuron</source> <volume>19</volume>, <fpage>127</fpage>&#x02013;<lpage>138</lpage>.<pub-id pub-id-type="doi">10.1016/S0896-6273(00)80353-0</pub-id><pub-id pub-id-type="pmid">9247269</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Braver</surname> <given-names>T. S.</given-names></name> <name><surname>Barch</surname> <given-names>D. M.</given-names></name> <name><surname>Cohen</surname> <given-names>J. D.</given-names></name></person-group> (<year>1999</year>). <article-title>Cognition and control in schizophrenia: a computational model of dopamine and prefrontal function</article-title>. <source>Biol. Psychiatry</source> <volume>46</volume>, <fpage>312</fpage>&#x02013;<lpage>328</lpage>.<pub-id pub-id-type="doi">10.1016/S0006-3223(99)00116-X</pub-id><pub-id pub-id-type="pmid">10435197</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cagniard</surname> <given-names>B.</given-names></name> <name><surname>Balsam</surname> <given-names>P. D.</given-names></name> <name><surname>Brunner</surname> <given-names>D.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name></person-group> (<year>2006a</year>). <article-title>Mice with chronically elevated dopamine exhibit enhanced motivation, but not learning, for a food reward</article-title>. <source>Neuropsychopharmacology</source> <volume>31</volume>, <fpage>1362</fpage>&#x02013;<lpage>1370</lpage>.<pub-id pub-id-type="doi">10.1038/sj.npp.1300966</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cagniard</surname> <given-names>B.</given-names></name> <name><surname>Beeler</surname> <given-names>J. A.</given-names></name> <name><surname>Britt</surname> <given-names>J. P.</given-names></name> <name><surname>McGehee</surname> <given-names>D. S.</given-names></name> <name><surname>Marinelli</surname> <given-names>M.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name></person-group> (<year>2006b</year>). <article-title>Dopamine scales performance in the absence of new learning</article-title>. <source>Neuron</source> <volume>51</volume>, <fpage>541</fpage>&#x02013;<lpage>547</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuron.2006.07.026</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calabresi</surname> <given-names>P.</given-names></name> <name><surname>Picconi</surname> <given-names>B.</given-names></name> <name><surname>Tozzi</surname> <given-names>A.</given-names></name> <name><surname>Di Filippo</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>Dopamine-mediated regulation of corticostriatal synaptic plasticity</article-title>. <source>Trends Neurosci.</source> <volume>30</volume>, <fpage>211</fpage>&#x02013;<lpage>219</lpage>.<pub-id pub-id-type="doi">10.1016/j.tins.2007.03.001</pub-id><pub-id pub-id-type="pmid">17367873</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Camerer</surname> <given-names>C.</given-names></name> <name><surname>Ho</surname> <given-names>T.-H.</given-names></name></person-group> (<year>1999</year>). <article-title>Experience-weighted attraction learning in normal form games</article-title>. <source>Econometrica</source> <volume>67</volume>, <fpage>827</fpage>&#x02013;<lpage>874</lpage>.<pub-id pub-id-type="doi">10.1111/1468-0262.00054</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cepeda</surname> <given-names>C.</given-names></name> <name><surname>Hurst</surname> <given-names>R. S.</given-names></name> <name><surname>Altemus</surname> <given-names>K. L.</given-names></name> <name><surname>Flores-Hernandez</surname> <given-names>J.</given-names></name> <name><surname>Calvert</surname> <given-names>C. R.</given-names></name> <name><surname>Jokel</surname> <given-names>E. S.</given-names></name> <name><surname>Grandy</surname> <given-names>D. K.</given-names></name> <name><surname>Low</surname> <given-names>M. J.</given-names></name> <name><surname>Rubinstein</surname> <given-names>M.</given-names></name> <name><surname>Ariano</surname> <given-names>M. A.</given-names></name> <name><surname>Levine</surname> <given-names>M. S.</given-names></name></person-group> (<year>2001</year>). <article-title>Facilitated glutamatergic transmission in the striatum of D2 dopamine receptor-deficient mice</article-title>. <source>J. Neurophysiol.</source> <volume>85</volume>, <fpage>659</fpage>&#x02013;<lpage>670</lpage>.<pub-id pub-id-type="pmid">11160501</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Clatworthy</surname> <given-names>P. L.</given-names></name> <name><surname>Lewis</surname> <given-names>S. J.</given-names></name> <name><surname>Brichard</surname> <given-names>L.</given-names></name> <name><surname>Hong</surname> <given-names>Y. T.</given-names></name> <name><surname>Izquierdo</surname> <given-names>D.</given-names></name> <name><surname>Clark</surname> <given-names>L.</given-names></name> <name><surname>Cools</surname> <given-names>R.</given-names></name> <name><surname>Aigbirhio</surname> <given-names>F. I.</given-names></name> <name><surname>Baron</surname> <given-names>J. C.</given-names></name> <name><surname>Fryer</surname> <given-names>T. D.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>2009</year>). <article-title>Dopamine release in dissociable striatal subregions predicts the different effects of oral methylphenidate on reversal learning and spatial working memory</article-title>. <source>J. Neurosci.</source> <volume>29</volume>, <fpage>4690</fpage>&#x02013;<lpage>4696</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.3266-08.2009</pub-id><pub-id pub-id-type="pmid">19369539</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname> <given-names>J. D.</given-names></name> <name><surname>McClure</surname> <given-names>S. M.</given-names></name> <name><surname>Yu</surname> <given-names>A. J.</given-names></name></person-group> (<year>2007</year>). <article-title>Should I stay or should I go? How the human brain manages the trade-off between exploitation and exploration</article-title>. <source>Philos. Trans. R. Soc. Lond.</source> <volume>362</volume>, <fpage>933</fpage>&#x02013;<lpage>942</lpage>.<pub-id pub-id-type="doi">10.1098/rstb.2007.2098</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cools</surname> <given-names>R.</given-names></name> <name><surname>Clark</surname> <given-names>L.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>2004</year>). <article-title>Differential responses in human striatum and prefrontal cortex to changes in object and rule relevance</article-title>. <source>J. Neurosci.</source> <volume>24</volume>, <fpage>1129</fpage>&#x02013;<lpage>1135</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.4312-03.2004</pub-id><pub-id pub-id-type="pmid">14762131</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Corrado</surname> <given-names>G. S.</given-names></name> <name><surname>Sugrue</surname> <given-names>L. P.</given-names></name> <name><surname>Seung</surname> <given-names>H. S.</given-names></name> <name><surname>Newsome</surname> <given-names>W. T.</given-names></name></person-group> (<year>2005</year>). <article-title>Linear-nonlinear-Poisson models of primate choice dynamics</article-title>. <source>J. Exp. Anal. Behav.</source> <volume>84</volume>, <fpage>581</fpage>&#x02013;<lpage>617</lpage>.<pub-id pub-id-type="doi">10.1901/jeab.2005.23-05</pub-id><pub-id pub-id-type="pmid">16596981</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>D&#x00027;Souza</surname> <given-names>U. M.</given-names></name> <name><surname>Craig</surname> <given-names>I. W.</given-names></name></person-group> (<year>2008</year>). <article-title>Functional genetic polymorphisms in serotonin and dopamine gene systems and their significance in behavioural disorders</article-title>. <source>Prog. Brain Res.</source> <volume>172</volume>, <fpage>73</fpage>&#x02013;<lpage>98</lpage>.<pub-id pub-id-type="doi">10.1016/S0079-6123(08)00904-7</pub-id><pub-id pub-id-type="pmid">18772028</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daw</surname> <given-names>N. D.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name></person-group> (<year>2004</year>). <article-title>Neuroscience</article-title>. <source>Matchmaking. Sci. (New York, NY)</source> <volume>304</volume>, <fpage>1753</fpage>&#x02013;<lpage>1754</lpage>.</citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daw</surname> <given-names>N. D.</given-names></name> <name><surname>Doya</surname> <given-names>K.</given-names></name></person-group> (<year>2006</year>). <article-title>The computational neurobiology of learning and reward</article-title>. <source>Curr. Opin. Neurobiol.</source> <volume>16</volume>, <fpage>199</fpage>&#x02013;<lpage>204</lpage>.<pub-id pub-id-type="doi">10.1016/j.conb.2006.03.006</pub-id><pub-id pub-id-type="pmid">16563737</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daw</surname> <given-names>N. D.</given-names></name> <name><surname>O&#x00027;Doherty</surname> <given-names>J. P.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name> <name><surname>Seymour</surname> <given-names>B.</given-names></name> <name><surname>Dolan</surname> <given-names>R. J.</given-names></name></person-group> (<year>2006</year>). <article-title>Cortical substrates for exploratory decisions in humans</article-title>. <source>Nature</source> <volume>441</volume>, <fpage>876</fpage>&#x02013;<lpage>879</lpage>.<pub-id pub-id-type="doi">10.1038/nature04766</pub-id><pub-id pub-id-type="pmid">16778890</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Day</surname> <given-names>J. J.</given-names></name> <name><surname>Roitman</surname> <given-names>M. F.</given-names></name> <name><surname>Wightman</surname> <given-names>R. M.</given-names></name> <name><surname>Carelli</surname> <given-names>R. M.</given-names></name></person-group> (<year>2007</year>). <article-title>Associative learning mediates dynamic shifts in dopamine signaling in the nucleus accumbens</article-title>. <source>Nat. Neurosci.</source> <volume>10</volume>, <fpage>1020</fpage>&#x02013;<lpage>1028</lpage>.<pub-id pub-id-type="doi">10.1038/nn1923</pub-id><pub-id pub-id-type="pmid">17603481</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dayan</surname> <given-names>P.</given-names></name> <name><surname>Balleine</surname> <given-names>B. W.</given-names></name></person-group> (<year>2002</year>). <article-title>Reward, motivation, and reinforcement learning</article-title>. <source>Neuron</source> <volume>36</volume>, <fpage>285</fpage>&#x02013;<lpage>298</lpage>.<pub-id pub-id-type="doi">10.1016/S0896-6273(02)00963-7</pub-id><pub-id pub-id-type="pmid">12383782</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Delaveau</surname> <given-names>P.</given-names></name> <name><surname>Salgado-Pineda</surname> <given-names>P.</given-names></name> <name><surname>Micallef-Roll</surname> <given-names>J.</given-names></name> <name><surname>Blin</surname> <given-names>O.</given-names></name></person-group> (<year>2007</year>). <article-title>Amygdala activation modulated by levodopa during emotional recognition processing in healthy volunteers: a double-blind, placebo-controlled study</article-title>. <source>J. Clin. Psychopharmacol.</source> <volume>27</volume>, <fpage>692</fpage>&#x02013;<lpage>697</lpage>.<pub-id pub-id-type="doi">10.1097/jcp.0b013e31815a444d</pub-id><pub-id pub-id-type="pmid">18004139</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doya</surname> <given-names>K.</given-names></name></person-group> (<year>2002</year>). <article-title>Metalearning and neuromodulation</article-title>. <source>Neural. Netw.</source> <volume>15</volume>, <fpage>495</fpage>&#x02013;<lpage>506</lpage>.<pub-id pub-id-type="doi">10.1016/S0893-6080(02)00044-8</pub-id><pub-id pub-id-type="pmid">12371507</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dzirasa</surname> <given-names>K.</given-names></name> <name><surname>Ramsey</surname> <given-names>A. J.</given-names></name> <name><surname>Takahashi</surname> <given-names>D. Y.</given-names></name> <name><surname>Stapleton</surname> <given-names>J.</given-names></name> <name><surname>Potes</surname> <given-names>J. M.</given-names></name> <name><surname>Williams</surname> <given-names>J. K.</given-names></name> <name><surname>Gainetdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Sameshima</surname> <given-names>K.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name> <name><surname>Nicolelis</surname> <given-names>M. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Hyperdopaminergia and NMDA receptor hypofunction disrupt neural phase signaling</article-title>. <source>J. Neurosci.</source> <volume>29</volume>, <fpage>8215</fpage>&#x02013;<lpage>8224</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.1773-09.2009</pub-id><pub-id pub-id-type="pmid">19553461</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Flores-Hernandez</surname> <given-names>J.</given-names></name> <name><surname>Galarraga</surname> <given-names>E.</given-names></name> <name><surname>Bargas</surname> <given-names>J.</given-names></name></person-group> (<year>1997</year>). <article-title>Dopamine selects glutamatergic inputs to neostriatal neurons</article-title>. <source>Synapse (New York, NY)</source> <volume>25</volume>, <fpage>185</fpage>&#x02013;<lpage>195</lpage>.<pub-id pub-id-type="doi">10.1002/(SICI)1098-2396(199702)25:2&#x0003C;185::AID-SYN9&#x0003E;3.0.CO;2-8</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Floresco</surname> <given-names>S. B.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Enomoto</surname> <given-names>T.</given-names></name></person-group> (<year>2009</year>). <article-title>Neural circuits subserving behavioral flexibility and their relevance to schizophrenia</article-title>. <source>Behav. Brain Res.</source> <volume>204</volume>, <fpage>396</fpage>&#x02013;<lpage>409</lpage>.<pub-id pub-id-type="doi">10.1016/j.bbr.2008.12.001</pub-id><pub-id pub-id-type="pmid">19110006</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Frank</surname> <given-names>M. J.</given-names></name> <name><surname>Claus</surname> <given-names>E. D.</given-names></name></person-group> (<year>2006</year>). <article-title>Anatomy of a decision: striato-orbitofrontal interactions in reinforcement learning, decision making, and reversal</article-title>. <source>Psychol. Rev.</source> <volume>113</volume>, <fpage>300</fpage>&#x02013;<lpage>326</lpage>.<pub-id pub-id-type="doi">10.1037/0033-295X.113.2.300</pub-id><pub-id pub-id-type="pmid">16637763</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Frank</surname> <given-names>M. J.</given-names></name> <name><surname>Doll</surname> <given-names>B. B.</given-names></name> <name><surname>Oas-Terpstra</surname> <given-names>J.</given-names></name> <name><surname>Moreno</surname> <given-names>F.</given-names></name></person-group> (<year>2009</year>). <article-title>Prefrontal and striatal dopaminergic genes predict individual differences in exploration and exploitation</article-title>. <source>Nat. Neurosci.</source> <volume>12</volume>, <fpage>1062</fpage>&#x02013;<lpage>1068</lpage>.<pub-id pub-id-type="doi">10.1038/nn.2342</pub-id><pub-id pub-id-type="pmid">19620978</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gainetdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Wetsel</surname> <given-names>W. C.</given-names></name> <name><surname>Jones</surname> <given-names>S. R.</given-names></name> <name><surname>Levin</surname> <given-names>E. D.</given-names></name> <name><surname>Jaber</surname> <given-names>M.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name></person-group> (<year>1999</year>). <article-title>Role of serotonin in the paradoxical calming effect of psychostimulants on hyperactivity</article-title>. <source>Science (New York, NY)</source> <volume>283</volume>, <fpage>397</fpage>&#x02013;<lpage>401</lpage>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giros</surname> <given-names>B.</given-names></name> <name><surname>Jaber</surname> <given-names>M.</given-names></name> <name><surname>Jones</surname> <given-names>S. R.</given-names></name> <name><surname>Wightman</surname> <given-names>R. M.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name></person-group> (<year>1996</year>). <article-title>Hyperlocomotion and indifference to cocaine and amphetamine in mice lacking the dopamine transporter</article-title>. <source>Nature</source> <volume>379</volume>, <fpage>606</fpage>&#x02013;<lpage>612</lpage>.<pub-id pub-id-type="doi">10.1038/379606a0</pub-id><pub-id pub-id-type="pmid">8628395</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goto</surname> <given-names>Y.</given-names></name> <name><surname>Grace</surname> <given-names>A. A.</given-names></name></person-group> (<year>2005a</year>). <article-title>Dopamine-dependent interactions between limbic and prefrontal cortical plasticity in the nucleus accumbens: disruption by cocaine sensitization</article-title>. <source>Neuron</source> <volume>47</volume>, <fpage>255</fpage>&#x02013;<lpage>266</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuron.2005.06.017</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goto</surname> <given-names>Y.</given-names></name> <name><surname>Grace</surname> <given-names>A. A.</given-names></name></person-group> (<year>2005b</year>). <article-title>Dopaminergic modulation of limbic and cortical drive of nucleus accumbens in goal-directed behavior</article-title>. <source>Nat. Neurosci.</source> <volume>8</volume>, <fpage>805</fpage>&#x02013;<lpage>812</lpage>.<pub-id pub-id-type="doi">10.1038/nn1471</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grace</surname> <given-names>A. A.</given-names></name> <name><surname>Bunney</surname> <given-names>B. S.</given-names></name></person-group> (<year>1984a</year>). <article-title>The control of firing pattern in nigral dopamine neurons: burst firing</article-title>. <source>J. Neurosci.</source> <volume>4</volume>, <fpage>2877</fpage>&#x02013;<lpage>2890</lpage>.</citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grace</surname> <given-names>A. A.</given-names></name> <name><surname>Bunney</surname> <given-names>B. S.</given-names></name></person-group> (<year>1984b</year>). <article-title>The control of firing pattern in nigral dopamine neurons: single spike firing</article-title>. <source>J. Neurosci.</source> <volume>4</volume>, <fpage>2866</fpage>&#x02013;<lpage>2876</lpage>.</citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gutkin</surname> <given-names>B. S.</given-names></name> <name><surname>Dehaene</surname> <given-names>S.</given-names></name> <name><surname>Changeux</surname> <given-names>J. P.</given-names></name></person-group> (<year>2006</year>). <article-title>A neurocomputational hypothesis for nicotine addiction</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>103</volume>, <fpage>1106</fpage>&#x02013;<lpage>1111</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.0510220103</pub-id><pub-id pub-id-type="pmid">16415156</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haluk</surname> <given-names>D. M.</given-names></name> <name><surname>Floresco</surname> <given-names>S. B.</given-names></name></person-group> (<year>2009</year>). <article-title>Ventral striatal dopamine modulation of different forms of behavioral flexibility</article-title>. <source>Neuropsychopharmacology</source> <volume>34</volume>, <fpage>2041</fpage>&#x02013;<lpage>2052</lpage>.<pub-id pub-id-type="doi">10.1038/npp.2009.21</pub-id><pub-id pub-id-type="pmid">19262467</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hazy</surname> <given-names>T. E.</given-names></name> <name><surname>Frank</surname> <given-names>M. J.</given-names></name> <name><surname>O&#x00027;Reilly</surname> <given-names>R. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Towards an executive without a homunculus: computational models of the prefrontal cortex/basal ganglia system</article-title>. <source>Philos. Trans. R. Soc. Lond. B. Biol. Sci.</source> <volume>362</volume>, <fpage>1601</fpage>&#x02013;<lpage>1613</lpage>.<pub-id pub-id-type="doi">10.1098/rstb.2007.2055</pub-id><pub-id pub-id-type="pmid">17428778</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Horvitz</surname> <given-names>J. C.</given-names></name></person-group> (<year>2002</year>). <article-title>Dopamine gating of glutamatergic sensorimotor and incentive motivational input signals to the striatum</article-title>. <source>Behav. Brain Res.</source> <volume>137</volume>, <fpage>65</fpage>&#x02013;<lpage>74</lpage>.<pub-id pub-id-type="doi">10.1016/S0166-4328(02)00285-1</pub-id><pub-id pub-id-type="pmid">12445716</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hsu</surname> <given-names>K. S.</given-names></name> <name><surname>Huang</surname> <given-names>C. C.</given-names></name> <name><surname>Yang</surname> <given-names>C. H.</given-names></name> <name><surname>Gean</surname> <given-names>P. W.</given-names></name></person-group> (<year>1995</year>). <article-title>Presynaptic D2 dopaminergic receptors mediate inhibition of excitatory synaptic transmission in rat neostriatum</article-title>. <source>Brain Res.</source> <volume>690</volume>, <fpage>264</fpage>&#x02013;<lpage>268</lpage>.<pub-id pub-id-type="doi">10.1016/0006-8993(95)00734-8</pub-id><pub-id pub-id-type="pmid">8535848</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Humphries</surname> <given-names>M. D.</given-names></name> <name><surname>Prescott</surname> <given-names>T. J.</given-names></name></person-group> (<year>2010</year>). <article-title>The ventral basal ganglia, a selection mechanism at the crossroads of space, strategy, and reward</article-title>. <source>Prog. Neurobiol.</source> <volume>90</volume>, <fpage>385</fpage>&#x02013;<lpage>417</lpage>.<pub-id pub-id-type="doi">10.1016/j.pneurobio.2009.11.003</pub-id><pub-id pub-id-type="pmid">19941931</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joshua</surname> <given-names>M.</given-names></name> <name><surname>Adler</surname> <given-names>A.</given-names></name> <name><surname>Bergman</surname> <given-names>H.</given-names></name></person-group> (<year>2009</year>). <article-title>The dynamics of dopamine in control of motor behavior</article-title>. <source>Curr. Opin. Neurobiol.</source> <volume>19</volume>, <fpage>615</fpage>&#x02013;<lpage>620</lpage>.<pub-id pub-id-type="doi">10.1016/j.conb.2009.10.001</pub-id><pub-id pub-id-type="pmid">19896833</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kass</surname> <given-names>R. E.</given-names></name> <name><surname>Raftery</surname> <given-names>A. E.</given-names></name></person-group> (<year>1995</year>). <article-title>Bayes factors</article-title>. <source>J. Am. Stat. Assoc.</source> <fpage>90</fpage>.<pub-id pub-id-type="pmid">12155398</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kehagia</surname> <given-names>A. A.</given-names></name> <name><surname>Murray</surname> <given-names>G. K.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>2010</year>). <article-title>Learning and cognitive flexibility: frontostriatal function and monoaminergic modulation</article-title>. <source>Curr. Opin. Neurobiol.</source> <volume>20</volume>, <fpage>199</fpage>&#x02013;<lpage>204</lpage>.<pub-id pub-id-type="doi">10.1016/j.conb.2010.01.007</pub-id><pub-id pub-id-type="pmid">20167474</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kellendonk</surname> <given-names>C.</given-names></name> <name><surname>Simpson</surname> <given-names>E. H.</given-names></name> <name><surname>Polan</surname> <given-names>H. J.</given-names></name> <name><surname>Malleret</surname> <given-names>G.</given-names></name> <name><surname>Vronskaya</surname> <given-names>S.</given-names></name> <name><surname>Winiger</surname> <given-names>V.</given-names></name> <name><surname>Moore</surname> <given-names>H.</given-names></name> <name><surname>Kandel</surname> <given-names>E. R.</given-names></name></person-group> (<year>2006</year>). <article-title>Transient and selective overexpression of dopamine D2 receptors in the striatum causes persistent abnormalities in prefrontal cortex functioning</article-title>. <source>Neuron</source> <volume>49</volume>, <fpage>603</fpage>&#x02013;<lpage>615</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuron.2006.01.023</pub-id><pub-id pub-id-type="pmid">16476668</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kheirbek</surname> <given-names>M. A.</given-names></name> <name><surname>Beeler</surname> <given-names>J. A.</given-names></name> <name><surname>Ishikawa</surname> <given-names>Y.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name></person-group> (<year>2008</year>). <article-title>A cAMP pathway underlying reward prediction in associative learning</article-title>. <source>J. Neurosci.</source> <volume>28</volume>, <fpage>11401</fpage>&#x02013;<lpage>11408</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.4115-08.2008</pub-id><pub-id pub-id-type="pmid">18971482</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kiyatkin</surname> <given-names>E. A.</given-names></name> <name><surname>Rebec</surname> <given-names>G. V.</given-names></name></person-group> (<year>1996</year>). <article-title>Dopaminergic modulation of glutamate-induced excitations of neurons in the neostriatum and nucleus accumbens of awake, unrestrained rats</article-title>. <source>J. Neurophysiol.</source> <volume>75</volume>, <fpage>142</fpage>&#x02013;<lpage>153</lpage>.<pub-id pub-id-type="pmid">8822548</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lau</surname> <given-names>B.</given-names></name> <name><surname>Glimcher</surname> <given-names>P. W.</given-names></name></person-group> (<year>2005</year>). <article-title>Dynamic response-by-response models of matching behavior in rhesus monkeys</article-title>. <source>J. Exp. Anal. Behav.</source> <volume>84</volume>, <fpage>555</fpage>&#x02013;<lpage>579</lpage>.<pub-id pub-id-type="doi">10.1901/jeab.2005.110-04</pub-id><pub-id pub-id-type="pmid">16596980</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Le Foll</surname> <given-names>B.</given-names></name> <name><surname>Gallo</surname> <given-names>A.</given-names></name> <name><surname>Le Strat</surname> <given-names>Y.</given-names></name> <name><surname>Lu</surname> <given-names>L.</given-names></name> <name><surname>Gorwood</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <article-title>Genetics of dopamine receptors and drug addiction: a comprehensive review</article-title>. <source>Behav. Pharmacol.</source> <volume>20</volume>, <fpage>1</fpage>&#x02013;<lpage>17</lpage>.<pub-id pub-id-type="doi">10.1097/FBP.0b013e3283242f05</pub-id><pub-id pub-id-type="pmid">19179847</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lo</surname> <given-names>C. C.</given-names></name> <name><surname>Wang</surname> <given-names>X. J.</given-names></name></person-group> (<year>2006</year>). <article-title>Cortico-basal ganglia circuit mechanism for a decision threshold in reaction time tasks</article-title>. <source>Nat. Neurosci.</source> <volume>9</volume>, <fpage>956</fpage>&#x02013;<lpage>963</lpage>.<pub-id pub-id-type="doi">10.1038/nn1722</pub-id><pub-id pub-id-type="pmid">16767089</pub-id></citation></ref>
<ref id="B53"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Lyons</surname> <given-names>M.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>1975</year>) <article-title>&#x0201C;The action of central nervous system stimulant drugs: a general theory concerning amphetamine effects,&#x0201D;</article-title> in <source>Current Developments in Psychopharmacology</source>, Vol. <volume>2</volume>, ed <person-group person-group-type="editor"><name><surname>Essman</surname> <given-names>W.</given-names></name></person-group>, (<publisher-loc>New York</publisher-loc>: <publisher-name>Spectrum</publisher-name>), <fpage>79</fpage>&#x02013;<lpage>163</lpage>.</citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marco-Pallares</surname> <given-names>J.</given-names></name> <name><surname>Cucurell</surname> <given-names>D.</given-names></name> <name><surname>Cunillera</surname> <given-names>T.</given-names></name> <name><surname>Kramer</surname> <given-names>U. M.</given-names></name> <name><surname>Camara</surname> <given-names>E.</given-names></name> <name><surname>Nager</surname> <given-names>W.</given-names></name> <name><surname>Bauer</surname> <given-names>P.</given-names></name> <name><surname>Schule</surname> <given-names>R.</given-names></name> <name><surname>Schols</surname> <given-names>L.</given-names></name> <name><surname>Munte</surname> <given-names>T. F.</given-names></name> <name><surname>Rodriguez-Fornells</surname> <given-names>A.</given-names></name></person-group> (<year>2009</year>). <article-title>Genetic variability in the dopamine system (dopamine receptor D4, catechol-<italic>O</italic>-methyltransferase) modulates neurophysiological responses to gains and losses</article-title>. <source>Biol. Psychiatry.</source> <volume>66</volume>, <fpage>154</fpage>&#x02013;<lpage>161</lpage>.<pub-id pub-id-type="doi">10.1016/j.biopsych.2009.01.006</pub-id><pub-id pub-id-type="pmid">19251248</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mingote</surname> <given-names>S.</given-names></name> <name><surname>Weber</surname> <given-names>S. M.</given-names></name> <name><surname>Ishiwari</surname> <given-names>K.</given-names></name> <name><surname>Correa</surname> <given-names>M.</given-names></name> <name><surname>Salamone</surname> <given-names>J. D.</given-names></name></person-group> (<year>2005</year>). <article-title>Ratio and time requirements on operant schedules: effort-related effects of nucleus accumbens dopamine depletions</article-title>. <source>Eur. J. Neurosci.</source> <volume>21</volume>, <fpage>1749</fpage>&#x02013;<lpage>1757</lpage>.<pub-id pub-id-type="doi">10.1111/j.1460-9568.2005.03972.x</pub-id><pub-id pub-id-type="pmid">15845103</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mink</surname> <given-names>J. W.</given-names></name></person-group> (<year>1996</year>). <article-title>The basal ganglia: focused selection and inhibition of competing motor programs</article-title>. <source>Prog. Neurobiol.</source> <volume>50</volume>, <fpage>381</fpage>&#x02013;<lpage>425</lpage>.<pub-id pub-id-type="doi">10.1016/S0301-0082(96)00042-1</pub-id><pub-id pub-id-type="pmid">9004351</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mogenson</surname> <given-names>G. J.</given-names></name> <name><surname>Jones</surname> <given-names>D. L.</given-names></name> <name><surname>Yim</surname> <given-names>C. Y.</given-names></name></person-group> (<year>1980</year>). <article-title>From motivation to action: functional interface between the limbic system and the motor system</article-title>. <source>Prog. Neurobiol.</source> <volume>14</volume>, <fpage>69</fpage>&#x02013;<lpage>97</lpage>.<pub-id pub-id-type="doi">10.1016/0301-0082(80)90018-0</pub-id><pub-id pub-id-type="pmid">6999537</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Montague</surname> <given-names>P. R.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name> <name><surname>Sejnowski</surname> <given-names>T. J.</given-names></name></person-group> (<year>1996</year>). <article-title>A framework for mesencephalic dopamine systems based on predictive Hebbian learning</article-title>. <source>J. Neurosci.</source> <volume>16</volume>, <fpage>1936</fpage>&#x02013;<lpage>1947</lpage>.<pub-id pub-id-type="pmid">8774460</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Monte-Silva</surname> <given-names>K.</given-names></name> <name><surname>Kuo</surname> <given-names>M. F.</given-names></name> <name><surname>Thirugnanasambandam</surname> <given-names>N.</given-names></name> <name><surname>Liebetanz</surname> <given-names>D.</given-names></name> <name><surname>Paulus</surname> <given-names>W.</given-names></name> <name><surname>Nitsche</surname> <given-names>M. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Dose-dependent inverted U-shaped effect of dopamine (D2-like) receptor activation on focal and nonfocal plasticity in humans</article-title>. <source>J. Neurosci.</source> <volume>29</volume>, <fpage>6124</fpage>&#x02013;<lpage>6131</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.0728-09.2009</pub-id><pub-id pub-id-type="pmid">19439590</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morice</surname> <given-names>E.</given-names></name> <name><surname>Billard</surname> <given-names>J. M.</given-names></name> <name><surname>Denis</surname> <given-names>C.</given-names></name> <name><surname>Mathieu</surname> <given-names>F.</given-names></name> <name><surname>Betancur</surname> <given-names>C.</given-names></name> <name><surname>Epelbaum</surname> <given-names>J.</given-names></name> <name><surname>Giros</surname> <given-names>B.</given-names></name> <name><surname>Nosten-Bertrand</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <article-title>Parallel loss of hippocampal LTD and cognitive flexibility in a genetic model of hyperdopaminergia</article-title>. <source>Neuropsychopharmacology</source> <volume>32</volume>, <fpage>2108</fpage>&#x02013;<lpage>2116</lpage>.<pub-id pub-id-type="doi">10.1038/sj.npp.1301354</pub-id><pub-id pub-id-type="pmid">17342172</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Moron</surname> <given-names>J. A.</given-names></name> <name><surname>Brockington</surname> <given-names>A.</given-names></name> <name><surname>Wise</surname> <given-names>R. A.</given-names></name> <name><surname>Rocha</surname> <given-names>B. A.</given-names></name> <name><surname>Hope</surname> <given-names>B. T.</given-names></name></person-group> (<year>2002</year>). <article-title>Dopamine uptake through the norepinephrine transporter in brain regions with low levels of the dopamine transporter: evidence from knock-out mouse lines</article-title>. <source>J. Neurosci.</source> <volume>22</volume>, <fpage>389</fpage>&#x02013;<lpage>395</lpage>.<pub-id pub-id-type="pmid">11784783</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mundorf</surname> <given-names>M. L.</given-names></name> <name><surname>Joseph</surname> <given-names>J. D.</given-names></name> <name><surname>Austin</surname> <given-names>C. M.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name> <name><surname>Wightman</surname> <given-names>R. M.</given-names></name></person-group> (<year>2001</year>). <article-title>Catecholamine release and uptake in the mouse prefrontal cortex</article-title>. <source>J. Neurochem.</source> <volume>79</volume>, <fpage>130</fpage>&#x02013;<lpage>142</lpage>.<pub-id pub-id-type="doi">10.1046/j.1471-4159.2001.00554.x</pub-id><pub-id pub-id-type="pmid">11595765</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nicola</surname> <given-names>S. M.</given-names></name></person-group> (<year>2007</year>). <article-title>The nucleus accumbens as part of a basal ganglia action selection circuit</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>191</volume>, <fpage>521</fpage>&#x02013;<lpage>550</lpage>.<pub-id pub-id-type="doi">10.1007/s00213-006-0510-4</pub-id><pub-id pub-id-type="pmid">16983543</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nicola</surname> <given-names>S. M.</given-names></name> <name><surname>Surmeier</surname> <given-names>J.</given-names></name> <name><surname>Malenka</surname> <given-names>R. C.</given-names></name></person-group> (<year>2000</year>). <article-title>Dopaminergic modulation of neuronal excitability in the striatum and nucleus accumbens</article-title>. <source>Annu. Rev. Neurosci.</source> <volume>23</volume>, <fpage>185</fpage>&#x02013;<lpage>215</lpage>.<pub-id pub-id-type="doi">10.1146/annurev.neuro.23.1.185</pub-id><pub-id pub-id-type="pmid">10845063</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Niv</surname> <given-names>Y.</given-names></name> <name><surname>Daw</surname> <given-names>N. D.</given-names></name> <name><surname>Joel</surname> <given-names>D.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name></person-group> (<year>2007</year>). <article-title>Tonic dopamine: opportunity costs and the control of response vigor</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>191</volume>, <fpage>507</fpage>&#x02013;<lpage>520</lpage>.<pub-id pub-id-type="doi">10.1007/s00213-006-0502-4</pub-id><pub-id pub-id-type="pmid">17031711</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Palmiter</surname> <given-names>R. D.</given-names></name></person-group> (<year>2008</year>). <article-title>Dopamine signaling in the dorsal striatum is essential for motivated behaviors: lessons from dopamine-deficient mice</article-title>. <source>Ann. N. Y. Acad. Sci.</source> <volume>1129</volume>, <fpage>35</fpage>&#x02013;<lpage>46</lpage>.<pub-id pub-id-type="doi">10.1196/annals.1417.003</pub-id><pub-id pub-id-type="pmid">18591467</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pecina</surname> <given-names>S.</given-names></name> <name><surname>Cagniard</surname> <given-names>B.</given-names></name> <name><surname>Berridge</surname> <given-names>K. C.</given-names></name> <name><surname>Aldridge</surname> <given-names>J. W.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name></person-group> (<year>2003</year>). <article-title>Hyperdopaminergic mutant mice have higher &#x0201C;wanting&#x0201D; but not &#x0201C;liking&#x0201D; for sweet rewards</article-title>. <source>J. Neurosci.</source> <volume>23</volume>, <fpage>9395</fpage>&#x02013;<lpage>9402</lpage>.<pub-id pub-id-type="pmid">14561867</pub-id></citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pennartz</surname> <given-names>C. M.</given-names></name> <name><surname>Berke</surname> <given-names>J. D.</given-names></name> <name><surname>Graybiel</surname> <given-names>A. M.</given-names></name> <name><surname>Ito</surname> <given-names>R.</given-names></name> <name><surname>Lansink</surname> <given-names>C. S.</given-names></name> <name><surname>van der Meer</surname> <given-names>M.</given-names></name> <name><surname>Redish</surname> <given-names>A. D.</given-names></name> <name><surname>Smith</surname> <given-names>K. S.</given-names></name> <name><surname>Voorn</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <article-title>Corticostriatal Interactions during Learning, Memory Processing, and Decision Making</article-title>. <source>J. Neurosci.</source> <volume>29</volume>, <fpage>12831</fpage>&#x02013;<lpage>12838</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.3177-09.2009</pub-id><pub-id pub-id-type="pmid">19828796</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phillips</surname> <given-names>P. E.</given-names></name> <name><surname>Walton</surname> <given-names>M. E.</given-names></name> <name><surname>Jhou</surname> <given-names>T. C.</given-names></name></person-group> (<year>2007</year>). <article-title>Calculating utility: preclinical evidence for cost-benefit analysis by mesolimbic dopamine</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>191</volume>, <fpage>483</fpage>&#x02013;<lpage>495</lpage>.<pub-id pub-id-type="doi">10.1007/s00213-006-0626-6</pub-id><pub-id pub-id-type="pmid">17119929</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Redgrave</surname> <given-names>P.</given-names></name> <name><surname>Prescott</surname> <given-names>T. J.</given-names></name> <name><surname>Gurney</surname> <given-names>K.</given-names></name></person-group> (<year>1999</year>). <article-title>The basal ganglia: a vertebrate solution to the selection problem?</article-title> <source>Neuroscience</source> <volume>89</volume>, <fpage>1009</fpage>&#x02013;<lpage>1023</lpage>.<pub-id pub-id-type="doi">10.1016/S0306-4522(98)00319-4</pub-id><pub-id pub-id-type="pmid">10362291</pub-id></citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reynolds</surname> <given-names>J. N.</given-names></name> <name><surname>Wickens</surname> <given-names>J. R.</given-names></name></person-group> (<year>2002</year>). <article-title>Dopamine-dependent plasticity of corticostriatal synapses</article-title>. <source>Neural. Netw.</source> <volume>15</volume>, <fpage>507</fpage>&#x02013;<lpage>521</lpage>.<pub-id pub-id-type="doi">10.1016/S0893-6080(02)00045-X</pub-id><pub-id pub-id-type="pmid">12371508</pub-id></citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rowland</surname> <given-names>N. E.</given-names></name> <name><surname>Vaughan</surname> <given-names>C. H.</given-names></name> <name><surname>Mathes</surname> <given-names>C. M.</given-names></name> <name><surname>Mitra</surname> <given-names>A.</given-names></name></person-group> (<year>2008</year>). <article-title>Feeding behavior, obesity, and neuroeconomics</article-title>. <source>Physiol. Behav.</source> <volume>93</volume>, <fpage>97</fpage>&#x02013;<lpage>109</lpage>.<pub-id pub-id-type="doi">10.1016/j.physbeh.2007.08.003</pub-id><pub-id pub-id-type="pmid">17825853</pub-id></citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rudebeck</surname> <given-names>P. H.</given-names></name> <name><surname>Walton</surname> <given-names>M. E.</given-names></name> <name><surname>Smyth</surname> <given-names>A. N.</given-names></name> <name><surname>Bannerman</surname> <given-names>D. M.</given-names></name> <name><surname>Rushworth</surname> <given-names>M. F.</given-names></name></person-group> (<year>2006</year>). <article-title>Separate neural pathways process different decision costs</article-title>. <source>Nat. Neurosci.</source> <volume>9</volume>, <fpage>1161</fpage>&#x02013;<lpage>1168</lpage>.<pub-id pub-id-type="doi">10.1038/nn1756</pub-id><pub-id pub-id-type="pmid">16921368</pub-id></citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salamone</surname> <given-names>J. D.</given-names></name></person-group> (<year>2006</year>). <article-title>Will the last person who uses the term &#x02018;reward&#x02019; please turn out the lights? Comments on processes related to reinforcement, learning, motivation and effort</article-title>. <source>Addict. Biol.</source> <volume>11</volume>, <fpage>43</fpage>&#x02013;<lpage>44</lpage>.<pub-id pub-id-type="doi">10.1111/j.1369-1600.2006.00011.x</pub-id><pub-id pub-id-type="pmid">16759335</pub-id></citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salamone</surname> <given-names>J. D.</given-names></name></person-group> (<year>2007</year>). <article-title>Functions of mesolimbic dopamine: changing concepts and shifting paradigms</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>191</volume>, <fpage>389</fpage>.<pub-id pub-id-type="doi">10.1007/s00213-006-0623-9</pub-id><pub-id pub-id-type="pmid">17334798</pub-id></citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salamone</surname> <given-names>J. D.</given-names></name> <name><surname>Correa</surname> <given-names>M.</given-names></name></person-group> (<year>2002</year>). <article-title>Motivational views of reinforcement: implications for understanding the behavioral functions of nucleus accumbens dopamine</article-title>. <source>Behav. Brain Res.</source> <volume>137</volume>, <fpage>3</fpage>&#x02013;<lpage>25</lpage>.<pub-id pub-id-type="doi">10.1016/S0166-4328(02)00282-6</pub-id><pub-id pub-id-type="pmid">12445713</pub-id></citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salamone</surname> <given-names>J. D.</given-names></name> <name><surname>Correa</surname> <given-names>M.</given-names></name> <name><surname>Mingote</surname> <given-names>S.</given-names></name> <name><surname>Weber</surname> <given-names>S. M.</given-names></name></person-group> (<year>2003</year>). <article-title>Nucleus accumbens dopamine and the regulation of effort in food-seeking behavior: implications for studies of natural motivation, psychiatry, and drug abuse</article-title>. <source>J. Pharmacol. Exp. Ther.</source> <volume>305</volume>, <fpage>1</fpage>&#x02013;<lpage>8</lpage>.<pub-id pub-id-type="doi">10.1124/jpet.102.035063</pub-id><pub-id pub-id-type="pmid">12649346</pub-id></citation></ref>
<ref id="B78"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Salamone</surname> <given-names>J. D.</given-names></name> <name><surname>Wisniecki</surname> <given-names>A.</given-names></name> <name><surname>Carlson</surname> <given-names>B. B.</given-names></name> <name><surname>Correa</surname> <given-names>M.</given-names></name></person-group> (<year>2001</year>). <article-title>Nucleus accumbens dopamine depletions make animals highly sensitive to high fixed ratio requirements but do not impair primary food reinforcement</article-title>. <source>Neuroscience</source> <volume>105</volume>, <fpage>863</fpage>&#x02013;<lpage>870</lpage>.<pub-id pub-id-type="doi">10.1016/S0306-4522(01)00249-4</pub-id><pub-id pub-id-type="pmid">11530224</pub-id></citation></ref>
<ref id="B79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schellekens</surname> <given-names>A. F.</given-names></name> <name><surname>Grootens</surname> <given-names>K. P.</given-names></name> <name><surname>Neef</surname> <given-names>C.</given-names></name> <name><surname>Movig</surname> <given-names>K. L.</given-names></name> <name><surname>Buitelaar</surname> <given-names>J. K.</given-names></name> <name><surname>Ellenbroek</surname> <given-names>B.</given-names></name> <name><surname>Verkes</surname> <given-names>R. J.</given-names></name></person-group> (<year>2010</year>). <article-title>Effect of apomorphine on cognitive performance and sensorimotor gating in humans</article-title>. <source>Psychopharmacology</source> <volume>207</volume>, <fpage>559</fpage>&#x02013;<lpage>569</lpage>.<pub-id pub-id-type="doi">10.1007/s00213-009-1686-1</pub-id><pub-id pub-id-type="pmid">19834690</pub-id></citation></ref>
<ref id="B80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>W.</given-names></name></person-group> (<year>2007a</year>). <article-title>Behavioral dopamine signals</article-title>. <source>Trends Neurosci.</source> <volume>30</volume>, <fpage>203</fpage>&#x02013;<lpage>210</lpage>.<pub-id pub-id-type="doi">10.1016/j.tins.2007.03.007</pub-id></citation></ref>
<ref id="B81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>W.</given-names></name></person-group> (<year>2007b</year>). <article-title>Multiple dopamine functions at different time courses</article-title>. <source>Annu. Rev. Neurosci.</source> <volume>30</volume>, <fpage>259</fpage>&#x02013;<lpage>288</lpage>.<pub-id pub-id-type="doi">10.1146/annurev.neuro.28.061604.135722</pub-id></citation></ref>
<ref id="B82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>W.</given-names></name> <name><surname>Apicella</surname> <given-names>P.</given-names></name> <name><surname>Ljungberg</surname> <given-names>T.</given-names></name></person-group> (<year>1993</year>). <article-title>Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task</article-title>. <source>J. Neurosci.</source> <volume>13</volume>, <fpage>900</fpage>&#x02013;<lpage>913</lpage>.<pub-id pub-id-type="pmid">8441015</pub-id></citation></ref>
<ref id="B83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>W.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name> <name><surname>Montague</surname> <given-names>P. R.</given-names></name></person-group> (<year>1997</year>). <article-title>A neural substrate of prediction and reward</article-title>. <source>Science (New York, NY)</source> <volume>275</volume>, <fpage>1593</fpage>&#x02013;<lpage>1599</lpage>.</citation></ref>
<ref id="B84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>W.</given-names></name> <name><surname>Dickinson</surname> <given-names>A.</given-names></name></person-group> (<year>2000</year>). <article-title>Neuronal coding of prediction errors</article-title>. <source>Annu. Rev. Neurosci.</source> <volume>23</volume>, <fpage>473</fpage>&#x02013;<lpage>500</lpage>.<pub-id pub-id-type="doi">10.1146/annurev.neuro.23.1.473</pub-id><pub-id pub-id-type="pmid">10845072</pub-id></citation></ref>
<ref id="B85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwarz</surname> <given-names>G.</given-names></name></person-group> (<year>1978</year>). <article-title>Estimating the dimension of a model</article-title>. <source>Ann. Stat.</source> <volume>6</volume>, <fpage>461</fpage>&#x02013;<lpage>464</lpage>.<pub-id pub-id-type="doi">10.1214/aos/1176344136</pub-id></citation></ref>
<ref id="B86"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seamans</surname> <given-names>J. K.</given-names></name> <name><surname>Floresco</surname> <given-names>S. B.</given-names></name> <name><surname>Phillips</surname> <given-names>A. G.</given-names></name></person-group> (<year>1998</year>). <article-title>D1 receptor modulation of hippocampal-prefrontal cortical circuits integrating spatial memory with executive functions in the rat</article-title>. <source>J. Neurosci.</source> <volume>18</volume>, <fpage>1613</fpage>&#x02013;<lpage>1621</lpage>.<pub-id pub-id-type="pmid">9454866</pub-id></citation></ref>
<ref id="B87"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Servan-Schreiber</surname> <given-names>D.</given-names></name> <name><surname>Printz</surname> <given-names>H.</given-names></name> <name><surname>Cohen</surname> <given-names>J. D.</given-names></name></person-group> (<year>1990</year>). <article-title>A network model of catecholamine effects: gain, signal-to-noise ratio, and behavior</article-title>. <source>Science (New York, NY)</source> <volume>249</volume>, <fpage>892</fpage>&#x02013;<lpage>895</lpage>.</citation></ref>
<ref id="B88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sesack</surname> <given-names>S. R.</given-names></name> <name><surname>Hawrylak</surname> <given-names>V. A.</given-names></name> <name><surname>Guido</surname> <given-names>M. A.</given-names></name> <name><surname>Levey</surname> <given-names>A. I.</given-names></name></person-group> (<year>1998</year>). <article-title>Cellular and subcellular localization of the dopamine transporter in rat cortex</article-title>. <source>Adv. Pharmacol. (San Diego, CA)</source> <volume>42</volume>, <fpage>171</fpage>&#x02013;<lpage>174</lpage>.</citation></ref>
<ref id="B89"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shidara</surname> <given-names>M.</given-names></name> <name><surname>Aigner</surname> <given-names>T. G.</given-names></name> <name><surname>Richmond</surname> <given-names>B. J.</given-names></name></person-group> (<year>1998</year>). <article-title>Neuronal signals in the monkey ventral striatum related to progress through a predictable series of trials</article-title>. <source>J. Neurosci.</source> <volume>18</volume>, <fpage>2613</fpage>&#x02013;<lpage>2625</lpage>.<pub-id pub-id-type="pmid">9502820</pub-id></citation></ref>
<ref id="B90"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shidara</surname> <given-names>M.</given-names></name> <name><surname>Mizuhiki</surname> <given-names>T.</given-names></name> <name><surname>Richmond</surname> <given-names>B. J.</given-names></name></person-group> (<year>2005</year>). <article-title>Neuronal firing in anterior cingulate neurons changes modes across trials in single states of multitrial reward schedules</article-title>. <source>Exp. Brain Res.</source> <volume>163</volume>, <fpage>242</fpage>&#x02013;<lpage>245</lpage>.<pub-id pub-id-type="doi">10.1007/s00221-005-2232-y</pub-id><pub-id pub-id-type="pmid">15912371</pub-id></citation></ref>
<ref id="B91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shidara</surname> <given-names>M.</given-names></name> <name><surname>Richmond</surname> <given-names>B. J.</given-names></name></person-group> (<year>2004</year>). <article-title>Differential encoding of information about progress through multi-trial reward schedules by three groups of ventral striatal neurons</article-title>. <source>Neurosci. Res.</source> <volume>49</volume>, <fpage>307</fpage>&#x02013;<lpage>314</lpage>.<pub-id pub-id-type="doi">10.1016/j.neures.2004.03.008</pub-id><pub-id pub-id-type="pmid">15196779</pub-id></citation></ref>
<ref id="B92"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sugrue</surname> <given-names>L. P.</given-names></name> <name><surname>Corrado</surname> <given-names>G. S.</given-names></name> <name><surname>Newsome</surname> <given-names>W. T.</given-names></name></person-group> (<year>2004</year>). <article-title>Matching behavior and the representation of value in the parietal cortex</article-title>. <source>Science (New York, NY)</source> <volume>304</volume>, <fpage>1782</fpage>&#x02013;<lpage>1787</lpage>.</citation></ref>
<ref id="B93"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Sutton</surname> <given-names>R.</given-names></name> <name><surname>Barto</surname> <given-names>A.</given-names></name></person-group> (<year>1998</year>). <source>Reinforcement Learning: An Introduction</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>.</citation></ref>
<ref id="B94"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Taylor</surname> <given-names>J. R.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>1984</year>). <article-title>Enhanced behavioural control by conditioned reinforcers following microinjections of d-amphetamine into the nucleus accumbens</article-title>. <source>Psychopharmacology (Berl.)</source> <volume>84</volume>, <fpage>405</fpage>&#x02013;<lpage>412</lpage>.<pub-id pub-id-type="doi">10.1007/BF00555222</pub-id><pub-id pub-id-type="pmid">6440188</pub-id></citation></ref>
<ref id="B95"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tobler</surname> <given-names>P. N.</given-names></name> <name><surname>Fiorillo</surname> <given-names>C. D.</given-names></name> <name><surname>Schultz</surname> <given-names>W.</given-names></name></person-group> (<year>2005</year>). <article-title>Adaptive coding of reward value by dopamine neurons</article-title>. <source>Science (New York, NY)</source> <volume>307</volume>, <fpage>1642</fpage>&#x02013;<lpage>1645</lpage>.</citation></ref>
<ref id="B96"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vijayraghavan</surname> <given-names>S.</given-names></name> <name><surname>Wang</surname> <given-names>M.</given-names></name> <name><surname>Birnbaum</surname> <given-names>S. G.</given-names></name> <name><surname>Williams</surname> <given-names>G. V.</given-names></name> <name><surname>Arnsten</surname> <given-names>A. F.</given-names></name></person-group> (<year>2007</year>). <article-title>Inverted-U dopamine D1 receptor actions on prefrontal neurons engaged in working memory</article-title>. <source>Nat. Neurosci.</source> <volume>10</volume>, <fpage>376</fpage>&#x02013;<lpage>384</lpage>.<pub-id pub-id-type="doi">10.1038/nn1846</pub-id><pub-id pub-id-type="pmid">17277774</pub-id></citation></ref>
<ref id="B97"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Weiss</surname> <given-names>S.</given-names></name> <name><surname>Nosten-Bertrand</surname> <given-names>M.</given-names></name> <name><surname>McIntosh</surname> <given-names>J. M.</given-names></name> <name><surname>Giros</surname> <given-names>B.</given-names></name> <name><surname>Martres</surname> <given-names>M. P.</given-names></name></person-group> (<year>2007</year>). <article-title>Nicotine improves cognitive deficits of dopamine transporter knockout mice without long-term tolerance</article-title>. <source>Neuropsychopharmacology</source> <volume>32</volume>, <fpage>2465</fpage>&#x02013;<lpage>2478</lpage>.<pub-id pub-id-type="doi">10.1038/sj.npp.1301385</pub-id><pub-id pub-id-type="pmid">17375139</pub-id></citation></ref>
<ref id="B98"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wickens</surname> <given-names>J. R.</given-names></name></person-group> (<year>2009</year>). <article-title>Synaptic plasticity in the basal ganglia</article-title>. <source>Behav. Brain Res.</source> <volume>199</volume>, <fpage>119</fpage>&#x02013;<lpage>128</lpage>.<pub-id pub-id-type="doi">10.1016/j.bbr.2008.10.030</pub-id><pub-id pub-id-type="pmid">19026691</pub-id></citation></ref>
<ref id="B99"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Williams</surname> <given-names>J.</given-names></name> <name><surname>Dayan</surname> <given-names>P.</given-names></name></person-group> (<year>2005</year>). <article-title>Dopamine, learning, and impulsivity: a biological account of attention-deficit/hyperactivity disorder</article-title>. <source>J. Child Adolesc. Psychopharmacol.</source> <volume>15</volume>, <fpage>160</fpage>&#x02013;<lpage>179</lpage>; discussion <fpage>157</fpage>&#x02013;<lpage>169</lpage>.<pub-id pub-id-type="doi">10.1089/cap.2005.15.160</pub-id><pub-id pub-id-type="pmid">15910202</pub-id></citation></ref>
<ref id="B100"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wise</surname> <given-names>R. A.</given-names></name></person-group> (<year>2004</year>). <article-title>Dopamine, learning and motivation</article-title>. <source>Nat. Rev.</source> <volume>5</volume>, <fpage>483</fpage>&#x02013;<lpage>494</lpage>.<pub-id pub-id-type="doi">10.1038/nrn1406</pub-id></citation></ref>
<ref id="B101"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>N.</given-names></name> <name><surname>Cepeda</surname> <given-names>C.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name> <name><surname>Levine</surname> <given-names>M. S.</given-names></name></person-group> (<year>2007</year>). <article-title>Altered corticostriatal neurotransmission and modulation in dopamine transporter knock-down mice</article-title>. <source>J. Neurophysiol.</source> <volume>98</volume>, <fpage>423</fpage>&#x02013;<lpage>432</lpage>.<pub-id pub-id-type="doi">10.1152/jn.00971.2006</pub-id><pub-id pub-id-type="pmid">17522168</pub-id></citation></ref>
<ref id="B102"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>T. X.</given-names></name> <name><surname>Sotnikova</surname> <given-names>T. D.</given-names></name> <name><surname>Liang</surname> <given-names>C.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Jung</surname> <given-names>J. U.</given-names></name> <name><surname>Spealman</surname> <given-names>R. D.</given-names></name> <name><surname>Gainetdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Yao</surname> <given-names>W. D.</given-names></name></person-group> (<year>2009</year>). <article-title>Hyperdopaminergic tone erodes prefrontal long-term potential via a D2 receptor-operated protein phosphatase gate</article-title>. <source>J. Neurosci.</source> <volume>29</volume>, <fpage>14086</fpage>&#x02013;<lpage>14099</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.0974-09.2009</pub-id><pub-id pub-id-type="pmid">19906957</pub-id></citation></ref>
<ref id="B103"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yin</surname> <given-names>H. H.</given-names></name> <name><surname>Zhuang</surname> <given-names>X.</given-names></name> <name><surname>Balleine</surname> <given-names>B. W.</given-names></name></person-group> (<year>2006</year>). <article-title>Instrumental learning in hyperdopaminergic mice</article-title>. <source>Neurobiol. Learn. Mem.</source> <volume>85</volume>, <fpage>283</fpage>&#x02013;<lpage>288</lpage>.<pub-id pub-id-type="doi">10.1016/j.nlm.2005.12.001</pub-id><pub-id pub-id-type="pmid">16423542</pub-id></citation></ref>
<ref id="B104"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhuang</surname> <given-names>X.</given-names></name> <name><surname>Oosting</surname> <given-names>R. S.</given-names></name> <name><surname>Jones</surname> <given-names>S. R.</given-names></name> <name><surname>Gainetdinov</surname> <given-names>R. R.</given-names></name> <name><surname>Miller</surname> <given-names>G. W.</given-names></name> <name><surname>Caron</surname> <given-names>M. G.</given-names></name> <name><surname>Hen</surname> <given-names>R.</given-names></name></person-group> (<year>2001</year>). <article-title>Hyperactivity and impaired response habituation in hyperdopaminergic mice</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>98</volume>, <fpage>1982</fpage>&#x02013;<lpage>1987</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.98.4.1982</pub-id><pub-id pub-id-type="pmid">11172062</pub-id></citation></ref>
<ref id="B105"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zweifel</surname> <given-names>L. S.</given-names></name> <name><surname>Parker</surname> <given-names>J. G.</given-names></name> <name><surname>Lobb</surname> <given-names>C. J.</given-names></name> <name><surname>Rainwater</surname> <given-names>A.</given-names></name> <name><surname>Wall</surname> <given-names>V. Z.</given-names></name> <name><surname>Fadok</surname> <given-names>J. P.</given-names></name> <name><surname>Darvas</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>M. J.</given-names></name> <name><surname>Mizumori</surname> <given-names>S. J.</given-names></name> <name><surname>Paladini</surname> <given-names>C. A.</given-names></name> <name><surname>Phillips</surname> <given-names>P. E.</given-names></name> <name><surname>Palmiter</surname> <given-names>R. D.</given-names></name></person-group> (<year>2009</year>). <article-title>Disruption of NMDAR-dependent burst firing by dopamine neurons provides selective assessment of phasic dopamine-dependent behavior</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>106</volume>, <fpage>7281</fpage>&#x02013;<lpage>7288</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.0813415106</pub-id><pub-id pub-id-type="pmid">19342487</pub-id></citation></ref>
</ref-list>
</back>
</article>