Implicit Value Updating Explains Transitive Inference Performance: The Betasort Model.

Greg Jensen, Fabian Mu��oz, Yelda Alkan, Vincent P Ferrera, Herbert S Terrace
Author Information
  1. Greg Jensen: Department of Neuroscience, Columbia University, New York, New York, United States of America; Department of Psychology, Columbia University, New York, New York, United States of America.
  2. Fabian Mu��oz: Department of Neuroscience, Columbia University, New York, New York, United States of America.
  3. Yelda Alkan: Department of Neuroscience, Columbia University, New York, New York, United States of America.
  4. Vincent P Ferrera: Department of Neuroscience, Columbia University, New York, New York, United States of America; Department of Psychiatry, Columbia University, New York, New York, United States of America.
  5. Herbert S Terrace: Department of Psychology, Columbia University, New York, New York, United States of America.

Abstract

Transitive inference (the ability to infer that B > D given that B > C and C > D) is a widespread characteristic of serial learning, observed in dozens of species. Despite these robust behavioral effects, reinforcement learning models reliant on reward prediction error or associative strength routinely fail to perform these inferences. We propose an algorithm called betasort, inspired by cognitive processes, which performs transitive inference at low computational cost. This is accomplished by (1) representing stimulus positions along a unit span using beta distributions, (2) treating positive and negative feedback asymmetrically, and (3) updating the position of every stimulus during every trial, whether that stimulus was visible or not. Performance was compared for rhesus macaques, humans, and the betasort algorithm, as well as Q-learning, an established reward-prediction error (RPE) model. Of these, only Q-learning failed to respond above chance during critical test trials. Betasort's success (when compared to RPE models) and its computational efficiency (when compared to full Markov decision process implementations) suggests that the study of reinforcement learning in organisms will be best served by a feature-driven approach to comparing formal models.

References

  1. J Comp Psychol. 2010 Nov;124(4):395-401 [PMID: 20853947]
  2. Curr Opin Neurobiol. 2012 Dec;22(6):1075-81 [PMID: 22959354]
  3. Neuroscience. 2010 Jun 16;168(1):138-48 [PMID: 20371271]
  4. PLoS Comput Biol. 2015 Mar 05;11(3):e1004060 [PMID: 25742003]
  5. Nat Neurosci. 2007 Sep;10(9):1214-21 [PMID: 17676057]
  6. Curr Opin Neurobiol. 2008 Apr;18(2):185-96 [PMID: 18708140]
  7. Nature. 2015 Feb 26;518(7540):529-33 [PMID: 25719670]
  8. Science. 1997 Mar 14;275(5306):1593-9 [PMID: 9054347]
  9. Psychol Rev. 2001 Jul;108(3):550-92 [PMID: 11488378]
  10. Trends Cogn Sci. 2005 Apr;9(4):202-10 [PMID: 15808503]
  11. Nature. 2004 Jan 22;427(6972):297 [PMID: 14737148]
  12. Anim Behav. 2008 Aug;76(2):479-486 [PMID: 19649139]
  13. Hippocampus. 2010 Aug;20(8):894-901 [PMID: 20054816]
  14. J Neurosci. 2011 Oct 12;31(41):14693-707 [PMID: 21994386]
  15. Exp Brain Res. 2002 Sep;146(1):1-10 [PMID: 12192572]
  16. Brain Res Bull. 2008 Jun 15;76(3):307-12 [PMID: 18498947]
  17. PLoS One. 2013 Jul 31;8(7):e70285 [PMID: 23936179]
  18. J Neurosci. 2011 Feb 16;31(7):2700-5 [PMID: 21325538]
  19. Psychol Rev. 1951 Sep;58(5):313-23 [PMID: 14883244]
  20. Trends Cogn Sci. 2002 Mar 1;6(3):105-106 [PMID: 11861176]
  21. J Exp Psychol Anim Behav Process. 2012 Oct;38(4):331-45 [PMID: 23066978]
  22. Ann Hum Genet. 1957 Jun;21(4):397-409 [PMID: 13435648]
  23. Clin Exp Pharmacol Physiol. 1998 Dec;25(12):1032-7 [PMID: 9888002]
  24. Psychol Sci. 2013 May;24(5):751-61 [PMID: 23558545]
  25. PLoS One. 2014 May 13;9(5):e97349 [PMID: 24824426]
  26. Nature. 2004 Aug 12;430(7001):778-81 [PMID: 15306809]
  27. Behav Processes. 2012 Mar;89(3):244-55 [PMID: 22178714]
  28. Behav Processes. 2013 May;95:3-7 [PMID: 23384660]
  29. Cogn Affect Behav Neurosci. 2009 Dec;9(4):343-64 [PMID: 19897789]
  30. Neuron. 2010 May 27;66(4):585-95 [PMID: 20510862]
  31. Psychopharmacology (Berl). 2007 Apr;191(3):391-431 [PMID: 17072591]
  32. Vision Res. 1980;20(6):535-8 [PMID: 6776685]
  33. Proc Int Conf Mach Learn. 2008;301:256-263 [PMID: 20467572]
  34. IEEE Trans Biomed Eng. 1963 Oct;10:137-45 [PMID: 14121113]
  35. Nature. 2007 Jan 25;445(7126):429-32 [PMID: 17251980]
  36. Neurobiol Learn Mem. 2015 Jan;117:4-13 [PMID: 24846190]
  37. Neuron. 2003 Apr 24;38(2):329-37 [PMID: 12718865]
  38. Cogn Affect Behav Neurosci. 2014 Jun;14(2):473-92 [PMID: 24647659]
  39. Science. 2003 Mar 21;299(5614):1898-902 [PMID: 12649484]
  40. J Comp Psychol. 2011 May;125(2):227-38 [PMID: 21341909]
  41. Nature. 2001 Jul 5;412(6842):43-8 [PMID: 11452299]
  42. J Neurosci. 2014 Jan 29;34(5):1657-71 [PMID: 24478349]
  43. PLoS Comput Biol. 2012 Jan;8(1):e1002346 [PMID: 22275857]
  44. Behav Processes. 2008 Jul;78(3):313-34 [PMID: 18423898]
  45. Neural Comput. 2001 Apr;13(4):841-62 [PMID: 11255572]
  46. Behav Processes. 2010 Oct;85(3):283-92 [PMID: 20708664]

Grants

  1. R01 MH081153/NIMH NIH HHS
  2. 5R01MH081153/NIMH NIH HHS

MeSH Term

Algorithms
Animals
Computational Biology
Humans
Learning
Macaca mulatta
Machine Learning
Male
Models, Neurological
Models, Statistical
Reward

Word Cloud

Created with Highcharts 10.0.0>learningmodelsstimuluscomparedTransitiveinferenceBDCreinforcementerroralgorithmbetasortcomputationaleveryQ-learningRPEabilityinfergivenwidespreadcharacteristicserialobserveddozensspeciesDespiterobustbehavioraleffectsreliantrewardpredictionassociativestrengthroutinelyfailperforminferencesproposecalledinspiredcognitiveprocessesperformstransitivelowcostaccomplished1representingpositionsalongunitspanusingbetadistributions2treatingpositivenegativefeedbackasymmetrically3updatingpositiontrialwhethervisiblePerformancerhesusmacaqueshumanswellestablishedreward-predictionmodelfailedrespondchancecriticaltesttrialsBetasort'ssuccessefficiencyfullMarkovdecisionprocessimplementationssuggestsstudyorganismswillbestservedfeature-drivenapproachcomparingformalImplicitValueUpdatingExplainsInferencePerformance:BetasortModel

Similar Articles

Cited By