Vicarious reinforcement learning signals when instructing others.

Matthew A J Apps, Elise Lesage, Narender Ramnani
Author Information
  1. Matthew A J Apps: Nuffield Department of Clinical Neuroscience, University of Oxford, Oxford OX1 9DU, United Kingdom, Department of Experimental Psychology, University of Oxford, Oxford OX1 2JD, United Kingdom, Department of Psychology, Royal Holloway, University of London, Surrey TW20 0EX, United Kingdom, and matthew.apps@ndcn.ox.ac.uk.
  2. Elise Lesage: Department of Psychology, Royal Holloway, University of London, Surrey TW20 0EX, United Kingdom, and Neuroimaging Research Branch, Intramural Research Program, National Institute on Drug Abuse, National Institutes of Health, Baltimore, Maryland 21224.
  3. Narender Ramnani: Department of Psychology, Royal Holloway, University of London, Surrey TW20 0EX, United Kingdom, and.

Abstract

Reinforcement learning (RL) theory posits that learning is driven by discrepancies between the predicted and actual outcomes of actions (prediction errors [PEs]). In social environments, learning is often guided by similar RL mechanisms. For example, teachers monitor the actions of students and provide feedback to them. This feedback evokes PEs in students that guide their learning. We report the first study that investigates the neural mechanisms that underpin RL signals in the brain of a teacher. Neurons in the anterior cingulate cortex (ACC) signal PEs when learning from the outcomes of one's own actions but also signal information when outcomes are received by others. Does a teacher's ACC signal PEs when monitoring a student's learning? Using fMRI, we studied brain activity in human subjects (teachers) as they taught a confederate (student) action-outcome associations by providing positive or negative feedback. We examined activity time-locked to the students' responses, when teachers infer student predictions and know actual outcomes. We fitted a RL-based computational model to the behavior of the student to characterize their learning, and examined whether a teacher's ACC signals when a student's predictions are wrong. In line with our hypothesis, activity in the teacher's ACC covaried with the PE values in the model. Additionally, activity in the teacher's insula and ventromedial prefrontal cortex covaried with the predicted value according to the student. Our findings highlight that the ACC signals PEs vicariously for others' erroneous predictions, when monitoring and instructing their learning. These results suggest that RL mechanisms, processed vicariously, may underpin and facilitate teaching behaviors.

Keywords

References

  1. Neuropsychologia. 2003;41(8):919-31 [PMID: 12667528]
  2. Nature. 2008 Nov 13;456(7219):245-9 [PMID: 19005555]
  3. Science. 2006 Sep 1;313(5791):1310-2 [PMID: 16946075]
  4. Science. 1998 May 1;280(5364):747-9 [PMID: 9563953]
  5. Trends Ecol Evol. 2008 Sep;23(9):486-93 [PMID: 18657877]
  6. Neurosci Biobehav Rev. 2014 Nov;47:549-58 [PMID: 25454357]
  7. Neuron. 2009 Dec 10;64(5):756-70 [PMID: 20005830]
  8. J Comp Neurol. 1982 Nov 20;212(1):23-37 [PMID: 7174906]
  9. Neuron. 2002 Oct 10;36(2):285-98 [PMID: 12383782]
  10. Science. 2009 May 15;324(5929):948-50 [PMID: 19443783]
  11. Neuron. 2013 Jul 24;79(2):217-40 [PMID: 23889930]
  12. Psychol Rev. 1981 Mar;88(2):135-70 [PMID: 7291377]
  13. Neuroimage. 2001 May;13(5):903-19 [PMID: 11304086]
  14. J Comp Neurol. 1985 Dec 15;242(3):425-58 [PMID: 4086670]
  15. Nat Neurosci. 2006 Aug;9(8):1007-8 [PMID: 16819523]
  16. Neuroimage. 2008 Aug 1;42(1):450-9 [PMID: 18534868]
  17. Front Neurosci. 2013 Aug 21;7:147 [PMID: 23970850]
  18. Neuron. 2006 May 18;50(4):531-4 [PMID: 16701204]
  19. J Neurosci. 2011 Mar 16;31(11):4178-87 [PMID: 21411658]
  20. J Comp Neurol. 1987 Aug 8;262(2):271-89 [PMID: 3624555]
  21. Exp Brain Res. 1981;42(3-4):319-30 [PMID: 6165607]
  22. Cereb Cortex. 2014 Sep;24(9):2502-11 [PMID: 23599165]
  23. Cereb Cortex. 1998 Jun;8(4):321-45 [PMID: 9651129]
  24. Front Neurosci. 2014 Mar 31;8:58 [PMID: 24765063]
  25. Nat Neurosci. 2007 May;10(5):647-56 [PMID: 17450137]
  26. Soc Neurosci. 2012 Jul;7(4):424-35 [PMID: 22114875]
  27. J Neurosci. 2009 Nov 18;29(46):14506-10 [PMID: 19923284]
  28. J Comp Neurol. 1989 Mar 1;281(1):97-113 [PMID: 2925903]
  29. Trends Cogn Sci. 2000 Jun;4(6):215-222 [PMID: 10827444]
  30. Proc Natl Acad Sci U S A. 2008 May 6;105(18):6741-6 [PMID: 18427116]
  31. J Neurosci. 2011 Sep 14;31(37):13039-45 [PMID: 21917787]
  32. Neuron. 2013 Dec 18;80(6):1558-71 [PMID: 24360551]
  33. Front Neurosci. 2013 Dec 10;7:233 [PMID: 24339801]
  34. Proc Natl Acad Sci U S A. 2013 Oct 8;110(41):16634-9 [PMID: 24062436]
  35. Cereb Cortex. 2008 Jul;18(7):1485-95 [PMID: 18033767]
  36. Cereb Cortex. 2000 Mar;10(3):220-42 [PMID: 10731218]
  37. Nat Neurosci. 2004 Jan;7(1):85-90 [PMID: 14699420]
  38. Cogn Affect Behav Neurosci. 2008 Dec;8(4):429-53 [PMID: 19033240]
  39. Nat Neurosci. 2013 Feb;16(2):243-50 [PMID: 23263442]
  40. Science. 2004 Feb 20;303(5661):1157-62 [PMID: 14976305]
  41. Proc Natl Acad Sci U S A. 2010 Aug 10;107(32):14431-6 [PMID: 20660717]
  42. Neuroimage. 2002 May;16(1):217-40 [PMID: 11969330]
  43. Front Neurosci. 2013 Dec 20;7:251 [PMID: 24391534]
  44. Proc Natl Acad Sci U S A. 2012 Jan 31;109(5):1419-24 [PMID: 22307594]
  45. Neuroimage. 2013 Jan 1;64:1-9 [PMID: 22982355]
  46. Nat Rev Neurosci. 2014 Aug;15(8):549-62 [PMID: 24986556]
  47. Annu Rev Psychol. 2006;57:87-115 [PMID: 16318590]
  48. Nat Neurosci. 2011 Oct;14(10):1338-44 [PMID: 21926982]
  49. J Comp Neurol. 1999 Aug 2;410(3):343-67 [PMID: 10404405]
  50. Annu Rev Neurosci. 2000;23:473-500 [PMID: 10845072]
  51. J Neurosci. 2014 Apr 30;34(18):6190-200 [PMID: 24790190]
  52. Neuron. 2013 Oct 30;80(3):816-26 [PMID: 24183030]
  53. Neuron. 2009 Jan 15;61(1):140-51 [PMID: 19146819]
  54. Curr Opin Neurobiol. 2009 Feb;19(1):75-83 [PMID: 19349160]
  55. Neuroimage. 2005 Jul 1;26(3):839-51 [PMID: 15955494]
  56. Nat Neurosci. 2004 May;7(5):497-8 [PMID: 15097995]
  57. Nat Neurosci. 2012 Sep;15(9):1307-12 [PMID: 22864610]
  58. Eur J Neurosci. 2005 Jun;21(12):3447-52 [PMID: 16026482]
  59. J Comp Neurol. 1982 Nov 20;212(1):38-52 [PMID: 7174907]
  60. J Comp Neurol. 1995 Aug 28;359(3):490-506 [PMID: 7499543]
  61. Neurosci Biobehav Rev. 2014 Oct;46 Pt 1:44-57 [PMID: 24239852]
  62. J Neurosci. 2007 Oct 24;27(43):11573-86 [PMID: 17959800]
  63. Nat Neurosci. 2011 Dec;14(12):1581-9 [PMID: 22037498]
  64. J Comp Neurol. 1992 Sep 15;323(3):341-58 [PMID: 1460107]
  65. Neuroimage. 2003 Jun;19(2 Pt 1):430-41 [PMID: 12814592]

MeSH Term

Adolescent
Adult
Association Learning
Brain Mapping
Computer Simulation
Cues
Feedback, Psychological
Female
Gyrus Cinguli
Humans
Image Processing, Computer-Assisted
Individuality
Magnetic Resonance Imaging
Male
Oxygen
Reinforcement, Psychology
Teaching
Young Adult

Chemicals

Oxygen

Word Cloud

Created with Highcharts 10.0.0learningACCRLoutcomesPEssignalsteacher'sactivitystudentactionsmechanismsteachersfeedbacksignalpredictionspredictedactualpredictionsocialstudentsunderpinbraincingulatecortexothersmonitoringstudent'sfMRIexaminedmodelcovariedvicariouslyinstructingteachingreinforcementReinforcementtheorypositsdrivendiscrepancieserrors[PEs]environmentsoftenguidedsimilarexamplemonitorprovideevokesguidereportfirststudyinvestigatesneuralteacherNeuronsanteriorone'salsoinformationreceivedlearning?Usingstudiedhumansubjectstaughtconfederateaction-outcomeassociationsprovidingpositivenegativetime-lockedstudents'responsesinferknowfittedRL-basedcomputationalbehaviorcharacterizewhetherwronglinehypothesisPEvaluesAdditionallyinsulaventromedialprefrontalvalueaccordingfindingshighlightothers'erroneousresultssuggestprocessedmayfacilitatebehaviorsVicariouserror

Similar Articles

Cited By