Regional personality assessment through social media language.

Salvatore Giorgi, Khoa Le Nguyen, Johannes C Eichstaedt, Margaret L Kern, David B Yaden, Michal Kosinski, Martin E P Seligman, Lyle H Ungar, H Andrew Schwartz, Gregory Park
Author Information
  1. Salvatore Giorgi: Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA. ORCID
  2. Khoa Le Nguyen: Department Psychology and Neuroscience, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA. ORCID
  3. Johannes C Eichstaedt: Department of Psychology, Institute for Human-Centered A.I., Stanford University, Stanford, California, USA. ORCID
  4. Margaret L Kern: Melbourne Graduate School of Education, University of Melbourne, Melbourne, Victoria, Australia. ORCID
  5. David B Yaden: Department of Psychiatry and Behavioral Sciences, Johns Hopkins University School of Medicine, Baltimore, Maryland, USA. ORCID
  6. Michal Kosinski: Graduate School of Business, Stanford University, Stanford, California, USA. ORCID
  7. Martin E P Seligman: Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA.
  8. Lyle H Ungar: Department of Computer and Information Science, University of Pennsylvania, Philadelphia, Pennsylvania, USA. ORCID
  9. H Andrew Schwartz: Department of Computer Science, Stony Brook University, Stony Brook, New York, USA. ORCID
  10. Gregory Park: Department of Psychology, University of Pennsylvania, Philadelphia, Pennsylvania, USA. ORCID

Abstract

OBJECTIVE: We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment.
METHOD: We applied a language-based assessment of the five factor model of personality to 6,064,267 U.S. Twitter users. We aggregated the Twitter-based personality scores to 2,041 counties and compared to political, economic, social, and health outcomes measured through surveys and by government agencies.
RESULTS: There was significant personality variation across counties. Openness to experience was higher on the coasts, conscientiousness was uniformly spread, extraversion was higher in southern states, agreeableness was higher in western states, and emotional stability was highest in the south. Across 13 outcomes, language-based personality estimates replicated patterns that have been observed in individual-level and geographic studies. This includes higher Republican vote share in less agreeable counties and increased life satisfaction in more conscientious counties.
CONCLUSIONS: Results suggest that regions vary in their personality and that these differences can be studied through computational linguistic analysis of social media. Furthermore, these methods may be used to explore other psychological constructs across geographies.

Keywords

References

  1. J Pers Soc Psychol. 2017 Sep;113(3):e18-e39 [PMID: 27442765]
  2. J Pers Soc Psychol. 2002 Jun;82(6):903-18 [PMID: 12051579]
  3. Annu Rev Psychol. 2003;54:547-77 [PMID: 12185209]
  4. PLoS One. 2015 Mar 24;10(3):e0122245 [PMID: 25803819]
  5. Popul Health Metr. 2015 Apr 17;13:11 [PMID: 25931988]
  6. Nat Hum Behav. 2017 Dec;1(12):890-895 [PMID: 31024181]
  7. J Pers Soc Psychol. 2005 Sep;89(3):407-25 [PMID: 16248722]
  8. PLoS One. 2018 Apr 4;13(4):e0194290 [PMID: 29617408]
  9. Am Psychol. 2011 Dec;66(9):917-8 [PMID: 22121993]
  10. Psychol Sci. 2017 Mar;28(3):276-284 [PMID: 28059682]
  11. Psychol Sci. 2015 Feb;26(2):159-69 [PMID: 25605707]
  12. J Pers Soc Psychol. 2021 Jun;120(6):1662-1695 [PMID: 33119387]
  13. Annu Rev Psychol. 2006;57:401-21 [PMID: 16318601]
  14. PLoS One. 2013 Sep 25;8(9):e73791 [PMID: 24086296]
  15. Perspect Psychol Sci. 2008 Sep;3(5):339-69 [PMID: 26158954]
  16. Health Psychol. 2008 Sep;27(5):505-12 [PMID: 18823176]
  17. Biometrika. 1950 Jun;37(1-2):17-23 [PMID: 15420245]
  18. Psychol Bull. 2003 May;129(3):339-75 [PMID: 12784934]
  19. J Pers Soc Psychol. 2013 Dec;105(6):996-1012 [PMID: 24128185]
  20. Dev Psychol. 2014 May;50(5):1315-30 [PMID: 23276130]
  21. Am Psychol. 2004 Feb-Mar;59(2):93-104 [PMID: 14992636]
  22. Psychol Sci. 2008 Apr;19(4):309-13 [PMID: 18399880]
  23. Psychol Bull. 1998 Sep;124(2):197-229 [PMID: 9747186]
  24. Proc Int AAAI Conf Weblogs Soc Media. 2022 May 31;16(1):228-240 [PMID: 36467573]
  25. Psychol Sci. 2010 Mar;21(3):372-4 [PMID: 20424071]
  26. Proc Natl Acad Sci U S A. 2015 Jan 20;112(3):725-30 [PMID: 25583480]
  27. Perspect Psychol Sci. 2022 Mar;17(2):407-441 [PMID: 34699736]
  28. J Res Pers. 2010 Jun 1;44(3):363-373 [PMID: 20563301]
  29. J Pers Soc Psychol. 1999 Dec;77(6):1296-312 [PMID: 10626371]
  30. Psychol Methods. 2016 Dec;21(4):507-525 [PMID: 27505683]
  31. Proc Natl Acad Sci U S A. 2020 May 12;117(19):10165-10171 [PMID: 32341156]
  32. J Res Pers. 2018 Feb;72:64-72 [PMID: 32831424]
  33. J Pers Soc Psychol. 2013 Jul;105(1):104-22 [PMID: 23586410]
  34. Bull World Health Organ. 2007 Nov;85(11):867-72 [PMID: 18038077]
  35. Psychol Bull. 2005 Nov;131(6):925-971 [PMID: 16351329]
  36. J Pers Soc Psychol. 2015 Jun;108(6):934-52 [PMID: 25365036]
  37. Assessment. 2014 Apr;21(2):158-69 [PMID: 24322010]
  38. Psychol Bull. 1996 Nov;120(3):323-37 [PMID: 8900080]
  39. Psychol Sci. 2016 Mar;27(3):419-27 [PMID: 26842317]
  40. Ann Behav Med. 2009 Apr;37(2):154-63 [PMID: 19455378]
  41. J Soc Psychol. 1973 Oct;91(1):73-9 [PMID: 4749508]
  42. J Fam Psychol. 2012 Oct;26(5):816-27 [PMID: 22888778]
  43. Am Psychol. 2010 Sep;65(6):548-58 [PMID: 20822196]
  44. Perspect Psychol Sci. 2007 Dec;2(4):313-45 [PMID: 26151971]

Grants

  1. R01 MH125702/NIMH NIH HHS

MeSH Term

Extraversion, Psychological
Humans
Language
Personality
Personality Assessment
Social Media

Word Cloud

Created with Highcharts 10.0.0personalitysocialcountiesmediaassessmenthigherlanguage-basedexplorelinguisticpatternsstudieslarge-scalesurveyscomputationalappliedoutcomesacrossstateslanguageOBJECTIVE:assessedpreviouslylimitedcostfeasibilityhowevermodelslargedatasetsnowallowMETHOD:fivefactormodel6064267USTwitterusersaggregatedTwitter-basedscores2041comparedpoliticaleconomichealthmeasuredgovernmentagenciesRESULTS:significantvariationOpennessexperiencecoastsconscientiousnessuniformlyspreadextraversionsouthernagreeablenesswesternemotionalstabilityhighestsouthAcross13estimatesreplicatedobservedindividual-levelgeographicincludesRepublicanvotesharelessagreeableincreasedlifesatisfactionconscientiousCONCLUSIONS:ResultssuggestregionsvarydifferencescanstudiedanalysisFurthermoremethodsmayusedpsychologicalconstructsgeographiesRegionalbigdatameasurement

Similar Articles

Cited By