New trends in natural language processing: statistical natural language processing.

M Marcus
Author Information
  1. M Marcus: Department of Computer and Information Science, University of Pennsylvania, Philadelphia 19104-6389, USA.

Abstract

The field of natural language processing (NLP) has seen a dramatic shift in both research direction and methodology in the past several years. In the past, most work in computational linguistics tended to focus on purely symbolic methods. Recently, more and more work is shifting toward hybrid methods that combine new empirical corpus-based methods, including the use of probabilistic and information-theoretic techniques, with traditional symbolic methods. This work is made possible by the recent availability of linguistic databases that add rich linguistic annotation to corpora of natural language text. Already, these methods have led to a dramatic improvement in the performance of a variety of NLP systems with similar improvement likely in the coming years. This paper focuses on these trends, surveying in particular three areas of recent progress: part-of-speech tagging, stochastic parsing, and lexical semantics.

References

  1. Proc Natl Acad Sci U S A. 1995 Oct 24;92(22):9964-9 [PMID: 7479810]

MeSH Term

Algorithms
Computers
Humans
Language
Linguistics
Semantics
Stochastic Processes
User-Computer Interface
Vocabulary
Voice

Word Cloud

Created with Highcharts 10.0.0methodsnaturallanguageworkprocessingNLPdramaticpastyearssymbolicrecentlinguisticimprovementtrendsfieldseenshiftresearchdirectionmethodologyseveralcomputationallinguisticstendedfocuspurelyRecentlyshiftingtowardhybridcombinenewempiricalcorpus-basedincludinguseprobabilisticinformation-theoretictechniquestraditionalmadepossibleavailabilitydatabasesaddrichannotationcorporatextAlreadyledperformancevarietysystemssimilarlikelycomingpaperfocusessurveyingparticularthreeareasprogress:part-of-speechtaggingstochasticparsinglexicalsemanticsNewprocessing:statistical

Similar Articles

Cited By