skip to content

Section C

Theoretical and Applied Linguistics

 

LI18: Computational Linguistics

This paper is available for the academic year 2018-19.

This paper provides an introduction to computational linguistics, covering the fundamental techniques which can be used to model linguistic phenomena computationally at the levels of morphology, syntax, semantics and pragmatics. Students are taught how such techniques are implemented, evaluated and applied to natural language processing (NLP) tasks. An overview of the use of such techniques is provided, along with an introduction to several applications (e.g. machine translation, sentiment analysis and dialogue systems). At the end of the course, students will understand basic computational linguistics techniques as well as their limitations and current performance levels when applied to linguistic research and to real-world tasks.

The course will follow the main text book used for Computational Linguistics worldwide: Speech and Language Processing - An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition by Daniel Jurafsky and James. H. Martin (2008, Second Edition, Prentice-Hall). This book will be accessible to all those taking the paper. More specialised reading is listed in each chapter of the book. These and other relevant readings will be introduced to students during the lectures. Relevant readings are freely available on the Web (and will be downloadable as pdf documents).  Additionally we will be drawing on updated topics introduced in the new draft version of Jurafsky and Martin available online at https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf.  Material for these will be summarised in the lecture notes.

Aims

  • To introduce the fundamental techniques of natural language processing (NLP)
  • To develop an understanding of the possibilities and limitations of those techniques
  • To understand the framework within which NLP continues to develop
  • To gain insights into current and future applications

Scope

  • Focus on basic natural language processing techniques at the levels of morphology, syntax, semantics and pragmatics
  • Focus on text (rather than speech) processing
  • No prerequisite courses in computational linguistics or computer science are required. The course is an entry level course accessible to any undergraduate student in linguistics, and does not require any programming skills.
Topics: 

Proposed lecture schedule/topics to be covered:

Michaelmas Term

1.Introduction: broad overview of NLP research, language models, complexity of language applications
2. Regular expressions, text normalization and edit distance 
3. Finite state techniques
4. N-gram language models
5. Naïve Bayes and sentiment classification 
6. Part-of-speech tagging
7. Context-free grammars and syntactic parsing
8. Parsing algorithms and treebanks

Lent Term

9. Compositional semantics
10. Distributional semantics
11. Neural networks and neural language models
12. Lexical Semantics
13. Computational Discourse
14. Dialogue systems and chatbots
15. Information extraction and question answering
16. Machine translation

Preparatory reading: 

Daniel Jurafsky and James Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. 2nd edn, available online: https://www.cs.colorado.edu/~martin/slp.html

Teaching and learning: 

The first part will provide an introduction to the course, and cover morphological and syntactic processing of language. The second part will focus on computational semantics and pragmatics, and introduce some well-known NLP applications.

You will receive sixteen lectures in total, eight in Michaelmas Term and eight in Lent Term. You will also have eight supervisions, normally three during Michaelmas Term, four in Lent Term and one in Easter Term.

The paper's Moodle site can be found here.

Assessment: 

Assessment will be by a three-hour written examination.

Course Contacts: 
Dr Nigel Collier