Computational Linguistics, WS 2024/25

Winter Semester 2024/25
Prof. Dr. Alexander Koller
Tutors: Anina Klaus; Yash Sarrof
Tue, Fri 10-12; Building C72, Room -1.05 (go down the stairs around the elevator)

First class: Friday, October 18

The course “Computational Linguistics” is the introductory course to computational linguistics for MSc students. Its leading question is: How can we uncover the hidden structure of natural language with computational methods? How can we design algorithms to do this efficiently, even in the face of massive ambiguity? How can we train statistical models that will help us choose the right interpretation? And what is the value of working with hidden linguistic structures when we can already do so much with purely neural LLMs?

The course covers a wide range of techniques for tagging, parsing, and semantics, including symbolic (classical statistical NLP, e.g. PCFGs), neural, and neurosymbolic models. It assumes very little previous knowledge about computational linguistics and then covers a lot of ground relatively quickly, making this a good class to take for first-year MSc students.

Online learning platform

We will make heavy use of Moodle for all course activities. I will upload the slides, assignments, and additional materials there, and will link to video recordings of the classes and to reading materials. You will also upload your solutions to the assignments here, and I urge you to use the discussion forum.

Please join the Moodle for this course as soon as you can. You will need a university account for this.

Structure of the course

We will meet twice a week for lectures and to discuss the assignments. I will assign some reading material for each lecture, and will assume that you have read this material before the lecture. This will allow us to cover more ground in the lectures, and allow you to identify questions that you’d like to ask.

Furthermore, I will hand out six assignments over the course of the semester. Assignments will mostly be programming projects that are designed to give you a deeper understanding of the course material. I will prepare the assignments under the assumption that you will use Python and NLTK to solve them. In the second half of the course, we will train neural networks using Pytorch; feel free to acquaint yourself with that too.

After you turn in each assignment, the tutors will score it, and we will then discuss your solutions in class. You should be prepared to explain your own solution to the other students. In addition, the tutors will offer regular tutorial sessions, in which you can ask us questions about the current assignment or anything else pertaining to the course. Attending the tutorials is voluntary, but strongly recommended.

Prerequisites

As an introductory class, this course does not assume any prior knowledge in computational linguistics. However, the six assignments are quite programming-intensive. You will need solid programming skills to succeed in the course or be prepared to develop them quickly as you go along.

We will give you a one-lecture crash course in Python as part of this course, and you can attend the Python programming course at the LST department in parallel. (Note that this is a course for our first-year BSc students, and it is taught in German.) However, if you have never programmed before, Computational Linguistics may not be an ideal course for you; consider taking only the Python course in your first semester, and then taking Computational Linguistics in your third semester.

Conversely, if you have a previous degree in computational linguistics, it is possible that you are already familiar with most of the content of this course. Feel free to come to the first class to hear in more detail what we cover; then you can decide if this course is worth your time.

Grading

This class is worth 6 credit points, which translates into 180 hours of work. Please schedule your semester accordingly.

The grade for the course is determined based on 50% grade for the assignments and 50% grade for your final project. Towards the end of the semester, you will propose a topic for a small final project that applies or extends the techniques from the course. Generally speaking, the workload for a final project should be similar to that of an assignment. You will then work on your project in the term break, and submit a working system together with a short paper that explains what problem your system solves and how it does this. We will grade the project on the difficulty and creativity of the task, the quality of your solution, and the clarity of the presentation.

Because work on the final project takes place in the term break, there will be no opportunity for a resit exam. If you can’t finish your project by the deadline, I invite you to attend the course again the next year.

In addition, you must successfully complete the assignments for the course. You must submit solutions to at least five of the six assignments. We will then add up your two best scores from Assignments 1-3 and your two best scores from Assignments 4-6. To pass the course, you must obtain at least 250 points (out of 400).