CSC3160 Fundamentals of Speech and Language Processing

The difference between speech and language processing and other data processing is the use of knowledge of language. In this course, we will study how to describe, process and compute different levels of language knowledge including Phonetics and Phonology, Morphology, Syntax, Semantics, and how the language knowledge is used in speech and language applications such as named entities recognition, information extraction, question answering, speech recognition, and speech synthesis.

News

This page is for Spring 2024. Please there is a significant change in assessment from Spring 2023.

I only write recommendation letter if you get 'A'. Sorry if you only get 'A-'.

I am happy to refer you internship opportunities if you have high-quality contributions to Amphion. This is an example of a high-quality contribution. See the comments from a top Silcon Valley investor.

Course Information

This course is designed as the first course for students who are interested in speech and language technology. The first half of the course focuses on the fundamentals and introduces tools for students to use, and the second half emphasises on applications, giving students the opportunity to know how speech and language technology could impact human life. In particular, the topics include:

Understanding human speech: spectrogram, fundamental frequency, formant, etc
Human sounds and their organization
Words and their relationship to other words
Syntax: Structure of sentences
Text processing and regular expressions
Language models
Embedding: Representations of the meaning of words
Word classifications and Named entities recognition
Applications: speech recognition, speech synthesis, machine translation, chatbot, etc

Prerequisites

Basic Concepts of probability: It will be easier for you to understand some lectures if you know basics of probability.

Textbooks

Recommended Books:

Speech and Language Processing (3rd ed. draft), by Dan Jurafsky and James H. Martin

Grading Policy (CSC3160)

Assignments (40%)

Assignment 1 (10%): Speech alignment and synthesis
Assignment 2 (10%): Text processing
Assignment 3 (10%): Word embedding and classification
Assignment 4 (10%): TBD

Midterm exam (25%)

Date: Mar 11 (same as lecture time)

Scope: Lecture 1 - 11

Final exam (30%)

Date: May 8 (same as lecture time)

Scope: All lectures (including high-level concept from guest lectures)

For CSC3160 students, if you would like to work on the final project instead of the final exam. Please let me know in advance.

Final project (30%) [AIR6063 students only]

Due date: May 8 (EOD)

The project is to reproduce a paper within Amphion. Amphion team will provide you computational resources and guidelines. The topics range from text processing, modeling and speech processing. If you have a preference, please let the teaching team know, otherwise, we can discuss to select a paper for you to work on. You can work as a team (2 students) or as an individual. You need to write a project report (max 6 pages) for the final project. Here is the report template.

Rating guideline

Implementation (20%): You need to submit your implementation in a single pull requestion. There are three categories for rating

No pull request: 0%
Pull request but not merged: Maximum 15%
Merged pull request: Default 20%. Here is an example of merged pull request.

Overall quality of report (10%): This is an overall evaluation of the final project report, including all dimensions: writing, creativity, convincing experiments and analysis.

Participation (5%)

Here are some ways to earn the participation credit, which is capped at 5%.

Attending Guest lectures: In the second half of the course, we have four invited speakers. We encourage students to attend the guest lectures and participate in Q&A. All students get 0.75% per guest lecture (in total 3%) for either attending in person, or by writing a guest lecture report if they attend remotely or watch the recording.
Completing feedback surveys: We will send out two feedback surveys during the semester to improve the course. All students get 0.5% per survey (in total 1%) for completing the surveys.
Course and Teaching Evaluation (CTE): The school will send requests for CTE to all students. The CTE is worth 1% credit.
Fixing a bug or write high-quality comments (at least one model file) for Amphion is worth 1% credit

Late Policy

The penalty is 0.5% off the final course grade for each late day.

Date	Lecture Description	Readings	Lecture Note	Events/Deadlines
Jan 8	Lecture 1: Introduction and course overview		[Slides]
Jan 10	Lecture 2: Understanding sounds	Pitch, loudness and timbre What is a Sound Spectrum? [Colab demo]	[Slides]
Jan 15	Lecture 3: Basics of speech signal processing and analysis		[Slides]
Jan 17	Lecture 4: Introduction of speech production		[Slides]	All assignments are out [Assignment 1] [Assignment 2] [Assignment 3] [Assignment 4]
Jan 22	Lecture 5: Speech representation	Basic Representations	[Slides]
Jan 24	Lecture 6: Phones and Phonation		[Slides]
Jan 29	No class
Jan 31	Lecture 8: Text processing		[Slides]	Assignment 1 due (11:59pm)
Feb 26	Lecture 9: Words, morphology, and parts of speech		[Slides]
Feb 28	Lecture 10: Word embedding		[Slides]	Assignment 2 due (11:59pm)
Mar 4	Lecture 11: Syntax: structure of sentences		[Slides]
Mar 6	Lecture 7: Speech perception		[Slides]
Mar 11	Mid-term exam (scope: lecture 1 - 11)
Mar 13	Lecture 12: Language model		[Slides]	Assignment 3 due (11:59pm)
Mar 18	No class
Mar 20	Lecture 13: Word2Vec and Sentiment Analysis		[Slides]
Mar 25	Lecture 14: TTS		[Slides]
Mar 27	Lecture 15: Voice conversion		[Slides]
Apr 1	Lecture 16: Automatic Speech Recognition		[Slides]
Apr 3	Lecture 17: Machine Translation		[Slides]
Apr 8	Lecture 18: Question answering		[Slides]
Apr 10	Lecture 19: Chatbot		[Slides]
Apr 15	No class: Instructor attending ICASSP 2024
Apr 17	No class: Instructor attending ICASSP 2024			Assignment 4 due (11:59pm)
Apr 22	Invited talk: Wei Li
Apr 24	Invited talk: Lei Wang
Apr 29	Invited talk: Xu Tan
May 6	Lecture 22: In class QA and review for final exam
May 8	Final exam