This page is for Spring 2024. Please there is a significant change in assessment from Spring 2023.
I only write recommendation letter if you get 'A'. Sorry if you only get 'A-'.
I am happy to refer you internship opportunities if you have high-quality contributions to Amphion. This is an example of a high-quality contribution. See the comments from a top Silcon Valley investor.
This course is designed as the first course for students who are interested in speech and language technology. The first half of the course focuses on the fundamentals and introduces tools for students to use, and the second half emphasises on applications, giving students the opportunity to know how speech and language technology could impact human life. In particular, the topics include:
Recommended Books:
Date: Mar 11 (same as lecture time)
Scope: Lecture 1 - 11
Date: May 8 (same as lecture time)
Scope: All lectures (including high-level concept from guest lectures)
For CSC3160 students, if you would like to work on the final project instead of the final exam. Please let me know in advance.
Due date: May 8 (EOD)
The project is to reproduce a paper within Amphion. Amphion team will provide you computational resources and guidelines. The topics range from text processing, modeling and speech processing. If you have a preference, please let the teaching team know, otherwise, we can discuss to select a paper for you to work on. You can work as a team (2 students) or as an individual. You need to write a project report (max 6 pages) for the final project. Here is the report template.
Rating guideline
Here are some ways to earn the participation credit, which is capped at 5%.
The penalty is 0.5% off the final course grade for each late day.
Date | Lecture Description | Readings | Lecture Note | Events/Deadlines |
---|---|---|---|---|
Jan 8 | Lecture 1: Introduction and course overview | [Slides] | ||
Jan 10 | Lecture 2: Understanding sounds | Pitch,
loudness and timbre What is a Sound Spectrum? [Colab demo] |
[Slides] | |
Jan 15 | Lecture 3: Basics of speech signal processing and analysis | [Slides] | ||
Jan 17 | Lecture 4: Introduction of speech production | [Slides] | All assignments are out
[Assignment 1] [Assignment 2] [Assignment 3] [Assignment 4] |
|
Jan 22 | Lecture 5: Speech representation | Basic Representations | [Slides] | |
Jan 24 | Lecture 6: Phones and Phonation | [Slides] | ||
Jan 29 | No class | |||
Jan 31 | Lecture 8: Text processing | [Slides] | Assignment 1 due (11:59pm) | |
Feb 26 | Lecture 9: Words, morphology, and parts of speech | [Slides] | ||
Feb 28 | Lecture 10: Word embedding | [Slides] | Assignment 2 due (11:59pm) | |
Mar 4 | Lecture 11: Syntax: structure of sentences | [Slides] | ||
Mar 6 | Lecture 7: Speech perception | [Slides] | ||
Mar 11 | Mid-term exam (scope: lecture 1 - 11) | |||
Mar 13 | Lecture 12: Language model | [Slides] | Assignment 3 due (11:59pm) | |
Mar 18 | No class | |||
Mar 20 | Lecture 13: Word2Vec and Sentiment Analysis | [Slides] | ||
Mar 25 | Lecture 14: TTS | [Slides] | ||
Mar 27 | Lecture 15: Voice conversion | [Slides] | ||
Apr 1 | Lecture 16: Automatic Speech Recognition | [Slides] | ||
Apr 3 | Lecture 17: Machine Translation | [Slides] | ||
Apr 8 | Lecture 18: Question answering | [Slides] | ||
Apr 10 | Lecture 19: Chatbot | [Slides] | ||
Apr 15 | No class: Instructor attending ICASSP 2024 | |||
Apr 17 | No class: Instructor attending ICASSP 2024 | Assignment 4 due (11:59pm) | ||
Apr 22 | Invited talk: Wei Li | |||
Apr 24 | Invited talk: Lei Wang | |||
Apr 29 | Invited talk: Xu Tan | |||
May 6 | Lecture 22: In class QA and review for final exam | |||
May 8 | Final exam |