A final project poster session is planned by the end of the course (tentatively May 20th 2023). This is to provide students the opportunities to connect with speech and language research/industry community.
Anyone from the CUHK-Shenzhen and speech and language technology community are welcome to join. More details will be provided when it is close to the event. Feel free to reach out!
This course is designed as the first course for students who are interested in speech and language technology. The first half of the course focuses on the fundamentals and introduces tools for students to use, and the second half emphasises on applications, giving students the opportunity to know how speech and language technology could impact human life. In particular, the topics include:
Recommended Books:
We will have a mid-term exam on March 9th 2023. The scope of the mid-term exam is from lecture 1 to lecture 12.
You need to write a project proposal (2 pages) and a project report (max 6 pages) for the final project. Here is the report template. You are also expected to report project milestones and make a project poster presentation. After the final project deadline, feel free to make your project open source.
Here are some ways to earn the participation credit, which is capped at 5%.
The penalty is 0.5% off the final course grade for each late day.
Date | Lecture Description | Readings | Lecture Note | Events/Deadlines |
---|---|---|---|---|
Jan 4 | Tutorial 0: GitHub, LaTeX, and Colab | Learn LaTeX in 30
minutes Colab official tutorial Official tutorials of GitHub |
[Slides] | Self-study |
Jan 5 | Lecture 1: Introduction and course overview | [Slides] [Video] |
||
Jan 10 | Lecture 2: Machine learning in a nutshell | Deep Learning
in a Nutshell: Core Concepts Machine learning, explained |
[Slides] [Video] |
|
Jan 11 | Tutorial 1: PyTorch | PyTorch
Quickstarts PyTorch Installation |
[Slides] [Video] [Colab] |
|
Jan 12 | Lecture 3: Understanding sound and acoustics | Pitch, loudness and timbre What is a Sound Spectrum? |
[Slides] [HTML] [Video] |
Assignment 1 out |
Jan 15 | Tutorial 2: TorchAudio (by Torchaudio team) | TorchAudio Documentation | [Slides] [Video] |
10:00am via zoom |
Jan 17 | Lecture 4: Understanding human speech | Voice Acoustics: an introduction Introduction to Speech Processing |
[Slides] [Video] |
|
Feb 9 | Lecture 5: Human sounds and their organization | Chapter 25: Phonetics | [Slides] [Video] |
|
Feb 14 | Lecture 6: Text processing and regular expressions | Chapter 2: Regular Expressions, Text Normalization, Edit Distance | [Slides] [Video] |
Assignment 2 out Assignment 1 due (11:59pm) |
Feb 15 | Tutorial 3: Text processing | |||
Feb 16 | Lecture 7: Words and their relationship to other words | [Slides] [Video] |
||
Feb 21 | Lecture 8: Syntax: Structure of sentences | [Slides] [Video] |
||
Feb 23 | Lecture 9: Language models | Chapter 3: N-gram Language
Models Chapter 7: Neural Networks and Neural Language Models |
[Slides] [Video] |
Assignment 2 due (11:59pm) |
Feb 28 | Lecture 10: Language models | Chapter 3: N-gram Language
Models Chapter 7: Neural Networks and Neural Language Models |
[Slides] [Video] |
|
Mar 2 | Lecture 11: Embedding: Representations of the meaning of words | Chapter 6: Vector Semantics and Embeddings | [Slides] [Video] |
Project proposal due (11:59pm) |
Mar 7 | Lecture 12: Embedding: Representations of the meaning of words | Chapter 6: Vector Semantics and Embeddings | [Slides] [Video] |
|
Mar 8 | Tutorial 4: Word embedding | |||
Mar 9 | Midterm exam | Assignment 3 out | ||
Mar 14 | Lecture 13: Word classifications and Named entities recognition | [Slides] [Video] |
||
Mar 15 | Tutorial 5: Visualization and plotting | |||
Mar 16 | Lecture 14: SLP Application - Sentiment analysis | [Slides] [Video] |
||
Mar 21 | Lecture 15: SLP Application - Text summarization | [Slides] [Video] |
Assignment 3 due (11:59pm) | |
Mar 22 | Lecture 16: Summarizing Conversations: From Meetings to Social Media (by Nancy Chen) | Invited talk. Location: DY103, Time: 12-13 | ||
Mar 28 | Lecture 17: SLP Application - Fundamentals of speech recognition (by Xiong Xiao) | Invited guest lecture | ||
Mar 30 | Lecture 18: SLP Application - Text-to-speech synthesis | [Slides] [Video] |
Project milestone 1 due (11:59pm) | |
Apr 6 | Lecture 19: SLP Application - Voice conversion | [Slides] [Video] |
||
Apr 11 | Lecture 20: SLP Application - Machine translation | Chapter 10: Machine Translation | [Slides] [Video] |
|
Apr 13 | Lecture 21: SLP Application - Question answering | Chapter 23: Question Answering | [Slides] [Video] |
|
Apr 18 | Lecture 22: SLP Application - Chatbot | Chapter 24: Chatbots and Dialogue Systems | [Slides] [Video] |
Project milestone 2 due (11:59pm) |
Apr 20 | Lecture 23: Industry applications of speech and language processing | Invited talk | ||
Apr 25 | Lecture 24: Industry applications of speech and language processing | Invited talk | ||
Apr 27 | Final project review preparation | |||
May 4 | Final project review preparation | Final project report early submission due (11:59pm) | ||
May 11 | Final project report due (11:59pm) | |||
May 20 | Final project poster session |
This session is open to the CUHK-Shenzhen community and invited guests. Details will be
available soon.
Tentative Time: (9am - 1pm). |