236610: Generative AI – Diffusion Models (Winter 2023/2024)

Generative AI – Diffusion Models (236610)

Winter 2023/2024

Given as an “Advanced Topics in CS” course

Last update: January 9th 2024

 

Lecturer:

  • Prof. Michael Elad (elad@cs.technion.ac.il)

Teachning Assistant:

  • Noam Elata (noamelata@campus.technion.ac.il)

Credit:

  • 2 Points

Time and Place: 

  • Sunday 16:30-18:30
  • The meetings will be held via ZOOM (link)
  • In case of a hybrid class (zoom+frontal), we will meet in Taub 4

Format: 

  • Registration should be done directly with the lecturer by email
  • This course will be given via zoom (and not as originally planned)
  • This course will be given in English
  • Meetings will be recorded and shared via YouTube
  • Priority is given to Technion’s Graduate students from CS, EE, BioMed and DDS. Towards the beginning of the semester, if there are vacancies, we will admit undergraduate students with the appropriate credetials
  • Attendance will NOT be checked and will NOT affect the grade

Pre-requisites:

  • 236200 or 236201 (Introduction to Data Processing and Representation) or 046200 (Processing and Analysis of Images)
    A provable background in programming in deep-learning – wither an advanced course on this topic or a project

Parallel Courses:

  • 048954 (Statistical Methods in Image Processing) is NOT considered as parellel, but students should avoid double-deeping their projects

Course Description:

A fascinating topic in deep learning in the last decade deals with the creation of information from `thin-air’, such as the synthesis of images, video clips, creation of music clips and more. For the most part, this topic is discussed in conjunction with learning machines called GANs, although other algorithmic tools have also been harnessed for this task, such as VAE, energy-based methods, normalized flow, and more. A real revolution is taking place these days with the introduction of diffusion methods as an alternative to all of the above. These methods are based on an iterative approach in which a Gaussian noise vector is gradually converted into a vector from the expected distribution law, and this is done with the help of the score function of the distribution law. As it turns out, this function can be approximated very accurately by a plain image denoiser – a simple algorithm designed to remove Gaussian noise from an image. This makes diffusion methods accessible and easy to train and activate. In this course, we intend to review this approach of diffusion methods, by going over the mathematical foundations underlying the proposed algorithms, familiarization with a variety of diffusion methods, various applications that rely on this technique for solving inversion problems, conditional sampling, and more.

Course Structure:

  • After few (6-7) lectures by the lecturer, we will concentrate on students’ lectures on their assigned projects, covering various recent papers in this domain. Each student participating in the course will be assigned with a paper (or more, depending on the content) as the grounds for their project.
  • The project itself will include the following activities:
    • Reading the paper(s) and understanding their content;
    • Implementing the algorithms suggested in the paper(s), if any
    • Exploring possible extensions to the work suggested;
    • Preparing a slide show to present the above – the slides should be cleared by the lecturer before being presented
    • Presenting the slides to the class; and
    • Optional (bonus grade): Issuing a report in a paper format to summarize all the above.

Comment: We may ask to repsent the slides BEFORE the completion of the additional research. Also, we may organize these presentations in a concentrated day after the exam period of the winter semester

Grading Policy:

  • 50% of the grade will be dictated by the quality of the presentation created and the lecture given
  • 50% of the grade will be dictated by the quality of the work that goes beyond the paper’s content

Choosing a Project’s Paper:

  • Please choose your paper for the project from THIS LIST
  • Projects can be done in pairs.
  • Note: You have to notify the teaching team (both Noam and Miki) of the paper you have chosen. If it has been already taken, you will have to choose again
  • You are welcome to suggest your own choice of paper outside this list, but it should be (i) relevant to the course; (ii) with open source code, or a reasonable alternative to it; and (iii) not something too basic that we have intend to present in class

Tentative Syllabus: 

1. Introduction
– General Description and Administration
– Review of Basic Mathematical Tool

2. Background
– A Prior for Images–How and Why
– Evolution of Priors in Image Processing–Classical Era
– Priors in Image Processing: The Era of Deep-Learning

3. Introduction to Diffusion
– Say Hello to the Score Function
– Image Denoisers
– RED & PnP
– Langevin Dynamics
– Diffusion Models–Introduction

4. Alternative Diffusion Models
– DDPM: Forward Path
– DDPM: Reverse Path
– DDPM: The Formally Introduced Reverse Path
– Related Topic: Probability estimation

5. Acceleration Methods
– Denoising Diffusion GANs
– Nested Diffusion
– Denoising Diffusion Implicit Models
– Other Acceleration methods

5. Guided Diffusion
– The Concept of Guidance
– Classifier Guidance
– Classifier Free Guidance

6. Diffusion for Inverse Problems
– Inverse Problems (IP)–Some Fundamentals
– Diffusion Models for IP-Conceptual Solutions
– Diffusion Models for IP–Diving into The Bayesian Approach
– Recent Work of Relevance

7. Diffusion: Applications [optional]

Preliminary Resources:

People to Follow:

  • Stefano Ermon (Stanford CS)
  • Yang Song (OpenAI)
  • Arash Vahdat (NVIDIA)
  • Diederick P. Kingma (Google)
  • Prafulla Dhariwal (OpenAI)
  • Tim Salimans (Google)

Video Recordings:

  • The 1st meeting – A general description of the course, a review of basic mathematical tools, The importance of  the prior in image processing
  • The 2nd meeting – Continuing the background chapter, describing the deep-learning era with a focus on image samplers
  • The 3rd meeting – Introducing the score-function, its relation to denoisers, and then moving to the Langevin dynamics and diffusion models relying on it
  • The 4th meeting – Denoising Diffusion Probabilistic Models (DDPM) – a thorough derivation of the reverse path, and likelihood evaluations
  • The 5th meeting – Acceleration methods – DDIM, Denoising Diffusion GANs, Nested Diffusion and more (multi-scale, consistency, latent diffusion)
  • The 6th meeting – Guided Diffusion via Classifier- and Classifier-Free Guidance, introduction to Text-2-Image methods and diving in on IMAGEN
  • The 7th meeting –
  • The 8th meeting –