Talks

Details view List view

Patch-Based Image Processing

From Sparse Representations to Deep Learning
4-8 September 2017
Summer School "Signal Processing meets Deep Learning" in Capri

Within the wide field of sparse approximation, convolutional sparse coding (CSC) has gained increasing attention in recent years. This model assumes a structured-dictionary built as a union of banded Circulant matrices. Most of the attention has been devoted to the practical side of CSC, proposing efficient algorithms for the pursuit problem, and identifying applications that benefit from this model. Interestingly, a systematic theoretical understanding of CSC seems to have been left aside, with the assumption that the existing classical results are sufficient. In this talk we start by presenting a novel analysis of the CSC model and its as- sociated pursuit. Our study is based on the observation that while being global, this model can be characterized and analyzed locally.

We show that uniqueness of the representation, its stability with respect to noise, and successful greedy or convex recovery are all guaranteed assuming that the underlying representation is locally sparse. These new results are much stronger and informative, compared to those obtained by deploying the classical sparse theory. Armed with these new insights, we proceed by proposing a multi-layer extension of this model, ML-CSC, in which signals are assumed to emerge from a cascade of CSC layers. This, in turn, is shown to be tightly connected to Convolutional Neural Networks (CNN), so much so that the forward-pass of the CNN is in fact the Thresholding pursuit serving the ML-CSC model. This connection brings a fresh view to CNN, as we are able to attribute to this architecture theoretical claims such as uniqueness of the representations throughout the network, and their stable estimation, all guaranteed under simple local sparsity conditions. Lastly, identifying the weaknesses in the above scheme, we propose an alternative to the forward-pass algorithm, which is both tightly connected to deconvolutional and recurrent neural networks, and has better theoretical guarantees.

This 3-hours talk was given at the Summer School "Signal Processing meets Deep Learning" in Capri. This talk summarizes portions of the PhD work by my three PhD students, Vardan Papyan, Yaniv Romano, and Jeremias Sulam.
Style Transfer via Texture Synthesis
March 19th, 2017
Hebrew University

Style-transfer is a process of migrating a style from a given image to the content of another, synthesizing a new image which is an artistic mixture of the two. Recent work on this problem adopting Convolutional Neural-networks (CNN) ignited a renewed interest in this field, due to the very impressive results obtained. There exists an alternative path towards handling the style-transfer task, via generalization of texture-synthesis algorithms. I will present a novel such style-transfer algorithm that extends the texture-synthesis work of Kwatra et. al. (2005), while aiming to get stylized images that get closer in quality to the CNN ones.

This talk was given in the computer vision seminar in the Hebrew University. This is a joint work with Peyman Milanfar - Google Research.
The Dichotomy between Global Processing and Local Modeling
December 7-11, 2015
Berlin, Germany

Recent work in image processing repeatedly shows highly efficient reconstruction algorithms that lean on modeling of small overlapping patches. Such methods impose a local model in order to regularize a global inverse problem. Why does this work so well? Does this leave room for improvements? What does a local model imply globally on the unknown signal? In this talk we will start from algorithmic attempts that aim to understand this dichotomy in order to narrow the global-local gap. Gradually, we will turn the discussion to a theoretical point of view that provides a deeper understanding of such local models, and their global implications.

This was given as a plenary talk in the International Matheon Conference on Compressed-Sensing and its Applications. The talk is based on joint work with Dmitry Batenkov, Jeremias Sulam, Vardan Papyan, and Yaniv Romano.
Facial Image Compression using Patch-Ordering-Based Adaptive Wavelet Transform
April 19-24, 2015
ICASSP, Brisbane, Australia

Compression of frontal facial images is an appealing and important application. Recent work has shown that specially tailored algorithms for this task can lead to performance far exceeding JPEG2000. This paper proposes a novel such compression algorithm, exploiting our recently developed redundant tree-based wavelet transform. Originally meant for functions defined on graphs and cloud of points, this new transform has been shown to be highly effective as an image adaptive redundant and multi-scale decomposition. The key concept behind this method is reordering of the image pixels so as to form a highly smooth 1D signal that can be sparsified by a regular wavelet. In this work we bring this image adaptive transform to the realm of compression of aligned frontal facial images. Given a training set of such images, the transform is designed to best sparsify the whole set using a common feature-ordering. Our compression scheme consists of sparse coding using the transform, followed by entropy coding of the obtained coefficients. The inverse transform and a post-processing stage are used to decode the compressed image. We demonstrate the performance of the proposed scheme and compare it to other competing algorithms.

This poster was presented in ICASSP 2015. It has been accepted as an IEEE-SPL paper.
Wavelet for Graphs and its Deployment to Image Processing
May 12-14, 2014
SIAM Imaging Science, in Hong-Kong.

What if we take all the overlapping patches from a given image and organize them to create the shortest path by using their mutual Euclidean distances? This suggests a reordering of the image pixels in a way that creates a maximal 1D regularity. What could we do with such a construction? In this talk we consider a wider perspective of the above, and introduce a wavelet transform for graph-structured data. The proposed transform is based on a 1D wavelet decomposition coupled with a pre-reordering of the input so as to best sparsify the given data. We adopt this transform to image processing tasks by considering the image as a graph, where every patch is a node, and edges are obtained by Euclidean distances between corresponding patches. We show several ways to use the above ideas in practice, leading to state-of-the-art image denoising, deblurring, inpainting, and face-image compression results.

This is a joint work with Idan Ram and Israel Cohen. This talk was given as a plenary talk in SIAM Imaging Science, in Hong-Kong.
Image Processing via Pixel Permutation
April 1st, 2014
Israel Machine Vision Conference (IMVC), in Tel-Aviv, Israel

Images are 2D signals, and should be processed as such — this is the common belief in the image processing community. Is it truly the case? Around thirty years ago, some researchers suggested to convert images into 1D signals, so as to harness well-developed 1D tools such as adaptive-filtering and Kalman- estimation techniques. These attempts resulted with poorly performing algorithms, strengthening the above belief. Why should we force unnatural causality between spatially ordered pixels? Indeed, why? In this talk I will present a conversion of images into 1D signals that leads to state-of-the-art results in series of applications – denoising, inpainting, compression, and more. The core idea in our work is that there exists a permutation of the image pixels that carries in it most of the “spatial content”, and this ordering is within reach, even if the image is corrupted. We expose this permutation and use it in order to process the image as if it is a one-dimensional signal, treating successfully a series of image processing problems.

This is a joint work with Idan Ram and Israel Cohen. This talk was given as a plenary talk in the Israel Machine Vision Conference (IMVC)
Sparse Modeling of Graph-Structured Data ... and ... Images
March 13 - 15, 2014
The Institute of Statistical Mathematics, Tachikawa, Tokyo

Images, video, audio, text documents, financial data, medical information, traffic info – all these and many others are data sources that can be effectively processed. Why? Is it obvious? In this talk we will start by discussing “modeling” of data as a way to enable their actual processing, putting emphasis on sparsity-based models. We will turn our attention to graph-structured data and propose a tailored sparsifying transform for its dimensionality reduction and subsequent processing. We shall conclude by showing how this new transform becomes relevant and powerful in revisiting … classical image processing tasks..

This is a joint work with Idan Ram and Israel Cohen. This talk was given as a plenary talk in a Workshop on Mathematical Approaches to Large-Dimensional Data Analysis
Wavelet for Graphs and its Deployment to Image Processing
July 9th, 2013
SPARS-2013, Laussanne

What if we take all the overlapping patches from a given image and organize them to create the shortest path by using their mutual distances? This suggests a reordering of the image pixels in a way that creates a maximal 1D regularity. What could we do with such a construction? In this talk we consider a wider perspective of the above, and introduce a wavelet transform for graph-structured data. The proposed transform is based on a 1D wavelet decomposition coupled with a pre-reordering of the input so as to best sparsify the given data. We adopt this transform to image processing tasks by considering the image as a graph, where every patch is a node, and edges are obtained by Euclidean distances between corresponding patches. We show several ways to use the above ideas in practice, leading to state-of-the-art image denoising, deblurring, and inpainting results.

This is a joint work with Idan Ram and Israel Cohen. This talk was given as a plenary talk in SPARS-2013. It was also given in Erasmus University, Rotterdam, the Netherlands, on January 30th, 2014.
Sparse Modeling of Graph-Structured Data ... and ... Images
May 28-29, 2013
Technion, Israel

Images, video, audio, text documents, financial data, medical information, traffic info — all these and many others are data sources that can be effectively processed. Why? Is it obvious? In this talk we will start by discussing “modeling” of data as a way to enable their actual processing, putting emphasis on sparsity-based models. We will turn our attention to graph-structured data and propose a tailored sparsifying transform for its dimensionality reduction and subsequent processing. We shall conclude by showing how this new transform becomes relevant and powerful in revisiting … classical image processing tasks.

This is a joint work with Idan Ram and Israel Cohen. This talk was given as an invited talk in the 3rd Annual International TCE Conference on Machine Learning & Big Data, May 28-29, 2013, in the Technion, Israel.
Recent Results on the Co-Sparse Analysis Model
March 18-22, 2013
84th Annual Meeting of the International Association of Applied Mathematics and Mechanics (GAMM), Novi-Sad, Serbia

In this talk we describe the co-sparse analysis model, with emphasis on pursuit algorithms and dictionary learning for it. We present two of our recent activities on this subject: (i) A theoretical study of the Analysis-Thresholding algorithm, exposing measures of goodness for the dictionary that govern the pursuit performance; and (ii) The development of an analysis K-SVD algorithm that trains a dictionary from signal examples and its use for image denoising.

This is a joint work with Tomer Peleg and Ron Rubinstein. This talk was given as an invited talk in the workshop "Mathematical signal and image processing" organized by Gitta Kutyniok and Otmar Scherzer
Recent Results on the Co-Sparse Analysis Model
December 7th, 2012
NIPS, Lake Tahoe

In this talk we describe the co-sparse analysis model, with emphasis on pursuit algorithms and dictionary learning for it. We present two of our recent activities on this subject: (i) A theoretical study of the Analysis-Thresholding algorithm, exposing measures of goodness for the dictionary that govern the pursuit performance; and (ii) The development of an analysis K-SVD algorithm that trains a dictionary from signal examples and its use for image denoising.

This is a joint work with Tomer Peleg and Ron Rubinstein. This talk was given as an invited talk in the workshop "Analysis Operator Learning vs. Dictionary Learning: Fraternal Twins in Sparse Modeling"
The Analysis Sparse Model - Definition, Pursuit, Dictionary Learning, and Beyond (Short-Version)
May, 2012
SIAM Imaging Science Conference, Philadelphia

The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a few atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator — hereafter referred to as the “Analysis Dictionary” – multiplies the signal, leading to a sparse outcome. While the two alternative models seem to be very close and similar, they are in fact very different. In this talk we define clearly the analysis model and describe how to generate signals from it. We discuss the pursuit denoising problem that seeks the zeros of the signal with respect to the analysis dictionary given noisy measurements. Finally, we explore ideas for learning the analysis dictionary from a set of signal examples. We demonstrate this model’s effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.

This is a short-version of the talk below. It was given as an invited talk in the SIAM Imaging Science Conference, in the Session "Sparse and Redundant Representations for Image Reconstruction and Geometry Extraction", organized by Weihong Guo (Case Western Reserve University, USA), Philadelphia May 2012. This talk was also given in a Machine-Learning Workshop in Janelia Farm (May, 2012). Joint work with Ron Rubinstein (former PhD student), Tomer Faktor (PhD student), Remi Gribonval and Sangnam Nam (INRIA, Rennes), and Mike Davies (UEdin).
The Analysis Sparse Model - Definition, Pursuit, Dictionary Learning, and Beyond
January 16th, 2012
Mathematics and Image Analysis 2012 Workshop (MIA'12), Paris

The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a few atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator — hereafter referred to as the “Analysis Dictionary” – multiplies the signal, leading to a sparse outcome. While the two alternative models seem to be very close and similar, they are in fact very different. In this talk we define clearly the analysis model and describe how to generate signals from it. We discuss the pursuit denoising problem that seeks the zeros of the signal with respect to the analysis dictionary given noisy measurements. Finally, we explore ideas for learning the analysis dictionary from a set of signal examples. We demonstrate this model’s effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.

Invited talk. This talk was also given as keynote talk in LVA-ICA, March 13th, 2012, Tel-Aviv, and in Oberwolfach (Germany) workshop on harmonic analysis on June 13th, 2012. Joint work with Ron Rubinstein (former PhD student), Tomer peleg (PhD student), Remi Gribonval and Sangnam Nam (INRIA, Rennes), and Mike Davies (UEdin).
From SD to HD: Improving Video Sequences Through Super-Resolution
November 10th, 2011
Haifa, Special Workshop for "final" employees. This presentation covers material that speards over 2 decades of my research activity

Multi-channel TV broadcast, Internet video and You-Tube, home DVD movies, video conference calls, cellular video calls and more – there is no doubt that videos are abundant and in everyday use. In many cases, the quality of the available video is poor, something commonly referred to as “low-resolution”. As an example, High-definition (HD) TV’s are commonly sold these days to customers that hope to enjoy a better viewing experience. Nevertheless, most TV broadcast today is still done in standard-definition (SD), leading to poor image quality on these screens. The field of Super-Resolution deals with ways to improve video content to increase optical resolution. The core idea: fusion of the visual content in several images can be performed and this can lead to a better resolution outcome. For years it has been assumed that such fusion requires knowing the exact motion the objects undergo within the scene. Since this motion may be quite complex in general, this stood as a major obstacle for industrial applications. Three years ago a break-through has been made in this field, allowing to bypass the need for exact motion estimation. In this lecture we shall survey the work in this field from its early days (25 years ago) and till very recently, and show the evolution of ideas and results obtained. No prior knowledge in image processing is required.

Joinly done with various collaborators in different periods of times: Arie Feuer (1994-1997), Yacov Hel-Or (1999), Peyman Milanfar and Sina Farsiu (2001-2005), and Matan Protter (2006-2011). These slides were adapted from Matan's talk - see his site
A Course on Sparse and Redundant Representation Modeling of Images - Iceland Summer-School
August 15-20, 2010
These lectures were given as part of a graduate summer school on Sparsity in Image and Signal Analysis, Holar, Iceland.

This course (4 lectures and one tutorial) brings the core ideas and achievements made in the field of sparse and redundant representation modeling, with emphasis on the impact of this field to image processing applications. The five lectures (given as PPTX and PDF) are organized as follows:

Lecture 1: The core sparse approximation problem and pursuit algorithms that aim to approximate its solution.

Lecture 2: The theory on the uniqueness of the sparsest solution of a linear system, the notion of stability for the noisy case, guarantees for the performance of pursuit algorithms using the mutual coherence and the RIP.

Lecture 3: Signal (and image) models and their importance, the Sparseland model and its use, analysis versus synthesis modeling, a Bayesian estimation point of view, dictionary learning with the MOD and the K-SVD, global and local image denoising, local image inpainting.

Lecture 4: Sparse representations in image processing – image deblurring, global image separation and image inpainting. using dictionary learning for image and video denoising and inpainting, image scale-up using a pair of learned dictionaries, facial image compression with the K-SVD.

Single Image Super-Resolution Using Sparse Representation
April 14th, 2010
SIAM Imaging Science 2010 Conference, Chicago. Mini-Symp. on Recent Advances in Sparse and Non-local Image Regularization (organized by Gabriel Peyre, Peyman Milanfar, and Michael Elad).

Scaling up a single image while preserving is sharpness and visual-quality is a difficult and highly ill-posed inverse problem. A series of algorithms have been proposed over the years for its solution, with varying degrees of success. In CVPR 2008, Yang, Wright, Huang and Ma proposed a solution to this problem based on sparse representation modeling and dictionary learning. In this talk I present a variant of their method with several important differences. In particular, the proposed algorithm does not need a separate training phase, as the dictionaries are learned directly from the image to be scaled-up. Furthermore, the high-resolution dictionary is learned differently, by forcing its alignment with the low-resolution one. We show the benefit these modifications bring in terms of simplicity of the overall algorithm, and its output quality.

This is a joint work with Roman Zeyde and Matan Protter (CS - Technion).
Sparse and Redundant Representation Modeling for Image Processing
December 11th, 2008
Computaitonal Algebraic Statistics, Theory and Applications (CASTA), Kyoto, Japan.

In this talk we describe applications such as image denoising and beyond using sparse and redundant representations. Our focus is on ways to perform these tasks with trained dictionaries using the K-SVD algorithm. As trained dictionaries are limited in handling small image patches, we deploy these within a Bayesian reconstruction procedure by forming an image prior that forces every patch in the resulting image to have a sparse representation.

Invited talk. Joint work with Michal Aharon (CS - Technion, Guillermo Sapiro (UMN), Julien Mairal (Inria - France), and Matan Protter (CS - Technion).
Super-Resolution-Reconstruction of Image Sequences Without Explicit Motion Estimation
July 8th, 2008
SIAM Imaging Science 2008, San-Diego. Special Session on Locally Adaptive Patch-based Image and Video Restoration - Part II.

Super-resolution reconstruction proposes a fusion of several low quality images into one higher quality result with better optical resolution. Classic super resolution techniques strongly rely on the availability of accurate motion estimation for this fusion task. When the motion is estimated inaccurately, as often happens for non-global motion fields, annoying artifacts appear in the super-resolved outcome. Encouraged by recent developments on the video denoising problem, where state-of-the-art algorithms are formed with no explicit motion estimation, we seek a super-resolution algorithm of similar nature that will allow processing sequences with general motion patterns. In this talk we base our solution on the Non-Local-Means (NLM) algorithm. We show how this denoising method is generalized to become a relatively simple super-resolution algorithm with no explicit motion estimation. Results on several test movies show that the proposed method is very successful in providing super-resolution on general sequences.

Joint work with Matan Protter (CS - Technion), Hiro Takeda and Peyman Milanfar (UCSC).
Image Denoising and Beyond via Learned Dictionaries and Sparse Representations
June 26th, 2008
Tel-Aviv University, Approximation Seminar, the Mathemathics department.

In this survey talk we focus on the use of sparse and redundant representations and learned dictionaries for image denoising and other related problems. We discuss the the K-SVD algorithm for learning a dictionary that describes the image content effectively. We then show how to harness this algorithm for image denoising, by working on small patches and forcing sparsity over the trained dictionary. The above is extended to color image denoising and inpainitng, video denoising, and facial image compression, leading in all these cases to state of the art results. We conclude with very recent results on the use of several sparse representations for getting better denoising performance. An algorithm to generate such set of representations is developed, and our analysis shows that by this method we approximate the minimum-mean-squared-error (MMSE) estimator, thus getting better results.

This talk survesy a wide group of papers, with statement of recent results obtained with Irad Yavneh.