# Talks

## Sparse Representation Practice

## Sparse Representation Practice

Over the past decade there has been a great interest in a synthesis-based model for signals, based on sparse and redundant representations. Such a model assumes that the signal of interest can be decomposed as a linear combination of few columns from a given matrix (the dictionary). An alternative, analysis-based, model can be envisioned, where an analysis operator multiplies the signal, leading to a sparse outcome. In this work we propose a simple but effective analysis operator learning algorithm, where analysis “atoms” are learned sequentially by identifying directions that are orthogonal to a subset of the training data. We demonstrate the effectiveness of the algorithm in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis operator.

How do we choose a network architecture in deep-learning solutions? By copying existing networks or guessing new ones, and sometimes by applying various small modifications to them via trial and error. This non-elegant and brute-force strategy has proven itself useful for a wide variety of imaging tasks. However, it comes with a painful cost – our networks tend to be quite heavy and cumbersome. Could we do better? In this talk we would like to propose a different point of view towards this important question, by advocating the following two rules: (i) Rather than “guessing” architectures, we should rely on classic signal and image processing concepts and algorithms, and turn these to networks to be learned in a supervised manner. More specifically, (ii) Sparse representation modeling is key in many (if not all) of the successful architectures that we are using. I will demonstrate these claims by presenting three recent image denoising networks that are light-weight and yet quite effective, as they follow the above guidelines.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model and its features, and then turn to describe two special cases of it – the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to … deep-learning. Alongside this main message of bringing a theoretical backbone to deep-learning, another central message that will accompany us throughout the talk: Generative models for describing data sources enable a systematic way to design algorithms, while also providing a complete mechanism for a theoretical analysis of these algorithms’ performance. This talk is meant for newcomers to this field – no prior knowledge on sparse approximation is assumed.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model and its features, and then turn to describe two special cases of it – the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to … deep-learning. Alongside this main message of bringing a theoretical backbone to deep-learning, another central message that will accompany us throughout the talk: Generative models for describing data sources enable a systematic way to design algorithms, while also providing a complete mechanism for a theoretical analysis of these algorithms’ performance. This talk is meant for newcomers to this field – no prior knowledge on sparse approximation is assumed.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model and its features, and then turn to describe two special cases of it – the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to … deep-learning. Alongside this main message of bringing a theoretical backbone to deep-learning, another central message that will accompany us throughout the talk: Generative models for describing data sources enable a systematic way to design algorithms, while also providing a complete mechanism for a theoretical analysis of these algorithms’ performance. This talk is meant for newcomers to this field – no prior knowledge on sparse approximation is assumed.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model, and then turn to describe two special cases of it – the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to … deep-learning. This talk is meant for newcomers to these fields – no prior knowledge on sparse approximation is assumed.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk, we describe a special case of this model— the multi-layered convolutional sparse coding (ML-CSC) construction. As we will carefully show, ML-CSC provides a solid theoretical foundation to the field of deep learning, explaining the used architectures, their performance limits, and prospects for future alternatives.

פירוק אטומי, הטבלה המחזורית, הרכבה של מולקולה … כל זה נשמע כמו תחילתה של הרצאה בכימיה. אבל לא! נושאים אלו יעלו בהרצאה שתדון בעיבוד תמונות. עיבוד תמונות הינו תחום מרכזי בחיינו – מהטלוויזיה בבתינו, המצלמה הדיגיטאלית שבכיסנו (ולאחרונה כחלק מהטלפון הסלולרי), דרך צפייה בסרטי די-וי-די, בקרת איכות בפסי ייצור, מערכות עקיבה ואבטחה, ועד צילומי אולטראסאונד, טומוגרפיה ותהודה מגנטית בבתי-חולים. בכל אלה ובעוד מוצרים רבים, עיבוד תמונות מהווה טכנולוגיה שאי-אפשר בלעדיה. תחום זה תוסס ופורח הן בתעשיה והן באקדמיה, עם עשרות אלפי מהנדסים ומדענים בכל רחבי העולם העוסקים בו יום יום ושעה שעה. אז מה זה עיבוד תמונות? זהו הנושא שבו נדון בהרצאה זו. עיבוד תמונות מתייחס לטיפול בתמונות ע”י מחשב. אנו נעסוק בשאלות כגון כיצד תמונה עושה את דרכה אל המחשב, כיצד היא אגורה שם, מה ניתן לעשות בה משכבר היא שם, ועוד. אחת המטרות המרכזיות בהרצאה זו היא הצגת חזית הידע בתחום, ובפרט העשייה המחקרית בטכניון בזירה זו. הדיון הכללי הנ”ל על עיבוד תמונות ישמש אותנו כמצע עליו נבנה כדי להציג את עבודתנו מהעת האחרונה, בה אנו עוסקים במודלים לתמונות המשתמשים ברעיונות “כימיקליים”, בעזרתם אנו מטפלים במידע בכלל ובתמונות בפרט. אנו נתאר כיצד ניתן לפרק תמונה לאטומים, לבנות טבלה מחזורית של יסודות לתיאור תמונות, וכיצד אנו רותמים את כל אלה כדי לפתור בעיות מעשיות בתחום, כגון שיפור תמונות וסרטים ותיקונם מקלקולים שונים, השלמת חלקים חסרים בתמונות, דחיסה, ועוד

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model, and then turn to describe two special cases of it — the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to вЂ¦ deep-learning. This talk is meant for newcomers to these fields – no prior knowledge on sparse approximation is assumed.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we describe two special cases of this model — the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). We show that the projection of signals (a.k.a. pursuit) to the ML-CSC model leads to various deep convolutional neural network architectures. This connection brings a fresh view to CNN, as we are able to accompany the above by theoretical claims such as uniqueness of the representations throughout the network, and their stable estimation, all guaranteed under simple local sparsity conditions. The ‘take-home-message’ from this talk is this: The ML-CSC model can serve as the theoretical foundation to deep-learning.

Sparse approximation is a well-established theory, with a profound impact on the fields of signal and image processing. In this talk we start by presenting this model and its features, and then turn to describe two special cases of it: the convolutional sparse coding (CSC) and its multi-layered version (ML-CSC). Amazingly, as we will carefully show, ML-CSC provides a solid theoretical foundation to the field of deep-learning. Alongside this main message of bringing a theoretical backbone to deep-learning, another central message that will accompany us throughout the talk: Generative models for describing data sources enable a systematic way to design algorithms, while also providing a complete mechanism for a theoretical analysis of these algorithms’ performance. This talk is meant for newcomers to this field – no prior knowledge on sparse approximation is assumed.

Within the wide field of sparse approximation, convolutional sparse coding (CSC) has gained increasing attention in recent years. This model assumes a structured-dictionary built as a union of banded Circulant matrices. Most of the attention has been devoted to the practical side of CSC, proposing efficient algorithms for the pursuit problem, and identifying applications that benefit from this model. Interestingly, a systematic theoretical understanding of CSC seems to have been left aside, with the assumption that the existing classical results are sufficient. In this talk we start by presenting a novel analysis of the CSC model and its as- sociated pursuit. Our study is based on the observation that while being global, this model can be characterized and analyzed locally.

We show that uniqueness of the representation, its stability with respect to noise, and successful greedy or convex recovery are all guaranteed assuming that the underlying representation is locally sparse. These new results are much stronger and informative, compared to those obtained by deploying the classical sparse theory. Armed with these new insights, we proceed by proposing a multi-layer extension of this model, ML-CSC, in which signals are assumed to emerge from a cascade of CSC layers. This, in turn, is shown to be tightly connected to Convolutional Neural Networks (CNN), so much so that the forward-pass of the CNN is in fact the Thresholding pursuit serving the ML-CSC model. This connection brings a fresh view to CNN, as we are able to attribute to this architecture theoretical claims such as uniqueness of the representations throughout the network, and their stable estimation, all guaranteed under simple local sparsity conditions. Lastly, identifying the weaknesses in the above scheme, we propose an alternative to the forward-pass algorithm, which is both tightly connected to deconvolutional and recurrent neural networks, and has better theoretical guarantees.

Within the wide field of sparse approximation, convolutional sparse coding (CSC) has gained increasing attention in recent years. This model assumes a structured-dictionary built as a union of banded Circulant matrices. Most of the attention has been devoted to the practical side of CSC, proposing efficient algorithms for the pursuit problem, and identifying applications that benefit from this model. Interestingly, a systematic theoretical understanding of CSC seems to have been left aside, with the assumption that the existing classical results are sufficient. In this talk we start by presenting a novel analysis of the CSC model and its as- sociated pursuit. Our study is based on the observation that while being global, this model can be characterized and analyzed locally. We show that uniqueness of the representation, its stability with respect to noise, and successful greedy or convex recovery are all guaranteed assuming that the underlying representation is locally sparse.

These new results are much stronger and informative, compared to those obtained by deploying the classical sparse theory. Armed with these new insights, we proceed by proposing a multi-layer extension of this model, ML-CSC, in which signals are assumed to emerge from a cascade of CSC layers. This, in turn, is shown to be tightly connected to Convolutional Neural Networks (CNN), so much so that the forward-pass of the CNN is in fact the Thresholding pursuit serving the ML-CSC model. This connection brings a fresh view to CNN, as we are able to attribute to this architecture theoretical claims such as uniqueness of the representations throughout the network, and their stable estimation, all guaranteed under simple local sparsity conditions. Lastly, identifying the weaknesses in the above scheme, we propose an alternative to the forward-pass algorithm, which is both tightly connected to deconvolutional and recurrent neural networks, and has better theoretical guarantees.

In this survey talk I will walk you through a decade of fascinating research activity on “sparse and redundant representations”. We will start with a classic image processing task of noise removal and use it as a platform for the introduction of data models in general, and sparsity and redundancy as specific forces in such models. The emerging model will be shown to lead to a series of key theoretical and numerical questions, which we will handle next. A key problem with the use of sparse and redundant representation modeling is the need for a sparsifying dictionary – we will discuss ways to obtain such a dictionary by learning from examples, and introduce the K-SVD algorithm. Then we will show how all these merge into a coherent theory that can be deployed successfully to various image processing applications.

In this talk we present a generic recursive algorithm for improving image denoising methods. Given the initial denoised image, we suggest repeating the following procedure: (i) Strengthen the signal by adding the previous denoised image to the degraded input image, (ii) Operate the denoising method on the strengthened image, and (iii) Subtract the previous denoised image from the restored signals strengthened outcome. The convergence of this process is studied for the K-SVD image denoising and related algorithms. Furthermore, still in the context of K-SVD image denoising, we introduce an interesting interpretation of the SOS algorithm as a technique for closing the gap between the local patch-modeling and the global restoration task, thereby leading to improved performance. We demonstrate the SOS boosting algorithm for several leading denoising methods (KSVD, NLM, BM3D, and EPLL), showing tendency to further improve denoising performance.

Images, video, audio, text documents, financial data, medical information, traffic info – all these and many others are data sources that can be effectively processed. Why? Is it obvious? In this talk we will start by discussing “modeling” of data as a way to enable their actual processing, putting emphasis on sparsity-based models. We will turn our attention to graph-structured data and propose a tailored sparsifying transform for its dimensionality reduction and subsequent processing. We shall conclude by showing how this new transform becomes relevant and powerful in revisiting … classical image processing tasks..

What if we take all the overlapping patches from a given image and organize them to create the shortest path by using their mutual distances? This suggests a reordering of the image pixels in a way that creates a maximal 1D regularity. Could we repeat this process in several scales? What could we do with such a construction? In this talk we consider a wider perspective of the above line of questions: We introduce a wavelet transform that is meant for data organized as a connected-graph or as a cloud of high-dimensional points. The proposed transform constructs a tree that applies a 1D wavelet decomposition filters, coupled with a pre-reordering of the input, so as to best sparsify the given data. We adopt this transform to image processing tasks by considering the image as a graph, where every patch is a node, and vertices are obtained by Euclidean distances between corresponding patches. We show three ways to use the above ideas in practice – adopt only the patch-reordering, use the obtained wavelet transform as a sparsifying process, and a third approach were this transform is used as a regularizer. State-of-the-art image denoising, deblurring, and inpainting results are obtained with the proposed schemes.

In this survey talk I will walk you through a decade of fascinating research activity on “sparse and redundant representations”. We will start with a classic image processing task of noise removal and use it as a platform for the introduction of data models in general, and sparsity and redundancy as specific forces in such models. The emerging model will be shown to lead to a series of key theoretical and numerical questions, which we will handle next. A key problem with the use of sparse and redundant representation modeling is the need for a sparsifying dictionary — we will discuss ways to obtain such a dictionary by learning from examples, and introduce the K-SVD algorithm. Then we will show how all these merge into a coherent theory that can be deployed successfully to various image processing applications.

The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a few atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator — hereafter referred to as the “Analysis Dictionary” – multiplies the signal, leading to a sparse outcome. While the two alternative models seem to be very close and similar, they are in fact very different. In this talk we define clearly the analysis model and describe how to generate signals from it. We discuss the pursuit denoising problem that seeks the zeros of the signal with respect to the analysis dictionary given noisy measurements. Finally, we explore ideas for learning the analysis dictionary from a set of signal examples. We demonstrate this model’s effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.

In this survey talk I will walk you through a decade of fascinating research activity on “sparse and redundant representations”. We will start with a classic image processing task of noise removal and use it as a platform for the introduction of data models in general, and sparsity and redundancy as specific forces in such models. The emerging model will be shown to lead to a series of key theoretical and numerical questions, which we will handle next. A key problem with the use of sparse and redundant representation modeling is the need for a sparsifying dictionary — we will discuss ways to obtain such a dictionary by learning from examples, and introduce the K-SVD algorithm. Then we will show how all these merge into a coherent theory that can be deployed successfully to various image processing applications.

The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a few atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator — hereafter referred to as the “Analysis Dictionary” – multiplies the signal, leading to a sparse outcome. While the two alternative models seem to be very close and similar, they are in fact very different. In this talk we define clearly the analysis model and describe how to generate signals from it. We discuss the pursuit denoising problem that seeks the zeros of the signal with respect to the analysis dictionary given noisy measurements. Finally, we explore ideas for learning the analysis dictionary from a set of signal examples. We demonstrate this model’s effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.

The synthesis-based sparse representation model for signals has drawn a considerable interest in the past decade. Such a model assumes that the signal of interest can be decomposed as a linear combination of a *few* atoms from a given dictionary. In this talk we concentrate on an alternative, analysis-based model, where an analysis operator — hereafter referred to as the “Analysis Dictionary” – multiplies the signal, leading to a sparse outcome. Our goal is to learn the analysis dictionary from a set of signal examples, and the approach taken is parallel and similar to the one adopted by the K-SVD algorithm that serves the corresponding problem in the synthesis model. We present the development of the algorithm steps, which include a tailored pursuit algorithm termed “Backward Greedy” algorithm and a penalty function for the dictionary update stage. We demonstrate its effectiveness in several experiments, treating synthetic data and real images, showing a successful and meaningful recovery of the analysis dictionary.

In the commonly used sparse representation modeling, the atoms are assumed to be independent of each other when forming the signal. In this talk we shall introduce a statistical model called Boltzman Machine (BM) that enables such dependencies to be taken into account. Adopting a Bayesian point of view, we first treat the pursuit problem – given a signal, and assuming that the model parameters and the dictionary are known, find its sparse representation. We derive the exact MAP estimation, and show that just like in the independent case, this leads to an exponential search problem. We derive two algorithms for its evaluation: a greedy approximation approach for the general case, and an exact estimation that corresponds to a unitary dictionary and banded interaction matrix. We also consider the estimation of the model parameters, learning these parameters directly from training data. We show that given the signals’ representations, this problem can be posed as a convex optimization task by using the Maximum Pseudo-Likelihood (MPL).

This course (4 lectures and one tutorial) brings the core ideas and achievements made in the field of sparse and redundant representation modeling, with emphasis on the impact of this field to image processing applications. The five lectures (given as PPTX and PDF) are organized as follows:

Lecture 1: The core sparse approximation problem and pursuit algorithms that aim to approximate its solution.

Lecture 2: The theory on the uniqueness of the sparsest solution of a linear system, the notion of stability for the noisy case, guarantees for the performance of pursuit algorithms using the mutual coherence and the RIP.

Lecture 3: Signal (and image) models and their importance, the Sparseland model and its use, analysis versus synthesis modeling, a Bayesian estimation point of view, dictionary learning with the MOD and the K-SVD, global and local image denoising, local image inpainting.

Lecture 4: Sparse representations in image processing – image deblurring, global image separation and image inpainting. using dictionary learning for image and video denoising and inpainting, image scale-up using a pair of learned dictionaries, facial image compression with the K-SVD.

This course (5 lectures) brings the core ideas and achievements made in the field of sparse and redundant representation modeling, with emphasis on the impact of this field to image processing applications. The five lectures (given as PPTX and PDF) are organized as follows:

Lecture 1: The core sparse approximation problem and pursuit algorithms that aim to approximate its solution.

Lecture 2: The theory on the uniqueness of the sparsest solution of a linear system, the notion of stability for the noisy case, guarantees for the performance of pursuit algorithms using the mutual coherence and the RIP.

Lecture 3: Signal (and image) models and their importance, the Sparseland model and its use, analysis versus synthesis modeling, a Bayesian estimation point of view.

Lecture 4: First steps in image processing with the Sparseland model – image deblurring, image denoising, image separation, and image inpainting. Global versus local processing of images. Dictionary learning with the MOD and the K-SVD.

Lecture 5: Advanced image processing: Using dictionary learning for image and video denoising and inpainting, image scale-up using a pair of learned dictionaries, Facial image compression with the K-SVD.

This survey talk focuses on the use of sparse and redundant representations and learned dictionaries for image denoising and other related problems. We discuss the the K-SVD algorithm for learning a dictionary that describes the image content efficiently. We then show how to harness this algorithm for image denoising, by working on small patches and forcing sparsity over the trained dictionary. The above is extended to color image denoising and inpainting, video denoising, and facial image compression, leading in all these cases to state of the art results. We conclude with more recent results on the use of several sparse representations for getting better denoising performance. An algorithm to generate such set of representations is developed, and our analysis shows that by this we approximate the minimum-mean-squared-error (MMSE) estimator, thus getting better results.

Scaling up a single image while preserving is sharpness and visual-quality is a difficult and highly ill-posed inverse problem. A series of algorithms have been proposed over the years for its solution, with varying degrees of success. In CVPR 2008, Yang, Wright, Huang and Ma proposed a solution to this problem based on sparse representation modeling and dictionary learning. In this talk I present a variant of their method with several important differences. In particular, the proposed algorithm does not need a separate training phase, as the dictionaries are learned directly from the image to be scaled-up. Furthermore, the high-resolution dictionary is learned differently, by forcing its alignment with the low-resolution one. We show the benefit these modifications bring in terms of simplicity of the overall algorithm, and its output quality.

פירוק אטומי, הטבלה המחזורית, הרכבה של מולקולה … כל זה נשמע כמו תחילתה של הרצאה בכימיה. אבל לא! נושאים אלו יעלו בהרצאה שתדון בעיבוד תמונות. עיבוד תמונות הינו תחום מרכזי בחיינו – מהטלוויזיה בבתינו, המצלמה הדיגיטאלית שבכיסנו (ולאחרונה כחלק מהטלפון הסלולרי), דרך צפייה בסרטי די-וי-די, בקרת איכות בפסי ייצור, מערכות עקיבה ואבטחה, ועד צילומי אולטראסאונד, טומוגרפיה ותהודה מגנטית בבתי-חולים. בכל אלה ובעוד מוצרים רבים, עיבוד תמונות מהווה טכנולוגיה שאי-אפשר בלעדיה. תחום זה תוסס ופורח הן בתעשיה והן באקדמיה, עם עשרות אלפי מהנדסים ומדענים בכל רחבי העולם העוסקים בו יום יום ושעה שעה. אז מה זה עיבוד תמונות? זהו הנושא שבו נדון בהרצאה זו. עיבוד תמונות מתייחס לטיפול בתמונות ע”י מחשב. אנו נעסוק בשאלות כגון כיצד תמונה עושה את דרכה אל המחשב, כיצד היא אגורה שם, מה ניתן לעשות בה משכבר היא שם, ועוד. אחת המטרות המרכזיות בהרצאה זו היא הצגת חזית הידע בתחום, ובפרט העשייה המחקרית בטכניון בזירה זו. הדיון הכללי הנ”ל על עיבוד תמונות ישמש אותנו כמצע עליו נבנה כדי להציג את עבודתנו מהעת האחרונה, בה אנו עוסקים במודלים לתמונות המשתמשים ברעיונות “כימיקליים”, בעזרתם אנו מטפלים במידע בכלל ובתמונות בפרט. אנו נתאר כיצד ניתן לפרק תמונה לאטומים, לבנות טבלה מחזורית של יסודות לתיאור תמונות, וכיצד אנו רותמים את כל אלה כדי לפתור בעיות מעשיות בתחום, כגון שיפור תמונות וסרטים ותיקונם מקלקולים שונים, השלמת חלקים חסרים בתמונות, דחיסה, ועוד

Cleaning of noise from signals is a classical and long-studied problem in signal processing. Algorithms for this task necessarily rely on an a-priori knowledge about the signal characteristics, along with information about the noise properties. For signals that admit sparse representations over a known dictionary, a commonly used denoising technique is to seek the sparsest representation that synthesizes a signal close enough to the corrupted one. As this problem is too complex in general, approximation methods, such as greedy pursuit algorithms, are often employed. In this line of reasoning, we are led to believe that detection of the sparsest representation is key in the success of the denoising goal. Does this means that other competitive and slightly inferior sparse representations are meaningless? Suppose we are served with a group of competing sparse representations, each claiming to explain the signal differently. Can those be fused somehow to lead to a better result? Surprisingly, the answer to this question is positive; merging these representations can form a more accurate, yet dense, estimate of the original signal even when the latter is known to be sparse. In this talk we demonstrate this behavior, propose a practical way to generate such a collection of representations by randomizing the Orthogonal Matching Pursuit (OMP) algorithm, and produce a clear analytical justification for the superiority of the associated Randomized OMP (RandOMP) algorithm. We show that while the Maximum a-posterior Probability (MAP) estimator aims to find and use the sparsest representation, the Minimum Mean-Squared-Error (MMSE) estimator leads to a fusion of representations to form its result. Thus, working with an appropriate mixture of candidate representations, we are surpassing the MAP and tending towards the MMSE estimate, and thereby getting a far more accurate estimation, especially at medium and low SNR. Another topic covered in thistalk concerns the case of a unitary dictionary. In such a case it is well-known that the MAP estimators has a closed-form and exact solution, and OMP is accurately computing it. Can a similar result be derived for MMSE? We show that this is indeed possible, obtaining a recursive formula that computes the MMSE simply and exactly.

In this talk we describe applications such as image denoising and beyond using sparse and redundant representations. Our focus is on ways to perform these tasks with trained dictionaries using the K-SVD algorithm. As trained dictionaries are limited in handling small image patches, we deploy these within a Bayesian reconstruction procedure by forming an image prior that forces every patch in the resulting image to have a sparse representation.

Cleaning of noise from signals is a classical and long-studied problem in signal processing. Algorithms for this task necessarily rely on an a-priori knowledge about the signal characteristics, along with information about the noise properties. For signals that admit sparse representations over a known dictionary, a commonly used denoising technique is to seek the sparsest representation that synthesizes a signal close enough to the corrupted one. As this problem is too complex in general, approximation methods, such as greedy pursuit algorithms, are often employed. In this line of reasoning, we are led to believe that detection of the sparsest representation is key in the success of the denoising goal. Does this means that other competitive and slightly inferior sparse representations are meaningless? Suppose we are served with a group of competing sparse representations, each claiming to explain the signal differently. Can those be fused somehow to lead to a better result? Surprisingly, the answer to this question is positive; merging these representations can form a more accurate, yet dense, estimate of the original signal even when the latter is known to be sparse. In this talk we demonstrate this behavior, propose a practical way to generate such a collection of representations by randomizing the Orthogonal Matching Pursuit (OMP) algorithm, and produce a clear analytical justification for the superiority of the associated Randomized OMP (RandOMP) algorithm. We show that while the Maximum a-posterior Probability (MAP) estimator aims to nd and use the sparsest representation, the Minimum Mean-Squared-Error (MMSE) estimator leads to a fusion of representations to form its result. Thus, working with an appropriate mixture of candidate representations, we are surpassing the MAP and tending towards the MMSE estimate, and thereby getting a far more accurate estimation, especially at medium and low SNR.

In this survey talk we focus on the use of sparse and redundant representations and learned dictionaries for image denoising and other related problems. We discuss the the K-SVD algorithm for learning a dictionary that describes the image content effectively. We then show how to harness this algorithm for image denoising, by working on small patches and forcing sparsity over the trained dictionary. The above is extended to color image denoising and inpainitng, video denoising, and facial image compression, leading in all these cases to state of the art results. We conclude with very recent results on the use of several sparse representations for getting better denoising performance. An algorithm to generate such set of representations is developed, and our analysis shows that by this method we approximate the minimum-mean-squared-error (MMSE) estimator, thus getting better results.

Modeling of signals or images by a sparse and redundant representation is shown in recent years to be very effective, often leading to stat-of-the-art results in many applications. Applications leaning on this model can be cast as energy minimization problems, where the unknown is a high-dimensional and very sparse vector. Surprisingly, traditional tools in optimization, including very recently developed interior-point algorithms, tend to perform very poorly on these problems. A recently emerging alternative is a family of techniques, known as “iterated-shrinkage” methods. There are various and different such algorithms, but common to them all is the fact that each of their iterations require a simple forward and inverse transform (e.g. wavelet), and a scalar shrinkage look-up-table (LUT) step. In this talk we shall explain the need for such algorithms, present some of them, and show how they perform on a classic image deblurring problem.

In this very brief talk I describe the need to model images in general, and then briefly present the Sparse-Land model. The talk includes a demonstration of a sequence of applications in image processing where this model has been deployed successfully, including denoising of still, color and video images, inpainting, and compression. The moral to take home is: “The Sparse-Land model is a new and promising model that can adapt to many types of data sources. Its potential for medical imaging is an important opportunity that should be explored”.

In this talk we consider several inverse problems in image processing, using sparse and redundant representations over trained dictionaries. Using the K-SVD algorithm, we obtain a dictionary that describes the image content effectively. Two training options are considered: using the corrupted image itself, or training on a corpus of high-quality image database. Since the K-SVD is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm for gray-level images with state-of-the-art denoising performance. We then extend these results to color images, handling their denoising, inpainting, and demosaicing. Following the above ideas, with necessary modifications to avoid color artifacts and over-fitting, we present stat-of-the art results in each of these applications. Another extension considered is video denoising — we demonstrate how the above method can be extended to work with 3D patches, propagate the dictionary from one frame to another, and get both improved denoising performance while also reducing substantially the computational load per pixel.

In signal and image processing, we often use transforms in order to simplify operations or to enable better treatment to the given data. A recent trend in these fields is the use of over complete linear transforms that lead to a sparse description of signals. This new breed of methods is more difficult to use, often requiring more computations. Still, they are much more effective in applications such as signal compression and inverse problems. In fact, much of the success attributed to the wavelet transform in recent years, is directly related to the above-mentioned trend. In this talk we will present a survey of this recent path of research, and its main results. We will discuss both the theoretic and the application sides to this field. No previous knowledge is assumed (… just common sense, and little bit of linear algebra).

We address the image denoising problem, where zero mean white and homogeneous Gaussian additive noise should be removed from a given image. The approach taken is based on sparse and redundant representations over a trained dictionary. The proposed algorithm denoises the image, while simultaneously training a dictionary on its (corrupted) content using the K-SVD algorithm. As the dictionary training algorithm is limited in handling small image patches, we extend its deployment to arbitrary image sizes by defining a global image prior that forces sparsity over patches in every location in the image. We show how such Bayesian treatment leads to a simple and effective denoising algorithm, with state-of-the-art performance, equivalent and sometimes surpassing recently published leading alternative denoising methods.

In this talk we present a novel method for separating images into texture and piece-wise smooth parts, and show how this formulation can also lead to image inpainting. Our separation and inpainting processes are based on sparse and redundant representations of the two contents – cartoon and texture – over different dictionaries. Using the Basis Pursuit Denoising (BPDN) to formulate the overall penalty function, we achieve a separation of the image, denoising, and inpainting. In fact, with a small modification, the damn thing can make coffee.

Recently, three independent works suggested iterated shrinkage algorithms that generalizes the classic work by Donoho and Johnston. Daubechies, Defrise and De-Mol developed such an algorithm for deblurring, using the concept of surrogate functions. Figueirido and Nowak used the EM algorithm to construct such algorithm, and later extended their work by using the bound-optimization technique. Elad developed such an algorithm based on a parallel coordinate descent (PCD) point of view. In this talk we describe these methods with an emphasis on the later, and demonstrate how it can be of importance for the minimization of a general basis pursuit penalty function. As such, the proposed algorithms form a new pursuit technique, falling in between the basis pursuit and the matching pursuit.

Shrinkage is a well known and appealing denoising technique. The use of shrinkage is known to be optimal for Gaussian white noise, provided that the sparsity on the signal’s representation is enforced using a unitary transform. Still, shrinkage is also practiced successfully with non-unitary, and even redundant representations. In this lecture we shed some light on this behavior. We show that simple shrinkage could be interpreted as the first iteration of an algorithm that solves the basis pursuit denoising (BPDN) problem. Thus, this work leads to a sequential shrinkage algorithm that can be considered as a novel and effective pursuit method. We demonstrate this algorithm, both synthetically, and for the image denoising problem, showing in both cases its superiority over several popular alternatives..

In recent years there is a growing interest in the study of sparse representation for signals. Using an over-complete dictionary that contains prototype signal-atoms, signals are described as sparse linear combinations of these atoms. Recent activity in this field concentrated mainly on the study of pursuit algorithms that decompose signals with respect to a given dictionary. Designing dictionaries to better fit the above model can be done by either selecting pre-specified transforms, or by adapting the dictionary to a set of training signals. Both these techniques have been considered in recent years, however this topic is largely still open. In this presentation we address the latter problem of designing dictionaries, and introduce the K-SVD algorithm for this task. We show how this algorithm could be interpreted as a generalization of the K-Means clustering process, and demonstrate its behavior in both synthetic tests and in applications on real data. The accompanying paper also describes its generalization to non-negative matrix factorization problem that suits signals generated under an additive model with positive atoms.

A recent trend in signal, image, and data analysis is the use of overcomplete linear transforms that lead to a sparse description of the processed data. This new breed of methods is more difficult to use, but they are much more effective in applications such as data compression and solution of inverse problems. In this talk we present a wide angle view to this recent path of research.

In signal and image processing, we often use transforms in order to simplify operations or to enable better treatment to the given data. A recent trend in these fields is the use of overcomplete linear transforms that lead to a sparse description of signals. This new breed of methods is more difficult to use, often requiring more computations. Still, they are much more effective in applications such as signal compression and inverse problems. In fact, much of the success attributed to the wavelet transform in recent years, is directly related to the above-mentioned trend. In this talk we will present a survey of this recent path of research, its main results, and the involved players and their contributions. We will discuss both the theoretic and the application sides to this field. No previous knowledge is assumed.

Transforming signals is typically done in order to simplify their representations. Among the many ways to do so, the use of linear combinations taken over a redundant dictionary is appealing due to both its simplicity and its diversity. Choosing the sparsest of all solutions aligns well with our desire for a simple signal description, and this also leads to uniqueness. Since the search for the sparsest representation is NP-hard, methods such as the Basis-Pursuit (BP) and the Matching Pursuit (MP) have been proposed in the mid 90’s to approximate the desired sparse solution.

The pioneering work by Donoho and Huo (’99) started a sequence of research efforts, all aiming to theoretically understand the quality of approximations obtained by the pursuit algorithms, and the limits to their success. A careful study established that both BP and MP algorithms are expected to lead to the sparsest of all representations if indeed such solution is sparse enough. Later work generalized these results to the case where error is allowed in the representation. Very recent results addressed the same analysis from a probabilistic point of view, finding bounds on the average performance, and showing a close resemblance to empirical evidence.

All these results lead to the ability to use the pursuit algorithms with clear understanding of their expected behavior, in what Stanley Osher would have called “emotionally uninvolved” manner. This paves the way for future transforms that will be based on (i) overcomplete (redundant) representations, (ii) linear in constructing signals, and non-linear in their decomposition, and (iii) sparsity as their core force. Furthermore, as signal transforms, signal compression, and inverse problems, are all tangled together, we are now armed with new and effective tools when addressing many problems in signal and image processing.

In this talk we present a survey of this recent path of research, its main results, and the involved players and their contributions. We will discuss both the theoretic and the application sides to this field. No previous knowledge is assumed.

In this talk we present a novel method for separating images into texture and piecewise smooth parts, and show how this formulation can also lead to image inpainting. Our separation and inpainting process exploits both the variational and the sparsity mechanisms, by combining the Basis Pursuit Denoising (BPDN) algorithm and the Total-Variation (TV) regularization scheme.

The basic idea in this work is the use of two appropriate dictionaries, one for the representation of textures, and the other for the natural scene parts, assumed to be piece-wise-smooth. Both dictionaries are chosen such that they lead to sparse representations over one type of image-content (either texture or piecewise smooth). The use of the BPDN with the two augmented dictionaries leads to the desired separation, along with noise removal as a by-product. As the need to choose a proper dictionary for natural scene is very hard, a TV regularization is employed to better direct the separation process.

This concept of separation via sparse and over-complete representation of the image is shown to have a direct and natural extension to image inpainting. When some of the pixels in known locations in the image are missing, the same separation formulation can be changed to fit the problem of decomposing the image while filling in the holes. Thus, as a by-product of the separation we achieve inpainting. This approach should be compared to a recently published inpainting system by Bertalmio, Vese, Sapiro, and Osher. We will present several experimental results that validate the algorithm’s performance.

The separation of image content into semantic parts plays a vital role in applications such as compression, enhancement, restoration, and more. In recent years several pioneering works suggested such separation based on variational formulation, and others using independent component analysis and sparsity. In this talk we present a novel method for separating images into texture and piecewise smooth parts, exploiting both the variational and the sparsity mechanisms, by combining the Basis Pursuit Denoising (BPDN) algorithm and the Total-Variation (TV) regularization scheme.

The basic idea in our work is the use of two appropriate dictionaries, one for the representation of textures, and the other for the natural scene parts, assumed to be piece-wise-smooth. Both dictionaries are chosen such that they lead to sparse representations over one type of image-content (either texture or piecewise smooth). The use of the BPDN with the two augmented dictionaries leads to the desired separation, along with noise removal as a by-product. As the need to choose a proper dictionary for natural scene is very hard, a TV regularization is employed to better direct the separation process. We will present several experimental results that validate the algorithm’s performance.