Talks
Super-Resolution
Super-Resolution
Image denoising is the most fundamental problem in image enhancement, and it is largely solved: It has reached impressive heights in performance and quality — almost as good as it can ever get. But interestingly, it turns out that we can solve many other problems using the image denoising “engine”. I will describe the Regularization by Denoising (RED) framework: using the denoising engine in defining the regularization of any inverse problem. The idea is to define an explicit image-adaptive regularization functional directly using a high performance denoiser. Surprisingly, the resulting regularizer is guaranteed to be convex, and the overall objective functional is explicit, clear and well-defined. With complete flexibility to choose the iterative optimization procedure for minimizing this functional, RED is capable of incorporating any image denoising algorithm as a regularizer, treat general inverse problems very effectively, and is guaranteed to converge to the globally optimal result.
Multi-channel TV broadcast, Internet video and You-Tube, home DVD movies, video conference calls, cellular video calls and more – there is no doubt that videos are abundant and in everyday use. In many cases, the quality of the available video is poor, something commonly referred to as “low-resolution”. As an example, High-definition (HD) TV’s are commonly sold these days to customers that hope to enjoy a better viewing experience. Nevertheless, most TV broadcast today is still done in standard-definition (SD), leading to poor image quality on these screens. The field of Super-Resolution deals with ways to improve video content to increase optical resolution. The core idea: fusion of the visual content in several images can be performed and this can lead to a better resolution outcome. For years it has been assumed that such fusion requires knowing the exact motion the objects undergo within the scene. Since this motion may be quite complex in general, this stood as a major obstacle for industrial applications. Three years ago a break-through has been made in this field, allowing to bypass the need for exact motion estimation. In this lecture we shall survey the work in this field from its early days (25 years ago) and till very recently, and show the evolution of ideas and results obtained. No prior knowledge in image processing is required.
Scaling up a single image while preserving is sharpness and visual-quality is a difficult and highly ill-posed inverse problem. A series of algorithms have been proposed over the years for its solution, with varying degrees of success. In CVPR 2008, Yang, Wright, Huang and Ma proposed a solution to this problem based on sparse representation modeling and dictionary learning. In this talk I present a variant of their method with several important differences. In particular, the proposed algorithm does not need a separate training phase, as the dictionaries are learned directly from the image to be scaled-up. Furthermore, the high-resolution dictionary is learned differently, by forcing its alignment with the low-resolution one. We show the benefit these modifications bring in terms of simplicity of the overall algorithm, and its output quality.
Super-resolution reconstruction proposes a fusion of several low quality images into one higher quality result with better optical resolution. Classic super resolution techniques strongly rely on the availability of accurate motion estimation for this fusion task. When the motion is estimated inaccurately, as often happens for non-global motion fields, annoying artifacts appear in the super-resolved outcome. Encouraged by recent developments on the video denoising problem, where state-of-the-art algorithms are formed with no explicit motion estimation, we seek a super-resolution algorithm of similar nature that will allow processing sequences with general motion patterns. In this talk we base our solution on the Non-Local-Means (NLM) algorithm. We show how this denoising method is generalized to become a relatively simple super-resolution algorithm with no explicit motion estimation. Results on several test movies show that the proposed method is very successful in providing super-resolution on general sequences.
The super-resolution reconstruction problem addresses the fusion of several low quality images into one higher-resolution outcome. A typical scenario for such a process could be the fusion of several video fields into a higher resolution output that can lead to high quality printout. The super-resolution result provides TRUE resolution, as opposed to the typically used interpolation techniques. The core idea behind this ability is the fact that higher-frequencies exist in the measurements, although in an aliased form, and those can be recovered due to the motion between the frames. Ever since the pioneering work by Tsai and Huang (1984), who demonstrated the core ability to get super-resolution, much work has been devoted by various research groups to this problem and ways to solve it. In this talk I intend to present the core ideas behind the super-resolution (SR) problem, and our very recent results in this field. Starting form the problem modeling, and posing the super-resolution task as a general inverse problem interpretation, we shall see how the SR problem can be addressed effectively using ML and later MAP estimation methods. This talk also show various ingredients that are added to the reconstruction process to make it robust and efficient. Many results will accompany these descriptions, so as to show the strengths of the methods.
In our field, when composing a super-resolved image, two ingredients contribute to the ability to get a leap in resolution: (i) the existence of many and diverse measurements, and (ii) the availability of a model to reliably describe the image to be produced. This second part, also known as the regularization or the prior, is of generic importance, and could be deployed to any inverse problem, and used by many other applications (compression, image synthesis, and more). The well-known recent work by Baker and Kanade (’02) and the work that followed (Lin and Shum ’04, Robinson and Milanfar ’05) all suggest that while the measurements are limited in gaining a resolution increase, the prior could be used to break this barrier. Clearly, the better the prior used, the higher the quality we can expect from the overall reconstruction procedure. Indeed, recent work on super-resolution (and other inverse problems) departs from the regular Tikhonov method, and tends to the robust counterparts, such as TV or the bilateral prior (see Farsiu et. al. ’04).
A recent trend with a growing popularity is the use of examples in defining the prior. Indeed, Baker and Kanade were the first to introduce this notion to the super-resolution task. There are several ways to use examples in shaping the prior to become better. The work by Mumford and Zhu (’99) and the follow-up contribution by Haber and Tenorio (02′) suggest a parametric approach. Baker and Kanade (’02), Freeman et. al. (several contributions ’01), Nakagaki and Katzaggelos (’03) all use the examples to directly learn the reconstruction function, by observing low-res. versus high-res. pairs.
In this talk we survey this line of work and show how it can be extended in several important ways. We show a general framework that builds an example-based prior that is independent of the inverse problem at hand, and we demonstrate it on several such problems, with promising results.
The super-resolution reconstruction process deals with the fusion of several low quality and low-resolution images into one higher-resolution and possibly better final image. We start by showing that from theoretic point of view, this fusion process is based on generalized sampling theorems due to Yen (1956) and Papulis (1977). When more realistic scenario is considered with blur, arbitrary motion, and additive noise, an estimation approach is considered instead.
We describe methods based on the Maximum-Likelihood (ML), Maximum-A-posteriori Probability (MAP), and the Projection onto Convex Sets POCS) as candidate tools to use. Underlying all these methods is the development of a model describing the relation between the measurements (low-quality images) and the desired output (high-resolution image). Through this path we presents the basic rational behind super-resolution, and then present the dichotomy between the static and the dynamic super-resolution process. We proposed treatment of both, and deal with several interesting special cases.