## The PIRM challenge on perceptual super resolutionPart of the PIRM Workshop at ECCV 2018

PIRM Challenge report

Single-image super-resolution has gained much attention in recent years. The appearance of deep neural-net based methods and the great advancement in generative modeling (e.g. GANs) has facilitated a major performance leap. One of the ultimate goals in super-resolution is to produce outputs with high visual quality, as perceived by human observers. However, many works have observed a fundamental disagreement between this recent leap in performance, as quantified by common evaluation metrics (PSNR, SSIM), and the subjective evaluation of human observers (reported e.g. in the SRGAN and EnhanceNet papers).

This observation caused the formation of two distinct research directions. The first is aimed at improving the performance according to conventional evaluation metrics (e.g. PSNR), which frequently produce visually unpleasing results. The second, targets high perceptual quality, which commonly performs poorly according to conventional metrics. Previous benchmarks and challenges are mostly relevant for the first line of works.

The PIRM-SR challenge will compare and rank perceptual single-image super-resolution. In contrast to previous challenges, the evaluation will be done in a perceptual-quality aware manner based on [Blau and Michaeli, CVPR'18], and not based solely on distortion measurement (e.g. PSNR/SSIM). This unified approach quantifies the accuracy and perceptual quality of algorithms jointly, and will enable perceptual-driven methods to compete alongside algorithms that target PSNR maximization.

* References for the methods appearing in the figures above can be found in this paper.

The Task 4x super-resolution of images which were down-sampled with a bicubic kernel.

Evaluation The perception-distortion plane will be divided into three regions defined by thresholds on the RMSE. In each region, the winning algorithm is the one that achieves the best perceptual quality.

Perceptual quality will be quantified by combining the quality measures of [Ma et al.] and [NIQE] by $$\text{Perceptual index} = \tfrac{1}{2} ((10 - \text{Ma}) + \text{NIQE}).$$Notice that a lower perceptual index indicates better perceptual quality. In a case of a marginal difference in the perceptual index between two submissions (up to 0.01 apart), the submission with a lower RMSE will be ranked higher.

See [Blau and Michaeli, CVPR'18] for an explanation of the rationale behind this evaluation method.

Regions The three regions are defined by
Region 1: RMSE ≤ 11.5
Region 2: 11.5 < RMSE ≤ 12.5
Region 3: 12.5 < RMSE ≤ 16
We encourage participation in all three regions.

Data for evalutaion Algorithms will be evaluated on a set of 100 images. The validation set can be downloaded here. Scores are computed on the y-channel after removing a 4-pixel border.

Submission Submit your results on the validation set to see your ranking on the leaderboard. After registering, you will receive submission instructions. During the validation phase (until July 17th), each group is limited to 20 validation submissions in total.

Self validation Evaluate your results on your own with the validation set and code found here.

Discussion forum Ask questions and find previous responses on the challenge discussion forum

## Data

Validation and test set Two sets of 100 image pairs (high-res and low-res).

Validation code A Matlab code for computing the RMSE and perceptual index.

## Final submission

Test dataset In the test phase (last week of the challenge), an additional set of 100 images will be released for the final evaluation.

Final submission In addition to the test set results, each group must submit: (i) a test code/executable for reproducing the results, and (ii) a fact sheet describing the method (a format will be released).

Final results and ranking The final results will be announced at the PIRM Workshop (in conjunction with ECCV'18).

Paper submission (optional) Challenge participant are invited to submit papers for the ECCV workshop proceedings. Papers will be accepted based on: (i) academic quality, and (ii) challenge ranking. The length limit is 14 pages (excluding references) in ECCV format.

## Final Ranking

* 3rd place in region 1 and 2nd place in region 2 are shared equally by two teams due to a marginal difference in both the perceptual index and the RMSE.
* For marginal differences in the perceptual index between two submissions (up to 0.01 apart), the submission with a lower RMSE was ranked higher.

* The CX, EDSR and ENet baselines (marked in red) did not participating in the challenge.

Technion, Israel

Technion, Israel