The PIRM challenge on perceptual super resolution

Part of the PIRM Workshop at ECCV 2018

Single-image super-resolution has gained much attention in recent years. The appearance of deep neural-net based methods and the great advancement in generative modeling (e.g. GANs) has facilitated a major performance leap. One of the ultimate goals in super-resolution is to produce outputs with high visual quality, as perceived by human observers. However, many works have observed a fundamental disagreement between this recent leap in performance, as quantified by common evaluation metrics (PSNR, SSIM), and the subjective evaluation of human observers (reported e.g. in the SRGAN and EnhanceNet papers).

This observation caused the formation of two distinct research directions. The first is aimed at improving the performance according to conventional evaluation metrics (e.g. PSNR), which frequently produce visually unpleasing results. The second, targets high perceptual quality, which commonly performs poorly according to conventional metrics. Previous benchmarks and challenges are mostly relevant for the first line of works.

The PIRM-SR challenge will compare and rank perceptual single-image super-resolution. In contrast to previous challenges, the evaluation will be done in a perceptual-quality aware manner based on [Blau and Michaeli, CVPR'18], and not based solely on distortion measurement (e.g. PSNR/SSIM). This unified approach quantifies the accuracy and perceptual quality of algorithms jointly, and will enable perceptual-driven methods to compete alongside algorithms that target PSNR maximization.

* References for the methods appearing in the figures above can be found in this paper.

Task and Evaluation

The Task 4x super-resolution of images which were down-sampled with a bicubic kernel.

Evaluation The perception-distortion plane will be divided into three regions defined by thresholds on the RMSE. In each region, the winning algorithm is the one that achieves the best perceptual quality.

Perceptual quality will be quantified by combining the quality measures of [Ma et al.] and [NIQE] by $$\text{Perceptual index} = \tfrac{1}{2} ((10 - \text{Ma}) + \text{NIQE}).$$Notice that a lower perceptual index indicates better perceptual quality. In a case of a marginal difference in the perceptual index between two submissions (up to 0.01 apart), the submission with a lower RMSE will be ranked higher.

See [Blau and Michaeli, CVPR'18] for an explanation of the rationale behind this evaluation method.

Regions The three regions are defined by
Region 1: RMSE ≤ 11.5
Region 2: 11.5 < RMSE ≤ 12.5
Region 3: 12.5 < RMSE ≤ 16
We encourage participation in all three regions.

Data for evalutaion Algorithms will be evaluated on a set of 100 images. The validation set can be downloaded here. Scores are computed on the y-channel after removing a 4-pixel border.

Submission Submit your results on the validation set to see your ranking on the leaderboard. After registering, you will receive submission instructions. During the validation phase (until July 17th), each group is limited to 20 validation submissions in total.

Self validation Evaluate your results on your own with the self-validation set and code found here. These are not the validation images, but have an equal distribution of scenes, quality etc.

Discussion forum Ask questions and find previous responses on the challenge discussion forum


Self-validation set A set of 100 image pairs (high-res and low-res) for self-validation.

Self-validation code A Matlab code for self-validation.

Validation set A set of 100 images (low-res only) for submissions during the validation phase.

Test set A set of 100 images (low-res only) for submissions during the final test phase.

Fact sheet template A template for the fact sheet required in final submissions.

Final submission

Test dataset In the test phase (last week of the challenge), an additional set of 100 images will be released for the final evaluation.

Final submission In addition to the test set results, each group must submit: (i) a test code/executable for reproducing the results, and (ii) a fact sheet describing the method (a format will be released).

Final results and ranking The final results will be announced at the PIRM Workshop (in conjunction with ECCV'18).

Paper submission (optional) Challenge participant are invited to submit papers for the ECCV workshop proceedings. Papers will be accepted based on: (i) academic quality, and (ii) challenge ranking. The length limit is 14 pages (excluding references) in ECCV format.

Final Ranking

* 3rd place in region 1 and 2nd place in region 2 are shared equally by two teams due to a marginal difference in both the perceptual index and the RMSE.
* For marginal differences in the perceptual index between two submissions (up to 0.01 apart), the submission with a lower RMSE was ranked higher.

* The CX, EDSR and ENet baselines (marked in red) did not participating in the challenge.

Important dates



May 1st

Validation data released

July 18th

Test data released

July 25th

Final results submission deadline

July 27th

Fact sheet submission deadline

August 1st

Challenge results released to participants

August 22nd

Paper submission deadline
(optional for challenge participants only)

September 5th

Notification of accepted papers

September 14th

PIRM 2018 Workshop

September 30th

Camera-ready deadline

* All dates refer to 11:59 PM CET



Lihi Zelnik-Manor

Technion, Israel

Tomer Michaeli

Technion, Israel

Radu Timofte

ETH Zurich, Switzerland

Roey Mechrez

Technion, Israel

Yochai Blau

Technion, Israel