HarmonyTransformer

Transformer for Image Harmonization and Beyond

Zonghui Guo, Zhaorui Gu, Bing Zheng, Junyu Dong, and Haiyong Zheng^*

Intelligent Information Sensing and Processing Lab

College of Electronic Engineering, Ocean University of China

Abstract

Image harmonization, aiming to make composite images look more realistic, is an important and challenging task. The composite, synthesized by combining foreground from one image with background from another image, inevitably suffers from the issue of inharmonious appearance caused by distinct imaging conditions, i.e., lights. Current solutions mainly adopt an encoder-decoder architecture with convolutional neural network (CNN) to capture the context of composite images, trying to understand what it should look like in the foreground referring to surrounding background. In this work, we seek to solve image harmonization with Transformer, by leveraging its powerful ability of modeling long-range context dependencies, for adjusting foreground light to make it compatible with background light while keeping structure and semantics unchanged. We present the design of our two vision Transformer frameworks and corresponding methods, as well as comprehensive experiments and empirical study, demonstrating the power of Transformer and investigating the Transformer for vision. Our methods achieve state-of-the-art performance on the image harmonization as well as four additional vision and graphics tasks, i.e., image enhancement, image inpainting, white-balance editing, and portrait relighting, indicating the superiority of our work.

Source Code

Codes and models are all available at https://github.com/zhenglab/HarmonyTransformer.

Citation

Zonghui Guo, Zhaorui Gu, Bing Zheng, Junyu Dong, Haiyong Zheng*. Transformer for Image Harmonization and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, DOI: 10.1109/TPAMI.2022.3207091.

1. Transformer for Image Harmonization

Real

Composite

S²AM

DoveNet

RainNet

IntrinsicHarmony

DHT

DHT+

Real

Composite

HT(Normal)

HT+(Normal)

HT(Inverted)

HT+(Inverted)

Composite

DIH

S²AM

DoveNet

RainNet

IntrinsicHarmony

DHT

DHT+

2. Transformer for Image Enhancement

Input

DeepUPE

CSRNet

DeepLPF

SymmetricEn

DHT

HT+

Ground Truth

3. Transformer for Image Inpainting

Input

Partial-Conv

Gated-Conv

RFR-Net

CTSDG

HT+

Ground Truth

4. Transformer for White-Balance Editing

Input

KNN-WB

D-WBE

WB-HT+

Ground Truth

Input

D-WBE

WB-HT+

Ground Truth

5. Transformer for Portrait Relighting

Original

Desired SH

SIPR

DPR

ShadowMaskFace

DHT+

Ground Truth

Original

Desired

SIPR

DPR

ShadowMaskFace

DHT+