Transformer for Image Harmonization and Beyond
Intelligent Information Sensing and Processing Lab
College of Electronic Engineering, Ocean University of China
Abstract
Image harmonization, aiming to make composite images look more realistic, is an important and challenging task.
The composite, synthesized by combining foreground from one image with background from another image,
inevitably suffers from the issue of inharmonious appearance caused by distinct imaging conditions, i.e., lights.
Current solutions mainly adopt an encoder-decoder architecture with convolutional neural network (CNN) to capture the context of composite images,
trying to understand what it should look like in the foreground referring to surrounding background.
In this work, we seek to solve image harmonization with Transformer, by leveraging its powerful ability of modeling long-range context dependencies,
for adjusting foreground light to make it compatible with background light while keeping structure and semantics unchanged.
We present the design of our two vision Transformer frameworks and corresponding methods, as well as comprehensive experiments and empirical study,
demonstrating the power of Transformer and investigating the Transformer for vision. Our methods achieve state-of-the-art performance on the
image harmonization as well as four additional vision and graphics tasks, i.e., image enhancement, image inpainting, white-balance editing, and portrait relighting, indicating the superiority of our work.
Source Code
Codes and models are all available at https://github.com/zhenglab/HarmonyTransformer.
Citation
Zonghui Guo, Zhaorui Gu, Bing Zheng, Junyu Dong, Haiyong Zheng*. Transformer for Image Harmonization and Beyond. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, DOI: 10.1109/TPAMI.2022.3207091.
1. Transformer for Image Harmonization
Real
Composite
S2AM
DoveNet
RainNet
IntrinsicHarmony
DHT
DHT+
loading...
Real
Composite
HT(Normal)
HT+(Normal)
HT(Inverted)
HT+(Inverted)
loading...
loading...
Composite
DIH
S2AM
DoveNet
RainNet
IntrinsicHarmony
DHT
DHT+
loading...
2. Transformer for Image Enhancement
Input
DeepUPE
CSRNet
DeepLPF
SymmetricEn
DHT
HT+
Ground Truth
loading...
3. Transformer for Image Inpainting
Input
Partial-Conv
Gated-Conv
RFR-Net
CTSDG
HT
HT+
Ground Truth
loading...
4. Transformer for White-Balance Editing
Input
KNN-WB
D-WBE
WB-HT+
Ground Truth
loading...
Input
D-WBE
WB-HT+
Ground Truth
loading...
5. Transformer for Portrait Relighting
Original
Desired SH
SIPR
DPR
ShadowMaskFace
DHT+
Ground Truth
loading...
Original
Desired
SIPR
DPR
ShadowMaskFace
DHT+
loading...