Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa Waseda University. (*equal contribution)

Similar documents
arxiv: v1 [cs.cv] 27 Jan 2018

An Introduction to Deep Image Aesthetics

Module 1: Digital Video Signal Processing Lecture 5: Color coordinates and chromonance subsampling. The Lecture Contains:

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

Advanced Computer Networks

Automatic Music Genre Classification

Optimized Color Based Compression

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text

Digital Media. Lecture 10: Video & Compression. Georgia Gwinnett College School of Science and Technology Modified from those of Dr.

An Image Compression Technique Based on the Novel Approach of Colorization Based Coding

SUMMIT LAW GROUP PLLC 315 FIFTH AVENUE SOUTH, SUITE 1000 SEATTLE, WASHINGTON Telephone: (206) Fax: (206)

ITU-T Video Coding Standards

Supplementary Material for Video Propagation Networks

Into the Depths: The Technical Details Behind AV1. Nathan Egge Mile High Video Workshop 2018 July 31, 2018

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

The H.26L Video Coding Project

AUDIOVISUAL COMMUNICATION

LSTM Neural Style Transfer in Music Using Computational Musicology

pdf Why CbCr?

Video 1 Video October 16, 2001

DeepID: Deep Learning for Face Recognition. Department of Electronic Engineering,

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Joint Image and Text Representation for Aesthetics Analysis

Convolutional Neural Network-Based Block Up-sampling for Intra Frame Coding

CS 7643: Deep Learning

MUSI-6201 Computational Music Analysis

Television History. Date / Place E. Nemer - 1

Experimenting with Musically Motivated Convolutional Neural Networks

Chapter 10 Basic Video Compression Techniques

A Novel Macroblock-Level Filtering Upsampling Architecture for H.264/AVC Scalable Extension

Reconfigurable Neural Net Chip with 32K Connections

Deep Jammer: A Music Generation Model

Image Steganalysis: Challenges

ImageNet Auto-Annotation with Segmentation Propagation

TRAFFIC SURVEILLANCE VIDEO MANAGEMENT SYSTEM

MPEG-2. ISO/IEC (or ITU-T H.262)

Efficient Implementation of Neural Network Deinterlacing

(12) Patent Application Publication (10) Pub. No.: US 2006/ A1

Chapter 2 Introduction to

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Color Image Compression Using Colorization Based On Coding Technique

Detecting Musical Key with Supervised Learning

Table of content. Table of content Introduction Concepts Hardware setup...4

Principles of Video Compression

The H.263+ Video Coding Standard: Complexity and Performance

Audio Structure Analysis

Multimedia Communications. Image and Video compression

Deep learning for music data processing

Chapter 2. Advanced Telecommunications and Signal Processing Program. E. Galarza, Raynard O. Hinds, Eric C. Reed, Lon E. Sun-

Video Compression. Representations. Multimedia Systems and Applications. Analog Video Representations. Digitizing. Digital Video Block Structure

Global Trade Medical Supplies

arxiv: v1 [cs.sd] 5 Apr 2017

Lecture 1: Introduction & Image and Video Coding Techniques (I)

Visual Communication at Limited Colour Display Capability

Improved Performance For Color To Gray And Back Using Walsh, Hartley And Kekre Wavelet Transform With Various Color Spaces

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Chrominance Subsampling in Digital Images

Information Transmission Chapter 3, image and video

Analysing Musical Pieces Using harmony-analyser.org Tools

Video Processing Applications Image and Video Processing Dr. Anil Kokaram

(12) United States Patent (10) Patent No.: US 6,867,549 B2. Cok et al. (45) Date of Patent: Mar. 15, 2005

Optimizing Digital Transfer of U-matic Video Recordings Leo Backman/DigiOmmel & Co.

Overview E248W-19203R. LED Panel. Features

Motion Video Compression

Error concealment techniques in H.264 video transmission over wireless networks

Improvement of MPEG-2 Compression by Position-Dependent Encoding

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

About Giovanni De Poli. What is Model. Introduction. di Poli: Methodologies for Expressive Modeling of/for Music Performance

Chord Classification of an Audio Signal using Artificial Neural Network

Overview: Video Coding Standards

VERY low bit-rate video coding has triggered intensive. Significance-Linked Connected Component Analysis for Very Low Bit-Rate Wavelet Video Coding

17 October About H.265/HEVC. Things you should know about the new encoding.

CS229 Project Report Polyphonic Piano Transcription

An Overview of Video Coding Algorithms

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Deep Aesthetic Quality Assessment with Semantic Information

Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network

MPEG has been established as an international standard

Processing. Electrical Engineering, Department. IIT Kanpur. NPTEL Online - IIT Kanpur

COMP 249 Advanced Distributed Systems Multimedia Networking. Video Compression Standards

Multimedia Communications. Video compression

Noise (Music) Composition Using Classification Algorithms Peter Wang (pwang01) December 15, 2017

Technical Specifications

A CLASSIFICATION-BASED POLYPHONIC PIANO TRANSCRIPTION APPROACH USING LEARNED FEATURE REPRESENTATIONS

Research on Color Reproduction Characteristics of Mobile Terminals

Frame Processing Time Deviations in Video Processors

Micro-DCI 53ML5100 Manual Loader

IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing

Automatic Piano Music Transcription

Stereo Super-resolution via a Deep Convolutional Network

OVE EDFORS ELECTRICAL AND INFORMATION TECHNOLOGY

Chapter 2 Video Coding Standards and Video Formats

Lecture 2 Video Formation and Representation

RECOMMENDATION ITU-R BT.1203 *

Real-Time Spectrogram (RTS tm )

Supplementary material for Inverting Visual Representations with Convolutional Networks

Image-to-Markup Generation with Coarse-to-Fine Attention

DCI Requirements Image - Dynamics

Transcription:

Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa Waseda University (*equal contribution)

Colorization of Black-and-white Pictures 2

Our Goal: Fully-automatic colorization 3

Colorization of Old Films 4

Related Work Scribble-based [Levin+ 2004; Yatziv+ 2004; An+ 2009; Xu+ 2013; Endo+ 2016] Specify colors with scribbles Require manual inputs [Levin+ 2004] Reference image-based [Chia+ 2011; Gupta+ 2012] Transfer colors of reference images Require very similar images Input Reference Output [Gupta+ 2012] 5

Related Work Automatic colorization with hand-crafted features [Cheng+ 2015] Uses existing multiple image features Computes chrominance via a shallow neural network Depends on the performance of semantic segmentation Only handles simple outdoor scenes Image features Input Neural Chroma Output 6

Contributions Novel end-to-end network that jointly learns global and local features for automatic image colorization New fusion layer that elegantly merges the global and local features Exploit classification labels for learning 7

Layers of Our Model Fully-connected layer All neurons are connected between layers Convolutional layer Takes into account underlying spatial structure No. of feature maps Neuron y x Fully-connected layer Convolutional layer 8

Our Model Mid-Level Features Fusion Layer Colorization Luminance Scaling Chrominance Upsampling Low-Level Features Global Features Two branches: local features and global features Composed of four networks 9

Low-Level Features Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features Extract low-level features such as edges and corners Lower resolution for efficient processing 10

Global Features Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features Compute a global 256-dimensional vector representation of the image 11

Mid-Level Features Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features Extract mid-level features such as texture 12

Fusion Layer Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features 13

Fusion Layer Combine the global features with the mid-level features The resulting features are independent of any resolution Mid-Level Features Fusion Layer y fusion u,v = σ b + W yglobal ymid u,v Global Features 14

Colorization Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features Compute chrominance from the fused features Restore the image to the input resolution 15

Training of Colors Mean Squared Error (MSE) as loss function Optimization using ADADELTA [Zeiler 2012] Adaptively sets a learning rate Forward Model MSE Backward Input Output Ground truth 16

Joint Training Mid-Level Features Fusion Layer Colorization Luminance Scaling Shared weights Chrominance Upsampling Low-Level Features Global Features Classification Training for classification jointly with the colorization Classification network connected to the global features 20.60% Formal Garden 16.13% Arch 13.50% Abbey 7.07% Botanical Garden 6.53% Golf Course Predicted labels 17

Dataset MIT Places Scene Dataset [Zhou+ 2014] 2.3 million training images with 205 scene labels 256 256 pixels Abbey Airport terminal Aquarium Baseball field Dining room Forest road Gas station Gift shop 18

Computational Time Colorize within a few seconds 80ms 20

Colorization of MIT Places Dataset 21

Comparisons Input [Cheng+ 2015] Ours Ours (w/o global features) (w/ global features) 22

Effectiveness of Global Features Input w/o global features w/ global features 23

User Study 10 users participated We show 500 images of each type: total 1,500 images per user 90% of our results are considered natural Natural Unnatural 24

Colorization of Historical Photographs Mount Moran, 1941 Scott's Run, 1937 Youngsters, 1912 Burns Basement, 1910 25

Style Transfer Low-Level Features 26

Style Transfer Low-Level Features 27

Style Transfer Adapting the colorization of one image to the style of another Local Global Local Global Local Global Inputs Output 28

Limitations Difficult to output colorful images Cannot restore exact colors Input Ground truth Output Input Ground truth Output 29

Conclusion Novel approach for image colorization by fusing global and local information Fusion layer Joint training of colorization and classification Style transfer Farm Land, 1933 California National Park, 1936 Homes, 1936 Spinners, 1910 Doffer Boys, 1909 30

Thank you! Project Page Code on GitHub! http://hi.cs.waseda.ac.jp/~iizuka/projects/colorization https://github.com/satoshiiizuka/siggraph2016_colorization Community Center, 1936 North Dome, 1936 Norris Dam, 1933 Miner, 1937 31