Supplementary Material for Video Propagation Networks

Similar documents
Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Optimized Color Based Compression

An Image Compression Technique Based on the Novel Approach of Colorization Based Coding

Improved Performance For Color To Gray And Back Using Walsh, Hartley And Kekre Wavelet Transform With Various Color Spaces

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Predicting Aesthetic Radar Map Using a Hierarchical Multi-task Network

Convolutional Neural Network-Based Block Up-sampling for Intra Frame Coding

LEARNING AUDIO SHEET MUSIC CORRESPONDENCES. Matthias Dorfer Department of Computational Perception

Deep Aesthetic Quality Assessment with Semantic Information

Satoshi Iizuka* Edgar Simo-Serra* Hiroshi Ishikawa Waseda University. (*equal contribution)

Image-to-Markup Generation with Coarse-to-Fine Attention

Less is More: Picking Informative Frames for Video Captioning

Stereo Super-resolution via a Deep Convolutional Network

Color Image Compression Using Colorization Based On Coding Technique

A robust video encoding scheme to enhance error concealment of intra frames

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

An Introduction to Deep Image Aesthetics

arxiv: v1 [cs.sd] 5 Apr 2017

Seamless Workload Adaptive Broadcast

Wind Noise Reduction Using Non-negative Sparse Coding

Modeling and Optimization of a Systematic Lossy Error Protection System based on H.264/AVC Redundant Slices

arxiv: v2 [cs.mm] 17 Jan 2018

Joint Image and Text Representation for Aesthetics Analysis

Photo Aesthetics Ranking Network with Attributes and Content Adaptation

arxiv: v2 [cs.cv] 27 Jul 2016

Technical report on validation of error models for n.

Performance Evaluation of Error Resilience Techniques in H.264/AVC Standard

A Discriminative Approach to Topic-based Citation Recommendation

Singer Identification

EMBEDDED ZEROTREE WAVELET CODING WITH JOINT HUFFMAN AND ARITHMETIC CODING

Content storage architectures

Improving Performance in Neural Networks Using a Boosting Algorithm

First Step Towards Enhancing Word Embeddings with Pitch Accents for DNN-based Slot Filling on Recognized Text

IMAGE AESTHETIC PREDICTORS BASED ON WEIGHTED CNNS. Oce Print Logic Technologies, Creteil, France

A COMPARATIVE STUDY ALGORITHM FOR NOISY IMAGE RESTORATION IN THE FIELD OF MEDICAL IMAGING

COLORIMETRIC characterization of an imaging device

Rebroadcast Attacks: Defenses, Reattacks, and Redefenses

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?

CHAPTER-9 DEVELOPMENT OF MODEL USING ANFIS

Image Aesthetics and Content in Selecting Memorable Keyframes from Lifelogs

Robust Joint Source-Channel Coding for Image Transmission Over Wireless Channels

Recognition and Summarization of Chord Progressions and Their Application to Music Information Retrieval

Semantic Image Segmentation via Deep Parsing Network

Video Color Conceptualization using Optimization

On the mathematics of beauty: beautiful music

Large scale Visual Sentiment Ontology and Detectors Using Adjective Noun Pairs

arxiv: v1 [cs.cv] 27 Jan 2018

Error concealment techniques in H.264 video transmission over wireless networks

VECTOR REPRESENTATION OF EMOTION FLOW FOR POPULAR MUSIC. Chia-Hao Chung and Homer Chen

On the mathematics of beauty: beautiful images

Flip-flop Clustering by Weighted K-means Algorithm

OPTICAL MUSIC RECOGNITION WITH CONVOLUTIONAL SEQUENCE-TO-SEQUENCE MODELS

DWT Based-Video Compression Using (4SS) Matching Algorithm

Ensemble LUT classification for degraded document enhancement

Toward Multi-Modal Music Emotion Classification

Predicting Time-Varying Musical Emotion Distributions from Multi-Track Audio

Reduced-reference image quality assessment using energy change in reorganized DCT domain

AUTOMATIC MAPPING OF SCANNED SHEET MUSIC TO AUDIO RECORDINGS

Interacting with a Virtual Conductor

An AI Approach to Automatic Natural Music Transcription

arxiv: v1 [cs.lg] 16 Dec 2017

SCALABLE video coding (SVC) is currently being developed

Optimization and Emulation Analysis on Sampling Model of Servo Burst

INTEGRATED CIRCUITS. AN219 A metastability primer Nov 15

Audio Cover Song Identification using Convolutional Neural Network

arxiv: v1 [cs.cv] 16 Jul 2017

Investigation of Two Bidirectional C + L Band Fiber Amplifiers with Pumping Sharing and Wavelength Reused Mechanisms

Common assumptions in color characterization of projectors

specification MSE series MSE and MSE+ colorimeter

Comment on the history of the stretched exponential function

On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks

On the Characterization of Distributed Virtual Environment Systems

Data flow architecture for high-speed optical processors

Sodern recent development in the design and verification of the passive polarization scramblers for space applications

A KIND OF COAXIAL RESONATOR STRUCTURE WITH LOW MULTIPACTOR RISK. Engineering, University of Electronic Science and Technology of China, Sichuan, China

Masking in Chrominance Channels of Natural Images Data, Analysis, and Prediction

Copy Move Image Forgery Detection Method Using Steerable Pyramid Transform and Texture Descriptor

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT

AN EVER increasing demand for wired and wireless

Automatic Extraction of Popular Music Ringtones Based on Music Structure Analysis

Lecture-47 INTEL 8085A INTERRUPT STRUCTURE

Speech Enhancement Through an Optimized Subspace Division Technique

Audio-Based Video Editing with Two-Channel Microphone

Instructions for Contributors to the APSIPA Transactions on Signal and Information Processing

Music Mood. Sheng Xu, Albert Peyton, Ryan Bhular

ABSOLUTE OR RELATIVE? A NEW APPROACH TO BUILDING FEATURE VECTORS FOR EMOTION TRACKING IN MUSIC

PACKET-SWITCHED networks have become ubiquitous

ROBUST REGION-OF-INTEREST SCALABLE CODING WITH LEAKY PREDICTION IN H.264/AVC. Qian Chen, Li Song, Xiaokang Yang, Wenjun Zhang

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Singer Traits Identification using Deep Neural Network

Analysis, Synthesis, and Perception of Musical Sounds

Interactive Classification of Sound Objects for Polyphonic Electro-Acoustic Music Annotation

Multiple-Window Spectrogram of Peaks due to Transients in the Electroencephalogram

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Digital holographic security system based on multiple biometrics

Research on sampling of vibration signals based on compressed sensing

Research Article Network-Aware Reference Frame Control for Error-Resilient H.264/AVC Video Streaming Service

GENDER IDENTIFICATION AND AGE ESTIMATION OF USERS BASED ON MUSIC METADATA

Transcription:

Supplementary Material for Video Propagation Networks Varun Jampani 1, Raghudeep Gadde 1,2 and Peter V. Gehler 1,2 1 Max Planck Institute for Intelligent Systems, Tübingen, Germany 2 Bernstein Center for Computational Neuroscience, Tübingen, Germany {varun.jampani,raghudeep.gadde,peter.gehler}@tuebingen.mpg.de 1. Parameters and Additional Results In this supplementary, we present experiment protocols and additional qualitative results for experiments on video object segmentation, semantic video segmentation and video color propagation. Table 1 shows the feature scales and other parameters used in different experiments. Figures 1, 2 show some qualitative results on video object segmentation with some failure cases in Fig. 3. Figure 4 shows some qualitative results on semantic video segmentation and Fig. 5 shows results on video color propagation. Experiment Feature Type Feature Scale-1, Λ a Feature Scale-2, Λ b α Input Frames Loss Type Video Object Segmentation (x, y, Y, Cb, Cr, t) (0.02,0.02,0.07,0.4,0.4,0.01) (0.03,0.03,0.09,0.5,0.5,0.2) 0.5 9 Logistic Semantic Video Segmentation with CNN1 [5]-NoFlow (x, y, R, G, B, t) (0.08,0.08,0.2,0.2,0.2,0.04) (0.11,0.11,0.2,0.2,0.2,0.04) 0.5 3 Logistic with CNN1 [5]-Flow (x+u x, y+u y, R, G, B, t) (0.11,0.11,0.14,0.14,0.14,0.03) (0.08,0.08,0.12,0.12,0.12,0.01) 0.65 3 Logistic with CNN2 [3]-Flow (x+u x, y+u y, R, G, B, t) (0.08,0.08,0.2,0.2,0.2,0.04) (0.09,0.09,0.25,0.25,0.25,0.03) 0.5 4 Logistic Video Color Propagation (x, y, I, t) (0.04,0.04,0.2,0.04) No second kernel 1 4 MSE Table 1. Experiment Protocols. Experiment protocols for the different experiments presented in this work. Feature Types: Feature spaces used for the bilateral convolutions, with position (x, y) and color (R, G, B or Y, Cb, Cr) features [0, 255]. u x, u y denotes optical flow with respect to the present frame and I denotes grayscale intensity. Feature Scales (Λ a, Λ b ): Validated scales for the features used. α: Exponential time decay for the input frames. Input Frames: Number of input frames for. Loss Type: Type of loss used for back-propagation. MSE corresponds to Euclidean mean squared error loss and Logistic corresponds to multinomial logistic loss. 1

Frame 5 Frame 10 Frame 15 Frame 20 Frame 25 Frame 30 -DLab Figure 1. Video Object Segmentation. Shown are the different frames in example videos with the corresponding ground truth () masks, predictions from [2], [4], (-Stage2) and -DLab (-DeepLab) models.

Frame 5 Frame 15 Frame 30 Frame 50 -DLab Input Video Frame 5 Frame 10 Frame 20 Frame 30 -DLab Figure 2. Video Object Segmentation. Shown are the different frames in example videos with the corresponding ground truth () masks, predictions from [2], [4], (-Stage2) and -DLab (-DeepLab) models.

Frame 4 Frame 14 Frame 24 Frame 36 -DLab Input Video Frame 5 Frame 15 Frame 30 Frame 50 -DLab Figure 3. Failure Cases for Video Object Segmentation. Shown are the different frames in example videos with the corresponding ground truth () masks, predictions from [2], [4], (-Stage2) and -DLab (-DeepLab) models.

Input CNN +(Ours) Figure 4. Semantic Video Segmentation. Input video frames and the corresponding ground truth () segmentation together with the predictions of CNN [5] and with -Flow.

Frame 2 Frame 7 Frame 13 Frame 19 (Ours) Levin et al. Input Video -Color Frame 2 Frame 7 Frame 13 Frame 19 (Ours) Levin et al. -Color Figure 5. Video Color Propagation. Input grayscale video frames and corresponding ground-truth () color images together with color predictions of Levin et al. [1] and -Stage1 models.

References [1] A. Levin, D. Lischinski, and Y. Weiss. Colorization using optimization. ACM Transactions on Graphics (ToG), 23(3):689 694, 2004. 6 [2] N. Märki, F. Perazzi, O. Wang, and A. Sorkine-Hornung. Bilateral space video segmentation. In Computer Vision and Pattern Recognition, IEEE Conference on, pages 743 751, 2016. 2, 3, 4 [3] S. R. Richter, V. Vineet, S. Roth, and V. Koltun. Playing for data: Ground truth from computer games. In European Conference on Computer Vision, pages 102 118. Springer, 2016. 1 [4] Y.-H. Tsai, M.-H. Yang, and M. J. Black. Video segmentation via object flow. In Computer Vision and Pattern Recognition, IEEE Conference on, 2016. 2, 3, 4 [5] F. Yu and V. Koltun. Multi-scale context aggregation by dilated convolutions. International Conference on Learning Representations, 2016. 1, 5