Finding Dense Subgraphs via Low-Rank Bilinear Optimization

Similar documents
Outline. Introduction to number systems: sign/magnitude, ones complement, twos complement Review of latches, flip flops, counters

Beyond Worst Case Analysis in Approxima4on Uriel Feige The Weizmann Ins2tute

A DISPLAY INDEPENDENT HIGH DYNAMIC RANGE TELEVISION SYSTEM

Promises and challenges of electronic journals 169. Heting Chu Palmer School of Library & Information Science, Long Island University, NY, USA

Perceptual Quantiser (PQ) to Hybrid Log-Gamma (HLG) Transcoding

CS229 Project Report Polyphonic Piano Transcription

Singing Voice Conversion Using Posted Waveform Data on Music Social Media

Christine Baldwin Project Manager, SuperJournal. David Pullinger Project Director, SuperJournal

Computer Organization

A combination of approaches to solve Task How Many Ratings? of the KDD CUP 2007

Solution of Linear Systems

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

MVP: Capture-Power Reduction with Minimum-Violations Partitioning for Delay Testing

Lecture 3: Nondeterministic Computation

Chapter 12. Synchronous Circuits. Contents

ORTHOGONAL frequency division multiplexing

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research on sampling of vibration signals based on compressed sensing

Music Genre Classification

An Efficient Test Pattern Generator -Mersenne Twister-

Optimized Color Based Compression

Adaptive Key Frame Selection for Efficient Video Coding

An Experimental Comparison of Fast Algorithms for Drawing General Large Graphs

The Ukulele Circle of Fifths - Song Structure Lesson

JAMIA. Information Information for Authors

Reproducibility Assessment of Independent Component Analysis of Expression Ratios from DNA microarrays.

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

Music Genre Classification and Variance Comparison on Number of Genres

Normalization Methods for Two-Color Microarray Data

Visual Encoding Design

Design of Polar List Decoder using 2-Bit SC Decoding Algorithm V Priya 1 M Parimaladevi 2

On-Supporting Energy Balanced K-Barrier Coverage In Wireless Sensor Networks

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

Detecting Musical Key with Supervised Learning

Error Resilience for Compressed Sensing with Multiple-Channel Transmission

AUTOMATIC TIMBRE CLASSIFICATION OF ETHNOMUSICOLOGICAL AUDIO RECORDINGS

Implementation of a turbo codes test bed in the Simulink environment

Lecture Notes 12: Digital Cellular Communications

UC Berkeley UC Berkeley Previously Published Works

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

TERRESTRIAL broadcasting of digital television (DTV)

A Framework for Segmentation of Interview Videos

Composer Style Attribution

Modeling memory for melodies

An optimal broadcasting protocol for mobile video-on-demand

Feature-Based Analysis of Haydn String Quartets

Communication Avoiding Successive Band Reduction

Hidden Markov Model based dance recognition

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

CPU Bach: An Automatic Chorale Harmonization System

Automatic Rhythmic Notation from Single Voice Audio Sources

Analysis and Clustering of Musical Compositions using Melody-based Features

HEBS: Histogram Equalization for Backlight Scaling

Adaptive decoding of convolutional codes

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Route optimization using Hungarian method combined with Dijkstra's in home health care services

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

The PeRIPLO Propositional Interpolator

MPEG has been established as an international standard

Proceedings of the Third International DERIVE/TI-92 Conference

Cryptanalysis of LILI-128

2. AN INTROSPECTION OF THE MORPHING PROCESS

EDDY CURRENT IMAGE PROCESSING FOR CRACK SIZE CHARACTERIZATION

QDR SRAM DESIGN USING MULTI-BIT FLIP-FLOP M.Ananthi, C.Sathish Kumar 1. INTRODUCTION In memory devices the most

Section 6.8 Synthesis of Sequential Logic Page 1 of 8

Guidance For Scrambling Data Signals For EMC Compliance

QPQ1282TR7. LTE Band 1 BAW Duplexer for Small Cells. Product Overview. Key Features. Functional Block Diagram. Applications.

Neural Network for Music Instrument Identi cation

Multichannel Satellite Image Resolution Enhancement Using Dual-Tree Complex Wavelet Transform and NLM Filtering

Power-Driven Flip-Flop p Merging and Relocation. Shao-Huan Wang Yu-Yi Liang Tien-Yu Kuo Wai-Kei Tsing Hua University

Figure 9.1: A clock signal.

HIGH-DIMENSIONAL CHANGEPOINT DETECTION

Luma Adjustment for High Dynamic Range Video

Implementation of an MPEG Codec on the Tilera TM 64 Processor

USING MATLAB CODE FOR RADAR SIGNAL PROCESSING. EEC 134B Winter 2016 Amanda Williams Team Hertz

Analysis of local and global timing and pitch change in ordinary

Music Source Separation

VISUAL CONTENT BASED SEGMENTATION OF TALK & GAME SHOWS. O. Javed, S. Khan, Z. Rasheed, M.Shah. {ojaved, khan, zrasheed,

Effective Design of Multi-User Reception and Fronthaul Rate Allocation in 5G Cloud RAN

A Novel Video Compression Method Based on Underdetermined Blind Source Separation

2D ELEMENTARY CELLULAR AUTOMATA WITH FOUR NEIGHBORS

Practicum 3, Fall 2010

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

DXR.1 Digital Audio Codec

TIMBRE-CONSTRAINED RECURSIVE TIME-VARYING ANALYSIS FOR MUSICAL NOTE SEPARATION

On the Characterization of Distributed Virtual Environment Systems

1. (1pts) What is the Hamming distance between these two bit patterns: and ?

By Jon R. Davids, MD, Daniel M. Weigl, MD, Joye P. Edmonds, MLIS, AHIP, and Dawn W. Blackhurst, DrPH

Implementation and performance analysis of convolution error correcting codes with code rate=1/2.

Sequential Circuits. Building Block: Flip-Flops

Information Networks

Restoration of Hyperspectral Push-Broom Scanner Data

Keywords: Edible fungus, music, production encouragement, synchronization

Chapter 5. Synchronous Sequential Logic. Outlines

QSched v0.96 Spring 2018) User Guide Pg 1 of 6

Life Science Journal 2014;11(6)

Transcription:

Fining Dense Subgraphs via Low-Ran Bilinear Optimization Dimitris S. Papailiopoulos Ioannis Mitliagas Alexanros G. Dimais Constantine Caramanis The University of Texas at Austin Abstract Given a graph, the Densest -Subgraph (DS) problem ass for the subgraph on vertices that contains the largest number of eges. In this wor, we evelop a new algorithm for DS that searches a lowimensional space for provably goo solutions. Our algorithm comes with novel performance bouns that epen on the graph spectrum. Our graph-epenent bouns are in practice significantly tighter than worst case aprioribouns: for most teste realworl graphs we fin subgraphs with ensity provably within 70% of the optimum. Our algorithm runs in nearly linear time, uner spectral assumptions satisfie by most graphs foun in applications. Moreover, it is highly scalable an parallelizable. We emonstrate this by implementing it in MapReuce an executing numerous experiments on massive real-worl graphs that have up to billions of eges. We empirically show that our algorithm can fin subgraphs of significantly higher ensity compare to the previous state of the art. 1. Introuction Given a graph G on n vertices with m eges an a parameter, we are intereste in fining an inuce subgraph on vertices with the largest average egree, also nown as the maximum ensity. This is the Densest -Subgraph (DS) a funamental problem in combinatorial optimization with applications in numerous fiels incluing social sciences, communication networs, an biology (see e.g. (Hu et al., 2005; Gibson et al., 2005; Dourisboure et al., 2007; Saha et al., 2010; Miller et al., 2010; Bahmani et al., 2012)). Proceeings of the 31 st International Conference on Machine Learning, Beijing, China, 2014. JMLR: W&CP volume 32. Copyright 2014 by the author(s). imitris@utexas.eu ioannis@utexas.eu imais@austin.utexas.eu constantine@utexas.eu DS is a notoriously har problem. It is NP-har by reuction to MaxClique. Moreover, Khot showe in (Khot, 2004) that, uner wiely believe complexity theoretic assumptions, DS cannot be approximate within an arbitrary constant factor. 1 The best nown approximation ratio was n 1/3+ (for some small ) ue to (Feige et al., 2001). Recently, (Bhasara et al., 2010) introuce an algorithm with approximation ratio n 1/4+, that runs in time n O(1/ ). Such results, where the approximation factor scales as a polynomial in the number of vertices, are too pessimistic for realworl applications. The resistance to better approximation espite the long history of the problem suggests that DS is probably very har in the worst case. Our Contributions. In this wor we move beyon the worst case framewor. We present a novel DS algorithm that has two ey features: i) it comes with approximation guarantees that are surprisingly tight on real-worl graphs an ii) it is fully parallelizable an can scale up to graphs with billions of eges. Our algorithm combines spectral an combinatorial techniques; it relies on examining caniate subgraphs obtaine from vectors lying in a low-imensional subspace of the ajacency matrix of the graph. This is accomplishe through a framewor calle the Spannogram, which we efine below. At its core our algorithm has an exact solver for constant ran bilinear programs with combinatorial constraints. Our approximation guarantees are ata-epenent: they are relate to the spectrum of the ajacency matrix. Let opt enote the average egree (i.e., the ensity) of the ensest -subgraph, where 0 apple opt apple 1. Our algorithm taes as input the graph, the subgraph size an an accuracy parameter 2{1,...,n} an runs in time O(n +1 ). Note that uner a spectral conition satisfie by many real graphs, the computational time is reuce to nearly-linear as we subse- 1 approximation ratio means that there exists an algorithm that prouces in polynomial time a number A, such that 1 apple opt apple, where opt is the optimal ensity. A

Fining Dense Subgraphs via Low-Ran Bilinear Optimization quently iscuss. The output is a subgraph on vertices with ensity opt, for which we obtain the following approximation result: Theorem 1. For any unweighte graph opt opt/(2 + o n (1)) 2 +1. where i is the ith largest, in magnitue, eigenvalue of the ajacency matrix of the graph. Moreover, if the graph is bipartite, or if the largest eigenvalues of the graph are positive, then opt opt 2 +1. Our bouns come close to 2 + an 1 + factor approximations, when +1 is significantly smaller than the ensity of the ensest -subgraph. In the following theorem, we give such an example. However, we woul lie to note that in the worst case our bouns might not yiel something meaningful. Theorem 2. If the ensest--subgraph contains a constant fraction of all the eges, an = ( p E), then we can approximate DS within a factor of 2+, in time n O(1/ 2). If aitionally the graph is bipartite, we can approximate DS within a factor of 1+. The above result is similar to the 1 + approximation ratio of (Arora et al., 1995) for ense graphs, where the ensest--subgraph contains a constant fraction of the (n 2 ) eges, where = (n). The innovation here is that our ratio also applies to sparse graphs with sublinear number of eges. Computable upper bouns. In aition to these theoretical guarantees, our analysis allows us to obtain a graph-epenent upper boun for the optimal subgraph ensity. This is shown in Fig. 3 in our experimental section, where for many graphs our algorithm is provably within 70% from the upper boun of opt. These are far stronger guarantees than the best available aprioribouns. This illustrates the potential power of graph-epenent guarantees that, however, require the execution of an algorithm. Nearly-linear time approximation. Our algorithm has a worst-case running time of O(n +1 ), where controls the quality of the approximation. However, uner some mil spectral assumptions, a ranomize version of our algorithm runs in nearlylinear time. Theorem 3. Let the largest eigenvalues of the graph be positive an let the an +1 largest have constant ratio: / +1 = (1). Then, we can moify our algorithm to output, with high probability, a subgraph 1 with ensity (1 log n ) opt in nearly-linear time O(m log n + n log +1 n), where m is the number of eges. We foun that the above spectral conition hols for all apple 5, in many real-worl graphs that we teste. Scalability. We evelop two ey scalability features that allow us to scale up e ciently on massive graphs. Vertex sparsification: We introuce a pre-processing step that eliminates vertices that are unliely to be part of the ensest -subgraph. The elimination is base on the vertices weighte leverage scores (Mahoney & Drineas, 2009; Boutsiis et al., 2009) an amits a provable boun on the introuce error. We empirically foun that even with a negligible aitional error, the elimination ramatically reuce problem sizes in all teste atasets. MapReuce implementation: We show that our algorithm is fully-parallelizable an tailor it for the MapReuce framewor. We use our MapReuce implementation to run experiments on Elastic MapReuce (EMR) on Amazon. In our large-scale experiments, we were able to scale out to thousans of mappers an reucers in parallel over 800 cores, an fin large ense subgraphs in graphs with billions of eges. 1.1. Relate wor DS algorithms: One of the few positive results for DS is a 1 + approximation for ense graphs where m = (n 2 ), an in the linear subgraph setting = (n) (Arora et al., 1995). For some values of m = o(n 2 ) a 2 + approximation was establishe by (Suzui & Touyama, 2005). Moreover, for any = (n) a constant factor approximation is possible via a greey approach by (Asahiro et al., 2000), or via semiefinite relaxations by (Srivastav & Wolf, 1998) an (Feige & Langberg, 2001). Recently, (Alon et al., 2013) establishe new approximation results for graphs with small -ran, using an approximate DS solver for lowran perturbe versions of the ajacency matrix. There is a vast literature on algorithms for etecting communities an well-connecte subgraphs: greey schemes (Ravi et al., 1994), optimization approaches (Jethava et al., 2012; Aspremont et al., 2010; Ames, 2011), an the truncate power metho (Yuan & Zhang, 2011). We compare with various of these algorithms in our evaluation section. The Spannogram framewor: We present an exact solver for bilinear optimization problems on matrices of constant ran, uner {0, 1} an sparsity constraints on the variables. Our theory is a generalization of the Spannogram framewor, originally introuce in the founational wor of (Karystinos & Liavas, 2010) an further evelope in (Asteris et al., 2014; Papailiopoulos et al., 2013), that obtains exact solvers for low-ran quaratic optimization problems with combinatorial constraints, such as sparse PCA. MapReuce algorithms for graphs: The esign of

Fining Dense Subgraphs via Low-Ran Bilinear Optimization MapReuce algorithms for massive graphs is an active research area as Haoop becomes one of the stanars for storing large ata sets. The relate wor by Bahmani et al. (Bahmani et al., 2012) esigns a novel MapReuce algorithm for the ensest subgraph problem. This ensest subgraph problem requires fining a subgraph of highest normalize ensity without enforcing a specific subgraph size. Surprisingly, without a subgraph size restriction, the ensest subgraph becomes polynomially solvable an therefore funamentally i erent from what we consier in this paper. 2. Propose Algorithm The ensity of a subgraph inexe by a vertex set S {1,...,n} is equal to the average egree of the vertices within S: en(s) = 1T S A1 S S where A is the ajacency matrix (A i,j =1if(i, j) is an ege, else A i,j = 0) an the inicator vector 1 S has 1s in the entries inexe by S an 0 otherwise. Observe that 1 T S A1 S = P i,j2s A i,j is twice the number of eges in the subgraph with vertices in S. For a fixe subgraph size S =, we can express DS as a quaratic optimization: DS : opt =( 1 /) max S = 1T S A1 S where S = enotes that the optimization variable is a -vertex subset of {1,...,n}. The bipartite version of DS. We approximate DS via approximating its bipartite version. This problem can be expresse as a bilinear maximization: DBS : opt B =( 1 /) max X = max Y = 1T X A1 Y. As we see in the following lemma, the two problems are funamentally relate: a goo solution for the bipartite version of the problem maps to a half as goo solution for DS. The proof can be foun in the supplemental material, where we escribe how to convert an algorithm for DBS to one for DS. Lemma 1. A -approximation algorithm for DBS implies a 2 -approximation algorithm for DS. 2.1. DBS through low ran approximations. At the core of our approximation lies a constant ran solver: we show that DBS can be solve in polynomial time on constant ran matrices. We solve constant ran instances of DBS instea of DS ue to an important implication: DS is NP-har even for ran-1 matrices, as we show in the supplemental material. The exact steps of our low-ran DBS algorithm are given in the pseuo-coe tables referre to as Algorithms 1 an 2. 2 We continue with a high-level escription. Step 1: Obtain A = P i=1 iv i vi T, a ran- approximation of A. Here, i is the i-th largest in magnitue eigenvalue an v i the corresponing eigenvector. Step 2: Use A to obtain O(n ) caniate subgraphs. For any matrix A we can solve DBS by exhaustively n checing all n pairs (X, Y) of -subsets of vertices. Surprisingly, if we want to fin the X, Y pairs that maximize 1 T X A 1 Y, i.e., the bilinear problem on the ran- matrix A, then we show that only O(n ) caniate pairs nee to be examine. In the next section, we erive the constant ran-solver using two ey facts. First, for each fixe vertex set Y, we show that it is easy to fin the optimal set X that maximizes 1 T X A 1 Y. Since this turns out to be easy, then the challenge is to fin the number of ifferent vertex sets Y that we nee to chec. Do we n nee to exhaustively chec all supports Y? We show that this question is equivalent to searching the span of the first eigenvectors of A, an collecting in asets the top- coorinates of all vectors in that -imensional space. By moifying the Spannogram theory of (Karystinos & Liavas, 2010; Asteris et al., 2014; Papailiopoulos et al., 2013), we show how this set has size O(n ) an can be constructe in time O(n +1 ). This will imply that DBS can be solve in time O(n +1 ) on A. Computational Complexity. The worst-case time complexity of our algorithm is O(T +n +1 ), where T is the time to compute the first eigenvectors of A. Uner conitions satisfie by many real worl graphs, we show that the complexity reuces to nearly linear in the size of G: O(m log n + n log +1 n). Algorithm 1 lowrandbs(,, A) 1: [V, ]=EVD(A,) 2: S = Spannogram(, V ) 3: {X, Y } = arg max X = max Y2S 1 T X V V T 1 Y 4: Output: {X, Y } 1: Spannogram(, V ) 2: S = {top (v) :v 2 span(v 1,...,v )} 3: Output: S. Remar 1. In the following we present an exact solver for DBS on constant ran approximations of A. Our general algorithm for DS maes a number of O() calls to the DBS low-ran solver, on slightly ifferent matrices that are obtaine by sub-sampling the 2 In the pseuocoe of Algorithm 1, top (v), enotes the inices of the largest signe elements of v.

Fining Dense Subgraphs via Low-Ran Bilinear Optimization entries of a low-ran approximation of A. The etails of our general algorithm can be foun in the supplemental material. 2.2. Approximation Guarantees We approximate DBS by fining a solution to the constant ran problem max max X = Y = 1T X A 1 Y. We output a pair of vertex sets, X, Y,whichwerefer to as the ran- optimal solution, that has ensity opt B =( 1 /) 1 T X A1 Y. Our approximation guarantees measure how far opt B is from opt B, the optimal ensity for DBS. Our bouns capture a simple core iea: the loss in our approximation comes ue to solving the problem on A instea of solving it on the full ran matrix A. This loss is quantifie in the next lemma. The proofs of the following results are in the supplemental material. Lemma 2. For any matrix A opt B opt B 2 +1. where i is the ith largest eigenvalue of A. Using an appropriate pre-processing step an then running Algorithm 1 as a subroutine on a sub-sample an low-ran version of A, we output a -subgraph Z that has ensity opt. By essentially combining Lemmata 1 an 2 we obtain the following bouns. Theorem 1. The -subgraph inexe by Z has ensity opt opt/(2 + o n (1)) 2 +1. where i is the ith largest, in magnitue, eigenvalue of the ajacency matrix of the graph. Moreover, if the graph is bipartite, or if the largest eigenvalues of the graph are positive, then opt opt 2 +1. Using bouns on eigenvalues of graphs, Theorem 1 translates to the following approximation guarantees. Theorem 2. If the ensest--subgraph contains a constant fraction of all the eges, an = ( p E), then we can approximate DS within a factor of 2+, in time n O(1/ 2). If aitionally the graph is bipartite, then we can approximate DS within a factor of 1+. Remar 2. The above results are similar to the 1+ ratio of (Arora et al., 1995), which hols for graphs where the ensest--subgraph contains (n 2 ) eges. Graph epenent bouns. For any given graph, after running our algorithm, we can compute an upper boun to the optimal ensity opt via bouns on opt B, since it is easy to see that opt B opt. Our graph-epenent boun is the minimum of three upper bouns on the unnown optimal ensity: Lemma 3. The optimal ensity of DS can be boune as opt apple min ( 1 /) 1 T X A 1 Y + +1, 1, 1. In our experimental section, we plot the above upper bouns, an show that for most teste graphs our algorithm performs provably within 70% from the upper boun on the optimal ensity. These are far stronger guarantees than the best available aprioribouns. 3. The Spannogram Framewor In this section, we escribe how our constant ran solver operates by examining caniate vectors in a low-imensional span of A. Here, we wor on a ran- matrix A = v 1 u T 1 +...+ v u T where u i = i v i, an we wish to solve: max X = Y = 1T X v 1 u T 1 +...+ v u T 1 Y. (1) Observe that we can rewrite (1) in the following way apple max 1 T X v 1 (u T T 1 1 Y) +...+ v (u 1 Y) X = Y = {z } {z } c 1 c =max max 1 T X v Y, (2) Y = X = where v Y = v 1 c 1 +...+v c is an n-imensional vector generate by the -imensional subspace spanne by v 1,...,v. We will now mae a ey observation: for every fixe vector v Y in (2), the inex set X that maximizes 1 T X v Y can be easily compute. It is not har to see that for any fixe vector v Y,the-subset X that maximizes 1 T X v Y = X i2x[v Y ] i correspons the set of largest signe coorinates of v Y. That is, the locally optimal -set is top (v Y ). We now wish to fin all possible locally optimal sets X. If we coul possibly chec all vectors v Y,thenwe coul fin all locally optimal inex sets top (v Y ). Let us enote as S the set of all -subsets X that are the optimal solutions of the inner maximization of (2) for any vector v in the span of v 1,...,v S = {top ([v 1 c 1 +...+ v c ]) : c 1,...,c 2 R}.

Fining Dense Subgraphs via Low-Ran Bilinear Optimization Clearly, this set contains all possible locally optimal X sets of the form top (v Y ). Therefore, we can rewrite DBS on A as max max 1 T X A 1 Y. (3) Y = X2S The above problem can now be solve in the following way: for every set X 2 S fin the locally optimal set Y that maximizes 1 T X A 1 Y, that is, this will be top (A 1 X ). Then, we simply nee to test all such X, Y pairs on A an eep the optimizer. Due to the above, the problem of solving DBS on A is equivalent to constructing the set of -supports S, an then fining the optimal solution in that set. How large can S be an can we construct it in polynomial time? Initially one coul expect that the set S coul n have size as big as. Instea, we show that the set S will be tremenously smaller. Lemma 4. The set S has size at most O(n ) an can be built in time O(n +1 ) using Algorithm 2. 3.1. Constructing the set S We buil up to the general ran- algorithm by explaining special cases that are easier to unerstan. Ran-1 case. We start with the = 1 case, where we have S 1 = {top (c 1 v 1 ):c 1 2 R}. It is not har to see that there are only two supports to inclue in S 1 : top (v 1 ) an top ( v 1 ). These two sets can be constructe in time in time O(n), via a partial sorting an selection algorithm (Cormen et al., 2001). Hence, S 1 has size 2 an can be constructe in time O(n). Ran-2 case. This is the first non-trivial which exhibits the etails of the Spannogram algorithm. Let an auxiliary angle 2 =[0, ) an let i. 3 c =[ c1 c 2 ]= h sin cos Then, we re-express c 1 v 1 + c 2 v 2 in terms of v( )=sin v 1 + cos v 2. (4) This means that we can rewrite the set S 2 as: S 2 = {top (±(v( )), 2 [0, )}. Observe that each element of v( ) is a continuous spectral curve in : [v( )] i =[v 1] i sin( )+[v 2] i cos( ). Consequently, the top/bottom- supports of v( )(i.e., top (±v( ))) are themselves a function of. How can we fin all possible supports? 3 Observe that when we scan, the vectors c, c express all possible unit norm vectors on the circle. as 8 6 4 2 0 2 4 6 8 [v(φ)]1 [v(φ)]2 [v(φ)]3 [v(φ)]4 [v(φ)]5 10 0 0.5 1 1.5 2 2.5 3 φ Figure 1. A ran = 2 spannogram for n = 5 an two ranom vectors v 1, v 2. Observe that every two curves intersect in exactly one point. These intersection points efine intervals in which a top- set is invariant. The Spannogram. In Fig. 1, we raw an example plot of five curves [v( )] i, i =1,...,5, which we call a spannogram. From the spannogram in Fig. 1, we can see that the continuity of these sinusoial curves implies a local invariance property of the top/bottom supports top (±v( )), in a small neighborhoo aroun a fixe. So, when oes a top/bottom- support change? The inex sets top (±v( )) change if an only if two curves cross, i.e., when the orering of two elements [v( )] i,[v( )] j changes. Fining all supports: There are n curves an each pair intersects at exactly one point in the omain 4. Therefore, there are exactly n 2 intersection points. n n These 2 intersection points efine 2 +1 intervals. Within an interval the top/bottom supports top (±v( )) remain the same. Hence, it is now clear that S 2 apple2 n 2 = O(n2 ). A way to fin all supports in S 2 is to compute the v( i,j ) vectors on the intersection points of two curves i, j, an then the supports in the two ajacent intervals of such intersection point. The v( i,j ) vector on an intersection point of two curves i an j can be easily compute by first solving a set of linear equations [v( i,j )] i =[v( i,j )] j ) (e i e j ) T [v 1 v 2 ]c i,j = 0 2 1 for the unnown vector c i,j,wheree i is the i-th column of the n n ientity matrix, i.e., c i,j = nullspace((e i e j ) T [v 1 v 2 ]). Then, we compute v( i,j )=[v 1 v 2 ]c i,j. Further etails on breaing ties in top (v( i,j )) can be foun in the supplemental material. Computational cost: We have n 2 intersection points, where we calculate the top/bottom supports for each v( i,j ). The top/bottom elements of every v( i,j ) can be compute in time O(n) using a partial sorting 4 Here we assume that the curves are in general position. This can be always accomplishe by infinitesimally perturbing the curves as in (Papailiopoulos et al., 2013).

Fining Dense Subgraphs via Low-Ran Bilinear Optimization an selection algorithm (Cormen et al., 2001). Since we perform this routine a total of O( n 2 )times,the total complexity of our ran-2 algorithm is O(n 3 ). General Ran- case. The algorithm generalizes to arbitrary imension, as we show in the supplemental material; its pseuo-coe is given as Algorithm 2. Remar 3. Observe that the computation of each loop in line 2 of Algorithm 2 can be compute in parallel. This will allow us to parallelize the Spannogram. Algorithm 2 Spannogram(,V ) 1: S = ; 2: for all (i 1,...,i ) 0 2{1,...,n} 2 an s 3 2{ 1 1, 1} o [(V ] i1,: [V ] i2,: 3: c = s nullspace @ 4 5A. [V ] i1,: [V ] i,: 4: v = V T c 5: S =top (v) 6: T = S {i 1,...,i } 7: for all subsets J of (i T 1,...,i ) o S 8: S = S (T[J) 9: en for 10: en for 11: Output: S. 3.2. An approximate S in nearly-linear time In our exact solver, we solve DBS on A in time O(n +1 ). Surprisingly, when A has only positive eigenvalues, then we can tightly approximate DBS on A in nearly linear time. Theorem 3. Let the largest eigenvalues of the graph be positive an let / +1 = (1). Then, using Algorithm 3 as a subroutine in Algorithm 1 yiels, with 1 high probability, a subgraph with ensity (1 log n ) opt in nearly-linear time O(m log n + n log +1 n), where m is the number of eges. The main iea is that instea of checing all O(n ) possible sets in S, we can approximately solve the problem by ranomly sampling (log n) vectors in the span of v 1,...,v. Our proof is base on the fact that we can approximate the surface of the -imensional sphere with M = O(log +1 n) ranomly sample vectors from the span of v 1,...,v, which then allows us to ientify, with high probability, near-optimal caniates in S. The moifie algorithm is very simple an is given below; its analysis can be foun in the supplemental material. Algorithm 3 Spannogram approx(, V, ) 1: for i =1: 2 log +1 n o 2: v =( 1/2 V ) T rann(, 1) 3: S = S [ top (v) [ top ( v) 4: en for 5: Output: S. 4. Scaling up In this section, we present the two ey scalability features that allow us to scale up to graphs with billions of eges. 4.1. Vertex Sparsification We introuce a very simple an e cient pre-processing step for iscaring vertices that are unliely to appear in a top set in S. This step runs after we compute A an uses the leverage score, `i = P j=1 [V ] 2 i,j j, of the i-th vertex to ecie whether we will iscar it or not. We show in the supplemental material, that by appropriately setting a threshol, we can guarantee a provable boun on the error introuce. In our experimental results, the above elimination is able to reuce n to approximately ˆn 10 for a provably small aitive error, even for ata sets where n = 10 8. 4.2. MapReuce Implementation A MapReuce implementation allows scaling out to a large number of compute noes that can wor in parallel. The reaer can refer to (Meng & Mahoney; Bahmani et al., 2012)) for a comprehensive treatment of the MapReuce paraigm. In short, the Haoop/MapReuce infrastructure stores the input graph as a istribute file sprea across multiple machines; it provies a tuple streaming abstraction, where each map an reuce function receives an emits tuples as (ey, value) pairs. The role of the eys is to ensure information aggregation: all the tuples with the same ey are processe by the same reucer. For the spectral ecomposition step of our scheme we esign a simple implementation of the power metho in MapReuce. The etails are beyon the scope of this wor; high-performance implementations are alreay available in the literature, e.g. (Lin & Schatz, 2010). We instea focus on the novel implementation of the Spannogram. Our MapReuce implementation of the ran-2 Spannogram is outline in Algorithm 4. The Mapper is responsible for the uplication an issemination of the eigenvectors, V 2, U 2 = V 2 2, to all reucers. Line 3 emits the j-th row of V 2 an U 2 once for every noe i. Since i is use as the ey, this ensures that every reucer receives V 2, U 2 in their entirety. From the breaown of the Spannogram in Section 3, it is unerstoo that, for the ran-2 case, it su ces to solve a simple system of equations for every pair of noes. The Reucer for noe i receives the full eigenvectors V 2, U 2 an is responsible for solving the problem for every pair (i, j), where j>i. Then, Line 6 emits the best caniate compute at Reucer i. A

Fining Dense Subgraphs via Low-Ran Bilinear Optimization Subgraph ensity 1000 800 600 400 200 0 G ( n, 1 2, =3 n ) G-Feige G-Ravi TPower Spannogram 10 4 10 6 10 8 10 10 E (a) Densities of the recovere subgraph v.s. the expecte number of eges. Number of cores 800 400 240 Running times on 2.5 Billion Eges Power metho Spannogram 0 100 200 300 400 500 600 Total time in minutes (b) Running times of the Spannogram an power iteration for two top eigenvectors. Figure 2. Plante clique experiments for ranom graphs. trivial final step, not outline here, collects all n 2 caniate sets an eeps the best one as the final solution. The basic outline in Algorithm 4 comes with heavy communication nees an was chosen here for ease of exposition. The more e cient version that we implement, oes not replicate V 2, U 2 n times. Instea, the number of reucers say R = n isfine-tune to the capabilities of the cluster. The mappers emit V 2, U 2 R times, once for every reucer. Then, reucer r is responsible for solving for noe pairs (i, j), where i r (mo R) an j > i. Depening on the performance bottlenec, i erent choices for are more appropriate. We ivie the construction of the O(n 2 ) caniate sets in S 2 to O(n ) reucers an each of them computes O(n 2 ) caniate subgraphs. The total communication cost for this parallelization scheme is O(n 1+ ): n reucers nee to have access to the entire V 2, U 2 that has 2 2 n entries. Moreover, the total computation cost for each reucer is O(n 3 ). Algorithm 4 SpannogramMR(V 2, U 2 ) 1: Map({[V 2] j,:, [U 2] j,:,j}): 2: for i =1:n o 3: emit: hi, {[V 2] j,1, [V 2] j,2, [U 2] j,1, [U 2] j,2,j}i 4: en for 1: Reuce i(hi, {[V 2] j,1, [V 2] j,2, [U 2] j,1, [U 2] j,2,j}i, 8j): 2: for each j i +1o 3: c =nullspace([v] i,: [V] j,:) 4: [en j, {X j, Y j}] = max 1 X V 2U T 2 1 Y Y =,X2top (±V 2 c) 5: en for 6: emit: i, {X i, Y i} =max j 1 Xj V 2U T 2 1 Yj 5. Experimental Evaluation We experimentally evaluate the performance of our algorithm an compare it to the truncate power metho (TPower) of (Yuan & Zhang, 2011), a greey algorithm by (Feige et al., 2001) (GFeige) an another greey algorithm by (Ravi et al., 1994) (GRavi). We performe experiments on synthetic ense subgraphs an also massive real graphs from multiple sources. In all experiments we compare the ensity of the subgraph obtaine by the Spannogram to the ensity of the output subgraphs given by the other algorithms. Our experiments illustrate three ey points: (1) for all teste graphs, our metho outperforms some times significantly all other algorithms compare; (2) our metho is highly scalable, allowing us to solve far larger problem instances; (3) our ata-epenent upper boun in many cases provie a certificate of nearoptimality, far more accurate an useful, than what a priori bouns are able to o. Plante clique. We first consier the so-calle (an now much stuie) Plante Clique problem: we see to fin a clique of size that has been plante in a graph where all other eges are rawn inepenently with probability 1/2. We scale our ranomize experiments from n = 100 up to 10 5. In all cases we set the size of the clique to =3 pn close to what is believe to be the critical computability threshol. In all our experiments, GRavi, TPower, an the Spannogram successfully recovere the hien clique. However, as can be seen in Fig. 2, the Spannogram algorithm is the only one able to scale up to n = 10 5 a massive ense graph with about 2.5 billion eges. The reason is that this graph oes not fit in the main memory of one machine an cause all centralize algorithms to crash after several hours. Our MapReuce implementation scales out smoothly, since it splits the problem over multiple smaller problems solve in parallel. Specifically, we use Amazon Wireless Services Elastic MapReuce framewor (aws). We implemente our map an reuce functions in Python an use the MR- Job class (mrj). For our biggest experiments we use a 100-machine strong cluster, consisting of m1.xlarge AWS instances (a total of 800 cores). The running times of our experiments over MapReuce are shown in Fig. 2(b). The main bottlenec is the computation of the first two eigenvectors which is performe by repeating the power iteration for few (typically 4) iterations. This step is not the emphasis of this wor an has not been optimize. The Spannogram algorithm is significantly faster an the benefits of parallelization are clear since it is CPU intensive. In principle, the other algorithms coul be also implemente over MapReuce, but that requires non-trivial istribute algorithm esign. As is well-nown, e.g., (Meng & Mahoney), implementing iterative machine learning algorithms over MapReuce can be a significant tas an schemes which perform worse in stanar metrics can be highly preferable for this parallel

Fining Dense Subgraphs via Low-Ran Bilinear Optimization Figure 3. Subgraph ensity vs. subgraph size (). We compare our DS Spannogram algorithm with the algorithms from (Feige et al., 2001) (GFeige),(Ravi et al., 1994) (GRavi),an(Yuan & Zhang, 2011) (tpm). Across all subgraph sizes, we obtain higher subgraph ensities using Spannograms of ran = 2 or 5. We also obtain a provable ata-epenent upper boun (soli blac line) on the objective. This proves that for these ata sets, our algorithm is typically within 80% from optimality, for all sizes up to = 250, an inee for small subgraph sizes we fin a clique which is clearly optimal. Further experiments on multiple other ata sets are shown in the supplemental material. framewor. Careful MapReuce algorithmic esign is neee especially for ense graphs lie the one in the hien clique problem. Real Datasets. Next, we emonstrate our metho s performance in real atasets an also illustrate the power of our ata-epenent bouns. We run experiments on large graphs from i erent applications an our finings are presente in Fig. 3. The figure compares the ensity achieve by the Spannogram algorithm for ran 1, 2 an 5 to the performance of GFeige, GRavi an TPower. The figure shows that the ran-2 an ran-5 versions of our algorithm, improve sometimes significantly over the other techniques. Our novel ata-epenent upper-boun shows that our results on these ata sets are provably near-optimal. The experiments are performe for two community graphs (com-livejournal an com-dblp), a web graph (web-notredame), an a subset of the Faceboo graph. A larger set of experiments is inclue in the supplemental material. Note that the largest graph in Figure 3 contains no more than 35 million eges; these cases fit in the main memory of a single machine an the running times are presente in the supplemental material, all performe on a stanar Macboo Pro laptop using Matlab. In summary, ran-2 too less than one secon for all these graphs while prior wor methos too approximately the same time, up to a few secons. Ran-1 was significantly faster than all other methos in all teste graphs an too fractions of a secon. Ran-5 too up to 1000 secons for the largest graph (LiveJournal). We conclue that our algorithm is an e cient option for fining ense subgraphs. Di erent ran choices give a traeo between accuracy an performance while the parallel nature allows scalability when neee. Further, our theoretical upper-boun can be useful for practitioners investigating ense structures in large graphs. 6. Acnowlegments The authors woul lie to acnowlege support from NSF grants CCF 1344364, CCF 1344179, DARPA XDATA, an research gifts by Google, Docomo an Microsoft.

Fining Dense Subgraphs via Low-Ran Bilinear Optimization References Amazon Web Services, Elastic Map Reuce. URL http: //aws.amazon.com/elasticmapreuce/. MRJob. URL http://pythonhoste.org/mrjob/. Alon, Noga, Lee, Troy, Shraibman, Ai, an Vempala, Santosh. The approximate ran of a matrix an its algorithmic applications: approximate ran. In Proceeings of the 45th annual ACM symposium on Symposium on theory of computing, pp.675 684.ACM,2013. Ames, Brenan PW. Convex relaxation for the plante clique, biclique, an clustering problems. PhD thesis, University of Waterloo, 2011. Arora, Sanjeev, Karger, Davi, an Karpinsi, Mare. Polynomial time approximation schemes for ense instances of np-har problems. In STOC, 1995. Asahiro, Yuichi, Iwama, Kazuo, Tamai, Hisao, an Touyama, Taeshi. Greeily fining a ense subgraph. Journal of Algorithms, 34(2):203 221,2000. Asteris, Megasthenis, Papailiopoulos, Dimitris S, an Karystinos, George N. The sparse principal component of a constant-ran matrix. IEEE Trans. IT, 60(4):228 2290, 2014. Bahmani, Bahman, Kumar, Ravi, an Vassilvitsii, Sergei. Densest subgraph in streaming an mapreuce. Proceeings of the VLDB Enowment, 5(5):454 465,2012. Bhasara, Aitya, Chariar, Moses, Chlamtac, Een, Feige, Uriel, an Vijayaraghavan, Aravinan. Detecting high log-ensities: an O(n 1/4 ) approximation for ensest -subgraph. In STOC, 2010. Boutsiis, Christos, Mahoney, Michael W, an Drineas, Petros. An improve approximation algorithm for the column subset selection problem. In Proceeings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 968 977. Society for Inustrial an Applie Mathematics, 2009. Cormen, Thomas H, Leiserson, Charles E, Rivest, Ronal L, an Stein, Cli or. Introuction to algorithms. MIT press, 2001. Aspremont, Alexanre et al. Wea recovery conitions using graph partitioning bouns. 2010. Dourisboure, Yon, Geraci, Filippo, an Pellegrini, Marco. Extraction an classification of ense communities in the web. In WWW, 2007. Feige, Uriel an Langberg, Michael. Approximation algorithms for maximization problems arising in graph partitioning. Journal of Algorithms, 41(2):174 211,2001. Feige, Uriel, Peleg, Davi, an Kortsarz, Guy. The ense -subgraph problem. Algorithmica, 29(3):410 421, 2001. Gibson, Davi, Kumar, Ravi, an Tomins, Anrew. Discovering large ense subgraphs in massive graphs. In PVLDB, 2005. Hu, Haiyan, Yan, Xifeng, Huang, Yu, Han, Jiawei, an Zhou, Xianghong Jasmine. Mining coherent ense subgraphs across massive biological networs for functional iscovery. Bioinformatics, 21(suppl 1):i213 i221, 2005. Jethava, Vinay, Martinsson, Aners, Bhattacharyya, Chiranjib, an Dubhashi, Devatt. The lovasz theta function, svms an fining large ense subgraphs. In NIPS, 2012. Karystinos, George N an Liavas, Athanasios P. E cient computation of the binary vector that maximizes a raneficient quaratic form. IEEE Trans. IT, 56(7):3581 3593, 2010. Khot, Subhash. Ruling out ptas for graph min-bisection, ensest subgraph an bipartite clique. In FOCS, 2004. Lin, Jimmy an Schatz, Michael. Design patterns for efficient graph algorithms in mapreuce. In Proceeings of the Eighth Worshop on Mining an Learning with Graphs, pp.78 85.ACM,2010. Mahoney, Michael W an Drineas, Petros. Cur matrix ecompositions for improve ata analysis. Proceeings of the National Acaemy of Sciences, 106(3):697 702, 2009. Meng, Xiangrui an Mahoney, Michael W. Robust regression on mapreuce. ICML 2013, (to appear). Miller, B, Bliss, N, an Wolfe, P. Subgraph etection using eigenvector l1 norms. In NIPS, 2010. Papailiopoulos, Dimitris S, Dimais, Alexanros G, an Koroythais, Stavros. Sparse pca through low-ran approximations. arxiv preprint arxiv:1303.0551, 2013. Ravi, Seharipuram S, Rosenrantz, Daniel J, an Tayi, Giri K. Heuristic an special case algorithms for ispersion problems. Operations Research, 42(2):299 310, 1994. Saha, Barna, Hoch, Allison, Khuller, Samir, Raschi, Louiqa, an Zhang, Xiao-Ning. Dense subgraphs with restrictions an applications to gene annotation graphs. In Research in Computational Molecular Biology, pp. 456 472. Springer, 2010. Srivastav, Anan an Wolf, Katja. Fining ense subgraphs with semiefinite programming. Springer, 1998. Suzui, Aio an Touyama, Taeshi. Dense subgraph problems with output-ensity conitions. In Algorithms an Computation, pp. 266 276. Springer, 2005. Yuan, Xiao-Tong an Zhang, Tong. Truncate power metho for sparse eigenvalue problems. arxiv preprint arxiv:1112.2679, 2011.

000 001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 Supplemental Material for: Fining Dense Subgraphs via Low-Ran Bilinear Optimization 1. Proof of Lemma 4: Builing the set S for arbitrary -imensional subspaces In our general case, we solve DBS on where A = V U T = X v i u T i i=1 V =[v 1... v ] an U =[ 1 v 1... v ]. Solving the problem on A is equivalent to answering the following combinatorial question: how many i erent top- supports are there in a -imensional subspace: top (c 1 v 1 +...+ c v )? Here we efine 1 auxiliary angles 1,..., 1 2 =[0, ) an we rewrite the coe cients c 1,...,c as c = 2 6 4 c 1.. c 3 7 5 = 2 6 4 sin 1 cos 1 sin 2. cos 1 cos 2...sin 1 cos 1 cos 2...cos 1 Clearly we can express every vector in the span of V as a linear combination c 1 v 1 +...+ c v in terms of : v( 1,..., 1) =(sin 1 ) v 1 + (cos 1 sin 2 ) v 2 +...+ (cos 1 cos 2...cos 1 ) v. (1) For notation simplicity let us efine a vector that contains all ' =[ 1,..., 1]. 3. 7 5 1 auxiliary phase variables We can use the above erivations to rewrite the set S that contains all top coorinates in the span of V as: S = {top (c 1 v 1 +...+ c v ):c 1,...,c 2 R} = {top ± (v(')) : ' 2 1 } = {top ± ((sin 1 ) v 1 + (cos 1 sin 2 ) v 2 +...+ (cos 1 cos 2...cos 1 ) v ),'2 1 } Observe again that each element of v(') is a continuous spectral curve in the 1 auxiliary variables: [v(')] i =(sin 1 ) [v 1 ] i + (cos 1 sin 2 ) [v 2 ] i +...+ (cos 1 cos 2...cos 1 ) [v ] i. Consequently, the top/bottom- supports of v(') (i.e., top (±v('))) are themselves a function of the 1 variables in '. How can we fin all possible supports? Remar 1. In our general problem we wish to fin all top an bottom coorinates that appear in a - imensional subspace. In the following iscussion, for simplicity we hanle the top coorinates problem. Fining the bottom trivially follows, by just checing the smallest coorinates of each vector c 1 v 1 +...+ c v that we construct using our algorithm. 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109

110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 Title Suppresse Due to Excessive Size 1.1. Raning regions for a single coorinate [v(')] i We now show that for each single coorinate [v(')] i, we can partition 1 in regions, wherein the ith coorinate [v(')] i retains the same raning relative to the other n 1 coorinates in the vector v('). Let us first consier for simplicity [v(')] 1. We aim to fin all values of ' where [v(')] 1 is in one of the the largest coorinates of v('). We observe that this region can be characterize by using n bounary tests: [v(')] 1? [v(')] 2 [v(')] 1? [v(')] 3. [v(')] 1? [v(')] n Each of the above bounary tests efines a bouning curve that partitions the 1 omain. We refer to this bouning curve as B 1,j (') : 1 7! 2.AB 1,j (') curve partitions an efines two regions of ' angles: R 1>j = {' 2 1 :[v(')] 1 > [v(')] j } an R 1applej = {' 2 1 :[v(')] 1 apple [v(')] j } (2) such that R 1>j [R 1applej = 1. Observe that these n 1curvesB 1,1 ('),...,B 1,n (') partition in isjoint cells, C1,...,C 1 T 1, such that T[ Ci 1 = 1. i=1 Within each cell C 1 i, the first coorinate [v(')] 1 retains a fixe raning relative to the rest of the elements in v('), e.g., for a specific cell it might be the largest element, an in another cell it might be the 10th smallest, etc. This happens because for all values of ' in a single cell, the respective orering [v(')] 1? [v(')] 2,...,[v(')] 1? [v(')] n remains the same. If we have access to a single point, say ' 0, that belongs to a specific cell, say Cj 1, then we can calculate [v(' 0)] an fin the raning of the first coorinate [v(')] 1, that remains invariant for all ' 2Cj 1. Hence, if we visit all these cells, then we can fin all possible ranings that the first coorinate [v(')] 1 taes in the -imensional span of v 1,...,v. In the following subsections, we show that the number of these cells is boune by T apple 2 n 1 1. Observe that each bouning curve B 1,i (') has a one-to-one corresponence to an equation [v(')] 1 =[v(')] j, which is linear in c: [v(')] 1 =[v(')] j ) e T 1 V c e T i V c =0) (e 1 e j ) T V c =0. (3) Due to their linear characterization with respect to c, it is easy to see that each ( intersects on a single point in 1 : 1 [v(')] 1 =[v(')] i1 [v(')] 1 =[v(')] i2. [v(')] 1 =[v(')] i 1 ) (e 1 e i1 ) T 2 V c =0 (e 1 e i2 ) T V c =0 ) 6. 4 (e 1 e i 1 ) T V c =0 (e 1 e i1 ) T (e 1 e i2 ) T. (e 1 e i 1 ) T 3 1)-tuple of bouning curves 7 5 V c = 0 ( 1) 1. Let us enote the solution of the above linear inverse problem as c 1,i1,...,i 1. We refer to c 1,i1,...,i 1 as an intersection vector. For each intersection vector c 1,i1,...,i 1, we can compute its polar expression an solve for the angles ' that generate it. These 1 input angles correspon exactly to the intersection point of 1curves specifie by the above 1 equations. We enote these 1 angles that generate c 1,i1,...,i 1, as ' 1,i1,...,i 1 which we refer to as the intersection point of the 1curvesB 1,i1 ('),...,B 1,i 1 ('). 1 as a matter of fact, ue to the sign ambiguity of the solution, this correspons to two intersection points. However, the following iscussion omits this technical etail for simplicity. 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219

220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 Title Suppresse Due to Excessive Size Since, the ' 1,i1,...,i 1 intersection points are efine for every 1 curves, the total number of intersection points n 1 is 1. In the following subsections, we show how we can visit all cells by just examining these intersection points. We procee to show that if we visit the ajacent cells of the intersection points efine for all coorinates, then we can fin all top- supports in the span of V. 1.2. Visiting all cells = fining all top supports Our goal is to fin all top- supports that can appear in the span of V. To o so, it is su cient to visit the cells where [v(')] 1 is the -th largest coorinate, then the cells where [v(')] 2 is the -largest, an so on. Within such cells, one coorinate (say [v(')] i ) remains always the -th largest, while the ientities of the bottom n coorinates remain the same. This means that in such a cell, we have that [v(')] i [v(')] j1,...,[v(')] i [v(')] jn for all ' in that cell an some scecific n other coorinates inexe by j 1,...,j n. Hence, although the sorting of the top 1 elements might change in that cell (i.e., the first might become the secon largest, an vice versa), the coorinates that participate in the top 1 support will be the same, while at the same time the -th largest will be [v(')] i. Hence, for each coorinate [v(')] i,weneetovisitthecellswhereinitisthe-th largest. We o this by examining all cells wherein [v(')] i retains a fixe raning. Visiting all these cells (T for each coorinate), is n 1 possible by visiting all n 1 intersection points of B i,j (') curves as efine earlier. Since we now that each cell is ajacent to at least 1 intersection point, then at each of these points we visit all ajacent cells. For each cell that we visit, we compute the support of the largest coorinates of a vector v(' 0 )witha' 0 that lies in that cell. We inclue this top inex set in S an carry the same proceure for all cells. Since we visit all coorinates an all their ajacent cells, this means that we visit all cells Cj i. This means that this proceure will construct all possible supports in 1.3. Constructing the set S S = {top (c 1 v 1 +...+ c v ):c 1,...,c 2 R} To visit all possible cells Cj i, we now have to chec the intersection points, which are obtaine by solving the system of 1 equations We can rewrite the above as 2 [v(')] i1 =[v(')] i2 =...=[v(')] i,[v(')] i1 =[v(')] i2,...,[v(')] i1 =[v(')] i. (4) 6 4 e T i 1 e T i 2 where the solution is the nullspace of the matrix, which has imension 1. e T i 1. e T i 3 7 5 V c = 0 ( 1) 1 (5) To explore all possible caniate vectors, we nee to visit all cells. To o so, we compute all possible n solution intersection vectors c i1,...,i. On each intersection vector we nee to compute the locally optimal support set top (V c i1,...,i ). Then observe that the coorinates i 1,...,i of V c i1,...,i have the same value, since they all satisfy equation (5). Let us assume that t of them appear in the set top (V c i1,...,i ). The, fining the top supports of all neighboring cell is equivalent to checing all i erent supports that can be generate by taing all t possible t- subsets of the i 1,...,i coorinates with respect to V c i1,...,i, while eeping the rest of the elements in V c i1,...,i in their original raning, as compute in top (V c i1,...,i ). This, inuces at most O( ) local sortings, i.e., /2 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329

330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 Title Suppresse Due to Excessive Size top supports. All these sortings will eventually be the elements of the S set. The number of all caniate n 1 support sets will now be O( /2 )=O(n ) an the total computation complexity is O n +1, since for each point we compute the top- support in linear time O(n). For completeness the algorithm of the spannogram framewor that generates S is given below. Algorithm 1 Spannogram Algorithm for S. 1: S = ; 2: for all (i 1,...,i ) 0 2{1,...,n} 2 an s 3 2{ 1 1, 1} o [(V ] i1,: [V ] i2,: 3: c = s nullspace @ 4 5A. [V ] i1,: [V ] i,: 4: v = V T c 5: S =top (v) 6: T = S {i 1,...,i } 7: for all subsets J of (i T 1,...,i ) o S 8: S = S (T[J) 9: en for 10: en for 11: Output: S. 1.4. Resolution of singularities In our proofs, we assume that the curves in v( ) are in general position. This is neee so that no more than 1 curves intersect at a single point. This assumption is equivalent to requiring that every submatrix of V is full ran. This general position requirement can be hanle by introucing infinitesimal perturbations in V. The etails of the analysis of this metho can be foun in (Papailiopoulos et al., 2013). 2. Going from DS to DBS an bac In this subsection we show how a -approximation algorithm for DBS for arbitrary matrices, implies a 2 approximation for DS. Our proof goes through a ranomize sampling argument. At the en of this section we also show a eterministic scheme that coverts a DBS algorithm to an algorithm for DS. This eterministic translation only guarantees that a -approximation algorithm for DBS for arbitrary matrices, can be converte to 4 -approximation for DS. Our eterministic metho uses much simpler vertex pruning techniques. Algorithm 2 ranombipartite(g) 1: L = ;, R = ; 2: for Each vertex v in G o 3: Z = Bernoulli(1/2). 4: if Z==1 then 5: Put v in L 6: else 7: Put v in R 8: en if 9: en for 10: G B = G 11: elete all eges in G B (L) an G B (R) 12: Output: G B 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439

440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 2.1. Proof of Lemma 1: Ranomize Reuction Title Suppresse Due to Excessive Size Let us enote by G(S) the subgraph in G inuce by a vertex set S. Let the ajacency matrix of the bipartite graph create by ranombipartite(g) beg B A B = apple 0n1 n 2 B B T 0 n2 n 1, where n 1 +n 2 = n. In the following, we refer to B as the bi-ajacency matrix of the bipartite graph G B. Moreover, we enote as L an R the two isjoint vertex sets of a bipartite graph. Before we procee let us state a simple property on the quaratic form of bipartite graphs. h 0n1 i n 2 B 0 n2 n 1 Proposition 1. Let A B = be the ajacency matrix of a bipartite graph. Then, for any subset B T of vertices S, we have that S = S l [S r,withs l = S\Lan S r = S\RMoreover, 1 T S A B 1 S =2 1 T S l B1 Sr Proof. It is easy to see that S l an S r are the vertex subsets of S that correspon to either the left or right noes of the bipartite graph. Since the two sets are isjoint, we have 1 S = 1 Sl + 1 Sr. Then, the quaratic forms on A B can be equivalently rewritten as bilinear forms on B: 1 T S A B 1 S =(1 Sl + 1 Sr ) T h 0 n1 n 2 B B T 0 n2 n 1 i (1 Sl + 1 Sr )=1 Sr B T 1 Sl + 1 Sl B1 Sr =2 1 T S l B1 Sr. Due to the above, we will consier the following DBS problem {X B, Y B } = arg max X = 1 max Y = 2 1 T X B1 Y, where 1 + 2 =. Due to Proposition 1, X B \Y B = ;, since the columns an rows of B inex two isjoint vertex sets L an R, respectively. Then, our approximate solution with respect to the original DS problem on A will then be S B = X B [Y B. Clearly S B =. Proposition 2. The ensity of the above approximate solution is en(s B )= 1T S B A1 SB 1 T S B A B 1 SB =2 1T X B B1 YB. Proof. The result follows immeiately by the nonnegativity of the entries in A, an the fact that A B contains a subset of the entries of A. The last equality follows from Proposition 1. We will now show that en(s B ) is at least opt/(2 + ) with high probability, if we solve DBS on log n graphs inepenently create using ranombipartite(g), an by eeping the best solution among all log n extracte sets S B. Proposition 3. Let us partition the vertices of G into two sets L, R accoring to ranombipartite(g): We flip a fair coin for each vertex of G an put it with probability 1/2 in either of the two sets. Then, we create a bipartite graph G B that has as left an right vertex sets the sets L an R, respectively. The eges that we maintain from the original graph, are only those that cross from L to R, while we elete all eges in the subgraphs inuce by both sets L an R. Then, there exists in a -subgraph in G B,thatcontains0.5 opt eges, in expectation. 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549