HIGH-DIMENSIONAL CHANGEPOINT ESTIMATION

Similar documents
HIGH-DIMENSIONAL CHANGEPOINT DETECTION

Decision-Maker Preference Modeling in Interactive Multiobjective Optimization

AUTOREGRESSIVE MFCC MODELS FOR GENRE CLASSIFICATION IMPROVED BY HARMONIC-PERCUSSION SEPARATION

Lecture 16: Feedback channel and source-channel separation

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

MUSIC transcription is one of the most fundamental and

Research Article Design and Analysis of a High Secure Video Encryption Algorithm with Integrated Compression and Denoising Block

Latin Square Design. Design of Experiments - Montgomery Section 4-2

Research on sampling of vibration signals based on compressed sensing

Orthogonal rotation in PCAMIX

Image Resolution and Contrast Enhancement of Satellite Geographical Images with Removal of Noise using Wavelet Transforms

Adaptive decoding of convolutional codes

Sensors, Measurement systems Signal processing and Inverse problems Exercises

Design and Analysis of New Methods on Passive Image Forensics. Advisor: Fernando Pérez-González. Signal Theory and Communications Department

Compressed-Sensing-Enabled Video Streaming for Wireless Multimedia Sensor Networks Abstract:

Decoding of purely compressed-sensed video

A Novel Video Compression Method Based on Underdetermined Blind Source Separation

Optimized Color Based Compression

Single Channel Blind Source Separation Using Independent Subspace Analysis

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Predictability of Music Descriptor Time Series and its Application to Cover Song Detection

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

CS229 Project Report Polyphonic Piano Transcription

System Identification

Design Trade-offs in a Code Division Multiplexing Multiping Multibeam. Echo-Sounder

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Fast MBAFF/PAFF Motion Estimation and Mode Decision Scheme for H.264

COMPRESSION OF DICOM IMAGES BASED ON WAVELETS AND SPIHT FOR TELEMEDICINE APPLICATIONS

Temporal data mining for root-cause analysis of machine faults in automotive assembly lines

Piya Pal. California Institute of Technology, Pasadena, CA GPA: 4.2/4.0 Advisor: Prof. P. P. Vaidyanathan

Streaming Compressive Sensing for High-Speed Periodic Videos

INTER GENRE SIMILARITY MODELLING FOR AUTOMATIC MUSIC GENRE CLASSIFICATION

Restoration of Hyperspectral Push-Broom Scanner Data

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

Scalability of delays in input queued switches

Chord Representations for Probabilistic Models

CS 498 Hot Topics in High Performance Computing. Networks and Fault Tolerance. 3. A Network-Centric View on HPC

AN UNEQUAL ERROR PROTECTION SCHEME FOR MULTIPLE INPUT MULTIPLE OUTPUT SYSTEMS. M. Farooq Sabir, Robert W. Heath and Alan C. Bovik

II. SYSTEM MODEL In a single cell, an access point and multiple wireless terminals are located. We only consider the downlink

Deep Aesthetic Quality Assessment with Semantic Information

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Optimum Frame Synchronization for Preamble-less Packet Transmission of Turbo Codes

On the Complexity-Performance Trade-off in Code-Aided Frame Synchronization

Multiple-point simulation of multiple categories Part 1. Testing against multiple truncation of a Gaussian field

Design of Fault Coverage Test Pattern Generator Using LFSR

Music Similarity and Cover Song Identification: The Case of Jazz

Audio-Based Video Editing with Two-Channel Microphone

MUSI-6201 Computational Music Analysis

Video Quality Monitoring for Mobile Multicast Peers Using Distributed Source Coding

Music Composition with RNN

Technical report on validation of error models for n.

Linköping University Post Print. Packet Video Error Concealment With Gaussian Mixture Models

Code-aided Frame Synchronization

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

BER margin of COM 3dB

Analysis of Different Pseudo Noise Sequences

Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED)

AUDIOVISUAL COMMUNICATION

TERRESTRIAL broadcasting of digital television (DTV)

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Lossless Compression Algorithms for Direct- Write Lithography Systems

Design Approach of Colour Image Denoising Using Adaptive Wavelet

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

An optimal broadcasting protocol for mobile video-on-demand

ONE SENSOR MICROPHONE ARRAY APPLICATION IN SOURCE LOCALIZATION. Hsin-Chu, Taiwan

GNURadio Support for Real-time Video Streaming over a DSA Network

Lecture 5: Clustering and Segmentation Part 1

Keywords Separation of sound, percussive instruments, non-percussive instruments, flexible audio source separation toolbox

Hidden Markov Model based dance recognition

Hybrid resampling methods for confidence intervals: comment

Different Approach of VIDEO Compression Technique: A Study

Release Year Prediction for Songs

FPGA IMPLEMENTATION AN ALGORITHM TO ESTIMATE THE PROXIMITY OF A MOVING TARGET

Analysis of Packet Loss for Compressed Video: Does Burst-Length Matter?


Multi-modal Kernel Method for Activity Detection of Sound Sources

NEXTONE PLAYER: A MUSIC RECOMMENDATION SYSTEM BASED ON USER BEHAVIOR

Adaptive Distributed Compressed Video Sensing

Color Quantization of Compressed Video Sequences. Wan-Fung Cheung, and Yuk-Hee Chan, Member, IEEE 1 CSVT

arxiv: v1 [cs.sd] 5 Apr 2017

Spatial competition in the network television industry

Chapter 12. Synchronous Circuits. Contents

Minimax Disappointment Video Broadcasting

Supervised Learning in Genre Classification

Hybrid Wavelet and EMD/ICA Approach for Artifact Suppression in Pervasive EEG

ELEC 691X/498X Broadcast Signal Transmission Fall 2015

IMAGE and video signals have a significant, constantly. Optimized Pre-Compensating Compression. arxiv: v2 [cs.

HUMANS have a remarkable ability to recognize objects

KONRAD JĘDRZEJEWSKI 1, ANATOLIY A. PLATONOV 1,2

Module 8 VIDEO CODING STANDARDS. Version 2 ECE IIT, Kharagpur

Bootstrap Methods in Regression Questions Have you had a chance to try any of this? Any of the review questions?

WYNER-ZIV VIDEO CODING WITH LOW ENCODER COMPLEXITY

Sampling Issues in Image and Video

FRAME RATE CONVERSION OF INTERLACED VIDEO

Study of White Gaussian Noise with Varying Signal to Noise Ratio in Speech Signal using Wavelet

Efficient Implementation of Neural Network Deinterlacing

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

A probabilistic framework for audio-based tonal key and chord recognition

Comparison of Mixed-Effects Model, Pattern-Mixture Model, and Selection Model in Estimating Treatment Effect Using PRO Data in Clinical Trials

Transcription:

HIGH-DIMENSIONAL CHANGEPOINT ESTIMATION VIA SPARSE PROJECTION 3 6 8 11 14 16 19 22 26 28 31 33 35 39 43 47 48 52 53 56 6 63 67 71 73 77 8 83 86 88 91 93 96 98 11 15 19 113 114 118 12 121 125 126 129 133 134 136 139 14 142 146 149 152 156 158 161 163 165 168 171 174 176 18 181 185 188 19 193 196 2 22 26 21 214 218 221 222 225 229 231 234 238 239 24 241 244 246 248 252 255 259 262 265 268 269 272 275 278 279 281 284 286 289 291 294 296 3 34 38 311 314 317 32 324 326 33 334 336 339 341 345 349 351 354 356 36 364 368 369 371 372 376 38 381 385 388 391 394 396 4 42 46 41 411 415 417 42 423 424 427 429 432 435 438 44 442 446 448 451 452 456 458 461 463 464 468 469 473 476 478 48 483 485 486 49 491 495 496 498 51 54 56 58 511 512 513 517 518 522 524 528 532 534 536 54 543 545 549 552 556 56 562 564 565 569 571 572 575 579 581 582 585 589 591 594 595 599 6 63 64 66 61 612 616 618 621 625 628 629 632 634 637 639 643 647 651 655 656 659 66 662 666 667 669 671 674 677 679 683 687 689 693 696 698 72 73 77 78 79 71 714 718 721 724 727 73 733 734 738 739 742 745 747 751 755 758 762 766 768 769 773 775 779 782 784 785 789 792 794 797 8 82 85 87 811 815 816 818 821 823 827 829 832 836 838 842 843 847 85 853 855 856 859 863 866 868 871 874 878 88 881 885 887 891 892 896 897 91 92 95 99 91 912 914 915 918 921 924 928 929 933 937 939 942 943 945 948 951 954 956 959 962 966 968 969 97 972 976 978 979 981 984 988 992 994 998 11 12 13 17 18 19 111 112 116 117 121 122 123 127 128 129 131 135 136 139 142 145 148 151 153 157 158 159 162 165 168 172 175 177 181 185 189 193 195 198 112 114 117 119 111 1111 1115 1117 112 1124 1127 1129 1133 1137 1141 1145 1148 115 1153 1155 1157 1161 1164 1165 1167 1171 1173 1176 1178 1179 1183 1184 1185 1189 1191 1192 1196 1197 1198 121 122 126 127 129 121 1212 1213 1217 122 1224 1226 1229 1231 1233 1236 124 1241 1245 1247 125 1252 1256 1257 126 1261 1265 1267 127 1272 1273 1276 1278 1282 1286 1287 1291 1292 1294 1298 132 135 138 1311 1313 1315 1316 1318 1321 1324 1326 133 1331 1335 1339 1343 1344 1347 135 1353 1355 1359 1363 1367 137 1372 1375 1378 1379 1382 1385 1387 1389 1392 1395 1398 141 145 146 149 1411 1414 1416 1419 1423 1427 1431 1434 1435 1439 1441 1443 1447 1451 1455 1458 1461 1462 1463 1465 1468 147 1474 1475 1477 148 1482 1485 1489 1492 1494 1497 152 154 155 159 151 1514 1516 1519 1523 1525 1529 1532 1534 1537 154 1544 1547 155 1552 1555 1558 1559 1562 1564 1568 1572 1576 1578 1582 1584 1587 159 1592 1596 1598 161 163 167 1611 1614 1618 1621 1625 1628 1631 1632 1636 1638 1642 1643 1647 1648 1649 1653 1656 1658 1659 1663 1664 1668 1671 1674 1676 1679 1683 1687 1689 1692 1694 1698 17 173 176 179 1712 1715 1719 1722 1724 1728 1729 1732 1733 1737 1741 1744 1748 1752 1756 1757 1761 1763 1767 177 1772 1776 1779 1783 1785 1786 179 1794 1796 1797 181 184 187 181 1813 1816 1819 182 1824 1828 1831 1832 1836 1837 184 1843 1847 1848 1849 1852 1853 1854 1855 1859 186 1864 1868 1869 1873 1875 1877 1881 1883 1887 1888 1891 1893 1895 1899 191 193 196 198 1911 1912 1914 1916 1919 1923 1924 1927 193 1932 1935 1938 194 1944 1946 1949 195 1953 1957 196 1964 1966 1968 1972 1973 1975 1978 1982 1985 1988 1991 1994 1995 1997 5 1 15 2 25 3 5 5 1 15 15 2 nodes in binary segmentation algorithm peak of projected CUSUM Richard Samworth, University of Cambridge Joint work with Tengyao Wang

Tengyao August 13; 2/31

Heterogeneity in Big Data One of the most commonly-encountered issues with Big Data is heterogeneity. Departures from traditional, stylised i.i.d. models can take many forms, e.g. missing data, correlated errors, data combined from multiple sources,... In data streams, heterogeneity is manifested through non-stationarity. Perhaps the simplest model assumes population changes occur at a finite set of time points. August 13; 3/31

Changepoint estimation Changepoint problems have a rich history (Page, 1955). State-of-the-art univariate methods include PELT (Killick, Fearnhead and Eckley, 212), Wild Binary Segmentation (Fryzlewicz, 214) and SMUCE (Frick, Munk and Sieling, 214). Some ideas extend to multivariate settings (Horváth, Kokoszka and Steinebach,1999; Ombao, Von Sachs and Guo, 25; Aue et al., 29; Kirch, Mushal and Ombao, 214). Increasing interest in high-dimensional setting, possibly with a sparsity condition on coordinates of change (Aston and Kirch, 214; Enikeeva and Harchaoui, 214; Jirak, 215; Cho and Fryzlewicz, 215; Cho, 216). August 13; 4/31

Basic model Let X = (X 1,..., X n ) R p n have independent columns X t N p (µ t, σ 2 I p ). Assume there exist changepoints 1 z 1 < z 2 < < z ν n 1 such that µ zi +1 = = µ zi+1 =: µ (i), i ν, where z := and z ν+1 := n. Writing θ (i) := µ (i) µ (i 1), 1 i ν, we assume k {1,..., p} s.t. θ (i) k for 1 i ν. August 13; 5/31

Further model assumptions Assume stationary run lengths satisfy 1 n min{z i+1 z i : i ν} τ, and the magnitudes of mean changes are such that θ (i) 2 ϑ, 1 i ν. Let P(n, p, k, ν, ϑ, τ, σ 2 ) be the set of distributions of such X. August 13; 6/31

Projection-based single changepoint estimation µ + W = X Let ν = 1, write z := z 1, θ := θ (1) and τ := n 1 min{z, n z}. For any a S p 1, a X t N(a µ t, σ 2 ). Hence a = θ/ θ 2 =: v maximises the magnitude of the difference in means between the two segments. August 13; 7/31

CUSUM transformation Define CUSUM transformation T p,n : R p n R p (n 1) by [T (M)] j,t = [T p,n (M)] j,t := ( n t(n t) n r=t+1 M j,r t n t r=1 M j,r t ). T (µ) + T (W ) = T (X) A + E = T August 13; 8/31

SVD of CUSUM transformation When ν = 1, we can compute A explicitly: t n(n t) A j,t = (n z)θ j, if t z =: (θγ ) j,t, n t nt zθ j, if t > z so the oracle projection direction is the leading left singular vector of the rank 1 matrix A. We could therefore consider estimating v by ˆv max,k argmaxṽ S p 1 (k) T ṽ 2, and indeed when n 6, with probability at least 1 4(p log n) 1/2, sin (ˆv max,k, v) 16 2σ k log(p log n). τϑ n August 13; 9/31

A computationally efficient projection Computing the k-sparse leading left singular vector of a matrix is NP-hard (Tillmann and Pfetsch, 214). However, max u S p 1 (k) u T 2 = max u S p 1 (k),w S u T w n 2 = max u S p 1,w S n 2, u k uw, T = max M, T, M M where M := {M : M = 1, rk(m) = 1, nnzr(m) k}. For λ >, we therefore consider computing ˆM argmax M S 1 { T, M λ M 1 }, where S 1 := {M R p (n 1) : M 1}, using ADMM. We can then let ˆv be a leading left singular vector of ˆM. August 13; 1/31

Alternative relaxation Let S 2 := {M R p (n 1) : M 2 1}. Then the simple dual formulation leads to M := soft(t, λ) soft(t, λ) 2 = argmax M S 2 { T, M λ M 1 }. Suppose ˆM argmax M S { T, M λ M 1 } for S = S1 or S = S 2 and let ˆv argmaxṽ S p 1 ˆM ṽ 2. If n 6 and λ 2σ log(p log n), then w.p. at least 1 4(p log n) 1/2, sin (ˆv, v) 32λ k τϑ n. August 13; 11/31

Changepoint estimation after projection Input: X R p n, λ >. Step 1: Perform CUSUM transformation T T (X) Step 2: Find ˆM { } argmax M S T, M λ M 1 for S = S 1 or S 2 Step 3: Find ˆv argmaxṽ S p 1 ˆM ṽ 2. Step 4: Let ẑ argmax 1 t n 1 ˆv T t, where T t is the tth column of T, and set T max ˆv Tẑ Output: ẑ, T max August 13; 12/31

Sample-splitting version performance Let σ > be known and X P P(n, p, k, 1, ϑ, τ, σ 2 ). Let ẑ be the output of sample-splitting algorithm with input X, σ and λ := 2σ log(p log n). C, C > such that if n 12, z is even and Cσ ϑτ k log(p log n) n 1, then w.p. at least 1 4{p log(n/2)} 1/2 17/ log(n/2), 1 n ẑ z C σ 2 log log n nϑ 2. If σ is constant, log p = O(log n), ϑ n a, τ n b, k n c and a + b + c/2 < 1/2, then rate of convergence is o(n 1+2a+δ ) for all δ >. August 13; 13/31

Multiple changepoint estimation inspect Wild binary segmentation scheme (Fryzlewicz, 214) August 13; 14/31

Multiple changepoint estimation inspect Wild binary segmentation scheme (Fryzlewicz, 214) August 13; 15/31

Multiple changepoint estimation inspect Wild binary segmentation scheme (Fryzlewicz, 214) August 13; 16/31

Example 5 1 15 2 25 3 5 1 15 2 candidate changepoint location projected CUSUM statistics 3 6 8 11 14 16 19 22 26 28 31 33 35 39 43 47 48 52 53 56 6 63 67 71 73 77 8 83 86 88 91 93 96 98 11 15 19 113 114 118 12 121 125 126 129 133 134 136 139 14 142 146 149 152 156 158 161 163 165 168 171 174 176 18 181 185 188 19 193 196 2 22 26 21 214 218 221 222 225 229 231 234 238 239 24 241 244 246 248 252 255 259 262 265 268 269 272 275 278 279 281 284 286 289 291 294 296 3 34 38 311 314 317 32 324 326 33 334 336 339 341 345 349 351 354 356 36 364 368 369 371 372 376 38 381 385 388 391 394 396 4 42 46 41 411 415 417 42 423 424 427 429 432 435 438 44 442 446 448 451 452 456 458 461 463 464 468 469 473 476 478 48 483 485 486 49 491 495 496 498 51 54 56 58 511 512 513 517 518 522 524 528 532 534 536 54 543 545 549 552 556 56 562 564 565 569 571 572 575 579 581 582 585 589 591 594 595 599 6 63 64 66 61 612 616 618 621 625 628 629 632 634 637 639 643 647 651 655 656 659 66 662 666 667 669 671 674 677 679 683 687 689 693 696 698 72 73 77 78 79 71 714 718 721 724 727 73 733 734 738 739 742 745 747 751 755 758 762 766 768 769 773 775 779 782 784 785 789 792 794 797 8 82 85 87 811 815 816 818 821 823 827 829 832 836 838 842 843 847 85 853 855 856 859 863 866 868 871 874 878 88 881 885 887 891 892 896 897 91 92 95 99 91 912 914 915 918 921 924 928 929 933 937 939 942 943 945 948 951 954 956 959 962 966 968 969 97 972 976 978 979 981 984 988 992 994 998 11 12 13 17 18 19 111 112 116 117 121 122 123 127 128 129 131 135 136 139 142 145 148 151 153 157 158 159 162 165 168 172 175 177 181 185 189 193 195 198 112 114 117 119 111 1111 1115 1117 112 1124 1127 1129 1133 1137 1141 1145 1148 115 1153 1155 1157 1161 1164 1165 1167 1171 1173 1176 1178 1179 1183 1184 1185 1189 1191 1192 1196 1197 1198 121 122 126 127 129 121 1212 1213 1217 122 1224 1226 1229 1231 1233 1236 124 1241 1245 1247 125 1252 1256 1257 126 1261 1265 1267 127 1272 1273 1276 1278 1282 1286 1287 1291 1292 1294 1298 132 135 138 1311 1313 1315 1316 1318 1321 1324 1326 133 1331 1335 1339 1343 1344 1347 135 1353 1355 1359 1363 1367 137 1372 1375 1378 1379 1382 1385 1387 1389 1392 1395 1398 141 145 146 149 1411 1414 1416 1419 1423 1427 1431 1434 1435 1439 1441 1443 1447 1451 1455 1458 1461 1462 1463 1465 1468 147 1474 1475 1477 148 1482 1485 1489 1492 1494 1497 152 154 155 159 151 1514 1516 1519 1523 1525 1529 1532 1534 1537 154 1544 1547 155 1552 1555 1558 1559 1562 1564 1568 1572 1576 1578 1582 1584 1587 159 1592 1596 1598 161 163 167 1611 1614 1618 1621 1625 1628 1631 1632 1636 1638 1642 1643 1647 1648 1649 1653 1656 1658 1659 1663 1664 1668 1671 1674 1676 1679 1683 1687 1689 1692 1694 1698 17 173 176 179 1712 1715 1719 1722 1724 1728 1729 1732 1733 1737 1741 1744 1748 1752 1756 1757 1761 1763 1767 177 1772 1776 1779 1783 1785 1786 179 1794 1796 1797 181 184 187 181 1813 1816 1819 182 1824 1828 1831 1832 1836 1837 184 1843 1847 1848 1849 1852 1853 1854 1855 1859 186 1864 1868 1869 1873 1875 1877 1881 1883 1887 1888 1891 1893 1895 1899 191 193 196 198 1911 1912 1914 1916 1919 1923 1924 1927 193 1932 1935 1938 194 1944 1946 1949 195 1953 1957 196 1964 1966 1968 1972 1973 1975 1978 1982 1985 1988 1991 1994 1995 1997 5 1 15 2 25 3 5 5 1 15 15 2 nodes in binary segmentation algorithm peak of projected CUSUM August 13; 17/31

Multiple changepoint estimation inspect Input: X R p n, λ >, ξ >, β >, Q N. Step 1: Set Ẑ. Draw (s 1, e 1 ),..., (s Q, e Q ) from {(l, r) Z 2 : l < r n}. Step 2: Run wbs(, n) where wbs is defined below. Step 3: Let ˆν Ẑ and sort Ẑ to yield ẑ 1 < < ẑˆν. Output: ẑ 1,..., ẑˆν Function wbs(s, e) Set Q s,e {q : s + nβ s q < e q e nβ} T [q] For q Q s,e, let (ẑ [q], max) SingleCP(X [q], λ) [q] Find q argmax q Qs,e T max and set b s q + ẑ [q ] If T [q ] max > ξ then Ẑ Ẑ {b}; wbs(s,b); wbs(b,e) end August 13; 18/31

Theory for inspect inspect : whenever SingleCP is called, second and third steps are on an independent copy X of X. Assume σ > known and X, X iid P P(n, p, k, ν, ϑ, τ, σ 2 ). Let ẑ 1 < < ẑˆν be output of inspect with input X, X, λ := 4σ log(np), ξ := λ, β and Q. Define ρ = ρ n := λ 2 n 1 ϑ 2 τ 4 and assume nτ 14. C, C > such that if C ρ < β/2 < τ/c and Cρkτ 2 1, then P P {ˆν = ν & ẑi z i C nρ, 1 i ν } 1 e τ 2 Q/9 Assume log p = O(log n), ϑ n a, τ n b, k n c. If a + b + c/2 < 1/2 and 2a + 5b < 1, then conditions can τ 6 log n np 4. hold for large n and rate is o(n (1 2a 4b)+δ ) for all δ >. August 13; 19/31

S 1 or S 2? Angles (in degrees) between oracle projection direction v and estimated projection directions ˆv S1 (using S 1 ) and ˆv S2 (using S 2 ), for different choices of ϑ. ϑ.5 1. 1.5 2. 2.5 (ˆv S1, v) 75.3 6.2 44.6 32.1 24. (ˆv S2, v) 75.7 61.7 46.8 34.4 26.5 ϑ 3. 3.5 4. 4.5 5. (ˆv S1, v) 19.7 15.9 12.6 1. 7.7 (ˆv S2, v) 21.7 18.1 15.2 12.2 1.2 Other parameters: n = 5, p = 1, k = 3, z = 2, σ 2 = 1. August 13; 2/31

Single changepoint simulations RMSE θ = (1, 2 1/2,..., k 1/2,,..., ) R p, ϑ =.8, σ 2 = 1 n p k z inspect dc sbs scan agg 2 agg 5 5 3 2 11.2 22.2 72.7 11.6 115.9 22.4 5 5 22 2 31. 8.8 87.1 65.7 113.2 83.1 5 5 5 2 35.3 15.9 12.9 86.8 112.7 17.9 5 5 5 2 48.8 147.7 129.6 12. 114.6 15.8 5 2 3 2 18.4 56. 99.4 26.4 163. 26.6 5 2 45 2 43.5 152.3 133.8 126.8 164.9 132.6 5 2 2 2 52.8 159.1 151.6 15.6 163.2 158.4 5 2 2 2 59.6 162.1 162.4 166.1 163. 176. 2 5 3 8 8.6 15.5 159.7 8.6 22.6 15.5 2 5 22 8 12.4 31.2 48.7 17. 25.9 32.1 2 5 5 8 14.6 39.6 57.7 2.4 25.3 38.6 2 5 5 8 23.9 72.7 86.1 35.6 25.1 71.8 2 2 3 8 9.3 15.9 215.7 9. 143.6 16.1 2 2 45 8 16.7 35.8 1.7 21.3 152.5 39.2 2 2 2 8 25.6 56.7 126.5 32. 151.8 59.1 2 2 2 8 48.4 17.9 28. 66.1 15.6 153.5 August 13; 21/31

Changepoint density estimates Left: (n, p, k, z, ϑ, σ 2 ) = (2, 1, 32, 8,.5, 1). Right: (n, p, k, z, ϑ, σ 2 ) = (2, 1, 32, 8, 1, 1). density..4.8.12 inspect dc sbs scan agg 2 agg 5 1 15 2 estimated changepoint location density..2.4.6.8.1.12 7 8 9 1 estimated changepoint location inspect dc sbs scan agg 2 agg August 13; 22/31

Misspecified settings (n, p, k, z, ϑ) = (2, 1, 32, 8, 1.5). Model inspect dc sbs scan agg agg 2 M unif 2.7 9.6 17.1 4.9 4.3 1.2 M exp 2.6 9.6 42.6 5. 4.7 9.6 M cs,loc (.2) 3.5 9.7 19.2 7. 5.4 9.8 M cs,loc (.5) 5.8 9.7 24.6 8.7 9.3 9.6 M cs (.5) 1.5 7.7 14.9 3. 3.6 6.7 M cs (.9) 2.7 9.9 18.6 4.7 4.7 9.6 M temp (.1) 6.1 2.3 12.8 9.4 1.9 2.2 M temp (.3) 3.1 32.4 276.4 38.8 38.2 34.8 M async (1) 5.8 11.5 18.5 7.8 7. 11.3 August 13; 23/31

Multiple changepoint simulations n = 2, p = 2, k = 4, z = (5, 1, 15), σ 2 = 1. (ϑ (1), ϑ (2), ϑ (3) ) method ˆν 1 2 3 4 5 ARI % best (.4,.8, 1.2) inspect 62 34 4.74 5 dc 62 32 5 1.69 19 sbs 54 44 1 1.7 21 scan 2 95 3.68 19 agg 2 81 17 2.71 2 agg 68 29 3.68 8 (.6, 1.2, 1.8) inspect 19 71 9 1.9 55 dc 28 53 17 2.85 22 sbs 18 67 14 1.85 14 scan 74 26.78 14 agg 2 23 66 1 1.87 agg 32 58 9 1.85 1 August 13; 24/31

Histograms of estimated changepoints n = 2, p = 2, k = 4, z = (5, 1, 15), (ϑ (1), ϑ (2), ϑ (3) ) = (.6, 1.2, 1.8), σ = 1. frequency 5 1 2 3 frequency 5 1 2 3 5 1 15 2 changepoints estimated by inspect 5 1 15 2 changepoints estimated by dc frequency 5 1 2 3 frequency 5 1 2 3 5 1 15 2 changepoints estimated by sbs 5 1 15 2 changepoints estimated by scan frequency 5 1 2 3 frequency 5 1 2 3 5 1 15 2 5 1 15 2 changepoints estimated by agg 2 changepoints estimated by agg August 13; 25/31

Comparative genomic hybridisation dataset patient number 2 4 6 8 1 5 1 15 2 loci August 13; 26/31

Temporal dependence Now assume noise vectors W 1,..., W n form a stationary Gaussian process with covariance function K, so K(u) = Cov(W t, W t+u ). Assume n 1 u= K(u) op B, where B is known, and let ẑ be the output of inspect with λ := σ 8B log(np). C, C > such that if n 12 and z are even, and Cσ ϑτ kb log(np) n 1, then ( 1 P n ẑ z C σ 2 ) B log n nϑ 2 1 12 n. August 13; 27/31

Spatial dependence Now suppose W 1,..., W n iid Np (, Σ) with Σ. Then a θ v proj = argmax a S p 1 a Σa = Σ 1/2 argmax b Σ 1/2 θ = Σ 1 θ b S p 1 Σ 1. θ 2 If ˆΘ is an estimator of Θ := Σ 1, and ˆv is a leading left singular vector of ˆM as before, then we can estimate v proj by ˆv proj := ˆΘˆv ˆΘˆv 2. Under assumptions on Θ that allow us to control ˆΘ Θ op, analogous theory can be obtained. August 13; 28/31

Summary inspect is a new method for high-dimensional changepoint estimation Convex relaxation used to find projection direction, then CUSUM and WBS to identify multiple changepoints R package InspectChangepoint available! August 13; 29/31

References Aston, J. A. D. and Kirch, C. (214) Change points in high dimensional settings. arxiv:149.1771. Aue, A., Hörmann, S., Horváth, L. and Reimherr, M. (29). Break detection in the covariance structure of multivariate time series models. Ann. Statist. 37, 446 487. Cho, H. (216) Change-point detection in panel data via double CUSUM statistic. preprint. Cho, H. and Fryzlewicz, P. (215) Multiple changepoint detection for high dimensional time series via sparsified binary segmentation. J. R. Stat. Soc. Ser. B, 77, 475 57. Enikeeva, F. and Harchaoui, Z. (214) High-dimensional change-point detection with sparse alternatives. arxiv:1312.19v2. Frick, K., Munk, A. and Sieling, H. (214) Multiscale change point inference. J. R. Stat. Soc. Ser. B 76, 495 58. Fryzlewicz, P. (214) Wild binary segmentation for multiple change-point detection. Ann. Statist., 42, 2243 2281. August 13; 3/31

Horváth, L., Kokoszka, P. and Steinebach, J. (1999) Testing for changes in dependent observations with an application to temperature changes. J. Multi. Anal., 68, 96 199. Jirak, M. (215) Uniform change point tests in high dimension. Ann. Statist., 43, 2451 2483. Killick, R., Fearnhead, P. and Eckley, I. A. (212) Optimal detection of changepoints with a linear computational cost. J. Amer. Stat. Assoc. 17, 159 1598. Kirch, C., Mushal, B. and Ombao, H. (215) Detection of changes in multivariate time series with applications to EEG data. J. Amer. Statist. Assoc., 11, 1197 1216. Ombao, H., Von Sachs, R. and Guo, W. (25) SLEX analysis of multivariate nonstationary time series. J. Amer. Statist. Assoc., 1, 519 531. Page, E. S. (1955) A test for a change in a parameter occurring at an unknown point. Biometrika, 42, 523 527. Tillmann, A. N. and Pfetsch M. E. (214) The computational complexity of the restricted isometry property, the nullspace property, and related concepts in compressed sensing. IEEE Trans. Inform. Theory, 6, 1248 1259. Wang, T. and Samworth, R. J. (217) High-dimensional changepoint estimation via sparse projection. J. Roy. Statist. Soc., Ser. B, to appear. http://arxiv.org/abs/166.6246v2. August 13; 31/31