DIMACS Implementation Challenges 1 Network Flows and Matching, Clique, Coloring, and Satisability, Parallel Computing on Trees and

Similar documents
OPERATIONS SEQUENCING IN A CABLE ASSEMBLY SHOP

Routing-Aware Scan Chain Ordering

Heuristic Search & Local Search

Automatic Music Genre Classification

Dynamic Scheduling. Differences between Tomasulo. Tomasulo Algorithm. CDC 6600 scoreboard. Or ydanicm ceshuldngi

Computer Architecture Spring 2016

Amdahl s Law in the Multicore Era

Differences between Tomasulo. Another Dynamic Algorithm: Tomasulo Organization. Reservation Station Components

Increasing Capacity of Cellular WiMAX Networks by Interference Coordination

Instruction Level Parallelism Part III

Mathematics, Proofs and Computation

Instruction Level Parallelism Part III

Modelling a master detail scheduler for the laboratory

LCD and Plasma display technologies are promising solutions for large-format

CS152 Computer Architecture and Engineering Lecture 17 Advanced Pipelining: Tomasulo Algorithm

for Digital IC's Design-for-Test and Embedded Core Systems Alfred L. Crouch Prentice Hall PTR Upper Saddle River, NJ

Diversity control in ant colony optimization

HiPAcc-LTE: An Integrated High Performance Accelerator for 3GPP LTE Stream Ciphers

Instruction Level Parallelism and Its. (Part II) ECE 154B

Simulated Annealing for Target-Oriented Partial Scan

Interframe Bus Encoding Technique for Low Power Video Compression

DAY 1. Intelligent Audio Systems: A review of the foundations and applications of semantic audio analysis and music information retrieval

Outline. 1 Reiteration. 2 Dynamic scheduling - Tomasulo. 3 Superscalar, VLIW. 4 Speculation. 5 ILP limitations. 6 What we have done so far.

MATH& 146 Lesson 11. Section 1.6 Categorical Data

VISUAL MILL LAB. SECTION 1: Complete the following tests and fill out the appropriate sections on your Visual Mill Color Deficit Worksheet.

Layout-Aware Scan Chain Synthesis for Improved Path Delay Fault Coverage

IoT Technical foundation and use cases Anders P. Mynster, Senior Consultant High Tech summit DTU FORCE Technology at a glance

Algorithms, Lecture 3 on NP : Nondeterministic Polynomial Time

The Effect of Wire Length Minimization on Yield

Using Scan Side Channel to Detect IP Theft

WHAT MAKES FOR A HIT POP SONG? WHAT MAKES FOR A POP SONG?

Concurrent Programming through the JTAG Interface for MAX Devices

Processes for the Intersection

Improving Performance in Neural Networks Using a Boosting Algorithm

An Experimental Comparison of Fast Algorithms for Drawing General Large Graphs

Testing Results for a Video Poker System on a Chip

VLSI Test Technology and Reliability (ET4076)

ECG Demonstration Board

Express Letters. A Novel Four-Step Search Algorithm for Fast Block Motion Estimation

unbiased , is zero. Yï) + iab Fuller and Burmeister [4] suggested the estimator: N =Na +Nb + Nab Na +NB =Nb +NA.

Package colorpatch. June 10, 2017

Practical De-embedding for Gigabit fixture. Ben Chia Senior Signal Integrity Consultant 5/17/2011

Features. = +25 C, IF = 1 GHz, LO = +13 dbm*

Novel Pulsed-Latch Replacement Based on Time Borrowing and Spiral Clustering

A Discrete Time Markov Chain Model for High Throughput Bidirectional Fano Decoders

Post-Routing Layer Assignment for Double Patterning

Based on slides/material by. Topic 14. Testing. Testing. Logic Verification. Recommended Reading:

Design for Testability Part II

Best Pat-Tricks on Model Diagnostics What are they? Why use them? What good do they do?

North Shore Community College

CMOS Testing-2. Design for testability (DFT) Design and Test Flow: Old View Test was merely an afterthought. Specification. Design errors.

Status of Pulse Tube Cryocooler Development at Sunpower, Inc.

Binary Translation Using Peephole Superoptimizers. Sorav Bansal and Alex Aiken Stanford University

Technical report on validation of error models for n.

Visual Encoding Design

Implementing a Rudimentary Oscilloscope

PERFORMANCE OF 10- AND 20-TARGET MSE CLASSIFIERS 1

Logic Design II (17.342) Spring Lecture Outline

A Greedy Heuristic Algorithm for Flip-Flop Replacement Power Reduction in Digital Integrated Circuits

Cost-Aware Live Migration of Services in the Cloud

PLANE TESSELATION WITH MUSICAL-SCALE TILES AND BIDIMENSIONAL AUTOMATIC COMPOSITION

Efficient Trace Signal Selection using Augmentation and ILP Techniques

Power Reduction Techniques for a Spread Spectrum Based Correlator

STA4000 Report Decrypting Classical Cipher Text Using Markov Chain Monte Carlo

Boolean, 1s and 0s stuff: synthesis, verification, representation This is what happens in the front end of the ASIC design process

CS 61C: Great Ideas in Computer Architecture

Algorithmic Music Composition

Low Power Estimation on Test Compression Technique for SoC based Design

Electrical and Telecommunications Engineering Technology_TCET3122/TC520. NEW YORK CITY COLLEGE OF TECHNOLOGY The City University of New York

Review C program: foo.c Compiler Assembly program: foo.s Assembler Object(mach lang module): foo.o. Lecture #14

CS61C : Machine Structures

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

Fault Detection And Correction Using MLD For Memory Applications

Digital Integrated Circuits Lecture 19: Design for Testability

Research on sampling of vibration signals based on compressed sensing

Cascadable 4-Bit Comparator

Lecture 2 Video Formation and Representation

RECOMMENDATION ITU-R BT Studio encoding parameters of digital television for standard 4:3 and wide-screen 16:9 aspect ratios

Other funding sources. Amount requested/awarded: $200,000 This is matching funding per the CASC SCRI project

Clarification for 3G Coverage Obligation Verification Data

Introduction to Artificial Intelligence. Problem Solving and Search

Lecture 17: Introduction to Design For Testability (DFT) & Manufacturing Test

SRAM Based Random Number Generator For Non-Repeating Pattern Generation

Digital Image Processing and Pattern Recognition

Partial BIST Insertion to Eliminate Data Correlation

Distortion Analysis Of Tamil Language Characters Recognition

Tomasulo Algorithm. Developed at IBM and first implemented in IBM s 360/91

ORTHOGONAL frequency division multiplexing

TECHNOLOGIES for digital music have become increasingly

Yong Cao, Debprakash Patnaik, Sean Ponce, Jeremy Archuleta, Patrick Butler, Wu-chun Feng, and Naren Ramakrishnan

Latch-Based Performance Optimization for FPGAs. Xiao Teng

Mosaic 1.1 Progress Report April, 2010

CENTRE OF TESTING SERVICE INTERNATIONAL OPERATE ACCORDING TO ISO/IEC FCC ID TEST REPORT

A Tabu Search/Path Relinking Algorithm to Solve the Job Shop Scheduling Problem

OVER the past few years, electronic music distribution

TECHNOLOGIES for digital music have become increasingly

OddCI: On-Demand Distributed Computing Infrastructure

Time Domain Simulations

Go BEARS~ What are Machine Structures? Lecture #15 Intro to Synchronous Digital Systems, State Elements I C

Unit V Design for Testability

Transcription:

8th DIMACS Implementation Challenge: The Traveling Salesman Problem http://wwwresearchattcom/dsj/chtsp/ David S Johnson AT&T Labs { Research Florham Park, NJ 07932-0971 dsj@researchattcom http://wwwresearchattcom/dsj/ Co-Organized with Lyle McGeoch, Fred Glover, Cesar Rego

DIMACS Implementation Challenges 1 Network Flows and Matching, 1990-91 2 Clique, Coloring, and Satisability, 1992-93 3 Parallel Computing on Trees and Graphs, 1993-94 4 Fragment Assembly and Genome Rearrangment, 1994-95 5 Priority Queues and Dictionaries, 1995-96 6 Near Neighbor Searches in High Dimension, 1997-98 7 Semidenite Programming, 1999-2000 8 The Traveling Salesman Problem, 2000

OUTLINE OF TALK Why a Challenge Who should Participate How to Participate Preliminary Results { Machine Speeds and Normalizations { Algorithmic Comparisons Future Directions

SCIENTIFIC GOALS Determine the current state of the art with respect to tradeos between running time and quality of solution for the TSP Identify promising algorithmic ideas for the TSP worthy of further investigation Gain insight into combinatorial optimization in general by seeing how various generic ideas are best adapted to the TSP context Explore how best to conduct a distributed algorithmic comparison project of this sort, and how best to analyze and display the resulting data Produce a DIMACS technical report summarizing what we learn, with all participants as co-authors

OTHER AGENDAS Obtain source material for a summary chapter on experimental analysis of TSP algorithms to be written with Lyle McGeoch for an upcoming book on the TSP edited by Gutin and Punnen Establish a long-lived mechanism for future researchers to evaluate their algorithms in comparison to works of the past Stop the ow of uninformed papers on the TSP

DESIRED PARTICIPANTS Current TSP researchers Researchers who have published experimental results about TSP algorithms in the past, so that those results can be put in perspective New TSP researchers interested in investigating new ideas and unanswered questions Future TSP researchers who want to compare with previous results

ARENAS FOR COMPETITION (Currently Restricted to Symmetric TSP) 1 Heuristics Tour Construction Heuristics Simple Local Optimization (2-Opt, 3-Opt, and Variants) Lin-Kernighan Variants Chained Lin-Kernighan Variants Other Metaheuristics (Simulated Annealing, Tabu Search, Neural Nets, Genetic Algorithms, etc) 2 Fast Lower Bound Algorithms 3 Optimization Algorithms 4 Open to Suggestions

HOW TO PARTICIPATE 1 Download Instances, Instance Generators, and Benchmark Codes from the website 2 Compile Generators and Benchmark Codes (C code) using your standard compilers on your standard machine 3 Run the Generators to generate the random instances in the testbed, comparing to downloaded samples to verify that Generators are performing correctly 4 Run the Benchmark Greedy code on selected random instances (as specied on the website) to (roughly) benchmark your machine's speed as a function of instance size Do this for all the specied instance that will t in your machine's memory 5 Run your own codes on the all the Benchmark Instances that they can handle Allowed excuses for failure to run: Instance too big, Running time too long, Code can't handle instances of this type (distance matrices, fractional coordinates, etc) 6 Send results to dsj@researchattcom using formats specied at the website (Tentative initial deadline: 30 September 2000) 7 Extra Credit: Perform extra experiments as suggested by DSJ or other participants Suggest extra experiments to be performed by DSJ or other participants

TESTBED, Part I - 55 Random Instances 1 Uniform Random Euclidean Instances (Points uniform in the 10 6 10 6 square) Sizes increasing by factors of p 10 from 1,000 to 10,000,000 Ten 1,000-city instances Five 3,162-city instances Three 10,000-city instances Two 31,623-city instances Two 100,000-city instances One each of 10 5:5 -, 10 6 -, 10 6:5 -, and 10 7 -city instances 2 Uniform Random Euclidean Instances (Points clustered in the 10 6 10 6 square) Ten 1,000-city instances Five 3,162-city instances Three 10,000-city instances Two 31,623-city instances Two 100,000-city instances 3 Random Distance Matrices (Distances chosen uniformly from h 0 10 6i ) Four 1,000-city instances Two 3,162-city instances One 10,000-city instance

TESTBED, Part II - 34 TSPLIB Instances dsj1000 pr1002 si1032 u1060 vm1084 pcb1173 d1291 rl1304 rl1323 nrw1379 fl1400 u1432 fl1577 d1655 vm1748 u1817 rl1889 d2103 u2152 u2319 pr2392 pcb3038 fl3795 fnl4461 rl5915 rl5934 pla7397 rl11849 usa13509 brd14051 d15112 d18512 pla33810 pla85900

0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 10,000-City Uniform Random Euclidean Instance

0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 1,000-City Clustered Random Euclidean Instance

0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 3,162-City Clustered Random Euclidean Instance

0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 0 2*10^5 4*10^5 6*10^5 8*10^5 10^6 10,000-City Clustered Random Euclidean Instance

Lin-Kernighan Results PERCENT EXCESS OVER HELD-KARP BOUND 0 1 2 3 4 5 6 Uniform Random Euclidean Clustered Random Euclidean Random Distance Matrix TSPLIB 10^3 5*10^3 5*10^4 5*10^5 NUMBER OF CITIES

100,000-City Uniform Random Euclidean Instance (From Johnson, Bentley, McGeoch, & Rothberg, 1993) 6 0 FRP 5 0 P E R C E N T E X C E S S 4 0 3 0 2 0 SP ST DST NN CL DMST NA+ NI CHCI CI CHGA KP 1 0 GRRA+ CW RI FA+ ACH FI 0 2OPT 3OPT LK 10 50 100 500 1000 5000 VAX 8550 Time in Seconds

The Test Battery time greedy E1k0 1000 time greedy E3k0 316 time greedy E10k0 100 time greedy E31k0 32 time greedy E100k0 10 time greedy E316k0 3 time greedy E1M0 1 time greedy E3M0 1 time greedy E10M0 1 User Seconds Instance 1000 x 1000 316 x 3162 100 x 10,000 32 x 31,623 10 x 100,000 3 x 316,228 1 x 1,000,000 1 x 3,162,278 1 x 10,000,000 500 Mhz Alpha 400 Mhz MIPS R12000 300 Mhz MIPS R12000 500 Mhz Pentium III 440 Mhz Sparc Ultra 10 196 Mhz MIPS R10000 135 Mhz IBM Power2 12 14 18 22 23 29 51 14 15 20 24 27 31 55 16 17 23 28 34 37 74 23 22 29 40 52 71 128 36 41 59 47 73 160 143 55 73 92 52 89 380 184 88 124 156 68 120 480 259 330 711 790 235 430 1690 911 1330 3670 3600 -- -- 6100 --

MACHINE SPEEDS SECONDS/NLOGN 00 000001 000002 000003 000004 500 Mhz Alpha 500 Mhz Pentium 440 Mhz Sparc 400 Mhz MIPS 300 Mhz MIPS 196 Mhz MIPS 135 Mhz PowerPC 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES N

Normalization: 196 Mhz MIPS to 500 Mhz Alpha CORRECTION FACTOR 015 020 025 030 035 040 045 usa13509 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES Correction Factor = Benchmark Greedy time for Alpha Benchmark Greedy time for MIPS

Errors in Running Time Normalization: Greedy Algorithm MIPS NORMALIZED TIME/ALPHA ACTUAL TIME 04 06 08 10 12 14 Overestimate Underestimate 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES

Errors in Running Time Normalization: Lin-Kernighan MIPS NORMALIZED TIME/ALPHA ACTUAL TIME 05 10 15 20 Overestimate Underestimate 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES

Greedy versus Clarke-Wright (Alpha vs MIPS) RATIO OF NORMALIZED RUNNING TIMES 08 10 12 14 16 CW Better GR Better 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES

Greedy versus Clarke-Wright (Same Machine) RATIO OF NORMALIZED RUNNING TIMES 06 08 10 12 14 CW Better GR Better 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES

Greedy versus Clarke-Wright PERCENT DIFFERENCE IN TOUR LENGTHS -2 0 2 4 6 8 10 12 CW Better GR Better 10^3 10^4 10^5 10^6 10^7 NUMBER OF CITIES

Chained LK: Johnson-McGeoch vs Applegate-Cook-Rohe RATIO OF NORMALIZED RUNNING TIMES 0 10 20 30 40 A-C-R Better J-M Better 1000 5000 10000 50000 100000 NUMBER OF CITIES

Chained LK: Johnson-McGeoch vs Applegate-Cook-Rohe PERCENT DIFFERENCE IN TOUR LENGTHS -20-15 -10-05 00 05 A-C-R Better J-M Better 1000 5000 10000 50000 100000 NUMBER OF CITIES

usa13509 Excess Excess Normalized over over Running HK Bound Optimal Time Algorithm 1725 1648 028 greedy 496 427 8741 2opt 332 264 123 acrlk 325 257 131 acrclk10 303 235 8753 3opt 242 174 184 acrclk100 218 151 10176 lk 137 071 669 acrclk1000 130 063 15598 ilk1n 104 038 27602 ilk3n 091 025 6383 acrclkn 086 019 60967 ilkn 000-066 11886 heldkarp -012-078 10693 rhk1-030 -095 2615 rhk2-035 -101 1332 rhk3

3,162-City Random Distance Matrix Excess Excess Normalized over over Running HK Bound Optimal Time Algorithm 21905 21904 1670 greedy 8917 8917 1948 2opt 4673 4672 1890 3opt 743 743 358 acrlk 608 608 477 acrclk10 466 466 1256 acrclk100 418 418 2019 lk 375 375 6264 acrclk1000 272 272 3151 ilk1n 250 250 5755 ilk3n 231 231 12150 ilkn 000 000 61249 concorde 000 000 4036 heldkarp

CONCLUSIONS Yet to be derived Your Help Needed!