Scrambling and Descrambling SMT-LIB Benchmarks

Similar documents
Cryptography CS 555. Topic 5: Pseudorandomness and Stream Ciphers. CS555 Spring 2012/Topic 5 1

David Chaum s Voter Verification using Encrypted Paper Receipts

The PeRIPLO Propositional Interpolator

MATH 214 (NOTES) Math 214 Al Nosedal. Department of Mathematics Indiana University of Pennsylvania. MATH 214 (NOTES) p. 1/11

TEST PATTERNS COMPRESSION TECHNIQUES BASED ON SAT SOLVING FOR SCAN-BASED DIGITAL CIRCUITS

Exercise 4. Data Scrambling and Descrambling EXERCISE OBJECTIVE DISCUSSION OUTLINE DISCUSSION. The purpose of data scrambling and descrambling

Virtual Vibration Analyzer

Applications of FIBERPRO s Polarization Scrambler PS3000 Series

Basics of BISS scrambling. Newtec. Innovative solutions for satellite communications

10GBASE-R Test Patterns

COMP Test on Psychology 320 Check on Mastery of Prerequisites

Permutation-based cryptography for the Internet of Things

Music Morph. Have you ever listened to the main theme of a movie? The main theme always has a

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Algorithms, Lecture 3 on NP : Nondeterministic Polynomial Time

ECE438 - Laboratory 1: Discrete and Continuous-Time Signals

Key-based scrambling for secure image communication

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Visual Literacy and Design Principles

Part I: Graph Coloring

Example: compressing black and white images 2 Say we are trying to compress an image of black and white pixels: CSC310 Information Theory.

How to Predict the Output of a Hardware Random Number Generator

MVP: Capture-Power Reduction with Minimum-Violations Partitioning for Delay Testing

NON-BREAKABLE DATA ENCRYPTION WITH CLASSICAL INFORMATION

USING MATLAB CODE FOR RADAR SIGNAL PROCESSING. EEC 134B Winter 2016 Amanda Williams Team Hertz

Machine Learning Term Project Write-up Creating Models of Performers of Chopin Mazurkas

Mathematics, Proofs and Computation

2D ELEMENTARY CELLULAR AUTOMATA WITH FOUR NEIGHBORS

Techniques for Seed Computation and Testability Enhancement for Logic Built-In Self Test

40/100 GbE PCS/PMA Testing

IF MONTY HALL FALLS OR CRAWLS

Chapter 6. Normal Distributions

AP English Literature 1999 Scoring Guidelines

Segmented Leap-Ahead LFSR Architecture for Uniform Random Number Generator

Lecture 3: Nondeterministic Computation

Evaluating Oscilloscope Mask Testing for Six Sigma Quality Standards

Peirce's Remarkable Rules of Inference

A MONTE-CARLO APPROACH

Data will be analysed based upon actual screen size, but may be presented if necessary in three size bins : Screen size category Medium (27 to 39 )

Prime Num Generator - Maker Faire 2014

Chapter 3. Boolean Algebra and Digital Logic

Chapter 12. Synchronous Circuits. Contents

Note on Path Signed Graphs

A QUERY BY EXAMPLE MUSIC RETRIEVAL ALGORITHM

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.

Using Scan Side Channel to Detect IP Theft

140 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 12, NO. 2, FEBRUARY 2004

Analysis of WFS Measurements from first half of 2004

Reducing False Positives in Video Shot Detection

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

2. AN INTROSPECTION OF THE MORPHING PROCESS

Fault Analysis of Stream Ciphers

AN IMPROVED ERROR CONCEALMENT STRATEGY DRIVEN BY SCENE MOTION PROPERTIES FOR H.264/AVC DECODERS

Design and Implementation of Data Scrambler & Descrambler System Using VHDL

Hardware Model Checking Competition 2014 CAV Edition

Achieving BER/FLR targets with clause 74 FEC. Phil Sun, Marvell Adee Ran, Intel Venugopal Balasubramonian, Marvell Zhenyu Liu, Marvell

Digital Audio: Some Myths and Realities

SharkFest 17 Europe. Generating Wireshark Dissectors from XDR Files. Why you don't want to write them by hand. Richard Sharpe.

hit), and assume that longer incidental sounds (forest noise, water, wind noise) resemble a Gaussian noise distribution.

Combining Pay-Per-View and Video-on-Demand Services

Relationships Between Quantitative Variables

Sequences and Cryptography

Multiple Image Secret Sharing based on Linear System

Midterm Review. Yao Wang Polytechnic University, Brooklyn, NY11201

Predicting the immediate future with Recurrent Neural Networks: Pre-training and Applications

LabView Exercises: Part II

Sentiment Analysis on YouTube Movie Trailer comments to determine the impact on Box-Office Earning Rishanki Jain, Oklahoma State University

Permutation based speech scrambling for next generation mobile communication

V.Sorge/E.Ritter, Handout 5

CS408 Cryptography & Internet Security

Powerful Software Tools and Methods to Accelerate Test Program Development A Test Systems Strategies, Inc. (TSSI) White Paper.

Lab experience 1: Introduction to LabView

Introduction p. 1 The Elements of an Argument p. 1 Deduction and Induction p. 5 Deductive Argument Forms p. 7 Truth and Validity p. 8 Soundness p.

Precise Digital Integration of Fast Analogue Signals using a 12-bit Oscilloscope

Optimizing the Error Recovery Capabilities of LDPC-staircase Codes Featuring a Gaussian Elimination Decoding Scheme

Relationships. Between Quantitative Variables. Chapter 5. Copyright 2006 Brooks/Cole, a division of Thomson Learning, Inc.

EXPLORING THE USE OF ENF FOR MULTIMEDIA SYNCHRONIZATION

More About Regression

Logic synthesis for post-cmos technologies

The Leading Broadcast Graphics Solution for Live Production Powerful Shader-based Masking 4K-Ready LYRICX PRODUCT INFORMATION SHEET

Total Minimal Dominating Signed Graph

Yale University Department of Computer Science

1/20/2010 WHY SHOULD WE PUBLISH AT ALL? WHY PUBLISH? INNOVATION ANALOGY HOW TO WRITE A PUBLISHABLE PAPER?

Cryptagram. Photo Privacy for Online Social Media Matt Tierney, Ian Spiro Christoph Bregler, Lakshmi Subramanian

An Efficient Multi-Target SAR ATR Algorithm

INTEGRATED CIRCUITS. AN219 A metastability primer Nov 15

HYBRID CONCATENATED CONVOLUTIONAL CODES FOR DEEP SPACE MISSION

A Review of logic design

Bilbo-Val: Automatic Identification of Bibliographical Zone in Papers

Motion Video Compression

Module 2 :: INSEL programming concepts

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

Heart Rate Variability Preparing Data for Analysis Using AcqKnowledge

Automatic Piano Music Transcription

SYNTCOMP Synthesis Competition for Reactive Systems

Mining Complex Boolean Expressions for Sequential Equivalence Checking

Negation Switching Equivalence in Signed Graphs

Randomness for Ergodic Measures

Generating Music with Recurrent Neural Networks

GBA 327: Module 7D AVP Transcript Title: The Monte Carlo Simulation Using Risk Solver. Title Slide

Transcription:

Scrambling and Descrambling SMT-LIB Benchmarks Tjark Weber Uppsala University, Sweden SMT 2016 Coimbra, Portugal Tjark Weber Scrambling and Descrambling... 1 / 16

Motivation The benchmarks used in the SMT Competition are known in advance. Competing solvers could cheat by simply looking up the correct answer for each benchmark in the SMT Library. To make this form of cheating more difficult, benchmarks in the competition are lightly scrambled. Tjark Weber Scrambling and Descrambling... 2 / 16

Scrambling: Example ( s e t l o g i c UFNIA) ( set info : s t a t u s u n s a t ) ( declare fun f ( I n t I n t ) I n t ) ( declare fun x ( ) I n t ) ( a s s e r t ( f o r a l l ( ( y I n t ) ) (< ( f y y ) y ) ) ) ( a s s e r t (> x 0 ) ) ( a s s e r t (> ( f x x ) ( 2 x ) ) ) ( check sat ) ( e x i t ) Original benchmark Tjark Weber Scrambling and Descrambling... 3 / 16

Scrambling: Example ( s e t l o g i c UFNIA) ( set info : s t a t u s u n s a t ) ( declare fun f ( I n t I n t ) I n t ) ( declare fun x ( ) I n t ) ( a s s e r t ( f o r a l l ( ( y I n t ) ) (< ( f y y ) y ) ) ) ( a s s e r t (> x 0 ) ) ( a s s e r t (> ( f x x ) ( 2 x ) ) ) ( check sat ) ( e x i t ) Original benchmark ( s e t l o g i c UFNIA) ( declare fun x2 ( ) I n t ) ( declare fun x1 ( I n t I n t ) I n t ) ( a s s e r t (< ( x2 2) ( x1 x2 x2 ) ) ) ( a s s e r t (> x2 0 ) ) ( a s s e r t ( f o r a l l ( ( x3 I n t ) ) (> x3 ( x1 x3 x3 ) ) ) ) ( check sat ) ( e x i t ) Scrambled benchmark Tjark Weber Scrambling and Descrambling... 3 / 16

The Benchmark Scrambler The benchmark scrambler parses SMT-LIB benchmarks into an abstract syntax tree, which is then printed again in concrete SMT-LIB syntax. Originally developed by Alberto Griggio Written in C++ ( 1,000 lines of code) Based on a Flex/Bison parser ( 900 lines) for the SMT-LIB language Used (with minor modifications) at every SMT-COMP since 2011 Tjark Weber Scrambling and Descrambling... 4 / 16

The (Old) Scrambling Algorithm 1 Comments and other artifacts that have no logical effect are removed. 2 Input names, in the order in which they are encountered during parsing, are replaced by names of the form x1, x2,.... 3 Variables bound by the same binder (e.g., let, forall ) are shuffled. 4 Arguments to commutative operators (e.g., and, +) are shuffled. 5 Anti-symmetric operators (e.g., <, bvslt ) are randomly replaced by their counterparts (e.g., >, bvsgt). 6 Consecutive declarations are shuffled. 7 Consecutive assertions are shuffled. All pseudo-random choices depend on a seed value that is not known to competition solvers. Tjark Weber Scrambling and Descrambling... 5 / 16

Benchmark Normalization Since scrambling loses information (e.g., input names), the original benchmark cannot be restored from the scrambled benchmark alone. However, how difficult is it to identify some original benchmark(s) in the SMT Library that could have resulted in the scrambled output? Scrambling Original benchmark Scrambled benchmark Tjark Weber Scrambling and Descrambling... 6 / 16

Benchmark Normalization Since scrambling loses information (e.g., input names), the original benchmark cannot be restored from the scrambled benchmark alone. However, how difficult is it to identify some original benchmark(s) in the SMT Library that could have resulted in the scrambled output? This turns out to be computationally easy. We use a normalization algorithm: Scrambling Original benchmark Normalization Normalization Scrambled benchmark Normalized benchmark Tjark Weber Scrambling and Descrambling... 6 / 16

The Normalization Algorithm 1 Comments and other artifacts that have no logical effect are removed. 2 For original benchmarks, input names, in the order in which they are encountered during parsing, are replaced by names of the form x1, x2,.... For scrambled benchmarks, input names are retained. 3 Variables bound by the same binder (e.g., let, forall ) are sorted. 4 Arguments to commutative operators (e.g., and, +) are sorted. 5 Anti-symmetric operators (e.g., <, bvslt ) are replaced by a canonical representation. 6 Consecutive declarations are sorted. 7 Consecutive assertions are sorted. Where the scrambler shuffles, the normalizer sorts. Tjark Weber Scrambling and Descrambling... 7 / 16

The World s Fastest SMT Solver Our normalization algorithm allows us to build a cheating SMT solver. Before the competition: 1 Normalize all 154,238 benchmarks used in the Main Track of SMT-COMP 2015. 2 For each normal form, compute its SHA-512 hash digest. Create a map from digests to benchmark status. During the competition, for each scrambled benchmark: 1 Normalize the benchmark (retaining input names). 2 Compute the SHA-512 digest of the normal form. 3 Use this to look up the benchmark s status in the pre-computed map. Tjark Weber Scrambling and Descrambling... 8 / 16

The World s Fastest SMT Solver: Performance We compare the performance of our normalizing solver to the performance of a virtual best solver obtained by using, for each benchmark, the best performance of any solver that participated in SMT-COMP 2015. Run-time comparison for each benchmark: Tjark Weber Scrambling and Descrambling... 9 / 16

The World s Fastest SMT Solver: Performance (cont.) Run-times plotted against the number of benchmarks solved: Our normalizing solver solves every benchmark and is (on average) 223 times faster than the virtual best solver. Tjark Weber Scrambling and Descrambling... 10 / 16

Benchmark Similarities in the SMT Library Our normalization algorithm allows us to identify similar benchmarks in the SMT Library. There are 196,375 non-incremental benchmarks in the 2015 release of the SMT Library. We call two benchmarks similar if they have the same normal form. Tjark Weber Scrambling and Descrambling... 11 / 16

Benchmark Similarities in the SMT Library: Findings 10000 Equivalence classes 1000 100 10 1 10 100 1000 Size (benchmarks) 30,799 benchmarks (16%) are duplicates wrt. similarity. Up to 1,499 similar versions of a single benchmark. 119 benchmarks with unknown status are similar (and thus equisatisfiable) to benchmarks with known status. Tjark Weber Scrambling and Descrambling... 12 / 16

Requirements on a Good Scrambling Algorithm 1 Must not affect satisfiability. 2 Must be efficient. 3 Should (ideally) not affect solving times. 4 Given two benchmarks, it should be hard to decide without additional information (such as the seed used for scrambling) whether one is a scrambled version of the other. Tjark Weber Scrambling and Descrambling... 13 / 16

Requirements on a Good Scrambling Algorithm 1 Must not affect satisfiability. 2 Must be efficient. 3 Should (ideally) not affect solving times. 4 Given two benchmarks, it should be hard to decide without additional information (such as the seed used for scrambling) whether one is a scrambled version of the other. The old scrambling algorithm meets (1)-(3), but falls short of (4). Observation: Our normalization algorithm crucially relies on the fact that the replacement of input names with names of the form x1, x2,... is entirely predictable. Tjark Weber Scrambling and Descrambling... 13 / 16

A New Scrambling Algorithm 1 Comments and other artifacts that have no logical effect are removed. 2 Input names, in the order in which they are encountered during parsing, are replaced by names of the form x1, x2,.... 3 A random permutation π is applied to all names, replacing each name xi with π(xi). 4 Variables bound by the same binder (e.g., let, forall ) are shuffled. 5 Arguments to commutative operators (e.g., and, +) are shuffled. 6 Anti-symmetric operators (e.g., <, bvslt ) are randomly replaced by their counterparts (e.g., >, bvsgt). 7 Consecutive declarations are shuffled. 8 Consecutive assertions are shuffled. Tjark Weber Scrambling and Descrambling... 14 / 16

The New Scrambling Algorithm is GI-Complete Theorem For the new scrambling algorithm, the problem of determining whether two benchmarks are scrambled versions of each other is GI-complete. Proof of GI-hardness: Given a graph G = (V, E), construct a corresponding SMT-LIB benchmark B(G) as follows: v V {v1, v2} E ( declare fun v ( ) Bool ) ( a s s e r t (= v1 v2 ) ) Now two graphs G and H are isomorphic if and only if B(G) and B(H) are scrambled versions of each other. Tjark Weber Scrambling and Descrambling... 15 / 16

Conclusions The scrambling algorithm used at SMT-COMP since 2011 is ineffective at obscuring the original benchmark. However, we have no reason to believe that cheating has occurred at past competitions. Our improved scrambling algorithm renders the problem of identifying the original benchmark GI-complete. This algorithm has now been used at SMT-COMP 2016. Nonetheless, the competition may have to rely on social disincentives and scrutiny more than on technical measures to prevent this form of cheating. Is there an even better scrambling algorithm? Tjark Weber Scrambling and Descrambling... 16 / 16