Test Compression for Circuits with Multiple Scan Chains

Similar documents
IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE

Survey of Test Vector Compression Techniques

Response Compaction with any Number of Unknowns using a new LFSR Architecture*

Czech Technical University in Prague Faculty of Information Technology Department of Digital Design

Changing the Scan Enable during Shift

State Skip LFSRs: Bridging the Gap between Test Data Compression and Test Set Embedding for IP Cores *

Achieving High Encoding Efficiency With Partial Dynamic LFSR Reseeding

Reducing Test Point Area for BIST through Greater Use of Functional Flip-Flops to Drive Control Points

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab. Built-In Self Test 2

VLSI System Testing. BIST Motivation

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

Overview: Logic BIST

HIGHER circuit densities and ever-increasing design

A New Low Energy BIST Using A Statistical Code

Deterministic BIST Based on a Reconfigurable Interconnection Network

Design of Fault Coverage Test Pattern Generator Using LFSR

TEST PATTERNS COMPRESSION TECHNIQUES BASED ON SAT SOLVING FOR SCAN-BASED DIGITAL CIRCUITS

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

LOW-OVERHEAD BUILT-IN BIST RESEEDING

Low Power Estimation on Test Compression Technique for SoC based Design

DETERMINISTIC SEED RANGE AND TEST PATTERN DECREASE IN LOGIC BIST

Design of Test Circuits for Maximum Fault Coverage by Using Different Techniques

Jin-Fu Li Advanced Reliable Systems (ARES) Laboratory. National Central University

SIC Vector Generation Using Test per Clock and Test per Scan

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

Weighted Random and Transition Density Patterns For Scan-BIST

A Novel Low Power pattern Generation Technique for Concurrent Bist Architecture

Using on-chip Test Pattern Compression for Full Scan SoC Designs

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Low Power Illinois Scan Architecture for Simultaneous Power and Test Data Volume Reduction

Power Problems in VLSI Circuit Testing

Channel Masking Synthesis for Efficient On-Chip Test Compression

Final Exam CPSC/ECEN 680 May 2, Name: UIN:

Reducing Power Supply Noise in Linear-Decompressor-Based Test Data Compression Environment for At-Speed Scan Testing

FOR A DISSERTATION SUBMITTED TO THE DEPARTMENT OF ELECTRICAL ENGINEERING AND THE COMMITTEE ON GRADUATE STUDIES OF STANFORD UNIVERSITY

Research Article Ring Counter Based ATPG for Low Transition Test Pattern Generation

I. INTRODUCTION. S Ramkumar. D Punitha

Design and Implementation OF Logic-BIST Architecture for I2C Slave VLSI ASIC Design Using Verilog

Logic Design for Single On-Chip Test Clock Generation for N Clock Domain - Impact on SOC Area and Test Quality

Transactions Brief. Circular BIST With State Skipping

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

Bit Swapping LFSR and its Application to Fault Detection and Diagnosis Using FPGA

ECE 715 System on Chip Design and Test. Lecture 22

A New Approach to Design Fault Coverage Circuit with Efficient Hardware Utilization for Testing Applications

Synthesis Techniques for Pseudo-Random Built-In Self-Test Based on the LFSR

for Digital IC's Design-for-Test and Embedded Core Systems Alfred L. Crouch Prentice Hall PTR Upper Saddle River, NJ

Scan-shift Power Reduction Based on Scan Partitioning and Q-D Connection

Controlling Peak Power During Scan Testing

A Combined Compatible Block Coding and Run Length Coding Techniques for Test Data Compression

Test-Pattern Compression & Test-Response Compaction. Mango Chia-Tso Chao ( 趙家佐 ) EE, NCTU, Hsinchu Taiwan

MVP: Capture-Power Reduction with Minimum-Violations Partitioning for Delay Testing

Seed Encoding with LFSRs and Cellular Automata

926 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 7, JULY /$ IEEE

Test Point Insertion with Control Point by Greater Use of Existing Functional Flip-Flops

Test Data Compression for System-on-a-Chip Using Golomb Codes 1

Strategies for Efficient and Effective Scan Delay Testing. Chao Han

DETERMINISTIC TEST PATTERN GENERATOR DESIGN WITH GENETIC ALGORITHM APPROACH

Design of BIST with Low Power Test Pattern Generator

Fault Detection And Correction Using MLD For Memory Applications

Design for Test. Design for test (DFT) refers to those design techniques that make test generation and test application cost-effective.

VLSI Test Technology and Reliability (ET4076)

Bit-Serial Test Pattern Generation by an Accumulator behaving as a Non-Linear Feedback Shift Register

Lecture 23 Design for Testability (DFT): Full-Scan

CMOS Testing-2. Design for testability (DFT) Design and Test Flow: Old View Test was merely an afterthought. Specification. Design errors.

Controlled Transition Density Based Power Constrained Scan-BIST with Reduced Test Time. Farhana Rashid

March Test Compression Technique on Low Power Programmable Pseudo Random Test Pattern Generator

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Lecture 23 Design for Testability (DFT): Full-Scan (chapter14)

K.T. Tim Cheng 07_dft, v Testability

Clock Control Architecture and ATPG for Reducing Pattern Count in SoC Designs with Multiple Clock Domains

Built-In Self-Test (BIST) Abdil Rashid Mohamed, Embedded Systems Laboratory (ESLAB) Linköping University, Sweden

Instructions. Final Exam CPSC/ELEN 680 December 12, Name: UIN:

Design for Testability

Design of BIST Enabled UART with MISR

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

Scan. This is a sample of the first 15 pages of the Scan chapter.

Design for test methods to reduce test set size

Department of Electrical and Computer Engineering University of Wisconsin Madison. Fall Final Examination CLOSED BOOK

A Novel Method for UVM & BIST Using Low Power Test Pattern Generator

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

VLSI IMPLEMENTATION OF SINGLE CYCLE ACCESS STRUCTURE FOR LOGIC TEST IN FPGA TECHNOLOGY

This Chapter describes the concepts of scan based testing, issues in testing, need

Clock Gate Test Points

Testing Digital Systems II

Diagnosis of Resistive open Fault using Scan Based Techniques

Soft Computing Approach To Automatic Test Pattern Generation For Sequential Vlsi Circuit

Efficient Combination of Trace and Scan Signals for Post Silicon Validation and Debug

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

Unit 8: Testability. Prof. Roopa Kulkarni, GIT, Belgaum. 29

TEST PATTERN GENERATION USING PSEUDORANDOM BIST

At-speed Testing of SOC ICs

BUILT-IN SELF-TEST BASED ON TRANSPARENT PSEUDORANDOM TEST PATTERN GENERATION. Karpagam College of Engineering,coimbatore.

Comparative Analysis of Stein s. and Euclid s Algorithm with BIST for GCD Computations. 1. Introduction

CPE 628 Chapter 5 Logic Built-In Self-Test. Dr. Rhonda Kay Gaede UAH. UAH Chapter Introduction

DESIGN OF TEST PATTERN OF MULTIPLE SIC VECTORS FROM LOW POWER LFSR THEORY AND APPLICATIONS IN BIST SCHEMES

CacheCompress A Novel Approach for Test Data Compression with cache for IP cores

Multivalued Logic for Reduced Pin Count and Multi-Site SoC Testing

ISSN Vol.04, Issue.09, September-2016, Pages:

Retiming Sequential Circuits for Low Power

Transcription:

Test Compression for Circuits with Multiple Scan Chains Ondřej Novák, Jiří Jeníček, Martin Rozkovec Institute of Information Technologies and Electronics Technical University in Liberec Liberec, Czech Republic ondrej.novak@tul.cz, jiri.jenicek@tul.cz, martin.rozkovec@tul.cz Abstract The paper presents a test pattern compression method for circuits with a high number of parallel scan chains. It reduces test time while it keeps hardware overhead low. The decompression method is based on the continuous LFSR reseeding that is used in such a way that it enables LFSR lockout escaping within a small number of clock cycles. It requires a separate controlling of the LFSR decompressor and the scan chain clock inputs. The paper discusses decompression effectiveness for different LFSR shapes, scan chain lengths and numbers of parallel LFSR inputs. We have found that it is hardware saving to use an LFSR with the state skipping instead of using a LFSR accompanied with a phase shifter. It can be designed in such a way that it uses a lower number of internal XOR gates, guarantees maximum separation between scan chains and does not introduce an extra delay on the LFSR outputs. Experimental results on benchmark circuits have shown that the presented test pattern decompression provides unreduced fault coverage and short test lengths while the hardware overhead is low comparing the designs designed with the help of nowadays industrial tools. Keywords esign For Testability in IC esign, ASIC Testing, Test ata Volume Compression, Linear Finite State Machines, Test Application Time Reduction I. INTROUCTION It is commonly accepted that for large circuits it is useful to adopt some method of test compression and on chip decompression that reduces test data volume and test time during test pattern transfer from a tester (ATE) to circuit under test (CUT) [21]. When looking for an appropriate test compression method, it is necessary to consider several parameters: test data, test time, test access mechanism (TAM) bandwidth, compression algorithm complexity, decompressor hardware overhead and ability of the test equipment to compress test responses and mask the unknown logic CUT responses (Xvalues) [5]. The most important parameters that are considered when evaluating the quality of test pattern compression are: keeping 100% test coverage, number of test clock cycles (test length), number of tester channels devoted to the CUT test, hardware overhead, possibility of non-valid responses masking, energy spent on testing (low test energy, constrained test power). The idea of test compression is based on the fact that the modern circuits can be typically tested with the deterministic test patterns containing more than 95% of don t care bits. The test pattern care bits are obtained on the decompressor outputs by unrolling the decompressor seeds while don t care test bits are replaced by pseudorandom values that are obtained on the resting decompressor outputs positions [12], [1], [17], [13] and [22]. Special finite automata usually accompanied with a phase shifter that can reach demanded specific logical values on appropriate outputs in appropriate clock cycles and generate pseudorandom values on the rest of outputs can be used for the test pattern decompression. The sequence of the automaton seeds and injected internal state values serves as a compressed test pattern. Usually the automaton is linear that means that its output space (the space of all possible vectors that it can generate) is a linear subspace spanned by a Boolean matrix. This feature is crucial for the speed of test pattern encoding. LFSRs and-or ring generators with additional phase shifters are widely used for decompression of test patterns [8], [12], [1], [17], [9], [14], [22], [2], [7] and [11]. Encoding of a test pattern using a linear automaton requires solving the corresponding system of linear equations, one equation for one specified bit. Testing of multi-core systems can be done in a shorter time by sharing the tester channels by more cores simultaneously and retaining the unused tester data between consecutive test patterns [14]. The encoding ability is dependent on the number of free variables that can be used for LFSR seeding. The patterns have to be seeded either at the beginning of the encoding (static approach) or they can be injected continuously during the scan chain loading (dynamic approach - continuous reseeding). The second approach has an advantage of possible using shorter LFSRs. Even if there are free variables in the phase of scan chain loading it may occur that the pattern is locked out and the parallel outputs cannot load the scan chains with required logical values. This situation can be either accepted (in cases where the reduced fault coverage does not matter) or it can be solved by adding ATE channels and-or reducing the number of parallel scan chains which reduces the probability of the pattern lockout. An alternative method how to eliminate the lockout is to add a control signal that gates scan chain shifting when it is necessary to reseed the LFSR. Another possibility is to simply broadcast the data from the ATE to several parallel scan chains. This solution can be considered to be a special case of linear decompression where the decompressor consists of fan-out wires only. This arrangement reduces possibilities of successful encoding test patterns comparing with sequential decompressor automaton as the bits loaded into the parallel scan chains are equal and thus there exist a lot of bit pairs and more-tuples that cannot be set into different logical values. The advantage of the test pattern broadcasting is that it is easy to constrain the ATPG with the constraints caused by the broadcasting the patterns and to force the ATPG to generate patterns that respect the topology of the scan chains. A possibility how to overcome the described pattern lockout of this scheme is to use the method Illinois Scan [16]. It keeps advantages of the ATPG constraining and it

enables to convert the parallel loading scheme to a single scan chain that can be loaded with arbitrary pattern. The conversion can be done for the whole test pattern (static approach) or during the test pattern loading (dynamic approach) [18]. The dynamic approach provides usually shorter test sequences but requires more complicated computations for test pattern encoding. The method Embedded eterministic Test (ET) [17] uses a ring generator accompanied by a phase shifter. If the number of flip-flops in the ring generator and the number of ATE channels that load the ring generator are high enough there is only a small probability of the decompressor lockout. The number of tester channels could be reduced for modular testing scheme [7]. Similarly to [16] ET uses an ATPG constraining for farther improving encode-ability of test patterns. This arrangement guarantees high ATPG efficiency of the method. Smart BIST [12] uses similar decompressor hardware with continuous reseeding as ET so it has similar encoding capability. It uses an LFSR with a phase shifter loading the parallel scan chains. It adds a mechanism that gates shifting the scan chains. This arrangement can solve the problem of the data lockout as it is possible to reseed the LFSR into arbitrary state while gating the scan chains. A modular scheme that can be used for large circuits was introduced in [12]; test patterns for each module can be calculated concurrently. This scheme seems to be promising to be the most efficient arrangement that can reduce the test time and keeps the hardware overhead relatively low. The system of gating scan chains causes a necessity to add one additional ATE channel, which is a disadvantage comparing to the other methods. On the other hand this additional signal can be used for controlling other required testing functions like bypass testing mode. The bypass mode enables direct loading specific values to the scan chains without any compression and is usually required to be embedded into the design. uring the period of gating scan chains the activity of the circuit is significantly reduced. From this reason we can say that this solution promises to fit with the nowadays requirements on reducing power of test. Smart BIST uses a phase shifter for avoiding the dependency between bits loaded into parallel scan chains. In order to enable lockout escape the phase shifter output bit values have to be linearly independent. From this reason the LFSR length feeding the phase shifter has to be at least equal to the number of the phase shifter outputs. This requirement can be considered to be a disadvantage of the method as other methods that do not guarantee full encode-ability of patterns [17] can use shorter LFSRs. The smart BIST principles were described in [12], but no information about test lengths and hardware overhead were published. It could help designers of ICs to have results of experiments with different modifications of the original scheme and this was the main reason of our research. We have chosen the Smart BIST as a basis of optimization experiments and compared the resulting test length and hardware overhead with an industrial compression tool decompressing the patterns in an automaton and a phase shifter without the possibility of lockout escaping. ifferently to the original methodology we introduced a hardware saving LFSR with state skipping similarly to [20] but with no possibility of switching between multiple polynomials. We have chosen such LFSR polynomial that has low number of XORs while it generates non overlapped output sequence of the corresponding chosen code. Farther we developed an algorithm of continuous LFSR reseeding that searches the reseeding bit or vector sequence efficiently in the LFSR state space while it keeps the CPU time acceptably low. We have performed a set of experiments mapping the influence of the ratio between the maximum scan chain length and the number of test clock cycles (test length) and between the number of LFSR inputs fed in parallel and the number of test clock cycles. II. ECOMPRESSION HARWARE The decompression automaton usually consists of an LFSR like automaton accompanied with a phase shifter. This scheme was used in [12]. Let us consider an m bit LFSR. The lockout escaping with the help of an m bit LFSR input sequence is possible only if the Boolean expressions describing the dependency between the phase shifter outputs are not linearly dependent. It can be reached only if m > number of phase shifter outputs. The typical number of XORs that form one phase shifter output bit that fulfills the above mentioned requirements is equal to 2 [4]. The total number of XORs in a phase shifter enabling lockout escaping is then typically close to 2m. 0: 1: 2: 3: 4: 5: 6: 7: 8: LFSR 4 states skipping LFSR 0 1 2 3 23 0 1 2 12 23 0 1 01 12 23 0 023 01 12 23 13 023 01 12 02 13 023 01 123 02 13 023 012 123 02 13 Fig. 1. LFSR and state skipping LFSR comparison 0 1 2 3 023 01 12 23 012 123 02 13 The m bit input LFSR sequence forms an information sequence of the code generated on the LFSR outputs and thus an arbitrary state can be reached within m clock cycles and the lockout escaping is guaranteed. It is possible to use an LFSR with state skipping [6] instead of adding a phase shifter on the LFSR outputs. The state skipping LFSR performs successive jumps of constant length in its state sequence, since it omits a predetermined number of states by calculating directly the state after them. The LFSR output sequence has then the dependency between bits reduced similarly to the phase shifter sequence as it avoids output bits repetition. We propose this solution as it has lower hardware overhead than the phase shifter if a characteristic LFSR polynomial guaranteeing periodic LFSR behavior with the period greater or equal to m*scan_chain_length is used. We have chosen the characteristic LFSR polynomial of the form x m +x m-1 +1. These polynomials are very suitable for LFSRs that skip over m states as the total number of XORs in the feedback will be equal to m.

Tsin = 2 0 1 2 m-2 m-3 Tctr Tclk Fig. 2. Example of the m-bit LFSR with the characteristic polynomial x m +x m-1 +1 performing skipping over m states. Tsin is a primary input of the LFSR and depending on the Tctr signal it is simply shifted through the register (primary outputs are omitted in the figure) or it is XORed regularly with every second flipflop ( =2). An example of the 4 bit LFSR is given in Fig. 1. Let us consider that the initial states of the LFSR flip-flops are represented by symbols 0, 1, 2 and 3. Then the next states are 23, 0, 1 and 2 (shown in the next row of symbols). The symbol 23 represents a XOR of 2 and 3. The LFSR skipping over 4 states generates the flip-flop states 023, 01, 12 and 23 just after the initial state. The skipping LFSR is shown in the right hand part of the figure; number of XOR gates is equal to 4. An interesting property of the LFSR with state skipping is that there are no gates modifying the LFSR flip-flop output sequence. From this reason it could be possible that the test pattern decoding LFSR can be shared with the first bits of the parallel scan chains. This arrangement could be used only in cases when the first scan chain bits cannot be overwritten by the functional outputs (the first scan chain bits serve as inputs only). Sharing the flip-flops between the decompressor and scan chains reduces the hardware overhead. ue to complications with the EA tools adaptation we performed our experiments with separated scan chains and decompressor only. We have found that for large m the resulting test lengths are significantly reduced for multi-input LFSRs where the Tsin input is XORed with regularly ributed LFSR flip-flops (in Fig. 2 the ance between these flip-flops is denoted as ). The controllability of the LFSR flip-flops can be improved also by using two or more independent input bit sequences (2 or more bit Tsin signal). We have verified that for the circuits with approx. one hundred parallel scan chains this solution lead to shortening the test sequence but the amount of stored data in total has grown and lower number of introduced test patterns did not compensate the growing number of tester data and channels. For this reason we have studied the influence of other parameter variations only for a 1 bit Tsin signal. The number of parallel scan chains that are fed from the LFSR is also important from the hardware overhead point of view. We experimentally verified that the shortest test sequences are obtained if the number of scan chains is several times bigger than the maximum scan chain length. Each additional parallel scan chain forms an overhead of one XOR, a flip-flop and a multiplexor. A tradeoff between the test length and hardware overhead has to be found. Tsin Tclk Tctr Clk LFSR CUT MISR SC 1 SC 2 SC 3 SC 4 Fig. 3. Simplified scheme of the test equipment Tsout The simplified scheme of the test equipment is demonstrated in Fig. 3. The test data are loaded into the scan chains from the LFSR outputs. The scan chains have a separate clocking so the system of decompressor, scan chain and Multiple-Input Signature Register (MISR) can perform all necessary modes of operation. ecompressor and MISR can perform decompressing/compacting and serial loading/readout depending on the Tctr signal value. CUT scan chains are controlled by the Tctr signal as well. In the pattern decompression mode, the compacted data are decompressed in the LFSR and they are concurrently loaded into CUT scan chains, while the response signature is compacted in the MISR (Tctr is high, both clocks are active). By holding the CUT clock signal we can perform one step of the continuous LFSR reseeding.

Test length Area overhead Test length Area overhead III. TEST PATTERN COMPRESSION Phase 1 test pattern map creating: The test pattern map can be done with the help of any ATPG and fault simulator that is able handling test patterns with don t care bits. The map has to contain information about faults that are covered by the test. This fault map is ordered, the most difficult patterns (greatest number of care bits) first, the easiest patterns last. This arrangement is useful in further phases as the most difficult patterns are in the majority of test sets necessary to be incorporated into the test set. In the experiments we considered the stuck-at faults only, but all types of faults that can be demonstrated with the help of a single test pattern can be considered. The pattern map we have used in experiments has for each fault one separately generated test pattern; no compaction of the vectors was performed. We can expect that compaction preserving the number of don t care bits in the patterns could farther reduce the resulting test length. Phase 2 test pattern encoding: Test patterns are encoded separately, the most difficult ones first and a fault simulation is performed for each pattern. The encoding consists of solving linear equations that represent conditions for LFSR input bit sequence that has to be loaded in order to create the care bits on the appropriate outputs in the appropriate clock cycles. If the care bits are encode-able on the proper positions in the test patterns the phase is finished, if not several attempts of reseeding the LFSR are performed (reseeding is done by introducing a new bit to LFSR input together with gating the scan chain shifting). There exist several possible clock cycles in which the clk gating could be done. The reseeding algorithm searches for the first possibility of successful bit reseeding within a given span of clock cycles. In our experiments we search within the ance (measured in the number of clock cycles) equal to m in order to keep the CPU time acceptable. Phase 2 algorithm: 1. ivide the test pattern to m-tuples of bits that will be loaded in parallel to the scan chains. Set number of clock cycle NCC=0 2. Consider next m-tuple of the test pattern bits. 3. Try to find the value or values of the already not set Tsin bit that guarantees that the already set Tsin and clk sequences together with the new bits encode the considered part of the test pattern. If successful go to 6. 4. Try to find another Tsin and clk sequence with the same number of non-active clk cycles that enables finding the Tsin bit value that successfully completes the sequence (limited number of attempts, limited depth of the search). If successful, go to 6. 5. Set NCC = NCC +1. Set clk(ncc) non-active. Go to 3. 6. Set NCC = NCC +1. Add Tsin(NCC) bit and clk(ncc) active bit to the sequences. Is the pattern completed? If yes, go to the next pattern, if no go to 2. IV. EXPERIMENTAL RESULTS In the first experiment we studied the influence of the decompressor LFSR length on the number of test clock cycles and the hardware overhead. We have chosen the ISCAS89 circuit S38417 that has 1636 internal flip-flops and or inputs and outputs concatenated into scan chains. We divided these scanable points into variable number of parallel scan chains with approx. the same length. The relative hardware overhead corresponding to decompressors of different lengths and test lengths are plotted in Fig. 4. We have performed test compression according to the proposed algorithm. In this experiment we have chosen the parameter equal to 5 for all the considered cases. We can see that effectiveness of adding more parallel chains declines for bigger numbers. 100000 90000 80000 70000 60000 50000 40000 30000 0% 0 50 100 150 # of scan chains Fig. 4. Test length, area overhead vs # of scan chains for ISCAS89 circuit S38417 The second experiment describes the dependency of the test length on the parameter of the same circuit as was chosen in the first experiment. The number of scan chains was set equal to 79. In Fig. 5., we can see that XORing the input bit with the content of regularly ributed flip-flops is hardware consuming and from this reason we have to make a tradeoff between hardware overhead and test length. On the other hand we can see that the dependency of the test length is not monotonous. The minimum test length is obtained for parameter equal to 5. Optimum ratio between number of scan chains and was found for several smaller ITC benchmark circuits [3] by exercising all valuable /number_of_scan_chains ratios. The graphs that show the test length space are shown in Fig. 6 and Fig. 7. We can extrapolate the results and we can claim that the most interesting values of the number of scan chains are from the interval between 2*scan_chain_length and 3*scan_chain_length. The parameter value has to be between 2 and 5. 80000 75000 70000 65000 60000 55000 50000 45000 40000 35000 30000 Fig. 5. Test length, area overhead vs test length Test length area overhead Area overhead 5,20% 5,15% 5,10% 5,05% 5,00% 4,95% 4,90% 4,85% 4,80% 4,75% 4,70% 0 20 40 60 80 100 8% 7% 6% 5% 4% 3% 2% 1%

TABLE I. NUMBERS OF TEST CLOCK CYCLES, ISTANCE BETWEEN PARALLEL INPUT XORS, HARWARE OVERHEA FOR ITC 99 AN ISCAS89 BENCHMARK CIRCUITS USING THE PROPOSE ESIGN AN ET circuit b03 b04 b05 b06 b07 b08 b09 b10 b11 b12 b15 b17 b20 s38417 s35932 s38584 # scan chains 13 15 6 8 16 9 5 25 17 28 22 38 23 80 80 80 Industrial test coverage 95 97.4 99.5 96.8 95.3 98 84.3 91.7 96 94.7 95.5 99.4 96.2 99.3 99.7 99.7 # clock industrial 729 2693 2304 337 1513 1071 537 1329 2504 4326 20740 124019 34820 36499 3298 23087 cycles proposed 157 1187 921 49 804 281 420 323 837 2216 37554 134297 60036 35108 2547 17633 Proposed: 2 2 2 2 5 2 2 3 3 5 4 5 5 5 5 3 Area: proposed/ industrial 0.5 0.54 0.34 0.40 0.51 0.43 0.33 0.62 0.55 0.65 0.66 0.70 0.60 0.93 0.93 0.95 We have experimented also with more bit Tsin inputs. This arrangement requires adding extra tester channels depending on the number of parallel inputs. For the benchmark circuit S38417 and two bit Tsin input the test length was reduced by the factor 0.85, the total test data volume was increased by the factor 1.2. According to this result it seems to be more advantageous to use preferably one bit Tsin input for cases where the test data volume and the number of tester channels are more important than the test time. # of parallel scan chains 80 70 60 50 40 30 20 10 b04 min=1187 chains=15 =2 1200 0 0 10 20 30 40 50 60 70 80 Fig. 6. b04 Colormapped test length vs vs # of scan chains # of parallel scan chains 50 45 40 35 30 25 20 15 10 5 b07 min=804 chains=16 =5 0 0 5 10 15 20 25 30 35 40 45 50 Fig. 7. b07 - Colormapped test length vs vs # of scan chains Table 1 summarizes the best obtained test lengths, hardware overhead for corresponding ratios between the number of parallel scan chains and parameters and compares the results with a standard industrial test compression method. This method had no possibility of gating the scan chain shifting so the fault 3000 2800 2600 2400 2200 2000 1800 1600 1400 2200 2000 1800 1600 1400 1200 1000 coverage was reduced due to lockouts. The flow we used for it was the following: We created parallel scan chains and wrapper chains in the benchmark circuit and we removed its primary inputs. The number of parallel scan chains was set equal to figures required for the proposed method. Initial number of ATE channels was set to one in order to lower the hardware overhead. A single ATE channel was composed of the test clock signal, three control signals and a single pair of serial data signals. We aimed to create a test with the maximum test coverage. The term test coverage means the percentage of faults detected among all testable faults. The test coverage is influenced by the ability of the ATPG to create a test pattern that tests the testable faults under the restrictions caused by the necessity to avoid possible lockouts caused by the decompressor. The obtained fault coverage was dependent on the number of required parallel scan chains and the number of tester channels. These constraints were necessary because they enabled comparison with the proposed method. For the proposed decompressor we have used a similar flow. The CUT was equipped with a wrapper and scan chains similarly to the experiment with the industrial tool. We used a separate LFSR for test patterns decompression and the parallel scan chain flip-flops. Embedded test equipment was controlled via clock signal, single control signal and a pair of serial data signals. For all experiments we used industrial synthesis tool with standard AMS 370 cell library. We compared the area overhead of the decompressor and the MISR itself. CUT s hardware is not included in our measurements neither their primary inputs nor outputs. The results presented in Table 1 show that the proposed solution gives competitive results in terms of the test length and the hardware overhead. Test compression according the proposed method has no influence on the test coverage (the test coverage is not influenced by pattern compression) for this reason we have not plot this value in the Table I. We have also compared the proposed decompression system with Illinois Scan [16] and [18] methods. The test lengths of the proposed method for the largest ISCAS89 benchmarks were for more than two times shorter so the comparison is not included into the table. The compression algorithm computations can be done in parallel on a computer grid and thus using modern computers it is manageable to encode even the largest circuits. The proposed method influences the medial power spent for test as it gates clocking the scan chains during the LFSR reseeding. For larger

circuits the reduction of the scan chain activity was approximately 10%. V. CONSLUSION We have experimentally verified that it is possible to reduce the number of clock cycles and the number of ATE channels comparing the commonly used compression methods while the hardware overhead is kept low. The proposed decompression method effectively eliminates the LFSR lockout. The proposed skipping LFSR provides similar channel separation as the LFSR with a phase shifter but the hardware overhead is lower than the originally introduced solution of the Smart BIST. We have studied the influence of the number of LFSR flip-flop XORed in parallel with the test data bit (Tsin sequence) and the number of parallel scan chains loaded from the decompressor on the length of test sequence. The demonstrated results can help designers to optimize the parameters of the decompression with the Smart BIST. The obtained decompression parameters are competitive with the design results provided by the industrial compression tools. The experiments have demonstrated that the proposed decompressor has low hardware overhead and non-reduced fault coverage even for circuits with limited access to the primary inputs for which the industrial tool had significantly degraded the fault coverage. The experiments were performed on developmental compression tool and were time consuming. ue to the positive results parallel computations and custom tailored solvers are worth using which can substantially speed up the processes. ACKNOWLEGEMENT This work was supported by the COST L-13019 program and the COST Action IC1103-Median program. REFERENCES [1] Balakrishnan, K.J.; Touba, N.A. "Improving Linear Test ata Compression", Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, On page(s): 1227-1237 Volume: 14, Issue: 11, Nov. 2006 [2] Czysz,.; Mrugalski, G.; Mukherjee, N.; Rajski, J.; Szczerbicki, P.; Tyszer, J. "eterministic Clustering of Incompatible Test Cubes for Higher Power-Aware ET Compression", Computer-Aided esign of Integrated Circuits and Systems, IEEE Transactions on, On page(s): 1225-1238 Volume: 30, Issue: 8, Aug. 2011 [3] avidson, S., "ITC'99 Benchmark Circuits - Preliminary Results," Test Conference, 1999. Proceedings. International, vol., no., pp.1125,1125, 1999 [4] Rajski, J.; Tamarapalli, N.; Tyszer, J., "Automated synthesis of phase shifters for built-in self-test applications," Computer-Aided esign of Integrated Circuits and Systems, IEEE Transactions on, vol.19, no.10, pp.1175,1188, Oct 2000 [5] Garg, R., R. Putman, and N.A. Touba, Increasing output compaction in presence of unknowns using an X-canceling MISR with deterministic observation, Proc. VTS, pp. 35-42, 2008. [6] GOLAN, P.: Pseudoexhaustive Test Pattern Generation for Structured igital Circuits. Proc. IX International Conference on Fault Tolerant Systems and iagnostics FTS9, Brno, Czechoslovakia, 1986, pp. 214-220 [7] Janicki, J.; Tyszer, J.; Mrugalski, G.; Rajski, J. "Bandwidth-aware test compression logic for SoC designs", Test Symposium (ETS), 2012 17th IEEE European, On page(s): 1 6 [8] Jinkyu Lee; Touba, N.A. "Low power test data compression based on LFSR reseeding", Computer esign: VLSI in Computers and Processors, 2004. ICC 2004. Proceedings. IEEE International Conference on, On page(s): 180 185 [9] Jinkyu Lee; Touba, N.A. "LFSR-Reseeding Scheme Achieving Low- Power issipation uring Test", Computer-Aided esign of Integrated Circuits and Systems, IEEE Transactions on, On page(s): 396-401 Volume: 26, Issue: 2, Feb. 2007 [10] Jinkyu Lee; Touba, N.A., "LFSR-Reseeding Scheme Achieving Low- Power issipation uring Test," Computer-Aided esign of Integrated Circuits and Systems, IEEE Transactions, vol.26, no.2, pp.396,401, Feb. 2007 [11] Kavousianos, X.; Tenentes, V.; Chakrabarty, K.; Kalligeros, E. "efect- Oriented LFSR Reseeding to Target Unmodeled efects Using Stuck-at Test Sets", Very Large Scale Integration (VLSI) Systems, IEEE Transactions on, On page(s): 2330-2335 Volume: 19, Issue: 12, ec. 2011 [12] Koenemann, B.; Barnhart, C.; Keller, B.; Snethen, T.; Farnsworth, O.; Wheater,., "A SmartBIST variant with guaranteed encoding," 10th Asian Test Symposium, 2001. Proceedings., vol., no., pp.325,330, 2001 [13] Larsson, A.; Larsson, E.; Chakrabarty, K.; Eles, P.; Zebo Peng "Test- Architecture Optimization and Test Scheduling for SOCs with Core- Level Expansion of Compressed Test Patterns", esign, Automation and Test in Europe, 2008. ATE '08, On page(s): 188 193 [14] Mrugalski, G.; Mukherjee, N.; Rajski, J.; Czysz,.; Tyszer, J. "Compression based on deterministic vector clustering of incompatible test cubes", Test Conference, 2009. ITC 2009. International, On page(s): 1 10 [15] Muthyala, S.S.; Touba, N.A., "SOC test compression scheme using sequential linear decompressors with retained free variables," VLSI Test Symposium (VTS), 2013 IEEE 31st, vol., no., pp.1,6, April 29 2013-May 2 2013 [16] Pandey, A.R. and J.H. Patel, Reconfiguration Technique for Reducing Test Time and Test Volume in Illinois Scan Architecture Based designs, Proc. 20th VLSI Test ymp. (VTS 02), IEEE CS Press, 2002, pp. 9-15. [17] Rajski, J.; Tyszer, J.; Kassab, M.; Mukherjee, N. "Embedded deterministic test", Computer-Aided esign of Integrated Circuits and Systems, IEEE Transactions on, On page(s): 776-792 Volume: 23, Issue: 5, May 2004 [18] Shah, M.A.; Patel, J.H., "Enhancement of the Illinois scan architecture for use with multiple scan inputs," VLSI, 2004. Proceedings. IEEE Computer society Annual Symposium, pp.167,172, 19-20 Feb. 2004 [19] Tenentes, V.; Kavousianos, X.; Kalligeros, E. "State Skip LFSRs: Bridging the Gap between Test ata Compression and Test Set Embedding for IP Cores", esign, Automation and Test in Europe, 2008. ATE '08, On page(s): 474 479 [20] Tenentes, V.; Kavousianos, X.; Kalligeros, E., "State Skip LFSRs: Bridging the Gap between Test ata Compression and Test Set Embedding for IP Cores," esign, Automation and Test in Europe, 2008. ATE '08, vol., no., pp.474,479, 10-14 March 2008 [21] Touba, N.A. "Survey of Test Vector Compression Techniques", esign & Test of Computers, IEEE, On page(s): 294-303 Volume: 23, Issue: 4, April 2006 [22] Wang, Z., K. Chakrabarty, and S. Wang Integrated LFSR reseeding, testaccess optimization, and test scheduling for core-based system-on-chip, IEEE Trans. CA, vol. 28, pp. 1251-1263, Aug. 2009.