Power-driven FPGA to ASIC Conversion

Size: px
Start display at page:

Download "Power-driven FPGA to ASIC Conversion"

Transcription

1 Power-driven FPGA to ASIC Conversion WenHai Fang a and Lambert Spaanenburg b a SwitchCore AB, Emdalavägen 1, Lund (Sweden) b Dept. of Information Technology, Lund University / LTH, P.O. Box 11, Lund (Sweden) ABSTRACT Gate arrays are often presented as a convenient means for ASIC prototyping. Obviously, they can both perform the same function and therefore be built from the same behavioral description. Design development implies a process of subsequent parameter bindings, leaving steadily less freedom for the remaining implementation choices. On the other hand, the ASIC offers more place & route freedom than the gate array. Hence it is commonly suggested that an optimal prototype will always have an acceptable ASIC realization. But this does not make the gate array an easy stepping-stone in ASIC development. Differences in platform technology induce a different structural sugaring to achieve a reasonable implementation. This cannot easily be ported, unless the implementation is developed while keeping the restrictions for the other technology in mind. Such implies a number of scaling rules to be the foundation of the design transformation process. This paper looks into the platform commonalities of Field-Programmable Gate-arrays and standard-cell ASICs from fundamental physical principles. These basic considerations are then related to show how the area and speed restrictions in the logic synthesis can be applied to carry power efficient designs efficiently from prototype to realization. This is illustrated in the design of the SNOW-2 encryption core, where a consistent 3% power reduction is achieved. Keywords: Application-Specific Integrated Circuit, Field-Programmable Gate-Array, Computational Energy, Dynamic Power Dissipation, Encryption. 1. INTRODUCTION Most microprocessors are contained in a product for non-computing purposes, putting different aspects than pure number crunching performance on the foreground. Though these aspects are also of concern in classical computers, the move towards embedded computing makes them probably the prime design aspect. On the other hand, embedded systems tend to be prototyped on Field-Programmable Gate-Arrays (FPGA). They offer a rich set of pre-integrated gates and macros that can be personalized to the desired system and a second system that performs this configuration. Compared to Application-Specific Integrated Circuits (ASIC), area consumption has become the degree of gate utilization, while power consumption is largely overshadowed by the static dissipation of the pre-integrated parts. For an effective prototyping, it is desirable that the decisions in designing an FPGA will also hold for the ASIC. Altera has for some time advertised that an FPGA can be transformed in an ASIC by a technique called Hardcopy 1. This essentially takes away all the unused gates from a design after Place & Route. Such is clearly not the most efficient way to do things, though it will clearly reduce the area and the power consumption. The alternative is to start from scratch, i.e. use the VHDL-coded design for a new Place & Route effort. It is often found that this requires an entirely new set of P&R decisions to be made. This leaves the question whether and under which circumstances the P&R decisions for the FPGA implementation can be re-used. The problem of re-using design decisions becomes even more pressing, when aiming for a low-power design. Techniques for power reduction can be applied at several stages of system design 2. Already the software implementation can be coded for power efficiency. Most of this has to do with handling the memory access, as the communication external to the chip and the distribution of such data streams within the chip are a major contributor. This is in turn reflected in architectural decisions that aim to limit the system part that is critical for overall speed. Having a larger part of the system work at a lower clock speed will always benefit the power efficiency. In the extreme, one may opt for no synchronicity. Still a number of measures can be taken on the logic level: clock gating and parallel processing, in combination with design steps as re-timing and unfolding. On a lower level of abstraction, a number of circuit design techniques have been proposed, like Dynamic Voltage Scaling 3.

2 Notably, IP cores for secure portable products have to be low on area and energy. In this paper we report about a further development for a power-efficient implementation of the SNOW2.0 standard. Therefore, we will first discuss some applicable techniques to see their impact on FPGA and ASIC design. Then we treat some specific methods for FPGA designs. Subsequently, the SNOW-2.0 design is introduced 4 and the derived techniques are applied. Finally we evaluate such techniques and find that the dynamic power dissipation can be reduced by a factor THE BASICS OF POWER REDUCTION The power dissipation of a logic gate is composed of a static contribution and a dynamic one. The static or quiescent power is consumed to keep the circuit into a well-defined static state. It is drastically reduced by the advent of CMOS technology, but returns when discussing FPGAs. The dynamic contribution is largely due to the clock frequency f clk and the logic switching rate α 0-1, formulated by Neil Weste 5 for full-custom design as P dyn = α 0-1 f clk C L V 2 dd. Though it is valid for FPGA design also, the interpretation for power reduction methods seems to be different. P f I II III (a) (b) Figure 1 Full-custom power/delay curve in theory (a) and in the practice of FPGA design (b), demonstrating the effect of different P&R efforts on Fang s design with 400,000 bits 4. In full-custom design, the gate is sized to drive a capacitive load C L at just the right frequency. As the gate delay is dependent on C L /V dd.k where K=W/L, we can see that a higher frequency is reached by raising the geometry ratio K. However, doing that raises not only the frequency, but also the capacitive load C L, being composed of a driver contribution K.C G and a driven part C W. Consequently the law of diminishing returns takes effect, which can be described from the recognition of three regions in the power/frequency curve (Figure 1a): a. Region I. If the load capacitance is mainly due to the wiring (and fan-out), the gate delay is a function of C W /V.K, which makes the power dissipation to be a linear function of K (therefore f). b. Region II. If both wiring and logic play a role in the load capacitance, we see that K has to change more than before to raise the frequency. Consequently, the power curve will be slightly non-linear. c. Region III. If the load capacitance is only slightly caused by wiring but largely driver-dominated, the gate delay becomes geometry-independent. Actually, as the logic gate is so large that it is merely loaded by itself, the frequency is almost saturated and can only be marginally changed. But that little change in frequency takes a drastic increase in K, making for a steep rise of the power/frequency curve. We can easily deduce from Figure 1a that reducing the operating frequency for designs at operating points in region III will drastically bring the power dissipation down. A typical example is in bank switching. As the power dissipation rises by more than a factor 2 with frequency, doubling the circuitry while halving the frequency will effectively bring the power consumption down! In FPGA design the gates will not be sized. Therefore the driver contribution to the load capacitance is fixed and the power dissipation will go linear with the frequency as long as the gates are fast enough. We encounter this voluntary limitation to region I behavior somewhat relaxed in semi-custom design also, and even full-custom design can be done this way. For ASIC design, the slow rise of the region I curve precludes the benefit of structural duplication. As the

3 power dissipation rises by less than a factor 2 with frequency, doubling the circuitry while halving the frequency will only raise the power consumption. Apparently, the important parameter is the Differential Power Dissipation DPD=δP/δf. If this value is more than 2, then power dissipation will drop by a larger ratio than the frequency. If it is smaller, then the effect may be debatable and may even not be an improvement at all. In general, the Differential Power Dissipation is layout dependent. This shows when the Place & Route is performed on different levels of effort. As an example, we show in Figure 1b the effect of different P&R efforts on the best result, discussed by Fang 4. What remains, is modifying the circuit structure. For an ASIC, this can be part of the design and can be furthered by also tuning the logic structures. For instance, a circuit with a low admissible clock rate can be lowered in propagation speed to restrict the power dissipation by serializing the logic structure. Of specific interest here is tapering, a serialization style aimed to moderate the relation between K and C for a given f. It is mostly applied in I/O circuits, but also explains the benefits of using buffers when a relative large load has to be driven. In FPGA design, the tooling will use pre-placed buffers, as long as the tooling is made aware of the problem. For instance, the clock will be distributed over a balanced arrangement of buffers to ensure timing. The apparent contrast between ASIC and FPGA design involves that in the latter case spatial tapering cannot be done with similar comfort. Fortunately, we can still create a tapering in time by twiddling with externally supplied signals, notably the clock and the reset. Both the clock and the reset are global wires that cross the entire chip and are therefore a major source of dissipation. Hence multiple clock and multiple power lines can help to reduce this, while tuning every local circuit to the requirements by adapting the signal frequency (and not the circuit activity). 3. DECENTRALISED CONTROL There has always been a heavy debate in processor design between centralized and decentralized control. Centralized control evolves from the separate attention on control and data-path as results from the Instruction Set Architecture concept. In synchronous design, the rationale is the distribution of the signals that designate what the effect of the continuous stream of clock pulses has to be. In asynchronous design, this issue does not exist and the existence of many control signals became the ruling factor. It has been argued that the main advantage of such designs is the lack of the global clock 6. Recently it is concluded that this signal causes a lot of the power dissipation 7. It is easy to see that the power dissipation of a synchronous system can therefore be lowered by distributed control, assuming that this will be used to localize the clock. It extends the concept of tapered timing with clock gating and local powering. The beneficial implementation is based on clock gating. Here the clock to a circuit is passed through a gate, where a hold signal is fed to the other input. The output will then not pass the clock when the hold signal is active and consequently the circuit is not clocked. The BUFGCE cell _ available in Spartan-3, Virtex-II, Virtex-II Pro and Virtex-II Pro X _ can implement such a clock gating block. Clock gating (or local clocking) goes hand in hand with local powering. An example of this gated powering can be found in the classical microprocessor with on-chip cache. Assuming locality of operation and data, the hierarchical construction of the cache allows most of this memory to be sleeping, as it will not be needed for the current part of the program execution. The only twiddle factor left to play with is the switch factor. In the case where an interconnect bus shows little logic activity, one may opt for time multiplexing. The question to answer here is: what is the power minimization equivalent of multiplexing? Lets take a look at clock phases in order to answer this question (Figure 2a). Figure 2 (a) Clock phases and (b) a 2-edged register.

4 Ever since the period, wherein TTL was the ruling technology, it has been good practice to separate data and control on different clock edges. In the LSI era, the technique lost popularity as new circuitry for clock balancing needed to be developed. But it allows the controller to provide the data path with signals that will keep the circuit active when desired, while holding it inactive otherwise. The signal has both a rising and a falling edge; the combination is used to create the read & store functionality that makes a double latch into a flip-flop (Figure 2b). This can be easily extended, for instance to registers that operate on both edges but then at half the frequency. The above discussion shows that the concept of multiplexing does not change fundamentally. The only meaning is that the aim is to raise the operational meaning of the line without raising the activity. In other words, every signal change needs to have as much operational meaning as possible as long as the activity rate remains the same. This has always been good practice, but here it also implies that using both edges allows reducing the frequency. Consequently, the power dissipation goes down while the noise margin (in terms of time between two events) remains the same. Of major concern are the I/O gates, which are a main source of power dissipation next to the global (clock) lines. Multiplexing signals on a smaller number of I/O gates has a significant impact on power dissipation. The same is true for the gated powering of the peripherals. 4. THE ASIC/FPGA TRADE-OFF The Field-Programmable Gate Array (FPGA) is based on standard blocks from which any type of logic can be created. Actually such standard blocks are memory parts, where functions can be stored such that the block behaves as the intended logic. This is the principle, but the raw reality is that there is one type of block with a specific capacity. The bare components will still be there, when less than the block capacity is used. Using more than the capacity is not possible; in that case simply more blocks will be used. This turns logic synthesis into allocation. A related issue has to do with macro usage. Macros are blocks, optimized for a specific function domain such as multiplication and data storage. The function can also be built by the logic slices, which are standard blocks in Xilinx families, leaving the question on when to use the macro and when to use the slices. The crossover point is important, but clearly the exact value is dependent on the FPGA type. The Block SelectRAM (BS-RAM) is a configurable memory module, generated by the Core Generator toolset with parameters that are assigned by the user. As means of data storage it is meant to bring improvement over Distributed RAM. The latter is based on the individual register elements contained in the logic slices, which makes the intrinsic speed higher than that of a macro element. They are crucial to many high-performance applications that use relatively small, embedded RAM blocks such as FIFOs or small register files. But for a number of register sets, speed or area will have increased by such amount that a single macro has become more efficient. Therefore the question is: when to use what? The answer to this question will typically be different for different design goals. Pareto Curve SNOW 2.0 Pareto 3.5 area(in slices) execution speed(in nsec) 3, 4 4,2 4,4 4,6 4, SNOW 2.0 Pareto3.5 (a) (b) Figure 3 (a) The Pareto curve from design with a different BS- versus Distributed RAM ratio and (b) the power dissipation in BSand Distributed RAM.

5 There has been a regular feud to relate slices and LUTs on one hand to logic gates on the other hand. The use of the equivalent logic gate has been proposed and enjoys a limited popularity. It expresses a design in the number of 2-input logic NAND-gates that are needed to achieve the same functionality, but this does not do justice to technologies, where the logic NAND is not an efficient basic building block. Relating slices to memory locations is even more difficult to achieve. It seems justified to use an equivalence ratio within the same technology. This can be concluded by starting from a design with many memory blocks and doing a number of subsequent transformations to decrease the memory usage, or vice versa 4. From a typical case study, of which the Pareto curve for the different implementations is shown in Figure 3a, it can be deduced that a BS-RAM is the area equivalent of 256 slices. Therefore it is convenient to replace any logic part in excess of 256 slices by a lookup table in BS-RAM. We cannot take this for a universal truth, as a synthesis report will not clarify how much of the slice will actually be used. The size of a logic circuit that corresponds to a table of fixed size varies considerably. This underlines that it is very hard to translate an FPGA design into gate equivalents, or to compare designs on FPGAs of different brands. Hence we will stick in our evaluation to Xilinx FPGAs only. In this paper we decide on whether to use BS- or Distributed RAM on basis of power dissipation instead of just area. From Figure 3b we find that BS-RAM has the constant consumption of 421 mw, while the Distributed RAM increases its dissipation per register count. The reason is that BS-RAMs use internally a gated clock to activate just one row at a time. On the other hand, the clock drives all blocks in a distributed RAM, regardless the addressed block. Not surprisingly, the crossover point is around 10 registers of 32 bits each. 5. A CASE STUDY A typical IP core for quality embedded systems finds application in secure communication. Data encryption has always been adequately supported in software, but the introduction into embedded devices necessitates hardware. This raises questions on the relation between algorithmic architecture and efficient hardware implementation. Such relations have been under investigation within the ECRYPT European Network of Excellence over the past years. The most widespread encryption algorithms are DES and AES. Clearly a lot of work has been spent in finding suitable hardware implementations. AES is not really low in its computational demands and is not really efficient in streaming; therefore ECRYPT has initiated a competition in new concepts that can be suitably integrated into an embedded system, called estream. The SNOW encryption algorithm for streaming has already been proposed in Though the first version was rapidly broken, the second version 10 has survived the scrutiny by the ECRYPT community and has recently seen its first commercial application for portable telephony. Therefore SNOW does not participate in the competition anymore, but has already moved on to become a standard. Key & IV Key_IV LFSR Calcu New_LFSR Value & KeyStream FSM PlainText Encrypt_Addition CipherText Figure 4 (a) A schematic view of SNOW-2.0 and (b) an initial implementation. SNOW-2.0 is a word-oriented stream cipher generator with a word size of 32 bits (Figure 4a). It takes two input values only: a secret key of either 12 or 256 bits and a public initialisation value, IV at 12 bits. The length of the linear feedback shift register (LFSR) is 16. The FSM consists of two 32-bits registers, R1 and R2, and a so-called S-box that calculates the output of the FSM. This output is XORed with the first element of the LFSR to provide the desired running key.

6 This paper describes the gradual and continuous enhancement of the original SNOW 2.0 IP core in line with the different approaches of power reduction discussed in sections 2 till 4. The power estimation is performed using Xpower on the back-annotated design. The initial design makes use of BS-RAMs, configured to allow for two read or two write accesses simultaneously. Figure 4b shows that 5 LFSR values are used at a time to calculate the new LFSR value, the value of R1 and the Keystream. The LFSR itself will be created around one BS-RAM. In the feedback loop, multiplication with α and α -1 can be implemented as a simple byte shift plus an additional XOR with one of 256 possible patterns. So this can be implemented using a look-up table stored in a BS-RAM. The S-box is also implemented using a look-up table. In other words, the only components in use are XOR, bit adder, bit shifter and look-up table. Addr. RAM 256 * bits Logic calculation n i n j XOR S (a) RAM 512 * 32bits n 1 n 2 Addr. XOR S (b) RAM 512 * 32bits n 3 n 4 (c) Figure 5 The different realizations of the S-box: the small (a) and the large (b) design with operational performance (c). First of all we have looked at the filling and clocking of the BS-RAMs used to implement the S-box. There are two options, as illustrated in Figure 5a and b. By using two BS-RAMs with 512*32 bits each, the individual clock frequency is reduced. Alternatively, the two boxes can be merged into a single RAM with 256 * bits at the expense of some additional glue logic. Here, the full clock rate needs to be used, while we have more gates. As illustrated in Figure 5c, the double box solution is preferred, reducing the power needs by minimal 7-12%. The next step is the insertion of a DCM in combination with double-edge triggering, as discussed in Figure 2. This allows separating the system logic between full and half speed parts. The effect is a reduction of minimal 6 - %. Then the point of interest is the LFSR. It uses only 16 words in a BS-RAM and can easily be replaced by the Xilinx SRL16, which is efficiently mapped on the LUTs. This gives a minimal 3 6% reduction, based on the fact that a BS-RAM dissipates the same amount of power irrespective of the filling grade while the LUT dissipates only for the parts that are used. As we have mentioned in section 3, power consumption can also be reduced by clock gating. The initial key block works only in the beginning. Hence, we use clock gating to shut this block down after use. This brings us an additional 4% reduction. Finally, the creation of the key stream is separated from the creation of new LFSR values, which reduces further power dissipation by 2%. The block diagram of the final design is shown Figure 6a. Overall reduction of the dynamic power dissipation in the three BS-RAM designs is frequency dependent and ranges from 16 to 34%. We have noted that a tool like Xpower measures power as an average over the simulation period. In stream encryption, we have an initialization phase followed execution. To measure the executive power correctly, the simulation has to cover at least 20,000 clock cycles. 6. DISCUSSION A first design of the SNOW-2.0 IP core has been discussed by Fang 4. It focuses on the utilization of the FPGA resources to achieve a small footprint and a high speed. This design has been taken as reference for the evaluation of the power reduction techniques, discussed above. The power estimations produced by the Xilinx X-Power tool are shown in Figure 6b. All values represent a design that is synthesized by Synplify-.1 using a speed restriction that corresponds to the required clock frequency. The original design, on which in sequence the modifications have been implemented, displays a dynamic dissipation that rises with 5mW per MHz increase in clock frequency.

7 CLK Create new_lfsr 2XCLK CLK Plain Text CLK Clock gating K ey & IV Load the initial key to LFSR LFSR FSM Create Key_str eam CLK keystream Encrypt CLK Control signal Cipher Text (a) 2XCLK (b) Figure 6 The final power-efficient SNOW design (a) and the dynamic power dissipation compared to the original design (b). The new design starts at a lower dissipation level as it contains 44% less equivalent used slices. The effect of our measure can be found from the slope of the curve, showing an increase of 4 mw per MHz. Both designs show a steeper increase at the end of the affordable clock frequency spectrum. When the FPGA architecture gets at the limit of its clock budget, the combinatorial logic gets parallelized. At this point we like to draw attention to what happens between 100 and 140 MHz. The increase in dynamic power dissipation DPD has been decreased from 5 mw/mhz to 4 and now suddenly drops without any change in the original VHDL file to 1.4. The synthesis report reveals no major change in the logic but a 2% increase in the number of occupied slices. This suggests that the difference is caused by problems in the Place & Route. Checking the logs of previous experiments, we have noticed that the curve after 100 MHz cannot always be reproduced. One out of four physical design attempts ends not with the shown curve but with a linear extension of the curve below 100 MHz, similar to the result for Fang s design. This has a striking similarity to what has been observed earlier in ASIC-oriented design. Looking back at what we stated in section 2, we can now conclude that the three regions in the power/delay curve can also be distinguished in FPGA design. The difference lies in the role of time domination. Where in ASIC design, an additional sizing of the driving gate compensates the additional loading; in FPGA design a different physical design of the overall circuit will take place. In region I, the speed requirements are so loose that it is attempted to place the subsequent gates close to one another. When all the space is taken, the remaining gates will be placed far away. In region III, the speed requirements are dominant and hard to fulfill. In the transition, arbitrary dense packing becomes unfeasible and a more balanced arrangement may occur. The consequence of this observation is in an eventual conversion of FPGA into ASIC. When a Hard Copy is made of the FPGA implementation, the timing-restricted (but not dominated) regime will provide a better starting point as the layout is less related to the specifics of the FPGA architecture. 7. CONCLUSIONS We have applied variations on three techniques in an attempt to decrease the power dissipation of an existing design: a. Optimal division between distributed and BS-RAM for compact storage; b. Multiple and gated clock for distributed and block activation; c. Adaptation of logic structure to enable the efficient application of the above two techniques. As the example stream encryption core operates on a bit per second, it is unlikely to have underused parts. The design operates successfully for internal clock frequencies till 306 MHz, while displaying power figures for a maximum external 153 MHz frequency. This is because the process concept in VHDL forbids the use of double-edged triggered registers. We have used the Digital Clock Monitor (DCM) to reach the same effect by local frequency doubling. As a consequence, the FPGA-based design is limited in its clock range to half that of the original design. This is only a matter of technology. The noted restriction will disappear when moving to an ASIC.

8 In literature, some SNOW implementations are described, where BS-RAM usage is varied to exchange throughput for area consumption 4. Some typical designs are mentioned in Table 1 with their power figures to enable a comparison with the low-power design as developed in this paper. All designs give output on every clock cycle and have therefore the same throughput of 340 Mbps at a 120 MHz clock. Table 1 Comparison between designs (f 120 MHz) with a 12-bits secret key. Design A 4 Design C 4 Design D 4 This paper Slice count/bs-rams 72/7 1360/4 1936/0 624/3 Dynamic power Overall power Throughput/Dynamic Power Throughput/#Slices and Dynamic power Throughput/#Slices and Power The discussed measures lead to a reduction in pure power dissipation of up to 3%. Alternatively, we can use computation power as throughput divided by power 11, for the purpose of cleaning the comparison between the designs from irrelevant influences. We have refrained from including the results from investigations on equivalent gate metrics 12, because their power estimates are based on too limited measurement periods. Though a distinct effect of BS-RAM usage on power dissipation cannot be denied, our low-power measures prove to be much more effective. Overall the dynamic computational power has increased by 39%. Normalized on area, the improvement is even 102% compared to the best design published earlier 4. ACKNOWLEDGEMENTS The authors like to thank Thomas Johansson, Chao Chen, Yitao Jia and Suleyman Malki for their support throughout this research. REFERENCES 1. Altera, 1 March H. van Gageldonk, An Asynchronous Low-Power 0C51 Microcontroller, Ph.D. Thesis, Eindhoven University of Technology, pp , 33-34, 49-54, , D. Shin, J. Kim and S. Lee, Low-energy intra-task voltage scheduling using static timing analysis, Proceedings DAC, pp , W. Fang, T. Johansson, and L. Spaanenburg, Snow 2.0 IP Core for Trusted Hardware, Proceedings FPL 2005, pp , Tampere, Finland, August N. Weste and K Eshragian, Principles of CMOS VLSI design: A system perspective, Addison-Wiley, L. Spaanenburg, et alieni, One-chip microcomputer design based on isochronity and selftesting, Digest EDA'4, pp , Warwick (England), March K. v. Berkel and M. Rem, "VLSI programming of asynchronous circuits for low power," in Asynchronous Digital Circuit Design (G. Birtwistle and A. Davis, eds.), Workshops in Computing, pp , Springer-Verlag, estream, 1 March P. Ekdahl and T. Johansson, SNOW a new stream cipher, Proceedings 1 st NESSIE workshop, P. Ekdahl, On LFSR based Stream Cipher Analysis and Design, Ph.D. Thesis, Lund Institute of Technology, Lund University, pp , T.A.C.M. Claassen, High Speed: Not the only way to exploit the intrinsic computational power of silicon, Digest IEEE ISSCC, pp , T. Good, W. Chelton and M. Benaissa, Review of stream cipher candidates from a low resource hardware perspective, 1 March 2006.

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices

March 13, :36 vra80334_appe Sheet number 1 Page number 893 black. appendix. Commercial Devices March 13, 2007 14:36 vra80334_appe Sheet number 1 Page number 893 black appendix E Commercial Devices In Chapter 3 we described the three main types of programmable logic devices (PLDs): simple PLDs, complex

More information

A Low Power Delay Buffer Using Gated Driver Tree

A Low Power Delay Buffer Using Gated Driver Tree IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) ISSN: 2319 4200, ISBN No. : 2319 4197 Volume 1, Issue 4 (Nov. - Dec. 2012), PP 26-30 A Low Power Delay Buffer Using Gated Driver Tree Kokkilagadda

More information

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath Objectives Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath In the previous chapters we have studied how to develop a specification from a given application, and

More information

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

data and is used in digital networks and storage devices. CRC s are easy to implement in binary Introduction Cyclic redundancy check (CRC) is an error detecting code designed to detect changes in transmitted data and is used in digital networks and storage devices. CRC s are easy to implement in

More information

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller

LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller XAPP22 (v.) January, 2 R Application Note: Virtex Series, Virtex-II Series and Spartan-II family LFSRs as Functional Blocks in Wireless Applications Author: Stephen Lim and Andy Miller Summary Linear Feedback

More information

FPGA Design with VHDL

FPGA Design with VHDL FPGA Design with VHDL Justus-Liebig-Universität Gießen, II. Physikalisches Institut Ming Liu Dr. Sören Lange Prof. Dr. Wolfgang Kühn ming.liu@physik.uni-giessen.de Lecture Digital design basics Basic logic

More information

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops International Journal of Emerging Engineering Research and Technology Volume 2, Issue 4, July 2014, PP 250-254 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Gated Driver Tree Based Power Optimized Multi-Bit

More information

FPGA Design. Part I - Hardware Components. Thomas Lenzi

FPGA Design. Part I - Hardware Components. Thomas Lenzi FPGA Design Part I - Hardware Components Thomas Lenzi Approach We believe that having knowledge of the hardware components that compose an FPGA allow for better firmware design. Being able to visualise

More information

Why FPGAs? FPGA Overview. Why FPGAs?

Why FPGAs? FPGA Overview. Why FPGAs? Transistor-level Logic Circuits Positive Level-sensitive EECS150 - Digital Design Lecture 3 - Field Programmable Gate Arrays (FPGAs) January 28, 2003 John Wawrzynek Transistor Level clk clk clk Positive

More information

L11/12: Reconfigurable Logic Architectures

L11/12: Reconfigurable Logic Architectures L11/12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following people and used with permission. - Randy H. Katz (University of California, Berkeley,

More information

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress

VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress VHDL Design and Implementation of FPGA Based Logic Analyzer: Work in Progress Nor Zaidi Haron Ayer Keroh +606-5552086 zaidi@utem.edu.my Masrullizam Mat Ibrahim Ayer Keroh +606-5552081 masrullizam@utem.edu.my

More information

Field Programmable Gate Arrays (FPGAs)

Field Programmable Gate Arrays (FPGAs) Field Programmable Gate Arrays (FPGAs) Introduction Simulations and prototyping have been a very important part of the electronics industry since a very long time now. Before heading in for the actual

More information

L12: Reconfigurable Logic Architectures

L12: Reconfigurable Logic Architectures L12: Reconfigurable Logic Architectures Acknowledgements: Materials in this lecture are courtesy of the following sources and are used with permission. Frank Honore Prof. Randy Katz (Unified Microelectronics

More information

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics 1) Explain why & how a MOSFET works VLSI Design: 2) Draw Vds-Ids curve for a MOSFET. Now, show how this curve changes (a) with increasing Vgs (b) with increasing transistor width (c) considering Channel

More information

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler Efficient Architecture for Flexible Using Multimodulo G SWETHA, S YUVARAJ Abstract This paper, An Efficient Architecture for Flexible Using Multimodulo is an architecture which is designed from the proposed

More information

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design

Use of Low Power DET Address Pointer Circuit for FIFO Memory Design International Journal of Education and Science Research Review Use of Low Power DET Address Pointer Circuit for FIFO Memory Design Harpreet M.Tech Scholar PPIMT Hisar Supriya Bhutani Assistant Professor

More information

LFSR Counter Implementation in CMOS VLSI

LFSR Counter Implementation in CMOS VLSI LFSR Counter Implementation in CMOS VLSI Doshi N. A., Dhobale S. B., and Kakade S. R. Abstract As chip manufacturing technology is suddenly on the threshold of major evaluation, which shrinks chip in size

More information

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits

VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits VLSI Technology used in Auto-Scan Delay Testing Design For Bench Mark Circuits N.Brindha, A.Kaleel Rahuman ABSTRACT: Auto scan, a design for testability (DFT) technique for synchronous sequential circuits.

More information

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General...

EECS150 - Digital Design Lecture 18 - Circuit Timing (2) In General... EECS150 - Digital Design Lecture 18 - Circuit Timing (2) March 17, 2010 John Wawrzynek Spring 2010 EECS150 - Lec18-timing(2) Page 1 In General... For correct operation: T τ clk Q + τ CL + τ setup for all

More information

Clock Gating Aware Low Power ALU Design and Implementation on FPGA

Clock Gating Aware Low Power ALU Design and Implementation on FPGA Clock Gating Aware Low ALU Design and Implementation on FPGA Bishwajeet Pandey and Manisha Pattanaik Abstract This paper deals with the design and implementation of a Clock Gating Aware Low Arithmetic

More information

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS

OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS IMPLEMENTATION OF AN ADVANCED LUT METHODOLOGY BASED FIR FILTER DESIGN PROCESS 1 G. Sowmya Bala 2 A. Rama Krishna 1 PG student, Dept. of ECM. K.L.University, Vaddeswaram, A.P, India, 2 Assistant Professor,

More information

Design of Fault Coverage Test Pattern Generator Using LFSR

Design of Fault Coverage Test Pattern Generator Using LFSR Design of Fault Coverage Test Pattern Generator Using LFSR B.Saritha M.Tech Student, Department of ECE, Dhruva Institue of Engineering & Technology. Abstract: A new fault coverage test pattern generator

More information

Figure.1 Clock signal II. SYSTEM ANALYSIS

Figure.1 Clock signal II. SYSTEM ANALYSIS International Journal of Advances in Engineering, 2015, 1(4), 518-522 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in RESEARCH ARTICLE Multi bit Flip-Flop Grouping

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity.

Prototyping an ASIC with FPGAs. By Rafey Mahmud, FAE at Synplicity. Prototyping an ASIC with FPGAs By Rafey Mahmud, FAE at Synplicity. With increased capacity of FPGAs and readily available off-the-shelf prototyping boards sporting multiple FPGAs, it has become feasible

More information

A video signal processor for motioncompensated field-rate upconversion in consumer television

A video signal processor for motioncompensated field-rate upconversion in consumer television A video signal processor for motioncompensated field-rate upconversion in consumer television B. De Loore, P. Lippens, P. Eeckhout, H. Huijgen, A. Löning, B. McSweeney, M. Verstraelen, B. Pham, G. de Haan,

More information

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Minimizing Switched Capacitance-III. (Refer

More information

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER

CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER 80 CHAPTER 6 ASYNCHRONOUS QUASI DELAY INSENSITIVE TEMPLATES (QDI) BASED VITERBI DECODER 6.1 INTRODUCTION Asynchronous designs are increasingly used to counter the disadvantages of synchronous designs.

More information

A Symmetric Differential Clock Generator for Bit-Serial Hardware

A Symmetric Differential Clock Generator for Bit-Serial Hardware A Symmetric Differential Clock Generator for Bit-Serial Hardware Mitchell J. Myjak and José G. Delgado-Frias School of Electrical Engineering and Computer Science Washington State University Pullman, WA,

More information

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015 Optimization of Multi-Channel BCH Error Decoding for Common Cases Russell Dill Master's Thesis Defense April 20, 2015 Bose-Chaudhuri-Hocquenghem (BCH) BCH is an Error Correcting Code (ECC) and is used

More information

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS *

SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEQUENTIAL CIRCUITS * SYNCHRONOUS DERIVED CLOCK AND SYNTHESIS OF LOW POWER SEUENTIAL CIRCUITS * Wu Xunwei (Department of Electronic Engineering Hangzhou University Hangzhou 328) ing Wu Massoud Pedram (Department of Electrical

More information

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September ISSN International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 917 The Power Optimization of Linear Feedback Shift Register Using Fault Coverage Circuits K.YARRAYYA1, K CHITAMBARA

More information

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, tomott}@berkeley.edu Abstract With the reduction of feature sizes, more sources

More information

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code

COPY RIGHT. To Secure Your Paper As Per UGC Guidelines We Are Providing A Electronic Bar Code COPY RIGHT 2018IJIEMR.Personal use of this material is permitted. Permission from IJIEMR must be obtained for all other uses, in any current or future media, including reprinting/republishing this material

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043

EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP. Due İLKER KALYONCU, 10043 EL302 DIGITAL INTEGRATED CIRCUITS LAB #3 CMOS EDGE TRIGGERED D FLIP-FLOP Due 16.05. İLKER KALYONCU, 10043 1. INTRODUCTION: In this project we are going to design a CMOS positive edge triggered master-slave

More information

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009

12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009 12-bit Wallace Tree Multiplier CMPEN 411 Final Report Matthew Poremba 5/1/2009 Project Overview This project was originally titled Fast Fourier Transform Unit, but due to space and time constraints, the

More information

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS

AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS AN EFFICIENT LOW POWER DESIGN FOR ASYNCHRONOUS DATA SAMPLING IN DOUBLE EDGE TRIGGERED FLIP-FLOPS NINU ABRAHAM 1, VINOJ P.G 2 1 P.G Student [VLSI & ES], SCMS School of Engineering & Technology, Cochin,

More information

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003 1 Introduction Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003 Circuits for counting both forward and backward events are frequently used in computers and other digital systems. Digital

More information

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1

A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1 A FOUR GAIN READOUT INTEGRATED CIRCUIT : FRIC 96_1 J. M. Bussat 1, G. Bohner 1, O. Rossetto 2, D. Dzahini 2, J. Lecoq 1, J. Pouxe 2, J. Colas 1, (1) L. A. P. P. Annecy-le-vieux, France (2) I. S. N. Grenoble,

More information

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet

Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet Design of a Low Power Four-Bit Binary Counter Using Enhancement Type Mosfet Praween Sinha Department of Electronics & Communication Engineering Maharaja Agrasen Institute Of Technology, Rohini sector -22,

More information

Computer Architecture and Organization

Computer Architecture and Organization A-1 Appendix A - Digital Logic Computer Architecture and Organization Miles Murdocca and Vincent Heuring Appendix A Digital Logic A-2 Appendix A - Digital Logic Chapter Contents A.1 Introduction A.2 Combinational

More information

Combinational vs Sequential

Combinational vs Sequential Combinational vs Sequential inputs X Combinational Circuits outputs Z A combinational circuit: At any time, outputs depends only on inputs Changing inputs changes outputs No regard for previous inputs

More information

Scan. This is a sample of the first 15 pages of the Scan chapter.

Scan. This is a sample of the first 15 pages of the Scan chapter. Scan This is a sample of the first 15 pages of the Scan chapter. Note: The book is NOT Pinted in color. Objectives: This section provides: An overview of Scan An introduction to Test Sequences and Test

More information

Implementation of Dynamic RAMs with clock gating circuits using Verilog HDL

Implementation of Dynamic RAMs with clock gating circuits using Verilog HDL Implementation of Dynamic RAMs with clock gating circuits using Verilog HDL B.Sanjay 1 SK.M.Javid 2 K.V.VenkateswaraRao 3 Asst.Professor B.E Student B.E Student SRKR Engg. College SRKR Engg. College SRKR

More information

Music Electronics Finally DeMorgan's Theorem establishes two very important simplifications 3 : Multiplexers

Music Electronics Finally DeMorgan's Theorem establishes two very important simplifications 3 : Multiplexers Music Electronics Finally DeMorgan's Theorem establishes two very important simplifications 3 : ( A B )' = A' + B' ( A + B )' = A' B' Multiplexers A digital multiplexer is a switching element, like a mechanical

More information

An FPGA Implementation of Shift Register Using Pulsed Latches

An FPGA Implementation of Shift Register Using Pulsed Latches An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,

More information

Digital Systems Design

Digital Systems Design ECOM 4311 Digital Systems Design Eng. Monther Abusultan Computer Engineering Dept. Islamic University of Gaza Page 1 ECOM4311 Digital Systems Design Module #2 Agenda 1. History of Digital Design Approach

More information

Modeling Latches and Flip-flops

Modeling Latches and Flip-flops Lab Workbook Introduction Sequential circuits are digital circuits in which the output depends not only on the present input (like combinatorial circuits), but also on the past sequence of inputs. In effect,

More information

Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology

Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology Efficient 500 MHz Digital Phase Locked Loop Implementation sin 180nm CMOS Technology Akash Singh Rawat 1, Kirti Gupta 2 Electronics and Communication Department, Bharati Vidyapeeth s College of Engineering,

More information

Sharif University of Technology. SoC: Introduction

Sharif University of Technology. SoC: Introduction SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting

More information

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida

Reconfigurable Architectures. Greg Stitt ECE Department University of Florida Reconfigurable Architectures Greg Stitt ECE Department University of Florida How can hardware be reconfigurable? Problem: Can t change fabricated chip ASICs are fixed Solution: Create components that can

More information

ELEN Electronique numérique

ELEN Electronique numérique ELEN0040 - Electronique numérique Patricia ROUSSEAUX Année académique 2014-2015 CHAPITRE 5 Sequential circuits design - Timing issues ELEN0040 5-228 1 Sequential circuits design 1.1 General procedure 1.2

More information

Dual Slope ADC Design from Power, Speed and Area Perspectives

Dual Slope ADC Design from Power, Speed and Area Perspectives Dual Slope ADC Design from Power, Speed and Area Perspectives Isaac Macwan, Xingguo Xiong, Lawrence Hmurcik Department of Electrical & Computer Engineering, University of Bridgeport, Bridgeport, CT 06604

More information

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic

High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic High Performance Dynamic Hybrid Flip-Flop For Pipeline Stages with Methodical Implanted Logic K.Vajida Tabasum, K.Chandra Shekhar Abstract-In this paper we introduce a new high performance dynamic hybrid

More information

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques

Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Performance Evolution of 16 Bit Processor in FPGA using State Encoding Techniques Madhavi Anupoju 1, M. Sunil Prakash 2 1 M.Tech (VLSI) Student, Department of Electronics & Communication Engineering, MVGR

More information

P.Akila 1. P a g e 60

P.Akila 1. P a g e 60 Designing Clock System Using Power Optimization Techniques in Flipflop P.Akila 1 Assistant Professor-I 2 Department of Electronics and Communication Engineering PSR Rengasamy college of engineering for

More information

PICOSECOND TIMING USING FAST ANALOG SAMPLING

PICOSECOND TIMING USING FAST ANALOG SAMPLING PICOSECOND TIMING USING FAST ANALOG SAMPLING H. Frisch, J-F Genat, F. Tang, EFI Chicago, Tuesday 6 th Nov 2007 INTRODUCTION In the context of picosecond timing, analog detector pulse sampling in the 10

More information

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method

Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method Reconfigurable FPGA Implementation of FIR Filter using Modified DA Method M. Backia Lakshmi 1, D. Sellathambi 2 1 PG Student, Department of Electronics and Communication Engineering, Parisutham Institute

More information

Low Power D Flip Flop Using Static Pass Transistor Logic

Low Power D Flip Flop Using Static Pass Transistor Logic Low Power D Flip Flop Using Static Pass Transistor Logic 1 T.SURIYA PRABA, 2 R.MURUGASAMI PG SCHOLAR, NANDHA ENGINEERING COLLEGE, ERODE, INDIA Abstract: Minimizing power consumption is vitally important

More information

WINTER 15 EXAMINATION Model Answer

WINTER 15 EXAMINATION Model Answer Important Instructions to examiners: 1) The answers should be examined by key words and not as word-to-word as given in the model answer scheme. 2) The model answer and the answer written by candidate

More information

A Novel Low Power pattern Generation Technique for Concurrent Bist Architecture

A Novel Low Power pattern Generation Technique for Concurrent Bist Architecture A Novel Low Power pattern Generation Technique for Concurrent Bist Architecture Y. Balasubrahamanyam, G. Leenendra Chowdary, T.J.V.S.Subrahmanyam Research Scholar, Dept. of ECE, Sasi institute of Technology

More information

EE178 Spring 2018 Lecture Module 5. Eric Crabill

EE178 Spring 2018 Lecture Module 5. Eric Crabill EE178 Spring 2018 Lecture Module 5 Eric Crabill Goals Considerations for synchronizing signals Clocks Resets Considerations for asynchronous inputs Methods for crossing clock domains Clocks The academic

More information

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board

Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board Tutorial 11 ChipscopePro, ISE 10.1 and Xilinx Simulator on the Digilent Spartan-3E board Introduction This lab will be an introduction on how to use ChipScope for the verification of the designs done on

More information

A Fast Constant Coefficient Multiplier for the XC6200

A Fast Constant Coefficient Multiplier for the XC6200 A Fast Constant Coefficient Multiplier for the XC6200 Tom Kean, Bernie New and Bob Slous Xilinx Inc. Abstract. We discuss the design of a high performance constant coefficient multiplier on the Xilinx

More information

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785

[Krishna*, 4.(12): December, 2015] ISSN: (I2OR), Publication Impact Factor: 3.785 IJESRT INTERNATIONAL JOURNAL OF ENGINEERING SCIENCES & RESEARCH TECHNOLOGY DESIGN AND IMPLEMENTATION OF BIST TECHNIQUE IN UART SERIAL COMMUNICATION M.Hari Krishna*, P.Pavan Kumar * Electronics and Communication

More information

Chapter 4. Logic Design

Chapter 4. Logic Design Chapter 4 Logic Design 4.1 Introduction. In previous Chapter we studied gates and combinational circuits, which made by gates (AND, OR, NOT etc.). That can be represented by circuit diagram, truth table

More information

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY Ms. Chaitali V. Matey 1, Ms. Shraddha K. Mendhe 2, Mr. Sandip A.

More information

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique

FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique FPGA Based Implementation of Convolutional Encoder- Viterbi Decoder Using Multiple Booting Technique Dr. Dhafir A. Alneema (1) Yahya Taher Qassim (2) Lecturer Assistant Lecturer Computer Engineering Dept.

More information

2.6 Reset Design Strategy

2.6 Reset Design Strategy 2.6 Reset esign Strategy Many design issues must be considered before choosing a reset strategy for an ASIC design, such as whether to use synchronous or asynchronous resets, will every flipflop receive

More information

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm

A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm A Low Power Implementation of H.264 Adaptive Deblocking Filter Algorithm Mustafa Parlak and Ilker Hamzaoglu Faculty of Engineering and Natural Sciences Sabanci University, Tuzla, 34956, Istanbul, Turkey

More information

AN OPTIMIZED IMPLEMENTATION OF MULTI- BIT FLIP-FLOP USING VERILOG

AN OPTIMIZED IMPLEMENTATION OF MULTI- BIT FLIP-FLOP USING VERILOG AN OPTIMIZED IMPLEMENTATION OF MULTI- BIT FLIP-FLOP USING VERILOG 1 V.GOUTHAM KUMAR, Pg Scholar In Vlsi, 2 A.M.GUNA SEKHAR, M.Tech, Associate. Professor, ECE Department, 1 gouthamkumar.vakkala@gmail.com,

More information

High Performance Carry Chains for FPGAs

High Performance Carry Chains for FPGAs High Performance Carry Chains for FPGAs Matthew M. Hosler Department of Electrical and Computer Engineering Northwestern University Abstract Carry chains are an important consideration for most computations,

More information

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3.

Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol Chethan Kumar M 1, Praveen Kumar Y G 2, Dr. M. Z. Kurian 3. International Journal of Computer Engineering and Applications, Volume VI, Issue II, May 14 www.ijcea.com ISSN 2321 3469 Design and FPGA Implementation of 100Gbit/s Scrambler Architectures for OTN Protocol

More information

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop

Improve Performance of Low-Power Clock Branch Sharing Double-Edge Triggered Flip-Flop Sumant Kumar et al. 2016, Volume 4 Issue 1 ISSN (Online): 2348-4098 ISSN (Print): 2395-4752 International Journal of Science, Engineering and Technology An Open Access Journal Improve Performance of Low-Power

More information

Polar Decoder PD-MS 1.1

Polar Decoder PD-MS 1.1 Product Brief Polar Decoder PD-MS 1.1 Main Features Implements multi-stage polar successive cancellation decoder Supports multi-stage successive cancellation decoding for 16, 64, 256, 1024, 4096 and 16384

More information

An Efficient High Speed Wallace Tree Multiplier

An Efficient High Speed Wallace Tree Multiplier Chepuri satish,panem charan Arur,G.Kishore Kumar and G.Mamatha 38 An Efficient High Speed Wallace Tree Multiplier Chepuri satish, Panem charan Arur, G.Kishore Kumar and G.Mamatha Abstract: The Wallace

More information

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS COURSE / CODE DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS One common requirement in digital circuits is counting, both forward and backward. Digital clocks and

More information

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME Mr.N.Vetriselvan, Assistant Professor, Dhirajlal Gandhi College of Technology Mr.P.N.Palanisamy,

More information

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Reduction Stephanie Augsburger 1, Borivoje Nikolić 2 1 Intel Corporation, Enterprise Processors Division, Santa Clara, CA, USA. 2 Department

More information

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532 www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 10 Oct. 2016, Page No. 18532-18540 Pulsed Latches Methodology to Attain Reduced Power and Area Based

More information

System IC Design: Timing Issues and DFT. Hung-Chih Chiang

System IC Design: Timing Issues and DFT. Hung-Chih Chiang System IC esign: Timing Issues and FT Hung-Chih Chiang Outline SoC Timing Issues Timing terminologies Synchronous vs. asynchronous design Interfaces and timing closure Clocking issues Reset esign for Testability

More information

The basic logic gates are the inverter (or NOT gate), the AND gate, the OR gate and the exclusive-or gate (XOR). If you put an inverter in front of

The basic logic gates are the inverter (or NOT gate), the AND gate, the OR gate and the exclusive-or gate (XOR). If you put an inverter in front of 1 The basic logic gates are the inverter (or NOT gate), the AND gate, the OR gate and the exclusive-or gate (XOR). If you put an inverter in front of the AND gate, you get the NAND gate etc. 2 One of the

More information

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC)

Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC) Design of Low Power D-Flip Flop Using True Single Phase Clock (TSPC) Swetha Kanchimani M.Tech (VLSI Design), Mrs.Syamala Kanchimani Associate Professor, Miss.Godugu Uma Madhuri Assistant Professor, ABSTRACT:

More information

Designing Integrated Accelerator for Stream Ciphers with Structural Similarities

Designing Integrated Accelerator for Stream Ciphers with Structural Similarities Designing Integrated Accelerator for Stream Ciphers with Structural Similarities Sourav Sen Gupta 1, Anupam Chattopadhyay 2,andAyeshaKhalid 2 1 Centre of Excellence in Cryptology, Indian Statistical Institute,

More information

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005

EE178 Lecture Module 4. Eric Crabill SJSU / Xilinx Fall 2005 EE178 Lecture Module 4 Eric Crabill SJSU / Xilinx Fall 2005 Lecture #9 Agenda Considerations for synchronizing signals. Clocks. Resets. Considerations for asynchronous inputs. Methods for crossing clock

More information

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS

NH 67, Karur Trichy Highways, Puliyur C.F, Karur District UNIT-III SEQUENTIAL CIRCUITS NH 67, Karur Trichy Highways, Puliyur C.F, 639 114 Karur District DEPARTMENT OF ELETRONICS AND COMMUNICATION ENGINEERING COURSE NOTES SUBJECT: DIGITAL ELECTRONICS CLASS: II YEAR ECE SUBJECT CODE: EC2203

More information

Implementation of Low Power and Area Efficient Carry Select Adder

Implementation of Low Power and Area Efficient Carry Select Adder International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 3 Issue 8 ǁ August 2014 ǁ PP.36-48 Implementation of Low Power and Area Efficient Carry Select

More information

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL

Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL Design and Implementation of FPGA Configuration Logic Block Using Asynchronous Static NCL Indira P. Dugganapally, Waleed K. Al-Assadi, Tejaswini Tammina and Scott Smith* Department of Electrical and Computer

More information

EITF35: Introduction to Structured VLSI Design

EITF35: Introduction to Structured VLSI Design EITF35: Introduction to Structured VLSI Design Part 4.2.1: Learn More Liang Liu liang.liu@eit.lth.se 1 Outline Crossing clock domain Reset, synchronous or asynchronous? 2 Why two DFFs? 3 Crossing clock

More information

EFFICIENT IMPLEMENTATION OF RECENT STREAM CIPHERS ON RECONFIGURABLE HARDWARE DEVICES

EFFICIENT IMPLEMENTATION OF RECENT STREAM CIPHERS ON RECONFIGURABLE HARDWARE DEVICES EFFICIENT IMPLEMENTATION OF RECENT STREAM CIPHERS ON RECONFIGURABLE HARDWARE DEVICES Philippe Léglise, François-Xavier Standaert, Gaël Rouvroy, Jean-Jacques Quisquater UCL Crypto Group, Microelectronics

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014 EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect

More information

LUT Optimization for Memory Based Computation using Modified OMS Technique

LUT Optimization for Memory Based Computation using Modified OMS Technique LUT Optimization for Memory Based Computation using Modified OMS Technique Indrajit Shankar Acharya & Ruhan Bevi Dept. of ECE, SRM University, Chennai, India E-mail : indrajitac123@gmail.com, ruhanmady@yahoo.co.in

More information

Static Timing Analysis for Nanometer Designs

Static Timing Analysis for Nanometer Designs J. Bhasker Rakesh Chadha Static Timing Analysis for Nanometer Designs A Practical Approach 4y Spri ringer Contents Preface xv CHAPTER 1: Introduction / 1.1 Nanometer Designs 1 1.2 What is Static Timing

More information

Logic Design Viva Question Bank Compiled By Channveer Patil

Logic Design Viva Question Bank Compiled By Channveer Patil Logic Design Viva Question Bank Compiled By Channveer Patil Title of the Practical: Verify the truth table of logic gates AND, OR, NOT, NAND and NOR gates/ Design Basic Gates Using NAND/NOR gates. Q.1

More information

https://daffy1108.wordpress.com/2014/06/08/synchronizers-for-asynchronous-signals/

https://daffy1108.wordpress.com/2014/06/08/synchronizers-for-asynchronous-signals/ https://daffy1108.wordpress.com/2014/06/08/synchronizers-for-asynchronous-signals/ Synchronizers for Asynchronous Signals Asynchronous signals causes the big issue with clock domains, namely metastability.

More information

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, Sequencing ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall, 2013 ldvan@cs.nctu.edu.tw http://www.cs.nctu.edu.tw/~ldvan/ Outlines Introduction Sequencing

More information

Design and Evaluation of a Low-Power UART-Protocol Deserializer

Design and Evaluation of a Low-Power UART-Protocol Deserializer 1 Design and Evaluation of a Low-Power UART-Protocol Deserializer Casey T. Morrison, William Goh, Saeed Sadrameli, and Eric Blattler Abstract The and evaluation of a low-power Universal Asynchronous Receiver/Transmitter

More information

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description

VID_OVERLAY. Digital Video Overlay Module Rev Key Design Features. Block Diagram. Applications. Pin-out Description Key Design Features Block Diagram Synthesizable, technology independent VHDL IP Core Video overlays on 24-bit RGB or YCbCr 4:4:4 video Supports all video resolutions up to 2 16 x 2 16 pixels Supports any

More information

Hardware Implementation of Viterbi Decoder for Wireless Applications

Hardware Implementation of Viterbi Decoder for Wireless Applications Hardware Implementation of Viterbi Decoder for Wireless Applications Bhupendra Singh 1, Sanjeev Agarwal 2 and Tarun Varma 3 Deptt. of Electronics and Communication Engineering, 1 Amity School of Engineering

More information