Data-Pattern Enabled Self-Recovery Low-Power Storage System for Big Video Data

Size: px
Start display at page:

Download "Data-Pattern Enabled Self-Recovery Low-Power Storage System for Big Video Data"

Transcription

1 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW 1 Data-Pattern Enabled Self-Recovery Low-Power Storage System for Big Video Data Jonathon Edstrom, Dongliang Chen, Yifu Gong, Jinhui Wang, Member, IEEE, and Na Gong, Member, IEEE Abstract The growing popularity of powerful mobile devices such as smart phones and tablet devices has resulted in the exponential growth of demand for video applications. However, due to the large video data size and intensive computation, mobile video applications require frequent embedded memory access, which consumes a large amount of power and limits battery life. In this paper, we present a low-cost self-recovery video storage system by investigating meaningful data patterns hidden in big video data, by introducing data mining techniques to the hardware design process. We propose a two-dimensional data-pattern approach to explore horizontal data-association and vertical data-correlation characteristics. Such data relationship discovery and pattern identification enable a new dimension for the hardware design space and bring self-recovery ability to memories in the presence of bitcell failures. Based on the identified optimal data patterns, we present a low-cost and efficient SRAM design to enable data self-recovery at low voltages. A 45nm 32kb SRAM is implemented that delivers good video quality at near-threshold voltage (0.5 V) with negligible area overhead (7.94%). Index Terms videos; data mining; data pattern; low-power; self-recovery; on-chip memory 1 INTRODUCTION I NFORMATION has driven the remarkable evolution of human society. According to market research, by 2020, the amount of data that is created, replicated, and consumed, will be as large as 40ZB (Zettabyte, or B) [1, 2]; and more than half of the data traffic will be video data [3]. Traditional, plain TV sets, are losing ground to hybrid TVs, PCs, game consoles, and more recently, mobile devices such as tablets and smartphones. In this new, mobile, and big video age, one of the biggest contributors to user dissatisfaction still remaining is short battery life [3]. In particular, due to the intensive computation and large data size, video applications are demanding continuously increased storage space. To realize this process, embedded static random-access memory (SRAM) occupies over 65% of the mobile video decoder chip area and they are also a major contributor to mobile battery consumption (>92% of the motion compensation energy [32]) and this situation is only expected to grow for the next-generation mobile video format - H.265/HEVC which has 2x-3x higher memory demand compared to that of H.264 [32]. Voltage scaling techniques have been widely applied to reduce the power consumption of memory systems. Researchers have shown that SRAM achieves maximum efficiency at near-threshold voltage [14]. However, as voltage scales, SRAMs are susceptible to failure due to significant process variation. Various techniques have been developed to correct or eliminate these memory failures as voltage is scaled. Traditional low-power memory techniques can be divided into three, general, categories: (i) assist schemes, The authors are with the Department of Electrical and Computer Engineering, North Dakota State University, ND {jonathon.edstrom, dongliang.chen, yifu.gong, jinhui.wang.1, na.gong}@ndsu.edu. Big-data enabled data knowledge Traditional isolated hardware design process Positive Feedback Loop: big-data enabled better hardware will support big-data applications better Fig. 1. Big-data enabled intelligent efficient hardware. such as adjustment of cell voltage, [5] and boosted wordline voltage [6]; (ii) large bitcells such as upsized 6T cell [24], asymmetric 7T cell [7], single-ended read-decoupled 8T cells [8], read-disturb-free 9T [9], and bit-interleaving 12T cells [10]; and (iii) error correction techniques spanning from the use of error correction codes [11] to data remapping [12]. Unfortunately, almost all existing solutions require considerable implementation overhead to the original memory design. For example, the penalty of the area overhead are as high as %. Such large overhead leads to increased layout area, higher design complexity, and reduced performance of the entire system. Recently, a new branch of low-voltage embedded memory techniques have been developed to embrace the memory faults, instead of avoiding the faults (assistance techniques or more than 6T cells) or correcting the faults (e.g. ECC). Those techniques aim to mitigate the impact of memory faults by minimizing the magnitude of the error (due to a faculty cell), based on the determined memory fault positions from run-time testing (e.g. built-in self-test (BIST)). We refer those techniques as fault-position aware mitigation techniques. For example, Ganapathy et al. [13] developed a shifting technique to always store the leastsignificant-bits (LSBs) in the faulty cells, which may lead to a tolerable output quality. Ferreron et al. [14] presented a squeezing technique to compress zeros and store them in Manuscript received 14 Oct less memory space, thereby avoiding the presented +

2 2 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW memory failures at low voltage. However, based on the predetermined memory fault positions, the existing techniques still involve complex operation (e.g. shifting value calculation and storage) and the overhead incurred is still significant (e.g. 65% in [13]). To address the storage challenge of videos as well as other big things, we propose to leverage the assets of big data to extract useful knowledge and actionable information for hardware design. Recently, it has been observed that today s big-data applications, including videos, have three common data characteristics [33]: (i) redundant inputs, (ii) multiple acceptable outputs, and (iii) statistical computations. Those intrinsic characteristics provide substantial opportunities for data relationship discovery and pattern identification, which will enable a new dimension for hardware design space and bring exciting innovation opportunities for multi-dimensional innovations in circuits and systems, as illustrated in Fig. 1. Specifically, in this paper, we present a novel Data Pattern enabled Self- Recovery video SRAM (DPSR) to achieve efficient nearthreshold voltage computing while delivering good video output quality. By introducing advanced data mining techniques, we investigate meaningful data patterns hidden in video data and use them to enable self-recovery in SRAM. We propose a two-dimensional (2D) data pattern approach to explore horizontal data-association and vertical datacorrelation characteristics to determine the optimal data patterns for self-recovery. Based on this, we develop an efficient SRAM design technique to implement DPSR with negligible area overhead (7.94%) and performance penalty. Earlier in [30], we presented a basic DPSR design storing only chroma data associations, including some preliminary results. We extend our original work with the following additional contributions: We investigate data associations between bits of luma data in various videos to enable additional power savings by implementing the design across both luma and chroma data in the video memory (Sections 3.1, 3.2 and 3.3). We propose a new hardware design that realizes near-threshold voltage storage for the luma data based on discovered optimal data patterns. We analyze different hardware bit prediction schemes and implement the optimal wordline architecture for the highest bit prediction percentage. Since there is twice as much luma data as chroma data in typical mobile videos, our additions allow for triple the power savings as compared to our previous design [30] (Sections 4 and 5.2). To analyze the quality of video output, we add a new structural similarity (SSIM) metric proposed in [23], which is aware of the user s perception by including calculations for luminance, contrast, and structural changes in the video (Section 4.3). Also, to verify the power efficiency of the proposed technique, we develop memory power consumption models for both active and leakage power consumption and performed detailed analysis (Section 5.3). mobile devices in the environment of big data, we expand the video data to larger-scale and real videos using the recently released YouTube-8M dataset [31]. Specifically, 10,000 unique YouTube-8M videos, with 57.6 GB total data size, representing 500,000 individual frames, have been analyzed using data mining techniques to identify the general data patterns existing in various videos (Sections 3.1 and 3.2). Additionally, 25 videos from YouTube- 8M, separate from the 10,000 videos used in the training dataset, are used to verify the correct prediction percentage and video output qualities (Sections 3.3 and 5.4). It should be emphasized that the biggest challenge to achieve data-enabled hardware is that it is difficult for hardware designers to directly observe the inherent data behaviors from the large volume of video data. To realize the proposed data-pattern enabled efficient video storage, hardware designers require a deep understanding and systematic study of inherent data relationships from massive video data. This will not be solved by traditional data techniques, due to the increased complexity and the growing amount of video data. Also, the larger the data size, the more general data patterns can be identified and the more power saving opportunities can be enabled. Accordingly, big data can be one and the only one powerful way to realize the full potential of the proposed intelligent hardware. Noted that the data pattern identification process is conducted in design time (off-line), thereby avoiding runtime performance overhead caused by big data algorithms. The rest of the paper is organized as follows. Section 2 presents SRAM failure at near threshold-voltage. In Section 3, the data-mining enabled mobile video data patterns are analyzed for self-recovery. In Section 4, we present DPSR. The evaluation results are provided in Section 5. Finally, the conclusion is drawn in Section 6. 2 EMBEDDED MEMORY FAILURE ANALYSIS AT NEAR-THRESHOLD VOLTAGE It has been shown that the computing efficiency is maximized when a circuit is operating at near-threshold voltage [14]. However, at 0.5V (our target near-threshold voltage), SRAM failures become more severe with the increasing process variation. In particular, the random dopant fluctuation (RDF) effect leads to threshold voltage (Vth) variation and SRAM cell failures [15]. For the current manufacturing technologies, the failure probability of an SRAM cell (Pfail) typically ranges between 10-3 and 10-2, depending on the bitcell area [14, 16]. The minimum-sized SRAM has highest failure rate of 10-2 and larger bitcells have a lower failure probability. With 58% area overhead, the failure rate can be reduced from 10-2 to 10-3 [16]. In our analysis, we consider both 10-2 (minimum-sized SRAM) and 10-3 (upsized SRAM) conditions. It should be noted that the failure rate Most importantly, in order to emulate the use of

3 3 (a) Fig. 2. Error maps in the SRAM array at 0.5 V. (a) Failure rate is 10-3 (0.001) and (b) failure rate is 10-2 (0.01). Each dot on the maps illustrates the bitcell failure locations with row number (y axis) and column number (x axis) in the SRAM array. can be further optimized using a recently developed priority-based sizing technique [26]. To further study the SRAM failure characteristics at low voltage, we investigated error maps for a 512 word 64 bit (b) TABLE 1 FAULT PROBABILITY IN A 32-BIT SRAM WORD Number of faults per wordline SRAM failure rate: 10-3 (0.001) SRAM failure rate: 10-2 (0.01) % % % % % % % % % % 5 0% % 6 0% % 7 0% % *Calculations based on Monte Carlo simulation (10 9 trials) SRAM with Pfail equal to 10-2 and During the fault injection process, we assumed the failed bits to be located across the memory cells based on the failure probabilities according to a uniform distribution, introducing embedded memory failures to the decoding process. Using a uniform distribution for the errors is confirmed by memory failure measurements in [29]. The results are shown in Fig. 2. SRAM faults are uniformly distributed in the array. We also analyzed the probability of different faults in the same wordline (32-bit word) and the results are listed in Table 1. It can be seen that a wordline has a low number of faulty cells. The probability of two faults existing in the same wordline is only 3.6% when the SRAM bitcell failure rate is Accordingly, in the presence of a memory fault, SRAM may achieve self-recovery based on other bits in the same wordline if meaningful bit-level data-patterns exist. 3 DATA PATTERN INVESTIGATION FOR SELF- RECOVERY This section presents our data-mining methodology to discover data-patterns hidden in video data to enable reliable self-recovery. Specifically, we propose a new two-dimensional (2D) data pattern approach to explore horizontal data-association and vertical data-correlation characteristics, thereby achieving optimal data patterns. 3.1 Rule Mining Enabled Horizontal Today s mobile video frames are typically stored and processed in YUV format. The YUV format includes one luma (Y) component, which contains the brightness information of the image, and two chroma components, which contain the blue-difference (Cb) and red-difference (Cr) color Fig. 3 shows a typical frame of video data stored in embedded memory using a resolution YUV 4:2:0 video as an example. As shown, each pixel has 8-bit luma data and 8-bit subsampled chroma data. Since video data is stored in on-chip memory as binary bits, we utilize MSB Luma(Y) data 8 bits/pixel Chroma (Cb) data 8 bits/4 pixels Chroma (Cr) data 8 bits/4 pixels LSB Cb1 Cb2 Cb3 Cb4 Cb5 Cb6 Cb7 Cb8 MSB MSB LSB Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 LSB Cr1 Cr2 Cr3 Cr4 Cr5 Cr6 Cr7 Cr8 2D Data-Pattern 4:2:0 Subsampling Y Cb Cr 16x16 Pixels 4:2:0 YUV Video Frame MSB LSB Cb1 Cb2 Cb3 Cb4 Cb5 Cb6 Cb7 Cb8 MSB Vertical Data Pattern: LSB Cb1 Cb2 Cb3 Cb4 Cb5 Cb6 Cb7 Cb8... Dataset/ Database Horizontal Data Pattern: Transaction 1 Item 1 Item 2 Item 3... Item X {0,1} Rule Mining Enabled Horizontal Data Pattern Data-Pattern Analysis Dataset No. of bits (no. of frames) ( Akiyo 364,953,600 (300) 364,953,600 (300) Container 364,953,600 (300) Flower 304,128,000 (250) Foreman 304,128,000 (250) Coastguard Hall 304,128,000 (250) Mobile 304,128,000 (250) Mother- Daughter 304,128,000 (250) News 304,128,000 (250) Silent 304,128,000 (250) Tempete 316,293,120 (260) Waterfall 316,293,120 (260) Total 4,221,296,640 bits (3470 frames) Fig. 3. 2D data-pattern enabled self-correction and data pattern analysis dataset.

4 4 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW Rules TABLE 3 VERTICAL CORRELATION PROBABILITY probability from 12 video benchmarks Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % probability from 10,000 YouTube-8M videos Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % Y % Cb % Cr % an association data mining technique to identify horizontal bit-level data patterns. rule mining was introduced in 1993 to discover relationships between different variables, called items, in a dataset or database [17]. A complete dataset is made up of many transactions where each transaction contains a set of items. Each item can be associated with a binary attribute, 0 or 1, that is used to distinguish that item is present or not in its corresponding transaction. This type of data organization is illustrated in Fig. 3. Each resulting rule, generated from the association rule mining process, is an implication of the form X Y, where X and Y are disjoint sets of, or individual, items. Each rule is also accompanied by collected statistics from the dataset called TABLE 2 DISCOVERED ASSOCIATION RULES From 12 video benchmarks [27, 28] From 10,000 Youtube-8M videos [31] Confidence Support Confidence Support Rules Confidence Support Confidence Support Y2=1 Y1= % % % Y2=1 Y1= % % % Y3=1 Y1= % % % Y3=1 Y1= % % % Y2=0 Y1= % % % Y2=0 Y1= % % % Y1=1 Y2= % % % Y1=1 Y2= % % % Y1=1 Y3= % % % Y1=1 Y3= % % % Cb2=0 Cb1= % % % Cb2=0 Cb1= % % % Cb2=1 Cb1= % % % Cb2=1 Cb1= % % % Cb1=0 Cb2= % % % Cb1=0 Cb2= % % % Cb1=1 Cb2= % % % Cb1=1 Cb2= % % % Cr2=1 Cr1= % % % Cr2=1 Cr1= % % % Cr2=0 Cr1= % % % Cr2=0 Cr1= % % % Cr1=1 Cr2= % % % Cr1=1 Cr2= % % % Cr1=0 Cr2= % % % Cr1=0 Cr2= % % % Cr1=1 Cr3= % % % Cr1=1 Cr3= % % % Cr1=0 Cr3= % % % Cr1=0 Cr3= % % % *Bit 1 (i.e. Y1, Cb1, Cr1) is the MSB. Bit 8 (i.e. Y8, Cb8, Cr8) is the LSB. of items is the proportion of transactions in the dataset that contains such set of items. The confidence value for an association rule, X Y, is the proportion of transactions that contain X which also contain Y, or the conditional probability P(Y X). To enable association data mining, we first use 12 different video benchmarks to build a dataset [27, 28]. In total, the video data size is 4,221,296,640 bits from 3470 frames, as shown in Fig. 3. Each video data bit is defined as an individual item and we used Weka [18] to perform the wellknown association rule mining algorithm - Apriori on our large video dataset. Table 2 lists the horizontal data patterns we obtained for chroma data based on video benchmarks. We further expand the video data to larger-scale and real video datasets in order to emulate the use of mobile devices in the environment of big data. We use Google s recently released YouTube-8M dataset [31], which is the largest video dataset to date. Specifically, 10,000 unique videos from YouTube-8M dataset, with 57.6 GB total data size, representing 500,000 individual frames, was analyzed using data-mining methods. A script was written to download these 10,000 videos from the ~7 million available URLs provided in the Youtube-8M dataset. After each video was downloaded, 50 contiguous frames were randomly selected from the video and were converted from the MP4 format to the raw YUV format using the FFmpeg decoder [32] for data-mining analysis. To support largescale video data processing, our experiments have been performed on the Thunder cluster at the Center for Computationally Assisted Science and Technology (CCAST) of North Dakota State University, which consists of 53 compute nodes. Each node has dual socket Intel Xeon 2670v2 Ivy Bridge (10 core per socket) 2.5GHz with 64GB DDR3 RAM and all nodes are interconnected with FDR InfinBand at a 56Gbit/s transfer rate. As illustrated in Fig. 3, each video data bit is defined as an individual item and the well- support and confidence values. The support value for a set

5 5 TABLE 4 OPTIMAL LUMA DATA PATTERNS FROM 25 YOUTUBE-8M VIDEOS, SEPARATE FROM THE 10,000 VIDEOS USED IN THE TRAINING DATASET Y bits Optimal Data Patterns Correct Prediction (%) Y1 (Y1 previous) Y2 (Y2 previous) Y3 (Y3 previous) Y4 (Y4 previous) Y5 (Y5 previous) Y6 (Y6 previous) Y7 (Y7 previous) Y8 (Y8 previous) known association rule mining algorithm - Apriori was used on our large video dataset to gather confidence and support metric calculations. The average results obtained for the horizontal data patterns are also listed in Table 2. We can see that the association rules obtained from video benchmarks are very general and they also exist in large scale videos. 3.2 Vertical Vertical data correlation characteristics of multimedia applications have been studied by many researchers [19, 20]. These works show that the most-significant-bits (MSB) of video data have strong correlation with neighboring pixels and the switching probability is very low. As listed in Table 3, from video benchmarks, the correlation probability of the MSBs (Y1, Cb1, Cr1) in neighboring pixels is over 93%, while it is reduced to 53% for the LSB of Cb (Cb8). The similar correlation characteristics can be observed in Your- Tube-8M videos. The MSBs in neighboring pixels have very strong correlation (with probability over 90%), but LSBs are more random and have little correlation with neighboring pixels. Power saving techniques involving the correlation have been used in previous works for bit prediction where no transistor switching results in power savings [4] and attempting to load the same value (reading continuous 0s or 1s) from a memory bit cell in order to eliminate the cost of precharging if the correct value is read out from the previous bit line read [19]. This work uses the correlation property of YUV data through the use of a novel bit correction technique that attempts to correct memory faults with high precision. By comparing the correlation percentages and the association rules we can identify the best combination of association rules and correlation between bits to construct an optimal pattern for data self-recovery. 3.3 Optimal Data Patterns for Self-Recovery In order to select an optimal data pattern from association and correlation, we define the Weighted Confidence based on the support and confidence of a particular rule as follows: Weighted Confidence = Confidence (Rule) Support (Rule) +Confidence(Complement Rule) Support(Complement Rule) (1) TABLE 5 OPTIMAL CHROMA DATA PATTERNS FROM 25 YOUTUBE- 8M VIDEOS, SEPARATE FROM 10,000 VIDEOS USED BEFORE For example, Weighted Confidence of association rule Cr1 Cr 2 can be expressed as Weighted Confidence of Cr1 Cr 2 Cb bits Cb1 Cb2 Cb3 Cb4 Cb5 Cb6 Cb7 Cb8 Optimal Data Patterns Confidence Cr 1 0 Cr 2 1 Support Cr1 0 Cr 2 1 Cr 1 1 Cr 2 0 Support Cr1 1 Cr 2 0 Confience We then use this parameter to compare to the sum of the correlation values for 0 and 1 non-switching which is equal to the correlation. This is equivalent to the Weighted Confidence calculation but instead uses the individual bit value (0 or 1) correlation percentages and is calculated as follows: = Confidence(Bitprevious = 0 Bitcurrent = 0) + Confidence(Bitprevious = 1 Bitcurrent = 1) where Bitprevious and Bitcurrent represent the video data bits in the same position of two neighboring pixels. As an example, of Cr2 Cb2 Cb1 Cb1 Cb2 (Cb3previous) (Cb4previous) (Cb5previous) (Cb6previous) (Cb7previous) (Cb8previous) Correct Prediction (%) Confidence Cr 2 previous 0 Cr 2current Confidence Cr 2previous 1 Cr 2current Accordingly, we obtain the optimal bit-level data patterns with high prediction rate to enable self-recovery, as listed in Table 4 for luma (Y) and Table 5 for chroma (Cr and Cb). 25 videos from YouTube-8M, separate from the 10,000 videos used in the training dataset, are used to verify the correct prediction percentage shown in Table 4 and Table Cr bits Cr Cr Cr Cr Cr Cr Cr Cr8 Optimal Data Patterns Cr2 Cr1 Cr1 Cr2 Cr1 Cr3 (Cr4previous) (Cr5previous) (Cr6previous) (Cr7previous) (Cr8previous) Correct Prediction (%) (2) (3) (4)

6 6 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW Double Faults TABLE 6 DPSR RECOVERY FAILURE RATE SRAM Pfail: 10-3 (0.001) 5. These videos are obtained using the same method as previous, but are unique from the previous 10,000 videos to ensure our technique works properly for correction. Our analysis shows that luma data is more random and has less association with other bits in the same pixel and the optimal data patterns are all from correlation. 3.4 Recovery Failure Caused by Double Faults in Data Pattern Since the discovered optimal data patterns used for selfrecovery exist between two bits in the same wordline, it may cause recovery failure if both of the two bits in a pattern fail simultaneously. Table 6 lists the recovery failure rate. It shows that DPSR has good reliability with extremely low self-recovery rate (less than 0.2%). This is due to the fact that there is a low probability of having multiple faults in the same wordline, as discussed in Section 2. 4 DPSR HARDWARE IMPLEMENTATION SRAM Pfail: 10-2 (0.01) Correction Faults % % Faults % % DPSR Recovery Failure % % Utilizing the obtained optimal bit-level data patterns, we present a simple but efficient DPSR hardware design with low implementation cost. Fig. 4 shows the array architecture of the proposed DPSR, where the total array size is 32 kbits and there are four blocks with 256 words 32 bits. In this design, both luma data and chroma data will be stored in the same SRAM but in different blocks. Block 1 and block 2 will be used to store the luma data and each wordline will store the luma data of 4 pixels. Block 3 and block 4 will be used to store the chroma data and each wordline will store the chroma data of 2 pixels. Regarding the luma data stored in blocks 1 and 2, based on the optimal luma patterns obtained in Section 3, vertical correlation rules, that is, luma data of the previous pixel will be used for recovering the data of current pixel. Since SRAM reads are row-wise, reading two physical rows will cause considerable performance penalty. Accordingly, we adapt the vertical correlation based luma self-recovery to a hardware-friendly design scheme. Each wordline will store the luma data of 4 pixels, we will use its neighboring pixel in the same row for data correction in the current pixel. For example, if a data bit in pixel 1 has failure (see Fig. 4), we will use the corresponding bit in pixel 2 for recovery; if a bit in pixel 4 has failure, we will use the corresponding bit in pixel 3 (which is the neighboring pixel in the same row) for recovery. In order to verify our design would maximize the correct bit predictions, we calculated the correct prediction percentage for predicting each bit using both the previous and next pixel s corresponding bit. Our analysis shows that they were approximately equal calculations from all samples in our training and verification testing video benchmarks [27, 28]. Chroma data self-recovery is implemented in SRAM block 3 and block 4 using the optimal chroma patterns. As shown in Fig. 4, each wordline will store two pixels of chroma data with both Cr and Cb. And both vertical correlation rules and horizontal data pattern rules are used for the chroma data self-recovery (see Table 5). Similarly, for vertical correction rules based recovery, we use the neighboring pixel stored in the same row for data correction, thereby avoiding performance penalty. For example, if Cb1 in Pixel 1 has failure (see Fig. 4), the inverted value of Cb2 in the same pixel will be used for recovery; if Cr4 in Pixel 1 is failed, Cr4 in Pixel 2 stored in the same row will be used for recovery. As shown in Fig. 4, a hierarchical readout bitline (RBL) scheme (local RBL and global RBL) is applied to reduce the access time of the memory. The self-recovery logic of DPSR can be simply implemented by connecting multiplexers (MUX) to readout bitlines of conventional SRAM. Each global bit-line (gbl) is connected to a MUX which is controlled by the received fault positions. If a fault is indicated, self-recovery is enabled by selecting the data pattern. The fault position information will be used as the select signal of the MUX to control which bit to be the output. Similar to other existing fault position aware mitigation techniques, DPSR receives pre-determined locations of the faulty bits, which is usually executed either during post fabrication testing or during power-on self-test (POST) [14, 21, 22]. Such testing process can also be used to track temporal degradation caused memory failure such as aging effect. The evaluation results in the following sections show values (both ~79.4% average correct prediction) based on Read Decoder 256 wordlines 256 wordlines SRAM Block 1 (256*32) Write Driver Self-Recovery MUX & Readout SRAM Block 3 (256*32) Write Driver SRAM Block 2 (256*32) SRAM Block 4 (256*32) Read Decoder SRAM Block X (256*32) SRAM Block 1 & 2 Sub_array 1 (32 x 32) Sub_array 2 (32 x 32) Sub_array 8 (32 x 32) 31 Luma Luma Luma 8 7 Luma 0 Pixel 1 Pixel 2 Pixel 3 Pixel Cr Cb Cr 8 7 Cb 0 Pixel 1 SRAM Block 3 & 4... Pixel 2 wbl[31:0] wblb[31:0] lbl1[31:0] lbl2[31:0] S 32 PRE gbl[31:0] gblx[31:0] Self-Recovery MUX 32 Luma Self-Recovery MUX gblx31 gblx23 S31 S30 S0 Y31 Y30 Y0 gbl31 gblx30 gblx22 gbl30... gblx0 gblx8 gbl0 Chroma Self-Recovery MUX S31 S30 S0 Cr1 Cr2 Cb8 Fig.4. Proposed DPSR with data self-recovery ability. gblx31 gblx30 gbl31 gblx30 gblx31 gbl30... gblx0 gblx16 gbl0

7 7 Read Decoder ( µm) each Write Decoder Luma MUX Chroma MUX µm SRAM Block 1 (32 256) SRAM Block 2 (32 256) SRAM Block 3 (32 256) SRAM Block 4 (32 256) µm Fig.5. Layout Design of DPSR (with 7.94% area overhead). that DPSR also achieves smaller silicon area overhead, while delivering good output quality at near threshold voltage. 5 EVALUATION METHODOLOGY AND RESULTS To evaluate the effectiveness of the proposed technique, a 32kb SRAM is implemented using a high-performance 45- nm FreePDK CMOS process [31] to meet the multi-megahertz performance requirement of today s mobile video decoders. 5.1 Performance We first evaluate the performance of the proposed DPSR. Due to the added multiplexers, the read access time of DPSR increases from 0.27 ns to 0.31 ns, which is fast enough to deliver high-quality video format such as 8K Ultra HD applications [25]. 5.2 Layout As discussed before, embedded SRAMs usually occupy a large portion of area in a video chip, and therefore the area cost of the embedded SRAM is an important design concern. Fig. 5 shows the layout of DPSR. Each added self-recovery logic (MUX) occupies an area of µm µm, resulting in 7.94% area overhead. It should be noted that, the self-recovery logic is added to readout bitlines and increasing the number of words in a memory is beneficial in reducing the area overhead. 5.3 Power Efficiency To evaluate the power effectiveness of the proposed technique, we model the power consumption of the memory as: 31 Pj i Ri W i 2 P (5) Dynamic j0 i0, k0 j0 i P L (6) Leak j where PDynamic and PLeak are the dynamic and leakage power consumption, respectively. i is the value stored in SRAM, j is the bit number in a word, which is from 0 to 31. Pj(i) is the probability of a data bit j to be 0 or 1, which is extracted from various video benchmarks. R(i), W(i), and L(i) are the read power, write power, leakage power consumption for a single SRAM bitcell, respectively, storing the data bit i. Fig. 6 compares the power consumption in different memory operations. As expected, all power components decreases as the voltage scales from 1 V to 0.5 V. It should be noted that the power consumption overhead caused by the self-correction logic in the proposed technique is negligible as compared to the power reduction enabled by reducing voltage to near-threshold voltage, since the dynamic and leakage power consumption scale quadratically and linearly with voltage, respectively. The proposed memory at 0.5 V consumes 219 µw leakage power and 193 µw leakage power. As compared to the conventional memory at 1 V, 81.52% dynamic power savings and 82.45% leakage power savings can be achieved by the proposed technique. 5.4 Video Output Quality Analysis Different from the video benchmark sets used in Section 3, we use a new video benchmark set for verification: 3 videos from [26] and another 5 videos from [27]. We adopt the well-known peak signal-to-noise ratio (PSNR) metric to evaluate the video quality, which is defined as [19] PSNR 10 log (7) 10 MSE where MSE is the mean square error between the original videos (Org) and the degraded videos (Deg), expressed as m n Org( i, j) Deg ( i, j) MSE (8) mn i0 j 0 Memory Operations Fig.6. Power consumption of different memory operations (W(0): write 0, W(1): write 1, R(0): read 0, R(1): read 1, L(0): leakage power of storing 0, L(1): leakage power of storing 1). Researchers have shown that PSNR with 30 db or higher for a video would be acceptable [14]. Table 7 compares PSNR values using different techniques as Pfail are 10-2 (for minimum-sized SRAM) and 10-3 (for upsized SRAM with 58% area overhead [16]). In addition to video benchmarks, 10 YouTube-8M videos from the 25 videos used for verification earlier in Table 4 and Table 5 (separate from the 10,000 videos used in the training dataset) are used for calculating the video metrics presented. Due to the limited space, Fig. 7 shows six video output images with memory failure of 10-2 when failures are injected. It can be seen that DPSR has good recovery precision and it can deliver good video quality with PSNR over 35 db, even for minimumsized SRAM. Accordingly, DPSR achieves good video output quality at near-threshold voltage. 2

8 8 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW Videos Original Video Conventional (Pfail = 0.01) DPSR (Pfail = 0.01) Shift (Pfail = 0.01) [13] city PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: crew PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: football PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: Concert - PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: Game - PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: Electric Guitar - PSNR: SSIM: PSNR: SSIM: PSNR: SSIM: Fig.7. Video output using different video storage techniques.

9 9 Dataset Video benchmarks YouTube 8M Dataset Dataset Video benchmarks YouTube 8M Dataset TABLE 7 VIDEO PSNR METRIC COMPARISON conventional DPSR conventional DPSR Ref. [13] Ref. [13] Videos (Pfail = 0.001) (Pfail = 0.001) (Pfail = 0.01) (Pfail = 0.01) (Pfail =0.001) (Pfail = 0.01) akiyo bus city coastguard crew football foreman sign_irene Running Concert Music Video Festival Game Electric Guitar Snow Flute Vehicle Planet TABLE 8 VIDEO SSIM METRIC COMPARISON Conventional DPSR conventional DPSR Ref. [13] Ref. [13] Videos (Pfail = 0.001) (Pfail = 0.001) (Pfail = 0.01) (Pfail = 0.01) (Pfail =0.001) (Pfail = 0.01) akiyo bus city coastguard crew football foreman sign_irene Running Concert Music Video Festival Game Electric Guitar Snow Flute Vehicle Planet The matric PSNR has been used extensively to describe video output quality in a quantitative way but recent efforts to capture true human perception show it may not accurately describe the actual video quality a human perceives [23]. This is due to the fact that PSNR is based on the summation of error for every pixel s chrominance and luminance component values and this alone is not necessarily a good estimate to the user s perception of the video. Analyzing the video quality using the structural similarity (SSIM) metric is a method that is more aware of the user s perception since it includes calculations for luminance, contrast, and structural changes in the video. The general form of the SSIM equation is defined as [23] l x, y cx, y sx y SSIM ( x, y), 2 x y c1 2 xy c c c x y 1 where l(x,y), the luminance comparison, is a function of the mean intensities, μx and μy, c(x,y), the contrast comparison is a function of the standard deviations, σx and σy, and s(x,y), the structural comparison, is a function of the correlation between x and y, or σxy. Setting α = β = γ = 1 in the original equation results in the second equation. C1 (C2) is a constant that is included to avoid instabilities when the sum of the means (standard deviations) squared is equal to x y 2 (9)

10 10 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW values near zero. The value of the SSIM is in the range 0 to 1. As the value, SSIM(x,y), gets closer to 1, the quality of video y more closely matches the quality of video x. For our testing purposes, video x is the raw, uncompressed YUV video, before the decoding process, and video y is the post decoded YUV video that may or may not have other bit shifting or correction changes performed on it. The results of these SSIM calculations for conventional and DPSR are listed in Table 8. The video output quality of our DPSR method has a significant increase in SSIM value over the no failure correction, conventional, memory. 5.5 Comparison with Prior Work Table 9 compares the DPSR s performance with the stateof-the-art. With data-pattern enabled self-recovery ability, DPSR exhibits low implementation cost (7.94%) and reliable operation at near-threshold voltage to achieve maximum energy efficiency. Comparing with State-of-the-Art Data-Shifting [13]: Table 8 also compares the video output quality of the proposed DPSR and the data-shifting technique presented in [13]. As shown, the data-shifting technique [13] has slightly better quality in terms of PSNR and SSIM metrics as compared to our proposed technique, but is realized with large area overhead (~14%). This is because, the shifting scheme needs to calculate the shift values based on the received fault positions and then perform shifting to store least-significant-bits in the identified faulty bitcells. Comparing with State-of-the-Art Data-Squeezing [14]: The data squeezing technique presented in [14] is another recently developed memory failure mitigation technique. Based on the observation that, for many general-purpose applications, the last-level cache contains large amounts of null data, this technique compresses null subblocks so that they can be allocated to memory entries with faulty cells. This technique works well for register files and caches for general-purpose applications, which store as high as 79.23% zeros as discussed in [34]. However, it is not suitable for videos because the 8-bit video pixel data varys a lot between 0 and 255, which is difficult for zero compression. Comparing with State-of-the-Art Error Correction Code (ECC): ECCs have also been studied in ultra-low voltage contexts to protect again memory failures [35]. Similar redundancy based repair mechanisms, to implement ECC, the capacity of a memory need to be increased or part of its effective capacity has to be sacrificed to store check bits. In addition to memory space overhead, complex logic for ECC encoding and decoding must be added, which brings significant implementation penalty. For example, by using orthogonal Latin square codes discussed in [35], half of memory capacity is used to store ECC bits. In our developed big-data enabled memory technique, the general data patterns existing in large scale videos have been identified, which are used to achieve self-correction in the presence of the memory failures. The overhead of the developed self-correction logic is significantly reduced as compared to existing techniques. At the result, the developed DPSR is capable of delivering the best video quality for the least area overhead. fault-position awareness Low-power techniques bitcell modified near threshold operation power efficiency additional logic needed performance overhead 6 CONCLUSION In this paper, we have presented a data-pattern enabled SRAM with self-recovery ability for big video data. Based on the data patterns obtained by data-mining techniques, a simple circuit-level design technique is applied to enable self-recovery with low area overhead (7.94%). Our design successfully delivers good video quality for minimumsized SRAM at near-threshold voltage (with failure rate 10-2 ). ACKNOWLEDGMENT This work was supported in part by the National Science Foundation under Grant CCF and CNS , the ND NASA EPSCoR, the ND Venture grant, the NDSU- RCA funding, and the Offerdahl Foundation. Na Gong and Jinhui Wang are the corresponding authors. REFERENCES TABLE 9 COMPARISON WITH PRIOR WORK TCASI 12 [24] DAC 15 [13] [1] IDC (2012). The Digital Universe in 2020: Big Data, bigger Digital Shadows, and Biggest Growth in the Far East. December [Online]. Available: [2] K. Kim, Silicon Technologies and Solutions for the Data-Driven World, in Proc. IEEE International Solid-State Circuits Conference (ISSCC), Feb. 2015, pp [3] N. Rastogi, You Charged Me All Night Long, [Online]. Available: TC 16 [14] This Work No Yes Yes Yes bitcell sizing datashifting datasqueezing data-pattern enabled selfrecovery Yes No No No No (0.9V) Yes (-) Yes (0.5 V) Yes (0.5V) Bad Good Good Good No LUTs and shifter - - Rearrangement logic and tag array, comparator, Mux extra clock (for decompression) MUX 0.04 ns video quality acceptable good does not apply good area overhead 1 14% 6.3% 7.94% 11-65% 1 depending on the number of shifting bits

11 11 ntern/2009/10/you_charged_me_all_night_long.html. [4] M. E. Sinangil and A. P. Chandrakasan, Application-Specific SRAM Design Using Output Prediction to Reduce Bit-Line Switching Activity and Statistically Gated Sense Amplifiers for Up to 1.9 Lower Energy/Access, IEEE Journal of Solid-State Circuits, vol. 49, no. 1, pp , Jan [5] K. Nii, M. Yabuuchi, Y. Tsukamoto, S. Ohbayashi, S. Imaoka, H. Makino,Y.Yamagami, S. Ishikura,T. Terano, T. Oashi, K. Hashimoto, A. Sebe, G. Okazaki, K. Satomi, H. Akamatsu, and H. Shinohara, A 45-nm Bulk CMOS Embedded SRAM With Improved Immunity Against Process and Temperature Variations, IEEE J. Solid-State Circuits, vol. 43, no. 1, pp , Jan [6] O. Hirabayashi, A. Kawasumi, A. Suzuki, Y. Takeyama, K. Kushida, T. Sasaki, A. Katayama, G. Fukano, Y. Fujimura, T. Nakazato, Y. Shizuki, N. Kushiyama, and T. Yabe, A Process- Variation-Tolerant Dual-Power-Supply SRAM With Cell in 40 nm CMOS Using Level-Programmable Wordline Driver, in Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb. 2009, pp [7] K. Takeda, Y. Hagihara, Y. Aimoto, M. Nomura, Y. Nakazawa, T. Ishii, and H. Kobatake, A Read-static-noise-margin-free SRAM Cell for Low-VDD and High-speed Applications, IEEE J. Solid-State Circuits, vol. 41, no. 1, pp , Jan [8] T.-H. Kim, J. Liu, and C. H. Kim, A Voltage Scalable 0.26 V, 64 kb 8T SRAM with Vmin Lowering Techniques and Deep Sleep Mode, IEEE J. Solid-State Circuits, vol. 44, no. 6, pp , [9] M.-F. Chang, S.-W. Chang, P.-W. Chou, and W.-C. Wu, A 130 mv SRAM with Expanded Write and Read Margins for Subthreshold Applications, IEEE J. Solid-State Circuits, vol. 46, no. 2, pp , Feb [10] Y.-W. Chiu, Y.-H. Hu, M.-H. Tu, J.-K. Zhao, Y.-H. Chu, S.-J. Jou, and C.-T. Chuang, 40 nm Bit-Interleaving 12T Subthreshold SRAM With Data-Aware Write-Assist, IEEE Trans. Circuits Syst. I, vol. 61, no. 9, pp , Sep [11] M. K. Qureshi and Z. Chishti, Operating Secded-based Caches at Ultralow Voltage with Flair, in Proc rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), 2013, pp [12] A. Ansari, S. Feng, S. Gupta, and S. A. Mahlke, Archipelago: A Polymorphic Cache Design for Enabling Robust Near-threshold Operation, in Proc. IEEE Symp. on High Performance Computer Architecture (HPCA), 2011, pp [13] S. Ganapathy, G. Karakonstantis, A. Teman, and A. Burg, Mitigating the Impact of Faults in Unreliable Memories for Error-Resilient Applications, in Proc. Design Automation Conf. (DAC), 2015, pp [14] A. Ferreron, D. S. Gracia, J. Alastruey-Benedé, T. Monreal-Arnal, and P. E. Ibáñez, Concertina: Squeezing in Cache Content to Operate at Near-Threshold Voltage, IEEE Trans. On Computers, vol. 65, no. 3, Mar [15] N. Gong, S. Jiang, A. Challapalli, M. Panesar, and R. Sridhar, Variation-and-Aging Aware Low Power embedded SRAM for Multimedia Applications, in Proc. 25th IEEE International SoC Conference (SoCC 12), 2012, pp [16] S. Zhou, S. Katariya, H. Ghasemi, S. Draper, and N. S. Kim, Minimizing Total Area of Low-Voltage SRAM Arrays Thought Joint Optimization of Cell Size, Redundancy, and ECC, in Proc. Int. Conf. on Computer Design (ICCD), 2010, pp [17] R. Agrawal, T. Imielinski, and A. Swami, Mining Rules between Sets of Items in Large Databases, in Proc. for Computing Machinery s Special Interest Group on Management of Data (ACM SIGMOD) Conf., Washington DC, pp , May [18] Weka 3: Data Mining Software in Java, [Online]. Available: [19] N. Gong, S. Jiang, A. Challapalli, S. Fernandes, and R. Sridhar, Ultra-Low Voltage Split-data-aware Embedded SRAM for Mobile Video Applications, IEEE Trans. on Circuits and Systems II, vol. 59, no. 12, pp , Dec [20] H. Noguchi, Y. Iguchi, H. Fujiwara, Y. Morita, K. Nii, H. Kawaguchi, and M. Yoshimoto, A 10T Non-precharge Twoport SRAM for 74% Power Reduction in Video Processing, in Proc. IEEE Computer Society Annual Symp. VLSI Circuits, Mar. 2007, pp [21] A. R. Alameldeen, I. Wagner, Z. Chishti, W. Wu, C. Wilkerson, S. L. Lu, Energy-Efficient Cache Design Using Variable- Strength Error-Correcting Codes, in Proc. ISCA, 2011, pp [22] J. Chang, M. Huang, J. Shoemaker, J. Benoit, S.-L. Chen, W. Chen, S. Chiu, R. Ganesan, G. Leong, V. Lukka, S. Rusu, and D. Srivastava, The 65-nm 16-MB shared on-die L3 cache for the dual-core Intel Xeon Processor 7100 Series, IEEE J. Solid-State Circuits, vol. 42, no. 4, pp , Apr [23] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, Image quality assessment: from error visibility to structural similarity, IEEE Trans. on Image Processing, vol. 13, no. 4, pp , Apr [24] J. Kwon, I. Lee, and J. Park, "Heterogeneous SRAM Cell Sizing for Low Power H.264 Applications," IEEE Trans. on Circuits and Systems I, vol. 99, no. 2, pp. 1-10, Feb [25] D. Zhou, S. Wang, H. Sun, J. Zhou, J. Zhu, Y. Zhao, J. Zhou, S. Zhang, and S. Goto, A 4Gpixel/s 8/10b H.265/HEVC Video Decoder Chip for 8K Ultra HD Applications, in Proc. Int. Solid- State Circuits Conf. (ISSCC), Feb. 2016, San Franscisco, CA, pp [26] S. A. Pourbakhsh, X. Chen, D. Chen, X. Wang, N. Gong, and J. Wang, Sizing-Priority Based Low-Power Embedded Memory for Mobile Video Applications, in Proc. International Symposium on Quality Electronic Design (ISQED), 2016, Santa Clara, CA, pp [27] YUV Video Sequences, [Online]. Available: [28] Xiph.org Video Test Media [derf s collection], [Online]. Available: [29] F. Frustaci, D. Blaauw, D. Sylvester and M. Alioto, "Better-thanvoltage scaling energy reduction in approximate SRAMs via bit dropping and bit reuse," in Power and Timing Modeling, Optimization and Simulation (PATMOS), th International Workshop on, Salvador, 2015, pp [30] N. Gong, J. Edstrom, D. Chen, and J. Wang, Data-Pattern Enabled Self-Recovery Multimedia Storage System for Near- Threshold Computing, in Proc. International Conference on Computer Design (ICCD), 2016, Phoenix, Arizona, accepted. [31] 45-nm FreePDK. [Online]. Available: [32] F. Sampaio, M. Shafique, B. Zatt, S. Bampi, and J. Henkel, Energy-Efficient Architecture for Advanced Video Memory, in

12 12 IEEE TRANSACTIONS ON BIG DATA, UNDER REVIEW Proc IEEE/ACM International Conf. on Computer-Aided Design (ICCAD), Nov. 2014, pp [33] S. Venkataramani, S. T. Chakradhar, K. Roy, and A. Raghunathan, Approximate computing and the quest for computing efficiency, in Proc. the 52nd Annual Design Automation Conf. (DAC 15), Jun. 2015, pp [34] N. Gong, J. Wang, S. Jiang, and R. Sridhar, TM-RF: Aging Aware Power Efficient Register File Design for Modern Microprocessors, IEEE Trans. on Very Large Scale Integration (VLSI) Systems, vol. 23, no. 7, pp , Jul [35] Z. Chishti, A. R. Alameldeen, C. Wilkerson, W. Wu, and S.-L. Lu, Improving cache lifetime reliability at ultra-low voltages, in Proc. 42nd IEEE/ACM Int. Symp. Microarchit., 2009, pp Jonathon Edstrom received the B.S. degree in computer engineering and the M.S. degree in electrical and computer engineering from North Dakota State University, Fargo, ND, in 2015 and 2017, respectively. Currently, he is pursuing his Ph.D. degree in electrical and computer engineering at North Dakota State University. His research focuses on datadriven intelligent energy-efficient hardware design. Dongliang Chen received the B.S. degree in electrical engineering, Dalian University of Technology (DUT), Dalian, China, in Currently, he is pursuing his Ph.D. degree in electrical and computer engineering at the North Dakota State University, Fargo, ND. His research focuses on data-driven power-efficient mobile computing. Yifu Gong received the B.S. degree in electrical engineering at North Dakota State University in He is currently working towards his Ph.D. degree in electrical and computer engineering at the North Dakota State University, Fargo, ND. His research focuses on low-power embedded vision system. Jinhui Wang (M 13) received the B.E. degree in electrical engineering from Hebei University, Hebei, China, in 2004, and the Ph.D. degree in electrical engineering through a joint USA/China program between University of Rochester and Beijing University of Technology, in Dr. Wang is currently an Assistant Professor with the Department of Electrical and Computer Engineering at the North Dakota State University, Fargo, ND, USA. His research interests include low-power, highperformance, and variation-tolerant integrated circuit design, 3-D IC and EDA methodologies, and thermal solutions in VLSI. He has over 100 publications and 20 patents in the emerging semiconductor technologies. Na Gong (M 13) received the B.E. degree in electrical engineering, the M.E. degree in microelectronics from Hebei University, Hebei, China, and the Ph.D. degree in computer science and engineering from the State University of New York, Buffalo, in 2004, 2007, and 2013, respectively. Currently, Dr. Gong is an Assistant Professor of Electrical and Computer Engineering at the North Dakota State University, Fargo, ND, USA. Her research interests include data-driven energy-efficient VLSI circuits and systems, viewer-aware mobile systems, with an emphasis on memories.

RECENTLY, the growing popularity of powerful mobile

RECENTLY, the growing popularity of powerful mobile IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 59, NO. 12, DECEMBER 2012 883 Ultra-Low Voltage Split-Data-Aware Embedded SRAM for Mobile Video Applications Na Gong, Shixiong Jiang,

More information

Variation-and-Aging Aware Low Power embedded SRAM for Multimedia Applications

Variation-and-Aging Aware Low Power embedded SRAM for Multimedia Applications Variation-and-Aging Aware Low Power embedded SRAM for Multimedia Applications Na Gong, Shixiong Jiang, Anoosha Challapalli, Manpinder Panesar and Ramalingam Sridhar University at Buffalo, State University

More information

Noise Margin in Low Power SRAM Cells

Noise Margin in Low Power SRAM Cells Noise Margin in Low Power SRAM Cells S. Cserveny, J. -M. Masgonty, C. Piguet CSEM SA, Neuchâtel, CH stefan.cserveny@csem.ch Abstract. Noise margin at read, at write and in stand-by is analyzed for the

More information

MEMORY ERROR COMPENSATION TECHNIQUES FOR JPEG2000. Yunus Emre and Chaitali Chakrabarti

MEMORY ERROR COMPENSATION TECHNIQUES FOR JPEG2000. Yunus Emre and Chaitali Chakrabarti MEMORY ERROR COMPENSATION TECHNIQUES FOR JPEG2000 Yunus Emre and Chaitali Chakrabarti School of Electrical, Computer and Energy Engineering Arizona State University, Tempe, AZ 85287 {yemre,chaitali}@asu.edu

More information

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE

Design and analysis of RCA in Subthreshold Logic Circuits Using AFE Design and analysis of RCA in Subthreshold Logic Circuits Using AFE 1 MAHALAKSHMI M, 2 P.THIRUVALAR SELVAN PG Student, VLSI Design, Department of ECE, TRPEC, Trichy Abstract: The present scenario of the

More information

Interframe Bus Encoding Technique for Low Power Video Compression

Interframe Bus Encoding Technique for Low Power Video Compression Interframe Bus Encoding Technique for Low Power Video Compression Asral Bahari, Tughrul Arslan and Ahmet T. Erdogan School of Engineering and Electronics, University of Edinburgh United Kingdom Email:

More information

data and is used in digital networks and storage devices. CRC s are easy to implement in binary

data and is used in digital networks and storage devices. CRC s are easy to implement in binary Introduction Cyclic redundancy check (CRC) is an error detecting code designed to detect changes in transmitted data and is used in digital networks and storage devices. CRC s are easy to implement in

More information

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 9, SEPTEMBER

IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 9, SEPTEMBER IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 25, NO. 9, SEPTEMBER 2017 2625 SPIDER: Sizing-Priority-Based Application-Driven Memory for Mobile Video Applications Na Gong, Member,

More information

Low Power High Speed Voltage Level Shifter for Sub- Threshold Operations

Low Power High Speed Voltage Level Shifter for Sub- Threshold Operations International Journal of Innovative Research in Electronics and Communications (IJIREC) Volume 1, Issue 5, August 2014, PP 34-41 ISSN 2349-4042 (Print) & ISSN 2349-4050 (Online) www.arcjournals.org Low

More information

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction

Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Power Reduction Combining Dual-Supply, Dual-Threshold and Transistor Sizing for Reduction Stephanie Augsburger 1, Borivoje Nikolić 2 1 Intel Corporation, Enterprise Processors Division, Santa Clara, CA, USA. 2 Department

More information

An FPGA Implementation of Shift Register Using Pulsed Latches

An FPGA Implementation of Shift Register Using Pulsed Latches An FPGA Implementation of Shift Register Using Pulsed Latches Shiny Panimalar.S, T.Nisha Priscilla, Associate Professor, Department of ECE, MAMCET, Tiruchirappalli, India PG Scholar, Department of ECE,

More information

Implementation of an MPEG Codec on the Tilera TM 64 Processor

Implementation of an MPEG Codec on the Tilera TM 64 Processor 1 Implementation of an MPEG Codec on the Tilera TM 64 Processor Whitney Flohr Supervisor: Mark Franklin, Ed Richter Department of Electrical and Systems Engineering Washington University in St. Louis Fall

More information

Low-Power and Area-Efficient Shift Register Using Pulsed Latches

Low-Power and Area-Efficient Shift Register Using Pulsed Latches Low-Power and Area-Efficient Shift Register Using Pulsed Latches G.Sunitha M.Tech, TKR CET. P.Venkatlavanya, M.Tech Associate Professor, TKR CET. Abstract: This paper proposes a low-power and area-efficient

More information

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME

DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME DIFFERENTIAL CONDITIONAL CAPTURING FLIP-FLOP TECHNIQUE USED FOR LOW POWER CONSUMPTION IN CLOCKING SCHEME Mr.N.Vetriselvan, Assistant Professor, Dhirajlal Gandhi College of Technology Mr.P.N.Palanisamy,

More information

Dual-V DD and Input Reordering for Reduced Delay and Subthreshold Leakage in Pass Transistor Logic

Dual-V DD and Input Reordering for Reduced Delay and Subthreshold Leakage in Pass Transistor Logic Dual-V DD and Input Reordering for Reduced Delay and Subthreshold Leakage in Pass Transistor Logic Jeff Brantley and Sam Ridenour ECE 6332 Fall 21 University of Virginia @virginia.edu ABSTRACT

More information

A CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS

A CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS 9th European Signal Processing Conference (EUSIPCO 2) Barcelona, Spain, August 29 - September 2, 2 A 6-65 CYCLES/MB H.264/AVC MOTION COMPENSATION ARCHITECTURE FOR QUAD-HD APPLICATIONS Jinjia Zhou, Dajiang

More information

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module

Design of a Low Power and Area Efficient Flip Flop With Embedded Logic Module IOSR Journal of Electronics and Communication Engineering (IOSR-JECE) e-issn: 2278-2834,p- ISSN: 2278-8735.Volume 10, Issue 6, Ver. II (Nov - Dec.2015), PP 40-50 www.iosrjournals.org Design of a Low Power

More information

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL

Random Access Scan. Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL Random Access Scan Veeraraghavan Ramamurthy Dept. of Electrical and Computer Engineering Auburn University, Auburn, AL ramamve@auburn.edu Term Paper for ELEC 7250 (Spring 2005) Abstract: Random Access

More information

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique

A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique A Novel Approach towards Video Compression for Mobile Internet using Transform Domain Technique Dhaval R. Bhojani Research Scholar, Shri JJT University, Jhunjunu, Rajasthan, India Ved Vyas Dwivedi, PhD.

More information

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions

An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions 1128 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 11, NO. 10, OCTOBER 2001 An Efficient Low Bit-Rate Video-Coding Algorithm Focusing on Moving Regions Kwok-Wai Wong, Kin-Man Lam,

More information

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks

Research Topic. Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks Research Topic Error Concealment Techniques in H.264/AVC for Wireless Video Transmission in Mobile Networks July 22 nd 2008 Vineeth Shetty Kolkeri EE Graduate,UTA 1 Outline 2. Introduction 3. Error control

More information

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications

A Modified Static Contention Free Single Phase Clocked Flip-flop Design for Low Power Applications JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.8, NO.5, OCTOBER, 08 ISSN(Print) 598-657 https://doi.org/57/jsts.08.8.5.640 ISSN(Online) -4866 A Modified Static Contention Free Single Phase Clocked

More information

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics

VLSI Design: 3) Explain the various MOSFET Capacitances & their significance. 4) Draw a CMOS Inverter. Explain its transfer characteristics 1) Explain why & how a MOSFET works VLSI Design: 2) Draw Vds-Ids curve for a MOSFET. Now, show how this curve changes (a) with increasing Vgs (b) with increasing transistor width (c) considering Channel

More information

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015 Optimization of Multi-Channel BCH Error Decoding for Common Cases Russell Dill Master's Thesis Defense April 20, 2015 Bose-Chaudhuri-Hocquenghem (BCH) BCH is an Error Correcting Code (ECC) and is used

More information

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS

ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS Multimedia Processing Term project on ERROR CONCEALMENT TECHNIQUES IN H.264 VIDEO TRANSMISSION OVER WIRELESS NETWORKS Interim Report Spring 2016 Under Dr. K. R. Rao by Moiz Mustafa Zaveri (1001115920)

More information

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop

Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop Fully Static and Compressed Topology Using Power Saving in Digital circuits for Reduced Transistor Flip flop 1 S.Mounika & 2 P.Dhaneef Kumar 1 M.Tech, VLSIES, GVIC college, Madanapalli, mounikarani3333@gmail.com

More information

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014

EN2911X: Reconfigurable Computing Topic 01: Programmable Logic. Prof. Sherief Reda School of Engineering, Brown University Fall 2014 EN2911X: Reconfigurable Computing Topic 01: Programmable Logic Prof. Sherief Reda School of Engineering, Brown University Fall 2014 1 Contents 1. Architecture of modern FPGAs Programmable interconnect

More information

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS

PERFORMANCE ANALYSIS OF AN EFFICIENT PULSE-TRIGGERED FLIP FLOPS FOR ULTRA LOW POWER APPLICATIONS Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology ISSN 2320 088X IMPACT FACTOR: 5.258 IJCSMC,

More information

A low-power portable H.264/AVC decoder using elastic pipeline

A low-power portable H.264/AVC decoder using elastic pipeline Chapter 3 A low-power portable H.64/AVC decoder using elastic pipeline Yoshinori Sakata, Kentaro Kawakami, Hiroshi Kawaguchi, Masahiko Graduate School, Kobe University, Kobe, Hyogo, 657-8507 Japan Email:

More information

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532

Abstract 1. INTRODUCTION. Cheekati Sirisha, IJECS Volume 05 Issue 10 Oct., 2016 Page No Page 18532 www.ijecs.in International Journal Of Engineering And Computer Science ISSN: 2319-7242 Volume 5 Issue 10 Oct. 2016, Page No. 18532-18540 Pulsed Latches Methodology to Attain Reduced Power and Area Based

More information

Fault Detection And Correction Using MLD For Memory Applications

Fault Detection And Correction Using MLD For Memory Applications Fault Detection And Correction Using MLD For Memory Applications Jayasanthi Sambbandam & G. Jose ECE Dept. Easwari Engineering College, Ramapuram E-mail : shanthisindia@yahoo.com & josejeyamani@gmail.com

More information

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current

Modifying the Scan Chains in Sequential Circuit to Reduce Leakage Current IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) Volume 3, Issue 1 (Sep. Oct. 2013), PP 01-09 e-issn: 2319 4200, p-issn No. : 2319 4197 Modifying the Scan Chains in Sequential Circuit to Reduce Leakage

More information

A Low Power Delay Buffer Using Gated Driver Tree

A Low Power Delay Buffer Using Gated Driver Tree IOSR Journal of VLSI and Signal Processing (IOSR-JVSP) ISSN: 2319 4200, ISBN No. : 2319 4197 Volume 1, Issue 4 (Nov. - Dec. 2012), PP 26-30 A Low Power Delay Buffer Using Gated Driver Tree Kokkilagadda

More information

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel

Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel IEEE TRANSACTIONS ON MAGNETICS, VOL. 46, NO. 1, JANUARY 2010 87 Using Embedded Dynamic Random Access Memory to Reduce Energy Consumption of Magnetic Recording Read Channel Ningde Xie 1, Tong Zhang 1, and

More information

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE

REDUCING DYNAMIC POWER BY PULSED LATCH AND MULTIPLE PULSE GENERATOR IN CLOCKTREE Available Online at www.ijcsmc.com International Journal of Computer Science and Mobile Computing A Monthly Journal of Computer Science and Information Technology IJCSMC, Vol. 3, Issue. 5, May 2014, pg.210

More information

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops

Gated Driver Tree Based Power Optimized Multi-Bit Flip-Flops International Journal of Emerging Engineering Research and Technology Volume 2, Issue 4, July 2014, PP 250-254 ISSN 2349-4395 (Print) & ISSN 2349-4409 (Online) Gated Driver Tree Based Power Optimized Multi-Bit

More information

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video

Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Skip Length and Inter-Starvation Distance as a Combined Metric to Assess the Quality of Transmitted Video Mohamed Hassan, Taha Landolsi, Husameldin Mukhtar, and Tamer Shanableh College of Engineering American

More information

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Peak Dynamic Power Estimation of FPGA-mapped Digital Designs Abstract The Peak Dynamic Power Estimation (P DP E) problem involves finding input vector pairs that cause maximum power dissipation (maximum

More information

ALONG with the progressive device scaling, semiconductor

ALONG with the progressive device scaling, semiconductor IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS, VOL. 57, NO. 4, APRIL 2010 285 LUT Optimization for Memory-Based Computation Pramod Kumar Meher, Senior Member, IEEE Abstract Recently, we

More information

Selective Intra Prediction Mode Decision for H.264/AVC Encoders

Selective Intra Prediction Mode Decision for H.264/AVC Encoders Selective Intra Prediction Mode Decision for H.264/AVC Encoders Jun Sung Park, and Hyo Jung Song Abstract H.264/AVC offers a considerably higher improvement in coding efficiency compared to other compression

More information

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS)

International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational

More information

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT

DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT DESIGN AND SIMULATION OF A CIRCUIT TO PREDICT AND COMPENSATE PERFORMANCE VARIABILITY IN SUBMICRON CIRCUIT Sripriya. B.R, Student of M.tech, Dept of ECE, SJB Institute of Technology, Bangalore Dr. Nataraj.

More information

Implementation of Low Power and Area Efficient Carry Select Adder

Implementation of Low Power and Area Efficient Carry Select Adder International Journal of Engineering Science Invention ISSN (Online): 2319 6734, ISSN (Print): 2319 6726 Volume 3 Issue 8 ǁ August 2014 ǁ PP.36-48 Implementation of Low Power and Area Efficient Carry Select

More information

Weighted Random and Transition Density Patterns For Scan-BIST

Weighted Random and Transition Density Patterns For Scan-BIST Weighted Random and Transition Density Patterns For Scan-BIST Farhana Rashid Intel Corporation 1501 S. Mo-Pac Expressway, Suite 400 Austin, TX 78746 USA Email: farhana.rashid@intel.com Vishwani Agrawal

More information

An Efficient Reduction of Area in Multistandard Transform Core

An Efficient Reduction of Area in Multistandard Transform Core An Efficient Reduction of Area in Multistandard Transform Core A. Shanmuga Priya 1, Dr. T. K. Shanthi 2 1 PG scholar, Applied Electronics, Department of ECE, 2 Assosiate Professor, Department of ECE Thanthai

More information

Arithmetic Unit Based Reconfigurable Approximation Technique for Video Encoding

Arithmetic Unit Based Reconfigurable Approximation Technique for Video Encoding Arithmetic Unit Based Reconfigurable Approximation Technique for Video Encoding J.Jayakodi 1*, K.Sagadevan 2 1 ECE (Final year) IFET college of engineering, India. 2 Senior Assistant Professor, Department

More information

Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED)

Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED) Chapter 2 Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED) ---------------------------------------------------------------------------------------------------------------

More information

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder.

1. INTRODUCTION. Index Terms Video Transcoding, Video Streaming, Frame skipping, Interpolation frame, Decoder, Encoder. Video Streaming Based on Frame Skipping and Interpolation Techniques Fadlallah Ali Fadlallah Department of Computer Science Sudan University of Science and Technology Khartoum-SUDAN fadali@sustech.edu

More information

MPEG has been established as an international standard

MPEG has been established as an international standard 1100 IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 9, NO. 7, OCTOBER 1999 Fast Extraction of Spatially Reduced Image Sequences from MPEG-2 Compressed Video Junehwa Song, Member,

More information

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH

EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH EFFICIENT DESIGN OF SHIFT REGISTER FOR AREA AND POWER REDUCTION USING PULSED LATCH 1 Kalaivani.S, 2 Sathyabama.R 1 PG Scholar, 2 Professor/HOD Department of ECE, Government College of Technology Coimbatore,

More information

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP

HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP HIGH PERFORMANCE AND LOW POWER ASYNCHRONOUS DATA SAMPLING WITH POWER GATED DOUBLE EDGE TRIGGERED FLIP-FLOP 1 R.Ramya, 2 C.Hamsaveni 1,2 PG Scholar, Department of ECE, Hindusthan Institute Of Technology,

More information

A Low-Power 0.7-V H p Video Decoder

A Low-Power 0.7-V H p Video Decoder A Low-Power 0.7-V H.264 720p Video Decoder D. Finchelstein, V. Sze, M.E. Sinangil, Y. Koken, A.P. Chandrakasan A-SSCC 2008 Outline Motivation for low-power video decoders Low-power techniques pipelining

More information

Lecture 2 Video Formation and Representation

Lecture 2 Video Formation and Representation 2013 Spring Term 1 Lecture 2 Video Formation and Representation Wen-Hsiao Peng ( 彭文孝 ) Multimedia Architecture and Processing Lab (MAPL) Department of Computer Science National Chiao Tung University 1

More information

Research Article. ISSN (Print) *Corresponding author Shireen Fathima

Research Article. ISSN (Print) *Corresponding author Shireen Fathima Scholars Journal of Engineering and Technology (SJET) Sch. J. Eng. Tech., 2014; 2(4C):613-620 Scholars Academic and Scientific Publisher (An International Publisher for Academic and Scientific Resources)

More information

Reduction of Area and Power of Shift Register Using Pulsed Latches

Reduction of Area and Power of Shift Register Using Pulsed Latches I J C T A, 9(13) 2016, pp. 6229-6238 International Science Press Reduction of Area and Power of Shift Register Using Pulsed Latches Md Asad Eqbal * & S. Yuvaraj ** ABSTRACT The timing element and clock

More information

Area Optimization in 6T and 8T SRAM Cells Considering V th Variation in Future Processes

Area Optimization in 6T and 8T SRAM Cells Considering V th Variation in Future Processes IEICE TRANS. ELECTRON., VOL.E90 C, NO.10 OCTOBER 2007 1949 PAPER Special Section on VLSI Technology toward Frontiers of New Market Area Optimization in 6T and 8T SRAM Cells Considering V th Variation in

More information

Design of Fault Coverage Test Pattern Generator Using LFSR

Design of Fault Coverage Test Pattern Generator Using LFSR Design of Fault Coverage Test Pattern Generator Using LFSR B.Saritha M.Tech Student, Department of ECE, Dhruva Institue of Engineering & Technology. Abstract: A new fault coverage test pattern generator

More information

IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE

IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE IMPLEMENTATION OF X-FACTOR CIRCUITRY IN DECOMPRESSOR ARCHITECTURE SATHISHKUMAR.K #1, SARAVANAN.S #2, VIJAYSAI. R #3 School of Computing, M.Tech VLSI design, SASTRA University Thanjavur, Tamil Nadu, 613401,

More information

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder

Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder Operating Bio-Implantable Devices in Ultra-Low Power Error Correction Circuits: using optimized ACS Viterbi decoder Roshini R, Udhaya Kumar C, Muthumani D Abstract Although many different low-power Error

More information

An MFA Binary Counter for Low Power Application

An MFA Binary Counter for Low Power Application Volume 118 No. 20 2018, 4947-4954 ISSN: 1314-3395 (on-line version) url: http://www.ijpam.eu ijpam.eu An MFA Binary Counter for Low Power Application Sneha P Department of ECE PSNA CET, Dindigul, India

More information

Citation. As Published Publisher. Version

Citation. As Published Publisher. Version Application-Specific SRAM Design Using Output Prediction to Reduce Bit-Line Switching Activity and Statistically Gated Sense Amplifiers for Up to.9x Lower The MIT Faculty has made this article openly available.

More information

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky,

Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, Timing Error Detection: An Adaptive Scheme To Combat Variability EE241 Final Report Nathan Narevsky and Richard Ott {nnarevsky, tomott}@berkeley.edu Abstract With the reduction of feature sizes, more sources

More information

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP

DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP DESIGN OF A NEW MODIFIED CLOCK GATED SENSE-AMPLIFIER FLIP-FLOP P.MANIKANTA, DR. R. RAMANA REDDY ABSTRACT In this paper a new modified explicit-pulsed clock gated sense-amplifier flip-flop (MCG-SAFF) is

More information

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION

FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION FAST SPATIAL AND TEMPORAL CORRELATION-BASED REFERENCE PICTURE SELECTION 1 YONGTAE KIM, 2 JAE-GON KIM, and 3 HAECHUL CHOI 1, 3 Hanbat National University, Department of Multimedia Engineering 2 Korea Aerospace

More information

Design of Memory Based Implementation Using LUT Multiplier

Design of Memory Based Implementation Using LUT Multiplier Design of Memory Based Implementation Using LUT Multiplier Charan Kumar.k 1, S. Vikrama Narasimha Reddy 2, Neelima Koppala 3 1,2 M.Tech(VLSI) Student, 3 Assistant Professor, ECE Department, Sree Vidyanikethan

More information

Chapter 3 Evaluated Results of Conventional Pixel Circuit, Other Compensation Circuits and Proposed Pixel Circuits for Active Matrix Organic Light Emitting Diodes (AMOLEDs) -------------------------------------------------------------------------------------------------------

More information

Power Optimization by Using Multi-Bit Flip-Flops

Power Optimization by Using Multi-Bit Flip-Flops Volume-4, Issue-5, October-2014, ISSN No.: 2250-0758 International Journal of Engineering and Management Research Page Number: 194-198 Power Optimization by Using Multi-Bit Flip-Flops D. Hazinayab 1, K.

More information

Figure.1 Clock signal II. SYSTEM ANALYSIS

Figure.1 Clock signal II. SYSTEM ANALYSIS International Journal of Advances in Engineering, 2015, 1(4), 518-522 ISSN: 2394-9260 (printed version); ISSN: 2394-9279 (online version); url:http://www.ijae.in RESEARCH ARTICLE Multi bit Flip-Flop Grouping

More information

Reduced complexity MPEG2 video post-processing for HD display

Reduced complexity MPEG2 video post-processing for HD display Downloaded from orbit.dtu.dk on: Dec 17, 2017 Reduced complexity MPEG2 video post-processing for HD display Virk, Kamran; Li, Huiying; Forchhammer, Søren Published in: IEEE International Conference on

More information

A Novel Low-overhead Delay Testing Technique for Arbitrary Two-Pattern Test Application

A Novel Low-overhead Delay Testing Technique for Arbitrary Two-Pattern Test Application A Novel Low-overhead elay Testing Technique for Arbitrary Two-Pattern Test Application Swarup Bhunia, Hamid Mahmoodi, Arijit Raychowdhury, and Kaushik Roy School of Electrical and Computer Engineering,

More information

SCALABLE video coding (SVC) is currently being developed

SCALABLE video coding (SVC) is currently being developed IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, VOL. 16, NO. 7, JULY 2006 889 Fast Mode Decision Algorithm for Inter-Frame Coding in Fully Scalable Video Coding He Li, Z. G. Li, Senior

More information

Lossless Compression Algorithms for Direct- Write Lithography Systems

Lossless Compression Algorithms for Direct- Write Lithography Systems Lossless Compression Algorithms for Direct- Write Lithography Systems Hsin-I Liu Video and Image Processing Lab Department of Electrical Engineering and Computer Science University of California at Berkeley

More information

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications

An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications An Efficient Power Saving Latch Based Flip- Flop Design for Low Power Applications N.KIRAN 1, K.AMARNATH 2 1 P.G Student, VRS & YRN College of Engineering & Technology, Vodarevu Road, Chirala 2 HOD & Professor,

More information

Controlling Peak Power During Scan Testing

Controlling Peak Power During Scan Testing Controlling Peak Power During Scan Testing Ranganathan Sankaralingam and Nur A. Touba Computer Engineering Research Center Department of Electrical and Computer Engineering University of Texas, Austin,

More information

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE

LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE LUT OPTIMIZATION USING COMBINED APC-OMS TECHNIQUE S.Basi Reddy* 1, K.Sreenivasa Rao 2 1 M.Tech Student, VLSI System Design, Annamacharya Institute of Technology & Sciences (Autonomous), Rajampet (A.P),

More information

Implementation of Memory Based Multiplication Using Micro wind Software

Implementation of Memory Based Multiplication Using Micro wind Software Implementation of Memory Based Multiplication Using Micro wind Software U.Palani 1, M.Sujith 2,P.Pugazhendiran 3 1 IFET College of Engineering, Department of Information Technology, Villupuram 2,3 IFET

More information

An Overview of Video Coding Algorithms

An Overview of Video Coding Algorithms An Overview of Video Coding Algorithms Prof. Ja-Ling Wu Department of Computer Science and Information Engineering National Taiwan University Video coding can be viewed as image compression with a temporal

More information

Sharif University of Technology. SoC: Introduction

Sharif University of Technology. SoC: Introduction SoC Design Lecture 1: Introduction Shaahin Hessabi Department of Computer Engineering System-on-Chip System: a set of related parts that act as a whole to achieve a given goal. A system is a set of interacting

More information

Low Power Estimation on Test Compression Technique for SoC based Design

Low Power Estimation on Test Compression Technique for SoC based Design Indian Journal of Science and Technology, Vol 8(4), DOI: 0.7485/ijst/205/v8i4/6848, July 205 ISSN (Print) : 0974-6846 ISSN (Online) : 0974-5645 Low Estimation on Test Compression Technique for SoC based

More information

Constant Bit Rate for Video Streaming Over Packet Switching Networks

Constant Bit Rate for Video Streaming Over Packet Switching Networks International OPEN ACCESS Journal Of Modern Engineering Research (IJMER) Constant Bit Rate for Video Streaming Over Packet Switching Networks Mr. S. P.V Subba rao 1, Y. Renuka Devi 2 Associate professor

More information

Research Article Low Power 256-bit Modified Carry Select Adder

Research Article Low Power 256-bit Modified Carry Select Adder Research Journal of Applied Sciences, Engineering and Technology 8(10): 1212-1216, 2014 DOI:10.19026/rjaset.8.1086 ISSN: 2040-7459; e-issn: 2040-7467 2014 Maxwell Scientific Publication Corp. Submitted:

More information

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler Efficient Architecture for Flexible Using Multimodulo G SWETHA, S YUVARAJ Abstract This paper, An Efficient Architecture for Flexible Using Multimodulo is an architecture which is designed from the proposed

More information

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register

Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift Register International Journal for Modern Trends in Science and Technology Volume: 02, Issue No: 10, October 2016 http://www.ijmtst.com ISSN: 2455-3778 Area Efficient Pulsed Clock Generator Using Pulsed Latch Shift

More information

Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir

Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir Novel Correction and Detection for Memory Applications 1 B.Pujita, 2 SK.Sahir 1 M.Tech Research Scholar, Priyadarshini Institute of Technology & Science, Chintalapudi, India 2 HOD, Priyadarshini Institute

More information

FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS

FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS FRAME RATE BLOCK SELECTION APPROACH BASED DIGITAL WATER MARKING FOR EFFICIENT VIDEO AUTHENTICATION USING NETWORK CONDITIONS A. Kirthika 1 and A. Senthilkumar 2 1 Department of Electronics and Communication

More information

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains

Leakage Current Reduction in Sequential Circuits by Modifying the Scan Chains eakage Current Reduction in Sequential s by Modifying the Scan Chains Afshin Abdollahi University of Southern California (3) 592-3886 afshin@usc.edu Farzan Fallah Fujitsu aboratories of America (48) 53-4544

More information

Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop

Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop IJSTE - International Journal of Science Technology & Engineering Volume 2 Issue 06 December 2015 ISSN (online): 2349-784X Modified Ultra-Low Power NAND Based Multiplexer and Flip-Flop Amit Saraswat Chanpreet

More information

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology

Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology Design of a High Frequency Dual Modulus Prescaler using Efficient TSPC Flip Flop using 180nm Technology Divya shree.m 1, H. Venkatesh kumar 2 PG Student, Dept. of ECE, Nagarjuna College of Engineering

More information

Power Problems in VLSI Circuit Testing

Power Problems in VLSI Circuit Testing Power Problems in VLSI Circuit Testing Farhana Rashid and Vishwani D. Agrawal Auburn University Department of Electrical and Computer Engineering 200 Broun Hall, Auburn, AL 36849 USA fzr0001@tigermail.auburn.edu,

More information

VOLTAGE scaling is widely adopted to improve energy efficiency,

VOLTAGE scaling is widely adopted to improve energy efficiency, IEEE JOURNAL OF SOLID-STATE CIRCUITS, VOL. 50, NO. 5, MAY 2015 1 SRAM for Error-Tolerant Applications With Dynamic Energy-Quality Management in 28 nm CMOS Fabio Frustaci, Member, IEEE, Mahmood Khayatzadeh,

More information

Bit Rate Control for Video Transmission Over Wireless Networks

Bit Rate Control for Video Transmission Over Wireless Networks Indian Journal of Science and Technology, Vol 9(S), DOI: 0.75/ijst/06/v9iS/05, December 06 ISSN (Print) : 097-686 ISSN (Online) : 097-5 Bit Rate Control for Video Transmission Over Wireless Networks K.

More information

WINTER 15 EXAMINATION Model Answer

WINTER 15 EXAMINATION Model Answer Important Instructions to examiners: 1) The answers should be examined by key words and not as word-to-word as given in the model answer scheme. 2) The model answer and the answer written by candidate

More information

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY

A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY A NOVEL DESIGN OF COUNTER USING TSPC D FLIP-FLOP FOR HIGH PERFORMANCE AND LOW POWER VLSI DESIGN APPLICATIONS USING 45NM CMOS TECHNOLOGY Ms. Chaitali V. Matey 1, Ms. Shraddha K. Mendhe 2, Mr. Sandip A.

More information

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing

Investigation of Digital Signal Processing of High-speed DACs Signals for Settling Time Testing Universal Journal of Electrical and Electronic Engineering 4(2): 67-72, 2016 DOI: 10.13189/ujeee.2016.040204 http://www.hrpub.org Investigation of Digital Signal Processing of High-speed DACs Signals for

More information

P.Akila 1. P a g e 60

P.Akila 1. P a g e 60 Designing Clock System Using Power Optimization Techniques in Flipflop P.Akila 1 Assistant Professor-I 2 Department of Electronics and Communication Engineering PSR Rengasamy college of engineering for

More information

March Test Compression Technique on Low Power Programmable Pseudo Random Test Pattern Generator

March Test Compression Technique on Low Power Programmable Pseudo Random Test Pattern Generator International Journal of Computational Intelligence Research ISSN 0973-1873 Volume 13, Number 6 (2017), pp. 1493-1498 Research India Publications http://www.ripublication.com March Test Compression Technique

More information

Retiming Sequential Circuits for Low Power

Retiming Sequential Circuits for Low Power Retiming Sequential Circuits for Low Power José Monteiro, Srinivas Devadas Department of EECS MIT, Cambridge, MA Abhijit Ghosh Mitsubishi Electric Research Laboratories Sunnyvale, CA Abstract Switching

More information

Available online at ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b

Available online at  ScienceDirect. Procedia Computer Science 46 (2015 ) Aida S Tharakan a *, Binu K Mathew b Available online at www.sciencedirect.com ScienceDirect Procedia Computer Science 46 (2015 ) 1409 1416 International Conference on Information and Communication Technologies (ICICT 2014) Design and Implementation

More information

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005.

University of Bristol - Explore Bristol Research. Peer reviewed version. Link to published version (if available): /ISCAS.2005. Wang, D., Canagarajah, CN., & Bull, DR. (2005). S frame design for multiple description video coding. In IEEE International Symposium on Circuits and Systems (ISCAS) Kobe, Japan (Vol. 3, pp. 19 - ). Institute

More information

AS THE ITRS Roadmap predicts, memory area is becoming

AS THE ITRS Roadmap predicts, memory area is becoming 620 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 16, NO. 6, JUNE 2008 Novel Video Memory Reduces 45% of Bitline Power Using Majority Logic and Data-Bit Reordering Hidehiro Fujiwara,

More information