Out of order execution allows Letter A B C D E Answer Requires extra stages in the pipeline The processor to exploit parallelism between instructions. Is used mostly in handheld computers A, B, and C A and B 1
CMP is short for Letter A B C D E Answer Compare Common mode parallelism Cache Mostly Programs Chip Multiprocessor Concurrent Machine Programming 2
Coherence and consistency affect Letter A B C D E Answer The order in which memory operations take affect How your food tastes. How an OOO processor can execute add instructions The number of cores that can be in a CMP The depth of a processors pipeline. 3
The final is Letter A B C D E Answer Only about topics covered on evennumbered days of the course All multiple choice, and the answers are all B Comprehensive Similar in format to the midterm C and D 4
CAPEs Letter A B C D E Answer Are for super heroes. Are open until the start of finals week Are very important. All of the above. None of the above. 5
TA Evaluations and CAPE Please fill out your TA evaluations You should have received a link to do so. CAPEs are also open. Clicker evaluation: https://www.surveymonkey.com/s/cse141sp14_s wanson 6
Final review Come with questions Next Tuesday (the lecture will probably be brief) Next Thursday The final is comprehensive Look over the slides, homeworks, quizzes, and midterm 6/10/2014 8:00am-11:00am in this room
Storage Steven Swanson
Humanity processed 9 Zettabytes in 2008* Welcome to the Data Age! *http://hmi.ucsd.edu 9
Solid State Memories NAND flash Ubiquitous, cheap Sort of slow, idiosyncratic Phase change, Spin torque MRAMs, etc. On the horizon DRAM-like speed DRAM or flash-like density 10
Bandwidth Relative to disk 100000 10000 1000 5917x 2.4x/yr PCIe-PCM (2010) PCIe-Flash (2012) DDR Fast NVM (2016?) 100 10 Hard Drives (2006) PCIe-Flash (2007) PCIe-PCM (2013?) 7200x 2.4x/yr 1 1 10 100 1000 10000 100000 1000000 100000 1/Latency Relative To Disk 11
Disk Density 1 Tb/sqare inch 12 1
Hard drive Cost Today at newegg.com: $0.04 GB ($0.00004/MB) Desktop, 2 TB 13 1
Why Are Disks Slow? They have moving parts :-( The disk itself and the a head/arm The head can only read at one spot. High end disks spin at 15,000 RPM Data is, on average, 1/2 an revolution away: 2ms Power consumption limits spindle speed Why not run it in a vacuum? The head has to position itself over the right track Currently about 150,000 tracks per inch. Positioning must be accurate with about 175nm Takes 3-13ms 14 1
Making Disks Faster Caching Everyone tries to cache disk accesses! The OS The disk controller The disk itself. Access scheduling Reordering accesses can reduce both rotational and seek latencies 15 1
RAID! Redundant Array of Independent (Inexpensive) Disks If one disk is not fast enough, use many Multiplicative increase in bandwidth Multiplicative increase in Ops/Sec Not much help for latency. If one disk is not reliable enough, use many. Replicate data across the disks If one of the disks dies, use the replica data to continue running and re-populate a new drive. Historical foot note: RAID was invented by one of the text book authors (Patterson) 16 1
RAID Levels There are several ways of ganging together a bunch of disks to form a RAID array. They are called levels Regardless of the RAID level, the array appears to the system as a sequence of disk blocks. The levels differ in how the logical blocks are arranged physically and how the replication occurs. 17 1
RAID 0 Double the bandwidth. For an n-disk array, the n- th block lives on the n-th disk. Worse for reliability If one of your drives dies, all your data is corrupt-- you have lost every nth block. 18 1
RAID 1 Mirror your data 1/2 the capacity But, you can tolerate a disk failure. Double the bandwidth for reads Same bandwidth for writes. 19 1
Stripe your data across a bunch of disks Use one bit to hold parity information The number of 1 s at corresponding locations across the drives is always even. If you lose on drive, you can reconstruct it from the others. Read and write all the disks in parallel. 20 2
The Flash Juggernaut
Flash is Fast! Hard Drives PCIe-Flash 2007 Lat.: 7.1ms BW: 2.6MB/s 1x 1x 68us 250MB/s 104x 96x Random 4KB Reads from user space
Floating Gate Flash Operations Read 0V 1V 5V 0V 20V Program 20V Erase 0V 0V
Organizing Flash Cells into Chips
Organizing Flash Cells into Chips ~16K blocks/chip ~16-64Gbits/chip
Flash Operations Page: 0 1 2 3 4 n-4 n-3 n-2 n-1 n Block 0 SLC: Single Level Cell Block 1 == 1 bit Block 2 Block n MLC: Multi Level Cell Erase Blocks Program Pages == 2 bits TLC: Triple Level Cell == 3 bits
Single-Level Cell Endurance: 100,000 Cycles Data retention: 10 years Read Latency: 25us Program Latency: 100-200us == 1 bit
Multi-Level Cell (2 bits) Endurance: 5000-10,000 Cycles Data retention: 3-10 years Read Latency: 25-37us Program Latency: 600-1800us == 2 bits
Triple-level Cell (3bits) Endurance: ~500-1000 Cycles Data retention: 3 years Read Time: 60-120us Program Time: 500-6500us == 3 bits
3D Nand SLC, MLC, and TLC NAND cells are 4F 2 devices. 1.33 4F 2 per bit Higher densities require 3D designs Samsung has demonstrated 24 layers 2-4x density boost http://bcove.me/xz2o1af5
Flash Failure Mechanisms Program/Erase (PE) Wear Permanent damaged to the gate oxide at each flash cell Caused by high program/erase voltages Damage causes charge to leak off the floating gate Program disturb Data corruption caused by interference from programming adjacent cells. No permanent damage
Making Disks out Flash Chips Read Pages Write Pages Erase Blocks Hierarchical addresses PE Wear Read Write Flat address space No wear limitations
Writing Data SSD Maintain a map between virtual logical block addresses and physical flash locations.
Writing more data When you overwrite data, it goes to a new location.
Flash Translation Layer (FTL) Software FTL Flash User Logical Block Address Flash Write pages in order Erase/Write granularity Wears out FTL Logical Physical map Wear leveling Power cycle recovery
Centralized FTL State Map Write Point LBA Physical Page Address 0 Block 5 Page 7 2k Block 27 Page 0 4k Block 10 Page 2 101001011010001 010100100101011 101010110101001 111111111111111 111111111111111 111111111111111 Block Info Table Next Sequence Number: 12 Block Erased Erase Count Valid Page Count Sequence Number Bad Block Indicator 0 False 3 15 5 False 1 True 7 0 - False 2 False 0 4 9 False
Read Software 1. Read Data at LBA 2k 2. Map FTL Flash LBA Physical Page Address 0 Block 5 Page 7 2k Block 27 Page 0 4k Block 10 Page 2 3. Flash Operation
Write Mid Block Write 0101101011001010 to LBA 2k Write Point = Block 2, Page 5 Map LBA Physical Page Address 0 Block 5 Page 7 2k Block 0 Page 0 4k Block 10 Page 2 1010010111010101 0101001010111011 1010101101001010 Block Info Table Block Erased Erase Count Valid Page Count Next Sequence Number: 12 Sequence Number Bad Block Indicator 0 False 3 15 5 False 1 True 7 0 - False 2 False 0 4 9 False
Write Write 0101101011001010 to LBA 2k Map LBA Physical Page Address 0 Block 5 Page 7 2k Block 0 2 Page 0 5 4k Block 10 Page 2 Write Point = Block 2, Page 5 Write Point = Block 2, Page 6 1010010111010101 0101001010111011 1010101101001010 0101101011001010 Block Info Table Block Erased Erase Count Valid Page Count Next Sequence Number: 12 Sequence Number Bad Block Indicator 0 False 3 15 14 5 False 1 True 7 0 - False 2 False 0 4 5 9 False
Block Info Table Block Erased Erase Count Erase Valid Page Count Sequence Number Bad Block Indicator 0 False 3 13 5 False 1 False 7 1 12 False 2 False 0 3 9 False Move Valid Pages Block 2 0101011010101010 1010001010111010 0101011010010101 0101110100101000 1101000101101001 0101011010100111 0101110100010110 1011101000101010 1010010111010101 0101001010111011 1010101101001010
Block Info Table Block Erased Erase Count Erase Valid Page Count Sequence Number Bad Block Indicator 0 False 3 13 5 False 1 False 7 1 12 False 2 False 0 3 0 9 False Move Valid Pages Block 2 0101011010101010 1010001010111010 0101011010010101 0101110100101000 1101000101101001 0101011010100111 0101110100010110 1010010111010101 0101001010111011 1010101101001010 1010001010111010 1101000101101001 0101011010100111 Update: Map Valid Pg Counts etc. 1011101000101010
Block Info Table Block Erased Erase Count Erase Valid Page Count Sequence Number Bad Block Indicator 0 False 3 13 5 False 1 False 7 1 12 False 2 F T 01 0 - False Move Valid Pages Block 2 1010010111010101 0101001010111011 1010101101001010 1010001010111010 1101000101101001 0101011010100111 Update: Map Valid Pg Counts etc.