Sensor Data Processing and Neuro-inspired Computing

Similar documents
Chapter 7 Registers and Register Transfers

Logistics We are here. If you cannot login to MarkUs, me your UTORID and name.

EE260: Digital Design, Spring /3/18. n Combinational Logic: n Output depends only on current input. n Require cascading of many structures

Reliable Transmission Control Scheme Based on FEC Sensing and Adaptive MIMO for Mobile Internet of Things

Polychrome Devices Reference Manual

PowerStrip Automatic Cut & Strip Machine

Quality improvement in measurement channel including of ADC under operation conditions

Line numbering and synchronization in digital HDTV systems

Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

Voice Security Selection Guide

Comparative Study of Different Techniques for License Plate Recognition

What Does it Take to Build a Complete Test Flow for 3-D IC?

Image Intensifier Reference Manual

Read Only Memory (ROM)

Motivation. Analysis-and-manipulation approach to pitch and duration of musical instrument sounds without distorting timbral characteristics

2 Specialty Application Photoelectric Sensors

Australian Journal of Basic and Applied Sciences

Our competitive advantages : Solutions for X ray Tubes. X ray emitters. Long lifetime dispensers cathodes n. Electron gun manufacturing capability n

How the IoT Fuels Airlines Industry's Flight into the Future

2 Specialty Application Photoelectric Sensors

STx. Compact HD/SD COFDM Transmitter. Features. Options. Accessories. Applications

L-CBF: A Low-Power, Fast Counting Bloom Filter Architecture

Innovation in the Multi-Screen World. Sirius 800 Series. Multi-format, expandable routing that stands out from the crowd

THE Internet of Things (IoT) is likely to be incorporated

Implementation of Expressive Performance Rules on the WF-4RIII by modeling a professional flutist performance using NN

CCTV that s light years ahead

Research on the Classification Algorithms for the Classical Poetry Artistic Conception based on Feature Clustering Methodology. Jin-feng LIANG 1, a

2 Specialty Application Photoelectric Sensors

A Novel Method for Music Retrieval using Chord Progression

2 Specialty Application Photoelectric Sensors

9311 EN. DIGIFORCE X/Y monitoring. For monitoring press-fit, joining, rivet and caulking operations Series 9311 ±10V DMS.

PROBABILITY AND STATISTICS Vol. I - Ergodic Properties of Stationary, Markov, and Regenerative Processes - Karl Grill

SMARTEYE ColorWise TM. Specialty Application Photoelectric Sensors. True Color Sensor 2-65

Working with PlasmaWipe Effects

ttco.com

DIGITAL SYSTEM DESIGN

DIGITAL DISPLAY SOLUTION REAL ESTATE POINTS OF SALE (POS)

Math of Projections:Overview. Perspective Viewing. Perspective Projections. Perspective Projections. Math of perspective projection

PROJECTOR SFX SUFA-X. Properties. Specifications. Application. Tel

Analyzing the influence of pitch quantization and note segmentation on singing voice alignment in the context of audio-based Query-by-Humming

The Blizzard Challenge 2014

COLLEGE READINESS STANDARDS

Practice Guide Sonata in F Minor, Op. 2, No. 1, I. Allegro Ludwig van Beethoven

TRAINING & QUALIFICATION PROSPECTUS

Forces: Calculating Them, and Using Them Shobhana Narasimhan JNCASR, Bangalore, India

Mullard INDUCTOR POT CORE EQUIVALENTS LIST. Mullard Limited, Mullard House, Torrington Place, London Wel 7HD. Telephone:

BesTrans AOC (Active Optical Cable) Spec and Manual

References and quotations

NexLine AD Power Line Adaptor INSTALLATION AND OPERATION MANUAL. Westinghouse Security Electronics an ISO 9001 certified company

The Communication Method of Distance Education System and Sound Control Characteristics

Index. LV Series. Multimedia Projectors FULL LINE PRODUCT GUIDE. usa.canon.com/projectors. REALiS LCOS Projectors. WUX10 Mark II D WUX10 Mark II...

Manual Industrial air curtain

ProductCatalog

newsletter report Telecom & Media sector in Europe of GPS tracker for Internet-of-Things applications

Tobacco Range. Biaxially Oriented Polypropylene Films and Labels. use our imagination...

Apollo 360 Map Display User s Guide

Perspectives AUTOMATION. As the valve turns By Jim Garrison. The Opportunity to make Misteaks By Doug Aldrich, Ph.D., CFM

CSI 2130 Machinery Health Analyzer

MODELLING PERCEPTION OF SPEED IN MUSIC AUDIO

Higher-order modulation is indispensable in mobile, satellite,

MOBILVIDEO: A Framework for Self-Manipulating Video Streams

T-25e, T-39 & T-66. G657 fibres and how to splice them. TA036DO th June 2011

Emotional Intelligence:

Application Example. HD Hanna. Firewire. Display. Display. Display. Display. Display. Computer DVD. Game Console. RS-232 Control.

Because your pack is worth protecting. Tobacco Biaxially Oriented Polypropylene Films. use our imagination...

RELIABILITY EVALUATION OF REPAIRABLE COMPLEX SYSTEMS AN ANALYZING FAILURE DATA

Randomness Analysis of Pseudorandom Bit Sequences

A Backlight Optimization Scheme for Video Playback on Mobile Devices

Internet supported Analysis of MPEG Compressed Newsfeeds

Quantifying Domestic Movie Revenues Using Online Resources in China

University Student Design and Applied Solutions Competition

The new, parametrised VS Model for Determining the Quality of Video Streams in the Video-telephony Service

FHD inch Widescreen LCD Monitor USERGUIDE

Sigma 3-30KS Sigma 3-30KHS

Data Marketplace The Next IoT Frontier

MultiTest Modules. EXFO FTB-3923 Specs Provided by FTB-3920 and FTB-1400

Recognition of Human Speech using q-bernstein Polynomials

VOCALS SYLLABUS SPECIFICATION Edition

Facial Expression Recognition Method Based on Stacked Denoising Autoencoders and Feature Reduction

Manual Comfort Air Curtain

A Simulation Experiment on a Built-In Self Test Equipped with Pseudorandom Test Pattern Generator and Multi-Input Shift Register (MISR)

NIIT Logotype YOU MUST NEVER CREATE A NIIT LOGOTYPE THROUGH ANY SOFTWARE OR COMPUTER. THIS LOGO HAS BEEN DRAWN SPECIALLY.

NewBlot PVDF 5X Stripping Buffer

Research Article Measurements and Analysis of Secondary User Device Effects on Digital Television Receivers

RHYTHM TRANSCRIPTION OF POLYPHONIC MIDI PERFORMANCES BASED ON A MERGED-OUTPUT HMM FOR MULTIPLE VOICES

Daniel R. Dehaan Three Études For Solo Voice Summer 2010, Chicago

Manual RCA-1. Item no fold RailCom display. tams elektronik. n n n

Detection of Historical Period in Symbolic Music Text

Image Enhancement in the JPEG Domain for People with Vision Impairment

2016 Media Kit.

PIANO SYLLABUS SPECIFICATION. Also suitable for Keyboards Edition

CODE GENERATION FOR WIDEBAND CDMA

Background Manuscript Music Data Results... sort of Acknowledgments. Suite, Suite Phylogenetics. Michael Charleston and Zoltán Szabó

lev-lok Modular Wiring Device System The safer and more efficient solution for modern building electrical systems

Design meets function. Laser marking systems Technology, Innovation and Environment

Taking your meetings to the next level is how we re engineering a better world.

HfS Blueprint Report. Internet of Things (IoT) Services Excerpt for Cognizant. October 2016

SG Alternatives, LLC 2004 Parts Catalog

Volume 20, Number 2, June 2014 Copyright 2014 Society for Music Theory

Before you submit your application for a speech generating device, we encourage you to take the following steps:

Transcription:

Sesor Data Processig ad Neuro-ispired Computig Prof. Da Hammerstrom ECE Departmet Portlad State Uiversity Former DARPA Program Maager, MTO Maseeh College of Egieerig ad Computer Sciece

I this talk I will describe two related programs that I started ad ra at DARPA Both are aimed primarily at high performace sesor data processig for embedded platforms Also both are examples of specialized, ucovetioal computig UPSIDE: kow algorithms, focus is creatig ultra-low power computatio for processig large quatities of image data Cortical Processor: ew algorithms, focus is o capturig ad performig iferece over complex sesor data, primarily image data, o portable, low power platforms 9/14/16 Hammerstrom 2 Maseeh College of Egieerig ad Computer Sciece

To Set The Stage Source: NRC, The Future of Computig Performace, Game Over or Next Level? Moore s law cotiues we re gettig more trasistors with each geometry shrik. But, Deard scalig stopped voltage decreases have stalled eve as feature sizes shrik. Clock rates would have to decrease i order to hold power costat. Hardware offers lots more cocurrecy - software i geeral ca t use it all. For power or eergy costraied DoD embedded systems, Greater power efficiecy is the oly path forward Approved for Public Release, Distributio Ulimited 3

The Ucovetioal Processig of Sigals for Itelliget Data Exploitatio (UPSIDE) Program Large coordiated multi-discipliary teams with multiple subs All teams usig various levels of eural-ispiratio BAE Systems Uiversity of Massachusetts Johs Hopkis UCSB Stoy Brook Uiversity SEMATECH HRL Laboratories, LLC Purdue Uiversity Uiversity of Notre Dame Uiversity of Pittsburgh Itel Corp. NIST Uiversity of Michiga Portlad State Uiversity New Mexico Cosortium, Los Alamos Natioal Lab The Uiversity of Teessee Oak Ridge Natioal Laboratory Staford Uiversity. Distributio Statemet A: Approved for Public Release, Distributio Ulimited 4

UPSIDE: Performace Goals 100000 10000 UPSIDE: Emergig Device ~100 Ops/sec cosumig µwatts 1000 Giga-Features*/watts 100 10 1 0.1 0.01 0.001 COTS Efficiecy Ceilig UPSIDE: Mixed Sigal CMOS 0.0001 0.00001 0.000001 2000 2005 2010 2015 2020 Year UPSIDE Goals: 3 orders of magitude i throughput, 4 orders of magitude i power efficiecy, o loss i accuracy Distributio A. Approved for public release: distributio ulimited. 5

Feature Extractio i Image Processig Pipelie We ll focus o the simple filterig at the first layer Simple Filterig with Kerels: Simple filterig ivolves matchig low level simple cells to image segmets Typical Simple Cell feature kerels Example: Gabor Filter A Gabor filter is ofte used to fid edges or textures i a image Parameterized: positio, frequecy, orietatio, scale FOUO 6

Computatio Load for Simple Feature Matchig For each pixel, covolve each feature kerel with the pixels surroudig the pixel beig processed choose the kerel that had the highest score This is a patter recogitio operatio, fidig the kerel that most closely matches the image i the pixels i the receptive field (RF) aroud the pixel beig processed Such patter match is really a kid of iferece where we are fidig the kerel that is the most likely descriptio of the image i the RF of the pixel Ad this is just for static images Example: 1G pixel images @ 24 fps 7x7 covolutios with 24 differet kerels Covolutio requirig 50 istructios per kerel Result: 30T operatios/sec; 300W power cosumptio (100 Gops/W GPU) FOUO 7

Solutio: No-Digital, Probabilistic Computig Reduces Hierarchy Large Efficiecy Cost Geeral Purpose Computig Hierarchy... Programs Software Layers (OS, drivers, ) Multiple Cores Processor Fuctioal Uits Boolea Logic Gates... GP Digital architectures are ot well matched to feature extractio from sesor images Digital algorithms are created to search for image structures based o existig digital umber cruchig architectures Digital abstractios limit data aalysis 100s to 1,000s of digital operatios per image activity Wasted eergy i excess operatios, data movemet ad precisio. Trasistors. Need ew computig approaches matched to image processig Use the physics of ew emergig devices to extract features. Data aturally represeted i sparse form are more suitable to devices ad efficiet for data trasfer Efficiecy Gai UPSIDE Reduced Hierarchy. Image Data Computatioal Model New Devices with aalog behavior. Computig directly with devices elimiates multiple layers of hierarchy/iefficiecy Distributio A: Approved for Public Release; Distributio Ulimited 8

Iferece Origial Message y Nois e Received Message x Oe way to thik about iferece is as a decoder Ecoder Ecoded Message x Noisy Chael Decoder px ( ' ypy ) ( ) px ( ' ypy ) ( ) Decoded py ( x') = = Message px ( ') å px ( ' xpx ) ( ) y x We eed to ifer the most likely origial The Iferece Problem: Choose the most likely y, usig P[y x ] message give what we received ad our kowledge of the statistics of chael errors ad the messages beig geerated By usig a iferece abstractio, we ca defie a wide rage of operatios o images iferece operatios hece the UPSIDE Iferece Module (IM) Hammerstrom 9 Maseeh College of Egieerig ad Computer Sciece

Oe way to do Iferece: Best Match Associatio There are two kids of associative matchig: exact ad best 1) Exact match associatio is used extesively i computer sciece 0000001010011000011101001001000001100101001110 Best Match Associatio is a form of iferece ad forms basis for the UPSIDE Iferece Module The stored word is address where the correspodig field matches the iput field (ca be more tha oe) Iput: word to match ------00101000110001110------------------------------------- Field to be matched 00110010 UPSIDE uses a alterative kid of matchig: Output: Field retured 2) Best-match associatio fids the closest match accordig to some metric: Compares to all possible matches simultaeously (computatioally efficiet) A example metric is Hammig Distace (umber of bits which are differet) We ca be more sophisticated i selectig our metric ad use probabilities to ifer the correct aswer This approximatig behavior i the right kid of metric space ca lead to powerful geeralizatio iferece abstractio Hammerstrom 10 Maseeh College of Egieerig ad Computer Sciece

Steps to a Distributed Best Match Implemetatio: Distributed best match associatio is implemeted by a umber of recursively itercoected, o-liear processig elemets Collectively these elemets costitute the etwork state, ad a eergy metric ca be computed 1 E =- åw(, i j) y() i y( j) 2 Costraits are soft ad ca be represeted as a eergy field Which the etwork tries to miimize By settig the coectio stregths appropriately, the traiig vectors (our kerels) become local miima i the eergy space All bits ad kerels are compared simultaeously i oe eergy miimizig operatio eergy attractor basi attractor state The best match is to the kerel that is closest give our metric it is also, uder certai coditios, the most likely probabilistically (which is iferece ) This is called a attractor memory model. Attractor Memory Model Hammerstrom 11 Maseeh College of Egieerig ad Computer Sciece

The Result: A New Computatioal Model ad Implemetatio New Paradigm No-Boolea, Probabilistic Computig 1. Computig occurs by the physics of the devices (highly parallel) 2. Devices perform the computatioal equivalet of hudreds of discrete digital operatios 3. The model ca be cofigured ito hierarchies that accomplish most of the computatioal work required by the applicatio Sesor Data: Active Edges located (i red) Example: Fid Features i Sesor Data (7x7 Gabor Edge Fidig, 10 Giga-pixel Array) Boolea Computatio Processor: Itel 6 Core i7, GOPS: 6.7 1 iferece is 140 operatios/kerel, 24 kerels are compared / pixel GOPs/watt: 0.1 Compute time = 7,700 sec Aalog Direct Device Computatio Processor: 10 X 10 Array of coupled oscillators Giga- Ifereces/sec = 400 (56k GOPS equivalet) Compute time = 0.04 sec 430 milli-joules 460 kilo-joules (60 watts for 7700 secods) Distributio A: Approved for Public Release; Distributio Ulimited 12

UPSIDE Ucovetioal Processig of Sigals for Data Exploitatio DARPA Isight #1: Exploit the physics of emergig devices ad mixed sigal CMOS to perform extremely fast, low power computatio. Frot Ed Filterig (Edge Detectio) Image feature from CCD array 3x3 pixels Approach is beig implemeted i MS CMOS for ear term gais Pixels mapped ito coupled oscillators Oscillators relax to lowest eergy state E mi =E x Fial eergy compared agaist library of possible features E 1 = E 2 = E 3 = E 4 = Best Match: E x =E 3 Fial Result: Filtered Image Step ad repeat to Idetify all Edges (i red) UPSIDE elimiates computatioally itesive digital CMOS dot product multiplicatio DARPA Neovisio2 Staford Tower Video DARPA Isight #2: Computatioal method ca be applied uiversally to almost every computig fuctio i the frot ed of the Image Processig Pipelie Origis i Itel No-Boolea Computig Project Object Detectio BAE Systems ARGUS IS Object Saliecy/Trackig Object Classificatio Dismout Cars Reduce ISR computatioal power budget from kw to W, while icreasig speed >100x Distributio A. Approved for public release: distributio ulimited. 13

UPSIDE Program Tasks Image Processig Pipelie a applicatio driver Recreate the traditioal image processig pipelie (IPP) usig hierarchies of Iferece Modules (IM) UPSIDE Task 1 IM Developmet & IPP Implemetatio Sesor Data Sesor/ Filter Filterig Itermediate The Most Compute Itesive Feature Extractio Complex Feature Extractio Back Ed Classificatio Maximum Etropy data reductio Feature Extractio Object Classificatio UPSIDE Task 2 Mixed sigal CMOS implemetatio of computatioal model ad system test bed I Tadashi Shibata, Itel No-Boolea Project VCO_P VCO_N Out Desig ad fabricate a mixed sigal represetatio of the Iferece Module i state of the art Mixed Sigal CMOS Implemet a test bed system usig Mixed Sigal CMOS Validate agaist IPP simulatio UPSIDE Task 3 Image processig demostratio combiig ext-geeratio devices with ew computatio model Distributio A. Approved for public release: distributio ulimited. Simulate the mappig of the Iferece Module to specialized devices Determie systems level performace-price Show simple circuit operatio Image of CoFe2O4 epitaxial aopillars fabricated o MgO, courtesy of R. Comes, J. Lu, ad S. Wolf, Uiversity of Virgiia 14

UPSIDE ARGUS-IS Image Processig Pipelie: 40GP/s, 5W 1000x more power efficiet 100x faster Parameters - No Volatile Memory Gate-coupled VMM Source coupled VMM Adreas Adreou JHU NUC/Debayer Symmetric FGMOS device to be used i aalog NVM circuits 65 m UCE icludes computig with capacitor arrays 1.56 µm Modified commercial NVM memory 1.92 µm techology Traiig ad operatio of a itegrated euromorphic etwork based o metal-oxide memristors, M. Prezioso et al., Nature Letter, 7 May 2015, Vol. 521 Fabricated ad tested o May 14 Distributio A. Approved for public release: distributio ulimited. First demo of this kid Physical memristor x-bar implemetatio 3x3 biary iput images Micrograph of 12x12 4 classes (X,I,C,V) Mermistor Array 15

Accuracy of UPSIDE Method UPSIDE Objective: Lower power with o loss i accuracy. UPSIDE probabilistic computatio approach yields equivalet image-processig accuracy to high-precisio covetioal digital method. Covetioal IPP UPSIDE IPP True Pos UPSIDE False Pos UPSIDE False Neg UPSIDE Covetioal Score (#) Car 1050 (1039) 63 (74) 2 (2) Truck 1 (0) 12 (11) 1 (3) Bus 0 (0) 43 (43) 1 (1) Perso 1342 (1210) 508 (623) 29 (48) Cyclist 541 (643) 339 (278) 9 (18) UPSIDE Accuracy Score = 0.677 (vs 0.675) True Pos UPSIDE* False Pos UPSIDE* Google Images 50 uique object cetered images/class Covetioal Score (#) Car 36 (31) 2 (7) Truck 26 (27) 6 (5) Bus 33 (34) 4 (3) Perso 38 (39) 1 (0) Cyclist 36 (35) 5 (6) Distributio A. Approved for public release: distributio ulimited. UPSIDE Accuracy Score = 0.904 (vs 0.888)

Deep Learig Aalog Chip Deep Learig Chip Architecture Implemeted with Custom Aalog Elemets Floatig-gate aalog memory for o-boolea, probabilistic patter matchig performig o-chip, real-time traiig Approach eables highly efficiet computatio for object recogitio, classificatio ad trackig 0.9mm L1L1 L2L3L2L1L1 DPU RAC TC 7 Nodes 0.4mm Uiversity Teessee press release ad ews articles about chip ad DARPA UPSIDE program Performace & Efficiecy Accuracy comparable to s/w, with 282x lower traiig eergy tha sythesized custom digital equivalet. UPSIDE Chip Performace Traiig Efficiecy Digital Desig UPSIDE Chip 1.7 GOPS/W 480 GOPS/W 282x Improvemet Recogitio Accuracy 1 0.8 0.6 0.4 0.2 Recogitio Accuracy Measured Baselie 0 0 10 20 30 40 50 Percetage Corruptio Iput Patter Aalog DeSTIN Egie NN Classifier Classificatio Result J. Lu, S. Youg, I. Arel, J. Hollema, "A 1TOPS/W Aalog Deep Machie-Learig Egie with Floatig-Gate Storage i 0.13um CMOS," IEEE Joural of Solid-State Circuits, Vol. 50, Issue 1, pp. 270-281, Ja. 2015. Raw Data Rich Features Distributio A. Approved for public release: distributio ulimited. 17

Sesor Processig Pipelie: A Data Aalysis Crisis The Frot Ed UPSIDE Program The Back Ed Cortical Processor Frot-Ed Sigal Processig Feature Extractio Higher-Order Feature Extractio Associatio Iferece Decisio Makig Sesor Output Drowig i data, starved for kowledge Actioable Data/ Motor Cotrol Sesor data badwidth exceedig processig capabilities, particularly for embedded systems Miimal Cotextual iformatio Data become more kowledge / cotext itesive, cotaiig both spatial ad temporal iformatio, as they move through the pipelie Curret computatioal approaches do ot adequately represet complex spatial ad temporal data, limitig the ability to effectively perform complex recogitio for importat DoD tasks like aomaly detectio ad sceario predictio Distributio A. Approved for public release: distributio ulimited. 18

The back-ed Toolbox Still mostly empty after all these years Hammerstrom 9/14/16 19 Maseeh College of Egieerig ad Computer Sciece

Itelliget computig has bee highly depedet o the Moore s law bouty Ad although there have bee algorithm improvemets, the 4-5 orders of magitude icreases i processor speed ad memory capacity over the last 45 years accout for most of the the ability of computers to operate i the real world q q We have ot, as yet, bee able to implemet truly itelliget computig Yet we still have sigificat problems: Drowig i sesor data Complex weapo systems that have exceeded our ability to write error-free software Cybersecurity as a major system liability q Ad, we ca o loger rely o Moore s Law to save us Neural-ispired computig is still the best (perhaps oly) aswer, but its ear-term future is far from obvious We are startig to see a few successful applicatios built primarily from eural compoets ad which are Deep Learig algorithms ruig o GPUs Hammerstrom 20 Maseeh College of Egieerig ad Computer Sciece

DARPA Cortical Processor Program Machie learig has made sigificat advaces i data recogitio over the last decade Idustry is pushig towards data aalytics usig large scale etworks, ad primarily leveragig off-lie traiig Curret techiques require very large quatities of mostly labeled traiig data. Limited capture of temporal iformatio Power ad performace ucostraied, pre-trai i the cloud Ay ew data requires retraiig Other applicatios, such as may i the DoD, have differet requiremets Need to respod quickly to smaller, less comprehesive ad ulabeled traiig data Autoomous platforms with costraied SWaP (Space Weight ad Power) Traiig i real time o stad-aloe platforms with oly itermittet etwork access Leverage cotext more fully The Cortical Processor is a Machie Learig program, whose goal is to take Machie Learig to a ew level by leveragig biologically ispired techiques while focusig o complex applicatios Distributio A: Approved for Public Release; Distributio Ulimited 21

A Brief History History of Machie Learig CalTech 101 Domiat Approaches MNIST Neural Networks 1st Wave Neural Networks 3rd Wave Neural Networks 2d Wave Multilayer Neural Networks (Shallow) Neural Networks (Perceptro) Multilayer Neural Networks (Deep) Geeral Purpose-GPU (GP-GPU) 1960 1970 1980 1990 2000 DARPA Deep Learig Program ImageNet 2010 Year 2008/9 DARPA idetifies ad ivests i the potetial of deeply-layered eural etworks or Deep Learig machies for rapidly aalyzig sesory iput ad idetifyig saliet or aomalous features ad evets 2012 ImageNet wi spurs ew commercial iterest i Machie Learig Algorithms for Image Processig Distributio A. Approved for public release: distributio ulimited. 22

Image Classificatio Today vs. Recet Past ImageNet: The most challegig aual competitio for object class idetificatio algorithms. Goal: Idetify object classes i still images Traiig: 1.2M labeled images depictig 1,000 object categories Harder to determie where object is located i image Deep Learig Object Classificatios www.image-et.org Correct label probability Icorrect label probability Highly tued deep learig etworks ow achieve huma level accuracy i costraied image classificatio Distributio A: Approved for Public Release; Distributio Ulimited 23

Challeges for Capturig Cotext Traditioal Certai urecogizable images are misclassified Traditioal classifiers process a image by takig i the etire image at oce I that approach it is difficult to separate out cotext ad easy to make such mistakes Bio-Ispired Biological systems have attetioal mechaisms that dyamically gate iformatio flow Top-dow feedback forces high-level represetatios to be cosistet with low-level details Biological visio saccades sequetially to saliet (movemet, color, high spatial frequecy) parts of the image Origial Image Recogitio By Saccade Deep Neural Networks are Easily Fooled: High Cofidece Predictios for Urecogizable Images, A. Nguy et al., CVPR 2015 (Image Source) Cotext requires uderstadig object s relatio to its eviromet Distributio A: Approved for Public Release; Distributio Ulimited 24

Video Aalysis Is A Logical Next Step After ImageNet Curret techiques lack a uderstadig of cotext due to their iability to separate out objects ad determie their relatioships to each other i atural video KTH costraied dataset example Hollywood2: huma actios i realistic settigs KTH Royal Istitute of Techology, Stockholm, Swede 80%-90% accuracy possible o costraied datasets with homogeeous backgrouds IRISA/INRIA Rees Frace 50% - 60% SOA accuracy o datasets with ucostraied or realistic backgrouds Curret video actio recogitio algorithms caot accurately idetify complex actios ad scearios i real world videos Distributio A: Approved for Public Release; Distributio Ulimited 25

BUT bio-ispired details have to ear their way Gratuitous Biological Detail So, what do we mea by bio-ispired? Distributio Statemet Hammerstrom 26 Maseeh College of Egieerig ad Computer Sciece

Computatioal Neurosciece 101 Lateral ihibitio leads to sparse activatio ad coectivity creatig Sparse Distributed Represetatios (SDR) Results i a limited distributio sparse activatio which, i hardware, ca be leveraged for sigificat efficiecy Combiatorics i our favor, e.g. 1000 euros, 10 active at a time: 2.6x10 23 possible represetatios Oly a small umber of cells are required to recogize a patter Rapid learig typically oe shot - imprit sub-vector o patch of dedritic tree Hebb rule: euros that fire together, wire together Oe variatio is called Oe ad a half shot learig, where there is some adjustmet of imprited weights Syapses are oly possible where axos ad dedrites have some physical proximity, providig a wide rage of radom segmets agai combiatorics works i our favor Learig is fudametally usupervised Supervised, weakly supervised, ad reiforcemet learig possible Weights ad activatios are typically low precisio The expese is i represetig ad emulatig coectivity, ot i the arithmetic Temporal iformatio is fudametal to euro costructio delays are ubiquitous i dedritic trees Dedritic trees are active, pulse sigals are amplified as they proceed to the soma Sequece memory (predictig forward i time) is ubiquitous HTM/CLA Numeta (Hawkis & Ahmad) "Pyramidal euros: dedritic structure ad syaptic itegratio", Nelso Sprusto, Nature Reviews Neurosciece 9, 206-221 (March 2008) Distributio A. Approved for public release: distributio ulimited. 27

Computatioal Neurosciece 102 May models are spikig which is very favorable for hardware implemetatios (IBM TrueNorth) Feedback as well as feed-forward pathways Hypothesis reiforcemet Saliecy (directig attetio) Spatial ad temporal dilatio ascedig the hierarchy Hierarchical SDRs may allow the efficiet capture of ad iferece over sparse graphs the ability to capture complex, high level structure IBM s Hierarchical Cotext Networks (Wilcke) Close approximatio to Bayesia iferece Cortical colums: tight itra ad local iter colum coectivity, sparse loger rage coectivity, creates a atural modular structure with more efficiet coectivity utilizatio Systems built from more specialized cortical areas are ow startig to appear (Eliasmith) Spau http://www.extremetech.com/extreme/141926-spau-the-mostrealistic-artificial-huma-brai-yet Homeostasis Goal is for average activity; iactive euros ad syapses cotiuously reduce threshold to isure uiform activity Keeps all euros ad syapses i the game ad actively learig Cell-type-specific 3D recostructio of five eighborig barrel colums i rat vibrissal cortex (credit: Marcel Oberlaeder et al., Cerebral Cortex October 2012;22:2375±2391) The Cortical Colum: http://www.metz.supelec.fr/metz/ recherche/ersidp/projects/cortical/root.html Distributio A. Approved for public release: distributio ulimited. 28

Cortical Processor Study MTO Cortical Processor Study ivestigates systems that Elimiate the eed for large traiig sets as a prerequisite to traiig Trai i real time i a usupervised or weakly supervised eviromet Recogize temporal as well as spatial patters for recogitio of actio ad aomalies Lear ad perform iferece over complex structure i data How: Leverage elemets of computatioal eurosciece Spatial/temporal patter recogitio Oe shot learig etwork re-use Efficiet performace sparsity ad lower precisio reduces HW requiremets Prelimiary results: Powerful spikig models Fast real time learig Sparse Distributed Data Represetatios Model free adaptive cotrol Study cosists of 12 performers ad rus from Q2 2015 to Q2 2016 Distributio A. Approved for public release: distributio ulimited. 29

Cotext i Image Uderstadig Oe of the most importat potetial capabilities of hierarchical sparse models is the ability to capture the high-order structure that exists i real world data Movig from the back of the brai to the frot Gill Pratt Object Detectio: Need algorithms that have the potetial to capture complex, subtle structure i the data ad to recogize parts ad their relatioships Distributio A. Approved for public release: distributio ulimited. 30

Leverage Structure i Data to Determie Cotext Distributio A: Approved for Public Release; Distributio Ulimited 31

Sesor Fusio Large Scale Leverage of Cotext Potetial solutios to the most challegig DOD sesor fusio problems Fusio ivolves the correlatio of high level structure i the various sesor streams Sesor Data Fusio Sesor Applicatios Difficult for most algorithms ad approaches today. Creates the capability of usig learig to model ad cotrol complex systems. Helps maage sigal ad system complexity by automatig higher order relatioships. Surveillace Imagig time Sceario Awareess Complex Structure Trackig covoy of vehicles Rikus, Neurithmic Distributio A. Approved for public release: distributio ulimited. 32

Algorithm/Hardware Co-desig Leads to Efficiecy Gais Covetioal processors are easily accessible, but are iefficiet at executig cortical algorithms Costraied processor-memory partitio, limited leverage of etwork parallelism High precisio is uecessary Poor power utilizatio Memory L3 Cache L2 Cache L1 Cache GPU/CPU 64-bit word 8-bit weight Efficiecy Gaied from Algorithm/Hardware Co-desig 1) Sparse Activity + Sparse Coectivity + virtual euros = Icreased Efficiecy 2) Local Memory Removes Commuicatio Bottleecks 3) Low-Precisio Operatios = Very Low Power Cosumptio, Fast 4) Fie Grai Parallelism PE 1 PE 2 PE.. PE.. PE i PE.. PE.. PE.. PE m PE Modular Tiled Architecture = Processig Elemet = Memory = Arithmetic/logic = Iter-commuicatio Iferece (o traiig) 100x more power efficiet Traiig 10000x more power efficiet Distributio A. Approved for public release: distributio ulimited. 33

Example Hardware Optios Good: COTS (FPGA ad GPU) Better: Available Hardware (o-cots) Uiversity of Machester SpiNNaker, ARM odes, custom spike routig hardware, 130m 16 ARM core, ASIC, as part of HBP they will move to a ewer process Various Deep Learig ad Sparse processors, e.g., Eyeriss MIT Micro AP (Automata Processor) a regular expressio processor, the jury is still out cocerig how efficiet it is o Cortical Algorithms (UVA AP Ceter) TrueNorth 1M euros, 256M coectios, 28m, ot optimized to our algorithms ad coectivity structure, ad does ot implemet learig, but a ehaced learig versio may be possible Best: New Architectures Paul Frazo, NCSU, Cortical Processor, HTM architecture requiremets Others Distributio A. Approved for public release: distributio ulimited. 34

Golde Gate Chip TrueNorth Chip Proof-of-cocept of digital eurosyaptic system at worm-scale 256 euros, 256k syapses Neurosyaptic system of uprecedeted size bee- scale o a sigle chip 1 millio euros, 256 millio syapses, 70mW 4096 More euros 15 smaller area/core 100 lower power/core Chip does ot lear Itegratio of memory with computatio, combied with evet-drive desig Scietific America 2011 Parallel, distributed, evetdrive, o-vo Neuma, modular, scalable, faulttolerat architecture. At 5.4 Billio trasistors, the largest IBM chip to date. Sciece 2014 Distributio authorized to U.S. Govermet agecies oly (Critical Techology) 30 April 2015. Other requests for this documet shall be referred to DARPA Microsystems Techology Office

9/14/16 36

A Tool to Address System Complexity? Ca learig be leveraged as a efficiet system costructio alterative, for system cotrol as well as data processig? Expose the aget to reality rather tha tryig to approximate it through programmed equatios Lear complex ad subtle relatioships i the data ad perform iferece over those structures Rich models allow more robust aomaly detectio Cotiue learig ad adaptatio i situ Global Hawk DoD Sesig: A sigle Global Hawk requires 500 Mbps à 5x the total SatCom badwidth that the etire U.S. military used durig the Gulf War Big Data: Global Data Ceter Traffic Projectio Space Shuttle (400,000) Software System Codebases Average iphoe App (40,000) F22 Raptor Fighter (2M) Hubble Space Telescope (2M) Us Military Droe Cotrol Software (4M) 1 Millio Lies 10 Millio Lies F35 Fighter 2013 (24M) Facebook (62M) 50 Millio Lies Army Future Combat System Aborted (63 M) 100 Millio Lies 2014 Car Software Moder High Ed (100M) 2012-2017 Cisco Global Cloud Idex Distributio Statemet A: Approved for Public Release, Distributio Ulimited 37

Related Efforts The Europea Huma Brai Project: BraiScales (Heidelberg Uiversity) Wafer-scale eurocomputers Origially iteded to accelerate eurosciece, ow startig to look at real applicatios White House OSTP: A Naotechology-Ispired Grad Challege for Future Computig Create a ew type of computer that ca proactively iterpret ad lear from data, solve ufamiliar problems usig what it has leared, ad operate with the eergy efficiecy of the huma brai. IARPA MICrONS (Machie Itelligece from Cortical Networks) Revolutioize machie learig by reverse-egieerig the algorithms of the brai Chiese Brai Ispired Computig Research (CBICR) program Sta Williams (HP) trip report IBM: Machie Itelligece: Cogitive systems which lear cotiuously ad without supervisio, predict patters ad sequeces (i.e. the future) ad act o it Distributio A. Approved for public release: distributio ulimited. 38

Where To From Here? Mammalia eocortex is reasoably uiform across sesory modalities for most mammals It is as close to geeral purpose processig as ature gets Build modular compoets based o the cortical algorithm q q q Hybrid of UPSIDE, Deep Learig, ad Cortical Processor as basic module With supervised, usupervised, ad reiforcemet learig Off lie ad o-lie dyamic learig Modules ca be combied ito larger, more complex cofiguratios Sub-systems ad systems are the traied o the workloads rather tha by had geerated, error proe, difficult to maitai microcode q q Some will be loaded with parameters traied off lie Others will adapt o lie, i place, to system workloads ad system chages Applicatios will iclude a rage of sesor data processig tasks, as well as system ad robotic cotrol Hammerstrom 39 Maseeh College of Egieerig ad Computer Sciece

Biological Ispiratio, ot Duplicatio Coceptually oe ca thik of computatioal itelligece as a spectrum Amog other thigs we eed to do small c cogitio before we move o to Big C cogitio Ad, while cogitio remais our goal, I believe that it is possible to build very useful systems with the techology we have ow We have to build this field oe brick at a time, movig icremetally to the right Computig Is curretly here Our Goal Itelliget Sigal Processig (ISP) Small c cogitio: most mammals Big C Cogitio: humas A Rock A log way! A log way! The Krell Icreasig Itelligece Hammerstrom 40 Maseeh College of Egieerig ad Computer Sciece