Data flow architecture for high-speed optical processors

Similar documents
SPATIAL LIGHT MODULATORS

Spatial Light Modulators XY Series

Modulation transfer function of a liquid crystal spatial light modulator

Optodigital neural network classifier

Impact of DMD-SLMs errors on reconstructed Fourier holograms quality

Spatial Light Modulators

CHARACTERIZATION OF END-TO-END DELAYS IN HEAD-MOUNTED DISPLAY SYSTEMS

Spatial Light Modulators

1ms Column Parallel Vision System and It's Application of High Speed Target Tracking

THE USE OF forward error correction (FEC) in optical networks

Spatial Light Modulators

Reconfigurable Neural Net Chip with 32K Connections

Modeling Digital Systems with Verilog

Chapter 9 MSI Logic Circuits

Copyright 2002 Society of Photo Instrumentation Engineers.

DT3130 Series for Machine Vision

Design Project: Designing a Viterbi Decoder (PART I)

Lab #6: Combinational Circuits Design

Smart Traffic Control System Using Image Processing

Overview of All Pixel Circuits for Active Matrix Organic Light Emitting Diode (AMOLED)

Report on 4-bit Counter design Report- 1, 2. Report on D- Flipflop. Course project for ECE533

AN ARTISTIC TECHNIQUE FOR AUDIO-TO-VIDEO TRANSLATION ON A MUSIC PERCEPTION STUDY

UNIT V 8051 Microcontroller based Systems Design

Logic Design Viva Question Bank Compiled By Channveer Patil

An Overview of the Performance Envelope of Digital Micromirror Device (DMD) Based Projection Display Systems

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

DIGITAL SYSTEM FUNDAMENTALS (ECE421) DIGITAL ELECTRONICS FUNDAMENTAL (ECE422) COUNTERS

Chapter 2 Circuits and Drives for Liquid Crystal Devices

VXI RF Measurement Analyzer

ECE438 - Laboratory 4: Sampling and Reconstruction of Continuous-Time Signals

BUSES IN COMPUTER ARCHITECTURE

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

LCOS-SLM (Liquid Crystal on Silicon - Spatial Light Modulator)

THE CAPABILITY to display a large number of gray

CCD 143A 2048-Element High Speed Linear Image Sensor

Long and Fast Up/Down Counters Pushpinder Kaur CHOUHAN 6 th Jan, 2003

OFC & VLSI SIMULATION LAB MANUAL

A MISSILE INSTRUMENTATION ENCODER

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

FPGA IMPLEMENTATION AN ALGORITHM TO ESTIMATE THE PROXIMITY OF A MOVING TARGET

WINTER 15 EXAMINATION Model Answer

EE241 - Spring 2005 Advanced Digital Integrated Circuits

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

DT3162. Ideal Applications Machine Vision Medical Imaging/Diagnostics Scientific Imaging

Chapter 7 Counters and Registers

Introduction To LabVIEW and the DSP Board

IT T35 Digital system desigm y - ii /s - iii

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Application Note #63 Field Analyzers in EMC Radiated Immunity Testing

Spatial Light Modulators: Processing Light in Real Time

An Improved Recursive and Non-recursive Comb Filter for DSP Applications

Large-Scale Polysilicon Surface Micro-Machined Spatial Light Modulator

Low Power VLSI Circuits and Systems Prof. Ajit Pal Department of Computer Science and Engineering Indian Institute of Technology, Kharagpur

V9A01 Solution Specification V0.1

FPGA Development for Radar, Radio-Astronomy and Communications

Compact multichannel MEMS based spectrometer for FBG sensing

Digital Systems Laboratory 3 Counters & Registers Time 4 hours

Digital Signal. Continuous. Continuous. amplitude. amplitude. Discrete-time Signal. Analog Signal. Discrete. Continuous. time. time.

PARALLEL PROCESSOR ARRAY FOR HIGH SPEED PATH PLANNING

Illumination-based Real-Time Contactless Synchronization of High-Speed Vision Sensors

Enhancing Performance in Multiple Execution Unit Architecture using Tomasulo Algorithm

IEEE Santa Clara ComSoc/CAS Weekend Workshop Event-based analog sensing

CPS311 Lecture: Sequential Circuits

Digital Systems Based on Principles and Applications of Electrical Engineering/Rizzoni (McGraw Hill

Tutorial on Technical and Performance Benefits of AD719x Family

Objectives. Combinational logics Sequential logics Finite state machine Arithmetic circuits Datapath

Lab Determining the Screen Resolution of a Computer

An Efficient Reduction of Area in Multistandard Transform Core

Hello and welcome to this training module for the STM32L4 Liquid Crystal Display (LCD) controller. This controller can be used in a wide range of

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Figure 1: Feature Vector Sequence Generator block diagram.

Digital Transmission System Signaling Protocol EVLA Memorandum No. 33 Version 3

Keywords Xilinx ISE, LUT, FIR System, SDR, Spectrum- Sensing, FPGA, Memory- optimization, A-OMS LUT.

Chapter 1. Introduction to Digital Signal Processing

Time-division color electroholography using one-chip RGB LED and synchronizing controller

Power Reduction Techniques for a Spread Spectrum Based Correlator

UNIVERSAL SPATIAL UP-SCALER WITH NONLINEAR EDGE ENHANCEMENT

The software concept. Try yourself and experience how your processes are significantly simplified. You need. weqube.

Good afternoon! My name is Swetha Mettala Gilla you can call me Swetha.

The Calculative Calculator

Durham Magneto Optics Ltd. NanoMOKE 3 Wafer Mapper. Specifications

An Alternative Architecture for High Performance Display R. W. Corrigan, B. R. Lang, D.A. LeHoty, P.A. Alioshin Silicon Light Machines, Sunnyvale, CA

LFSR Counter Implementation in CMOS VLSI

Real-time Chatter Compensation based on Embedded Sensing Device in Machine tools

The Design of Efficient Viterbi Decoder and Realization by FPGA

Sequencing. Lan-Da Van ( 范倫達 ), Ph. D. Department of Computer Science National Chiao Tung University Taiwan, R.O.C. Fall,

Amon: Advanced Mesh-Like Optical NoC

VGA Controller. Leif Andersen, Daniel Blakemore, Jon Parker University of Utah December 19, VGA Controller Components

Part 1: Introduction to Computer Graphics

Dynamic calibration for improving the speed of a parallel-aligned liquid-crystal-on-silicon display

Sapera LT 8.0 Acquisition Parameters Reference Manual

Types of CRT Display Devices. DVST-Direct View Storage Tube

Digital holographic security system based on multiple biometrics

Chapter 6: Real-Time Image Formation

Implementation of BIST Test Generation Scheme based on Single and Programmable Twisted Ring Counters

Hugo Technology. An introduction into Rob Watts' technology

Laser Conductor. James Noraky and Scott Skirlo. Introduction

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

Performance of a Low-Complexity Turbo Decoder and its Implementation on a Low-Cost, 16-Bit Fixed-Point DSP

Full Disclosure Monitoring

Transcription:

Data flow architecture for high-speed optical processors Kipp A. Bauchert and Steven A. Serati Boulder Nonlinear Systems, Inc., Boulder CO 80301 1. Abstract For optical processor applications outside of laboratory experiments, it is desirable to streamline the data flow in order to obtain the highest possible throughput from the system. This paper presents the data flow architectures for two optical processors designed and built by Boulder Nonlinear Systems, as well as the processor designs and some experimental data. Keywords: spatial light modulators, optical correlators, liquid crystal displays, manufacturing inspection, optical processing, multispectral analysis, security monitoring, machine vision, data flow 2. Introduction Optical and digital processors are commonly viewed as two completely different approaches to process data. This paper shows that under certain conditions these two types of processors have base commonality. The principals of a data flow processing architecture are applied to an optical processor. A data flow architecture attempts to improve throughput by reducing the control signals to a minimum with processing elements that perform a predefined operation when all of the data is presented. This concept is very similar to the highly parallel nature of an optical processor. The data flow architecture is commonly viewed through the use of activity templates and program graphs. These same techniques are then modified and applied to an optical processor. Two optical processors that were designed and built by Boulder Nonlinear Systems are presented along with some experimental correlation data. 3. Data flow architecture Data flow architectures were developed in the 1970s as a new scheme for improving the throughput of computers. The basic premise is that there are very few control signals and the processing is performed as soon as all of the necessary operands, or data, are present in the processor element 1. The processor elements can be predefined or programmed on the fly to perform a single task such as addition, multiplication, comparison, etc. Tokens are associated with each piece of data to ensure that a new process is not started until all of the operands are present and the previous result has been passed to the next processing element. These processing elements can then be interconnected to perform complex programs at very high throughputs due to the minimal amount of overhead. A simple program is represented in Figure 1 that calculates the value: ( x + y) ( x y) z = * Each table is referred to as an activity template and contains fields for the operation to be carried out, one field for each of the operands, and one field for each of the results. Each result field contains an address to the input field of another activity template along with the result data.

Add x y Operand 1 Operand 2 Subtract Operand 1 Operand 2 Multiply Operand 1 Operand 2 z Figure 1 - Simple program representing a data flow architecture implemented with activity templates. A basic processing element is depicted in Figure 2. The activity store holds all of the necessary activity templates for the data flow program. Each activity template has a unique address that is entered into the instruction queue FIFO unit in the order of desired operation. The fetch unit receives these addresses from the instruction queue, fetches the appropriate activity template from the activity store, and formats the information into an operation packet for the operation unit. The operation unit then performs the defined operation according to the received packet, generates the appropriate result packet, and is then ready to receive the next operation packet. The update unit interprets the result packet and updates the input fields of all of the appropriate activity templates in the activity store. The update unit also tests for the availability of all operands for a given activity template and places the activity template address into the instruction queue as soon as all operands are available. This circular pipeline mechanism allows each unit to be constantly busy as long as the instruction queue is not empty. Packet Operation Unit Operation Packet Update Instruction Queue Fetch Activity Store Figure 2 - Basic processing element for a data flow architecture. 4. Optical processor An optical processor is a highly parallel data dependant processor and can therefore be considered as a special purpose version of a data flow processor. The optical processor described here contains only three basic types of operations, a Fourier

transform, a multiplication, and a modulus squared. The data flow program graph for this optical processor is depicted in Figure 3 and a conceptual drawing is shown in Figure 4. A Four. tran. Operand Mult. Operand 1 Four. tran. Operand Mod. Sq. Operand B Operand 2 result Figure 3 - The data flow program graph for an optical processor. The Operand A is introduced into Plane A via a programmable Spatial Light Modulator (SLM), and Operand B is introduced into Plane B via another SLM. Diffraction propagation from the SLM in Plane A results in the first Fourier transform operation depicted by the first activity template in Figure 3. The lens L1 scales and locates this Fourier transform plane to a more useful form. The second Fourier transform operation is created in a similar fashion with the SLM at Plane B and the lens L2. The multiplication of the Fourier transform of Plane A with Plane B occurs at Plane B prior to the second Fourier transform. The final modulus squared is a result of the square law intensity detection with a CCD detector. Coherent optical processing is performed by use of a coherent laser as the illumination source for the optical processor. The optical processor performs the operations defined in the activity templates at the speed of light as soon as the operands are present. Hence for a typical optical processor, two complex two-dimensional Fourier transforms and multiplications are performed in approximately 1 nanosecond, regardless of the number of pixels in the SLMs or detector. This optical processor can be utilized for several different calculations, see Table 1, depending solely on the input data for Operands A and B. A L1 B L2 Figure 4 - Conceptual drawing of an optical processor.

Table 1 - Some possible calculations for the optical processor defined in Figure 3. Operand A Operand B g(x,y) H*(f x,f y ) Cross-correlation g(x,y) H(f x,f y ) Convolution g(x,y) H(f x +α,f y +α)+ H*(f x -α,f y -α) (Vander Lugt filter 2 ) Convolution and Cross-correlation separated by 2α g(x,y) G*( f x,f y ) Auto-correlation g(x,y) A(f x,f y ) amplitude mask Spatially filtered version of g(x,y) A data flow optical processing element is depicted in Figure 5. The data store holds all of the necessary input images and filters for the desired task. Each data template consists of an input image and filter combination and has a unique address that is entered into the instruction queue FIFO unit in the order of desired operation. The fetch unit receives these addresses from the instruction queue, fetches the appropriate data template from the data store, and formats the information into a data packet for the optical processor. The optical processor then generates the appropriate result data from the CCD detector at the correlation plane and is then ready to receive the next data packet. An intelligent optical correlator system will have a more complex update unit than the example in Figure 2. The update unit must search every correlation image for valid correlation peaks. Then the update unit must either report these results or decide what additional data templates should be processed based on the correlation peak data. A brute force approach would simply run through a predefined set of input and filter combinations and report all of the results. The simplest approach to improving the throughput would be to utilize something on the order of a binary search tree for stepping through the data templates. The proper algorithm should be based on the type of data and the application scenario, no one algorithm will give the best results for every application. Input Data Optical Processor Data Update Instruction Queue Fetch Report Data Store Figure 5 - A data flow optical processing element.

Dalsa 4.1. 128x128 analog optical correlator The 128x128 analog optical correlator utilizes two Spatial Light Modulators (SLMs) for inputting the data, see Figure 6 and Figure 7. The output of the correlator is a Dalsa 128x128 high-speed CCD camera with a maximum throughput of 830 Hz. The 128x128 analog SLMs can modulate light in amplitude-only, real-axis-only, phase-only, or a amplitude-phase-coupled modes 3. The type of modulation is selected via the liquid crystal material and the input polarization state 4. The analog nature of the device allows for many levels of modulation. Current drive electronics support 4-bit and 8-bit modulation depth while refreshing the SLM data at a rate of almost 9000 Hz and allowing useful frame rates of nearly 1500 Hz. The theoretical full-frame load time of the VLSI chip is approximately 25 µsec, but has only been tested to 102.4 µsec. This results in a tested continuous full-frame load rate of 9766 Hz, or equivalently 1.3 gigabits/sec. However, this does not include time for the Liquid Crystal (LC) to optically respond to the electric field, or for actual viewing time. For a Chiral Smectic LC (CSLC) device, the typical response time (10% to 90% modulation) will be approximately 50-150 µsec, which is mainly a function of electric field strength and temperature. Our current drive electronics support a load time of 113.8 µsec and response and view times as short as one load time. This time coupled with an equivalent inverse image cycle for electrical balancing the LC results in a useful frame time of 682.6 µsec, or a rate of 1464 Hz. For a Zero-Twist Nematic (ZTN) device, the rise time can be comparable to a CSLC, while the fall time limits the frame rate to approximately 500 Hz. Note that a nematic modulator responds to the amplitude of an AC field, unlike a CLSC device which also responds to the polarity. Therefore, the true and inverse images necessary for electrical balancing result in a static image for the ZTN devices. This correlator also utilizes a Dalsa 128x128 camera as the input device for the input SLM, loading new images into the SLM at a frame rate of 732 Hz with every other frame being inverted to maintain a balanced electrical field resulting in a useful frame rate of 366 Hz. The filter SLM is driven from memory at 732 Hz, twice the rate of the input SLM, resulting in each filter being correlated with both the true and inverse input images. The output Dalsa camera is also driven at 732 Hz with each frame being captured by a frame grabber card. The captured images are then transferred to host memory for postprocessing with a peak detection algorithm. Laser Dalsa Input SLM Driver Filter SLM Driver Postprocessing Board Computer Figure 6 - Block diagram of 128x128 analog correlator.

Figure 7 - Photograph of 128x128 analog correlator. 4.2. 256x256 binary optical correlator The 256x256 binary optical correlator utilizes two Spatial Light Modulators (SLMs) for inputting the data, see Figure 8 and Figure 9. The output is a Dalsa 256x256 high-speed CCD camera with a maximum sustained throughput of 220 Hz. The 256x256 binary SLMs can modulate light in binary amplitude or binary phase 5. The type of modulation is selected by rotating an output analyzer. Current drive electronics refresh the SLM at a sustained rate of over 18 KHz and support a useful frame rate of over 2000 Hz. The theoretical full-frame load time of the 256x256 VLSI chip is approximately 25 µsec, but has only been tested to 51.2 µsec. This results in a tested continuous full-frame load rate of 19531 Hz, or equivalently 1.3 gigabits/sec. However, this does not include time for the LC to optically respond to the electric field, or for actual viewing time. Our current drive electronics support a load time of 55.4 µsec and response and view times as short as one load time, and a total true/inverse cycle time of 8 load cycles. This time results in a useful frame time of 442.8 µsec, or a rate of 2258 Hz. Each SLM is driven from a memory bank of 512 images. The Dalsa 256x256 camera feeds into a frame grabber for capturing the correlation images. The captured images are then transferred to host memory for postprocessing with a peak detection algorithm. Some sample input and actual correlation images can be seen in Figure 10 and Figure 11.

Laser Dalsa Input SLM Driver Filter SLM Driver Postprocessing Board Computer Figure 8 - Block diagram of 256x256 binary correlator. Figure 9 - Photograph of 256x256 binary correlator.

Figure 10 - BigX input and correlation from 256x256 binary optical correlator. Figure 11 - SmallXO input and correlation with a SmallX filter in the 256x256 binary optical correlator. 5. Summary Similarities between data flow processors and optical processors have been drawn. Both achieve high throughputs by processing data in a highly parallel fashion as soon as the data is presented. Some background was given on the description of a data flow processor with through the use of activity templates and program graphs. These same principals were applied to describe a generic optical processor. The designs for both the 128x128 analog and the 256x256 binary optical processors have been described in detail. Experimental correlation results were presented for the 256x256 binary optical processor. 6. Acknowledgments The authors would like to thank the United States Department of Agriculture (contract #96-33610-3088) and Pacific Northwest National Labs (contract #323592-A-A6) for funding the development of the two optical processors.

7. References 1 J. B. Dennis, Data Flow Supercomputers, Computer, Volume 13, Number 11, pp. 48-56, IEEE, November 1980. 2 J. W. Goodman, Introduction to Fourier Optics, McGraw-Hill, San Francisco, 1968. 3 S. A. Serati, G. D. Sharp, & R. A. Serati, 128 x 128 analog liquid crystal spatial light modulator, Optical Pattern Recognition VI, Volume 2490, pp. 378-387, SPIE, Bellingham, April 1995. 4 K. A. Bauchert, S. A. Serati, G. D. Sharp, & D. J. McKnight, Complex phase/amplitude spatial light modulator advances and use in a multispectral optical correlator, Optical Pattern Recognition VIII, Volume 3073, pp. 170-177, SPIE, Bellingham, April 1997. 5 D. J. McKnight, K. M. Johnson, & R. A. Serati, 256 x 256 liquid-crystal-on-silicon spatial light modulator, Applied Optics, Volume 33, Number 14, pp. 2775-2784, May 1994.