Multi-representational Architectures: Incorporating Visual Imagery into a Cognitive Architecture Soar Visual Imagery (SVI) 27 th SOAR WORKSHOP Scott Lathrop John Laird
OUTLINE REVIEW CURRENT ARCHITECTURE (SVI) EXPERIMENTAL RESULTS FUTURE WORK 2
WHAT IS VISUAL IMAGERY? 178 miles 192 miles What city is closer to Ann Arbor: South Bend, Indiana or Columbus, Ohio? 3
WHAT IS VISUAL IMAGERY ~200 miles ~220 miles What is wider in the center: the lower peninsula of Michigan or Ohio? 4
WHAT IS VISUAL IMAGERY? VISUAL IMAGERY VISUAL-SPATIAL VISUAL-DEPICTIVE Location, orientation Sentential, quantitative representations Linear algebra and computational geometry algorithms Shape, color, topology, spatial properties Depictive, pixel-based representations Image algebra algorithms Sentential/Algebraic algorithms Depictive/Ordinal algorithms 5
MULTI-VISUAL REPRESENTATIONS Representation Processing Uses Example Abstract symbols Symbolic manipulation Qualitative Visual & Spatial Reasoning object(mich) object(ohio) south(ohio,mich) in(aa,mich), sw(aa,mich) center(columbus, Ohio) etc. Abstract Non-commital Hybrid abstract and quantitative symbols Sentential, algebraic manipulation Quantitative Spatial Reasoning Intermediate Layer Michigan shape: rectangle location: <10,20,0> AA shape: point location <20,2,0> Ohio shape: square location: <15,-5,0> Iconic / Depictive symbols Algebraic or Depictive manipulation Visual Feature Recognition Quantitative Spatial Reasoning Committal Concrete 6
REVIEW SUMMARY Why research visual imagery? Best of both worlds multi-representational approach Abstract symbolic representations & computations Perceptually-based quantitative and depictive representations Add new capability Visual-spatial reasoning Visual-feature retrieval and reasoning Gain computational advantage Previous architecture and experiments focused exclusively on quantitative representations and visual-spatial type tasks Open research questions as of last Soar workshop What is a visual image s internal representation? Is there more than one format/data structure? What is the relationship between high-level vision and visual imagery and how does that constrain the architecture? 7
OUTLINE REVIEW CURRENT ARCHITECTURE (SVI) EXPERIMENTAL RESULTS FUTURE WORK 8
TASK VISION Attend-Visual-Object Attend-Visual-Spatial IMAGERY Construct Transform Generate Query EPISODIC MEMORY S1 SEMANTIC MEMORY Short-Term Memory Goals Current State Learning Chunking Reinforcement Appraisal Detector Soar Visual ID Explicit Object Features Spatial Relationships SVI Visual LTM Listeners Object Map Listeners Visual LTM Object Shape Object Color What Inspector Memory Process Symbols Visualizable Symbols Operator Visual Buffer STM Depictive Pixel based Object Map STM Object(s) Location Object(s)Topology Object(s) Orientation Object(s) Size Metric representation Where Inspector Refresher Stimulus-Based Refresh Control Path Data Pathway Rendered Image Visual Scene
TASK VISION Attend-Visual-Object Attend-Visual-Spatial IMAGERY Construct Transform Generate Query EPISODIC MEMORY S1 SEMANTIC MEMORY Short-Term Memory Goals Current State Learning Chunking Reinforcement Appraisal Detector Soar Object Shape Object Color Visual ID Explicit Object Features Visual LTM Memory Process Visual LTM Listeners What Inspector Symbols Visualizable Symbols Operator Visual ID General Shape Construct Imager Constructor 23 visual-id Visual Buffer STM viewpoint napkin above topology below front Depictive Pixel based connected state Spatial Relationships, Size, General Shape visual-object has-a above visual-id fork 1 visual-object has-a visual-object place-setting left topology disconnected spatialrelationship spatialrelationship left-of Spatial Relationships visual-object Object Map Listeners Object Map STM has-a Object(s) plate Location Object(s)Topology Object(s) Orientation Object(s) Size Metric Representation right viewpoint top visual-id 12 Where Inspector SVI Control Path Data Pathway
TASK VISION Attend-Visual-Object Attend-Visual-Spatial IMAGERY Construct Transform Generate Query EPISODIC MEMORY S1 SEMANTIC MEMORY Short-Term Memory Goals Current State Learning Chunking Reinforcement Appraisal Detector Soar Object Shape Object Color Visual ID Explicit Object Features Visual LTM Visual LTM Listeners Shape Color Transform Generate Imager Refresher Manipulator Viewpoint Object Global Location Orientation Size Move Orient Resize Spatial Relationships Object Map Listeners Object Map STM Object(s) Location Object(s)Topology Object(s) Orientation Object(s) Size Metric Representation SVI What Inspector Memory Process Symbols Visualizable Symbols Operator Visual Buffer STM Depictive Pixel based Where Inspector Control Path Data Pathway
TASK VISION Attend-Visual-Object Attend-Visual-Spatial IMAGERY Construct Transform Generate Query EPISODIC MEMORY S1 SEMANTIC MEMORY Short-Term Memory Goals Current State Learning Chunking Reinforcement Appraisal Detector Soar Visual ID Explicit Object Features Query Spatial Relationships SVI Visual LTM Listeners Imager Object Map Listeners Visual LTM Object Shape Object Color Inspect Visual Features Inspect Spatial Relationships Object Map STM Object(s) Location Object(s)Topology Object(s) Orientation Object(s) Size Metric Representation What Inspector Memory Process Symbols Visualizable Symbols Operator Visual Buffer STM Depictive Pixel based Where Inspector Refresher Stimulus-Based Refresh Control Path Data Pathway Rendered Image Visual Scene
KEY POINTS Central Cognition (Soar) Abstract, symbolic visual representations Domain knowledge (goals, states, task constraints) Controls construction, transformation, generation, and inspection Vision / Visual Imagery (SVI) Quantitative and depictive visual representations Leverages mechanisms provided by visual perception. Constructs and generates what it is told Provides perceptions based on what it sees Enables novel composition of previously perceived objects Reacquires knowledge abstracted away during initial perception 13
OUTLINE REVIEW CURRENT ARCHITECTURE (SVI) EXPERIMENTAL RESULTS FUTURE WORK 14
DEPICTIVE EXPERIMENT ALPHABET FEATURES Presentation of letter name and feature (300 ms) Response (yes or no) A curve RT Emphasized inspection of object features o curve o symmetry o enclosed space Depictive (pixel) representations Shape (vertices) stored in VisualLTM so had to construct visual representation External environment, non-visual interaction 15
DEPICTIVE EXPERIMENT SYMMETRY Transform representation along axis of symmetry Make comparison by subtracting out differences New capability (+) No correlation with human data (-) Transformation in one cycle. No maintenance of visual representation 1000 900 Transform 800 Time (ms) 700 600 500 400 E G V R M H P T I J F Q W L Letter Human Soar 16
OUTLINE REVIEW CURRENT ARCHITECTURE (SVI) EXPERIMENTAL RESULTS FUTURE WORK 17
EXPERIMENTAL DOMAIN CONSTRAINTS Interactive domain and not a question and answer task Emphasize the interaction with bottom-up visual perception and top-down visual imagery processing to evaluate the perceive-imagine-reperceive cycle Evaluate both visual-spatial and visual-depictive imagery Exercise major visual imagery functionalities (construct, transformation, generation, inspection) 18
SCOUT DOMAIN Blue 1 this is blue 2, 2, enemy currently vehicle set at at grid grid 654321 123456, moving oriented northwest, southeast, out. Teammate Agent is determines imagines a leader calls other paths what in of if verbal it another vehicle enemy adequate report and teammate of can locations scout/sensor enemy. coverage. take can based see on tactics/doctrine current (field Can of agent direction Agent may view) readjust cannot not see be and able self key enemy to see terrain or teammate? but can teammate (e.g. hill (Decision) imagine but mass, its can imagine buildings, location and its bridges) location orientation and orientation based on verbal based report on verbal reports Agent Agent Teammate LEGEND: Scout Enemy 19
NUGGETS & COAL NUGGETS Answered some of questions from last year Types of representations Intersection between high-level vision and visual imagery Architectural components are relatively stable Simulation is up and running COAL Determination of when to use which representation without a big switch Unclear as to details of specific algorithms Processing with concurrent visual perception and visual imagery unknown (resource constraints) 20