Performance Analysis with Vampir VIRTUAL INSTITUTE HIGH PRODUCTIVITY SUPERCOMPUTING

Similar documents
Detail at scale in performance analysis

Profiling techniques for parallel applications

A summary of scan conversion architectures supported by the SPx Development software

Profiling techniques for parallel applications

Managing Outage Details

INTER-PROCESS COMMUNICATION AND SYNCHRONISATION: Lesson-12: Signal Function

OVERVIEW. 1. Getting Started Pg Creating a New GarageBand Song Pg Apple Loops Pg Editing Audio Pg. 7

3. For how long can existing VDR models still be used?

5620 SAM SERVICE AWARE MANAGER 14.0 R7. Planning Guide

Student resource files

IS1500 (not part of IS1200) Logic Design Lab (LD-Lab)

Broadcast Graphics ACSR BG400 Webinar Table Of Content

Explorer Edition FUZZY LOGIC DEVELOPMENT TOOL FOR ST6

Experiment # 4 Counters and Logic Analyzer

Scanning For Photonics Applications

PSC300 Operation Manual

D-Lab & D-Lab Control Plan. Measure. Analyse. User Manual

ADS Basic Automation solutions for the lighting industry

5620 SERVICE AWARE MANAGER. NTP Driver Version Guide

A Full Line of Robots for Injection Molding: YS and ST Series Sprue Pickers SB Series Servo Robots SC Series Heavy Duty Servo robots

Getting Started with the LabVIEW Sound and Vibration Toolkit

Epiphan Frame Grabber User Guide

Design and Use of a DTV Monitoring System consisting of DVQ(M), DVMD/DVRM and DVRG

Detecting Bosch IVA Events with Milestone XProtect

Striking Clarity, Unparalleled Flexibility, Precision Control

Business Case for CloudTV

Scalability of MB-level Parallelism for H.264 Decoding

CL StageMix V6 User Guide

Palette Master Color Management Software

Figure 1 shows a simple implementation of a clock switch, using an AND-OR type multiplexer logic.

jamaseis Guide for Displaying Seismic Data

SignalTap: An In-System Logic Analyzer

DBS Installation Guide

Fast. Accurate. USB-capable. Power sensors from Rohde & Schwarz

Chapter 6. Flip-Flops and Simple Flip-Flop Applications

EAN-Performance and Latency

PulseCounter Neutron & Gamma Spectrometry Software Manual

SPR-11P Portable Transport Stream Recorder and Player

Transparent low-overhead checkpoint for GPU-accelerated clusters

North America, Inc. AFFICHER. a true cloud digital signage system. Copyright PDC Co.,Ltd. All Rights Reserved.

Graduate School of Biomedical Sciences. MS in Clinical Investigation Preparing for your Master s Thesis and Graduation

Fast. Accurate. USB-capable. Power sensors from Rohde & Schwarz

IP Broadcasting System. User manual

AZ DISPLAYS, INC. COMPLETE LCD SOLUTIONS SPECIFICATIONS FOR 15.0 OPEN FRAME MONITOR

Training Document for Comprehensive Automation Solutions Totally Integrated Automation (T I A)

MICROPROCESSOR-BASED METERING EQUIPMENT SECTION SECTION 16901

Use xtimecomposer and xscope to trace data in real-time

RefWorks Using Write-N-Cite

SCode V3.5.1 (SP-501 and MP-9200) Digital Video Network Surveillance System

Why t? TEACHER NOTES MATH NSPIRED. Math Objectives. Vocabulary. About the Lesson

Viewing Set-Top Box Data

SCode V3.5.1 (SP-601 and MP-6010) Digital Video Network Surveillance System

Software Quick Manual

Processor time 9 Used memory 9. Lost video frames 11 Storage buffer 11 Received rate 11

Logic Design. Flip Flops, Registers and Counters

UPDATE ON IOT LANDSCAPING

Premium INSTALLATION AND USER GUIDE ENGLISH TAHOMA BOX. - INSTALLATION AND USER GUIDE. Rev A _01-16

CHAPTER1: Digital Logic Circuits

SELECTION GUIDE Series of RF and Universal Frequency Counter/Timers

BROADCAST PRODUCTION\MASTER CONTROL SWITCHERS

Logic Design II (17.342) Spring Lecture Outline

TA Instruments Cement Analysis Software Getting Started Guide

LIO-8 Quick Start Guide

Microprocessor Design

GENIUS and GILDA. Roberto Barbera University of Catania and INFN. Grid School, Vico Equense,

FS3. Quick Start Guide. Overview. FS3 Control

Variations2: The Indiana University Digital Music Library Project

VeriLab. An introductory lab for using Verilog in digital design (first draft) VeriLab

VERINT EDGEVR 200 INTELLIGENT DIGITAL VIDEO RECORDER (Rev A)

Four steps to IoT success

User Guide. MonitorMix User Guide 1

Software Quick Manual

First Encounters with the ProfiTap-1G

Telecommunication Development Sector

Press Publications CMC-99 CMC-141

Software Quick Manual

Risk Risk Title Severity (1-10) Probability (0-100%) I FPGA Area II Timing III Input Distortion IV Synchronization 9 60

Displays Open Frame Monitor Model Number: AND-TFT-150Bxx

RSSL1:1-KuXER. Outdoor Unit (ODU) Ku Ext Ref LNB Redundancy System with external 10 MHz Reference System. Mux/Tee. Coax cable

Register Transfer Level in Verilog: Part II

StickIt! VGA Manual. How to install and use your new StickIt! VGA module

Configuring and Troubleshooting Set-Top Boxes

U S E R G U I D E HD1000

Supporting Creativity and Motivation in Learning Programming: A Musical Treatment

With FUSION*, you can enjoy your TV experience more with easy access to all your entertainment content on any TV in your home.

Go! Guide: The Notes Tab in the EHR

Lab #10: Building Output Ports with the 6811

Hitachi Kokusai Electric Comark LLC

EE 367 Lab Part 1: Sequential Logic

Remote Director. Apple 23 LCD Display. Collaborative Soft Proofing using the I. MANUFACTURER INTRODUCTION. SWOP Application Data Sheet

Software Quick Manual

Mendeley. By: Mina Ebrahimi-Rad (Ph.D.) Biochemistry Department Head of Library & Information Center Pasteur Institute of Iran

Filtration manager for automatic calculation of corrected differential pressure measurement in refuelling applications

Television and Teletext

Chapter 9 MSI Logic Circuits

PRACE Autumn School GPU Programming

ENGR 1000, Introduction to Engineering Design

8K120 Projection Application

LaCie 321 LCD Monitor

NetLogo User's Guide

Transcription:

Performance Analysis with Vampir

Outline Part I: Welcome to the Vampir Tool Suite Event Trace Visualization Vampir & VampirServer The Vampir Displays Part II: Vampir Hands-On Visualizing and analyzing NPB-MZ-MPI / BT SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 2

Event Trace Visualization with Vampir Alternative and supplement to automatic analysis Show dynamic run-time behavior graphically at any level of detail Provide statistics and performance metrics Timeline charts Show application activities and communication along a time axis Summary charts Provide quantitative results for the currently selected time interval SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 3

Visualization Modes (1) Directly on front end or local machine % vampir Core Core Core Core Multi-Core Program Core Core Core Core Score-P Trace File Vampir Small/Medium sized trace Thread parallel SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 4

Visualization Modes (2) On local machine with remote VampirServer % vampirserver start % vampir VampirServer Vampir CPU CPU CPU CPU CPU CPU CPU CPU Core Core Core Core LAN/WAN CPU CPU CPU CPU CPU CPU CPU CPU Core Core Core Core Many-Core CPU CPU CPU CPU CPU CPU Program CPU CPU Core Core Core Core Score-P Trace File Large Trace File (stays on remote machine) CPU CPU CPU CPU CPU CPU CPU CPU Core Core Core Core Parallel application SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 5

The main displays of Vampir Timeline Charts: Master Timeline Process Timeline Counter Data Timeline Performance Radar Summary Charts: Function Summary Message Summary Process Summary Communication Matrix View SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 6

Hands-on: Visualizing and analyzing NPB-MZ-MPI / BT

Help! Where is my trace file? % ls $HOME/NPB3.3-MZ-MPI/bin.scorep/scorep_bt-mz_trace profile.cubex scorep.cfg traces/ traces.def traces.otf2 If you followed the Score- P hands-on up to the trace experiment % ls /usr/local/speedup/scorep_bt-mz_trace profile.cubex scorep.cfg traces/ traces.def traces.otf2 If you did not follow to that point, take a prepared trace SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 8

Starting Vampir on MiniHPC Start Vampir on MiniHPC, if you followed the Score-P hands-on % vampir $HOME/NPB3.3-MZ-MPI/bin.scorep/scorep_bt-mz_trace/traces.otf2 Start Vampir on MiniHPC, if you did not follow to that point % vampir /usr/local/speedup/scorep_bt-mz_trace/traces.otf2 SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 9

Starting VampirServer on MiniHPC % vampirserver start Launching VampirServer... Submitting batch job (this might take a while)... Start VampirServer on MiniHPC SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 10

Starting VampirServer on MiniHPC % vampirserver start Launching VampirServer... Submitting batch job (this might take a while)... VampirServer 9.1.0 (r10418) Licensed to Universitaet Basel Running 4 analysis processes... (abort with \ vampirserver stop 28974) VampirServer <28974> listens on: \ dmi-cl-login.dmi.p.unibas.ch:30004 Start VampirServer on MiniHPC Copy host:port SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 11

Start Vampir % ssh -N -L 30000: dmi-cl-login.dmi.p.unibas.ch:30004 \ dmi-cl-login.dmi.unibas.ch Open a port forwarding to Stampede to be able to access the VampirServer host:port from VampirServer output SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 12

Start Vampir on local computer SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 13

Use the Open Other option SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 14

Select Remote File SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 15

Server is localhost Port is 30000 Connection type Socket SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 16

SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 17

Visualization of the NPB-MZ-MPI / BT trace Navigation Toolbar Function Summary Master Timeline Function Legend SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 18

Visualization of the NPB-MZ-MPI / BT trace Master Timeline Detailed information about functions, communication and synchronization events for collection of processes. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 19

Visualization of the NPB-MZ-MPI / BT trace Process Timeline Detailed information about different levels of function calls in a stacked bar chart for an individual process. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 20

Visualization of the NPB-MZ-MPI / BT trace Typical program phases Initialisation Phase Computation Phase SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 21

Visualization of the NPB-MZ-MPI / BT trace Counter Data Timeline Detailed counter information over time for an individual process. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 22

Visualization of the NPB-MZ-MPI / BT trace Performance Radar Detailed counter information over time for a collection of processes. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 23

Visualization of the NPB-MZ-MPI / BT trace Zoom in: Inititialisation Phase Context View: Detailed information about function initialize_. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 24

Visualization of the NPB-MZ-MPI / BT trace Find Function Execution of function initialize_ results in higher page fault rates. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 25

Visualization of the NPB-MZ-MPI / BT trace Computation Phase Computation phase results in higher floating point operations. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 26

Visualization of the NPB-MZ-MPI / BT trace Zoom in: Computation Phase MPI communication results in lower floating point operations. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 27

Visualization of the NPB-MZ-MPI / BT trace Zoom in: Finalisation Phase Early reduce bottleneck. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 28

Visualization of the NPB-MZ-MPI / BT trace Process Summary Function Summary: Overview of the accumulated information across all functions and for a collection of processes. Process Summary: Overview of the accumulated information across all functions and for every process independently. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 29

Visualization of the NPB-MZ-MPI / BT trace Process Summary Find groups of similar processes and threads by using summarized function information. SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 30

Summary and Conclusion

Summary Vampir & VampirServer Interactive trace visualization and analysis Intuitive browsing and zooming Scalable to large trace data sizes (20 TiByte) Scalable to high parallelism (200,000 processes) Vampir for Linux, Windows, and Mac OS X SPEEDUP 16 TUTORIAL: PERFORMANCE AND ENERGY MONITORING AND ANALYSIS (BASEL, SWITZERLAND, SEPTEMBER 16, 2016) 32

http://www.vampir.eu vampirsupport@zih.tu-dresden.de