White Paper Group Capacity and the Mystery of the Unenforced Limit Fabio Massimo Ottaviani - EPV Technologies

Similar documents
EAN-Performance and Latency

BAL Real Power Balancing Control Performance Standard Background Document

BAL Real Power Balancing Control Performance Standard Background Document

Welcome. Explanation of Counter-, Integrator- and Operating time counter function of the LOGOSCREEN 500 cf. Dipl.-Ing. Manfred Schleicher

HW#3 - CSE 237A. 1. A scheduler has three queues; A, B and C. Outgoing link speed is 3 bits/sec

RENAISSANCE THEATRE RENTAL GUIDE

Processes for the Intersection

Reference Guide Version 1.0

Filter Control FS-201

Processor time 9 Used memory 9. Lost video frames 11 Storage buffer 11 Received rate 11

The 5 Key Objectives of Mixing

FIBRE CHANNEL CONSORTIUM

EASY SET UP GUIDE. Thank you! You now own Flapit. Tell us about Flapit and you #flapitcounter

Program Handbook Joel Karn, Director Thomas Rheingans, Accompanist

The PeRIPLO Propositional Interpolator

TeleCheck Services, Inc.

Mathematics Workbook

(1) The fee for the use of a work in a film shall be as follows: (2) The fee for the use of a work in an event recording shall be as follows:

A low-power portable H.264/AVC decoder using elastic pipeline

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

ENGINEERING COMMITTEE Energy Management Subcommittee SCTE STANDARD SCTE

Blackbird 4K 3x1 HDMI Switch

Applying to carry BBC content and services: a partners guide to process

BookBites - A New Digital Reading Experience

Reference Parameters for Digital Terrestrial Television Transmissions in the United Kingdom

POLS 3045: Humor and American Politics SPRING 2017, Dr. Baumgartner Meets Tues. & Thur., 9:30-10:45, in Brewster, D-202

PulseCounter Neutron & Gamma Spectrometry Software Manual

A New "Duration-Adapted TR" Waveform Capture Method Eliminates Severe Limitations

INTUITIVE, REAL-TIME LAUNDROMAT DATA THAT S CUSTOM-MADE FOR THE WAY YOU OPERATE. LAUNDROMAT - LOCATION 1 - HUEBSCH.COM/COMMAND

CSE140L: Components and Design Techniques for Digital Systems Lab. CPU design and PLDs. Tajana Simunic Rosing. Source: Vahid, Katz

New York State Board of Elections Voting Machine Replacement Project Task List Revised

Real-time QC in HCHP seismic acquisition Ning Hongxiao, Wei Guowei and Wang Qiucheng, BGP, CNPC

AND9191/D. KAI-2093 Image Sensor and the SMPTE Standard APPLICATION NOTE.

USER & ENGINEER INSTRUCTION MANUAL

MUSIC IN FILM: the short Producer s Guide

Full Disclosure Monitoring

Software vs Hardware Machine Control: Cost and Performance Compared

SELF STORAGE. Self Service Kiosks for. Always on Duty! 24 Hour Sales & Support Remote Monitoring Added Security

Application Note 11 - Totalization

Lecture 0: Organization

ASH - EOC-01. Ethernet Over Coax Adapter User Guide

My Historical Figure:

RSNA 2018 Space Selection + Housing Webinar

The General Tariff 2019

Is Now Part of To learn more about ON Semiconductor, please visit our website at

AC Line Rated Disc Capacitors Class X1, 760 V AC, Class Y1, 500 V AC

COMMUNICATIONS OUTLOOK 1999

Audition F.A.Qs. Directed and Choreographed by Louise Denison Musical Direction by Jim Lunt

Sound Level Measurements at Dance Festivals in Belgium

NOVEMBER. Name: Total Weekly Minutes. Oct. 29 Oct.. 30 Oct.. 31 Nov. 1 Nov. 2 Nov. 3 Nov. 4. Nov. 5 Nov. 6 Nov. 7 Nov. 8 Nov. 9 Nov. 10 Nov.

J. Robin Warren Library Learning Space Information Guide

This document outlines everything you will need to know about the Les Misérables audition, rehearsals and performances.

ENGLISH 11 SUMMER READING

Less is More: Picking Informative Frames for Video Captioning

Katy JH Musical Audition Contract

1.1 What is CiteScore? Why don t you include articles-in-press in CiteScore? Why don t you include abstracts in CiteScore?

OUR CONSULTATION PROCESS WITH YOU

This document outlines everything you will need to know about the Les Misérables audition, rehearsals and performances.

Level 1 Mathematics and Statistics, 2011

Pennsauken Intermediate School Summer Reading 2018 Incoming 5th grade

SoundExchange compliance Noncommercial webcaster vs. CPB deal

PENNSAUKEN INTERMEDIATE SCHOOL Incoming 5th and 6th Grade Summer Reading Program for Summer 2017

ENGL 245 INTRODUCTION TO CINEMA STUDIES Fall 2017 Section 1 Tues/Thurs: 2:00-3:15 pm, Combs Hall 139

ECM and E 2 CM performance under bursty traffic. Cyriel Minkenberg & Mitch Gusat IBM Research GmbH, Zurich April 26, 2007

Journeyman Upgrade Training Schedule 2019 Merrillville Campus

RENFE TAF / CP FOGUETE References: & 47023

User Manual CC DC 24 V 5A. Universal Control Unit UC-1-E. General Information SET. Universal Control Unit UC-1 Of Central Lubrication PAUSE CONTACT

Pennsauken Intermediate School Summer Reading 2018 Incoming 4th grade

Data Sheet. HDSP-G01x, HDSP-G03x mm (0.4 inch) Dual Digit General Purpose Seven-Segment Display. Features. Description. Applications.

MONOPRICE. Blackbird 4K HDBaseT Extender Kit. User's Manual P/N 21792

Weekly Timer. Mounting track 50 cm (1.64 ft) length PFP-50N 1 m (3.28 ft) length PFP-100N

Initial Experience With Automatic Image Transmission to an Intensive Care Unit Using Picture Archiving and Communications System Technology

TV COVERAGE FUN CUP CHAMPIONSHIP 2017

Inserting the batteries. Basic settings of the remote control

Vision Call Statistics User Guide

Risk Risk Title Severity (1-10) Probability (0-100%) I FPGA Area II Timing III Input Distortion IV Synchronization 9 60

VJ 6040 UHF Chip Antenna for Mobile Devices

SINGAPORE NATIONALS 2017 GSOUL Photoshoot Information Sheet

Maps of OMA, TDP and mean power. Piers Dawe Mellanox Technologies

Terms of Use and The Festival Rules

Configuration options allow the trader to select up to 12 different ratios, the color, style, width and label contents can be configured too.

LA xlimit. Manual. by tb-software (C) tb-software 2015 Page 1 of 6

GLI-12 V1.1 GLI 12 V2.0

Subtitle Safe Crop Area SCA

Volume Trigger Proposal for the 2011 Season for horizontal low Energy events

AUDITION GUIDE

Surface Mount Multilayer Ceramic Capacitors for RF Power Applications

PSP Master Comp. Stereo Mastering Compressor

TEST 4 MATHEMATICS. Name:. Date of birth:. Primary School:. Today s date:.

Television Audio Remastering Device

DIGITAL CINEMA: UNSUAL BUSINESS AS USUAL

Draft Baseline Proposal for CDAUI-8 Chipto-Module (C2M) Electrical Interface (NRZ)

Synergy SIS Attendance Administrator Guide

DISNEYLAND PARIS WINTER PACKAGE PRICES Prices valid for all arrivals from November 7th 2018 to April 1st 2019

Frame Processing Time Deviations in Video Processors

Surface Mount Multilayer Ceramic Chip Capacitors for High Frequency

SPP-100 Module for use with the FSSP Operator Manual

for Television ---- Formatting AES/EBU Audio and Auxiliary Data into Digital Video Ancillary Data Space

Chameleon: Application Level Power Management with Performance Isolation

\I oc7tjo. 0 co(si. J under the Licensing Act U please complete section (A) U please complete section (B)

Transcription:

White Paper Group Capacity and the Mystery of the Unenforced Limit Fabio Massimo Ottaviani - EPV Technologies 1 Introduction Most sites pay IBM, and other ISV s, software costs based on the WLC (Workload License Charges) software pricing policy; in this policy, the license fees depend on the CPU usage (measured in s), rather than the machine capacity. CPU usage is calculated based on a 4-hour rolling average 1 ; depending on the workload characteristics, this value can be much lower than the power of the machine, which is normally over-sized to guarantee the service levels during a few peak hours. The bad news, is that the WLC software license fee is a monthly fee, based on the maximum value of the measured 4-hour rolling average. The complexity of today s systems and workloads, to-gether with human errors, can make it very probable that a company would pay for the full capacity of the machine most of the time. To guarantee the expected savings, IBM introduced the option to set limits to the which can be used in the 4-hour rolling average: by a single LPAR = defined capacity limit by a group of LPARs = group capacity limit The defined capacity limit can be very useful in avoiding certain LPARs, normally running non-business critical workloads from increasing the overall software costs. 1 The sum of the measured 4-hour rolling averages for all the LPARs in the CPC. SEGUS Inc 14151 Park Meadow Drive Chantilly, VA 20151 800.327.9650 www.segus.com SEGUS, EPV 2012 2014

The group capacity limit is much more important: it can guarantee that you don t pay more than the limit value, (or more than the sum of the limit values if more than one LPAR group has been created). This is the reason why the majority of the z/os sites use group capacity limits to protect against the risk of unplanned software costs. Unfortunately, it can happen that the group capacity limit is not enforced as expected, leading to undesired results. After a short introduction to Group Capacity concepts, we will discuss this issue based on the experience of one of our customers. 2 Group Capacity Overview Group capacity limit is an extension of defined capacity, allowing customers to set limits on the s which can be used in the 4-hour rolling average by a group of LPARs 2. Users can easily create groups of LPARs, and apply a capacity limit to each of them, by setting the Group Limit and Group Name parameters in the LPAR definitions on the Hardware Management Console. The following basic rules have to be fulfilled: an LPAR can only belong to one group; all the LPARs in a group have to run on the same machine. Additional limitations apply: the LPAR must run with shared processors; the LPAR must run with wait completion equal No ; the operating system must be z/os V1R8 or higher; hardware capping must be used to limit the CPU used by an LPAR. WLM (Workload Management) uses the definitions of the partitions, and the limits, to calculate a minimum and a maximum entitlement for each LPAR in the group: 2 Group and defined capacity limits can coexist and work together.

the minimum entitlement is the guaranteed share the LPAR can get when in contention; it is calculated as: MIN((WGT X GROUP / SUM(WGT)), DEF ) if DEF GT 0; the maximum entitlement is the maximum share the LPAR can get; it is calculated as: MIN (DEF, GROUP ) if DEF GT 0. The table in Figure 1 shows an example of group and defined capacity settings as reported in the Group Capacity configurations view 3 : CEC GROUP SYSTEM LPAR- NAME GROUP CAPACITY CONFIGURATION - THU, 25 JAN 2012 CEC GROUP WEIGHT DEF MIN ENT MAX ENT CAP OLD Z/ OS SER1 Z10ALL SYS1 LPR1 1329 1010 136 0 137.4 1010 N N N N SER1 Z10ALL SYS2 LPR2 1329 1010 717 0 724.2 1010 N N N N SER1 Z10ALL SYS3 LPR3 1329 1010 5 9 5.1 9 N N N N SER1 Z10ALL SYS4 LPR4 1329 1010 70 126 70.7 126 N N N N SER1 Z10ALL SYS5 LPR5 1329 1010 36 0 36.4 0 N N N N SER1 Z10ALL SYS6 LPR6 1329 1010 36 0 36.4 0 N N N N DED WC=Y Figure1 Only one group (Z10ALL) has been created in the SER1 machine. The group capacity limit is set to 1010 s. Defined capacity limits have also been assigned to SYS3 and SYS4 (9 and 126 s) to limit their entitlement. The four flags at the end of the table indicate that LPAR definitions are compliant to the described group capacity limitations: CAP, hardware capping; OLD z/os, z/os release older than 1.8; DED, CPU dedicated; WC=Y, wait completion equal YES. 3 All the figures present standard view from our EPV for z/os product.

3 The mystery of the unenforced limit At a customer site, group capacity is used to control software costs of 6 LPARs running on an IBM 2097-717. Their group and defined capacity definitions are reported in Figure 1. By looking at the EPV Management Summary view, they realized that something strange had happened in the last month. USED CEC DATE INST USED BASELINE SER1 2012-01 1329 1070 1010 SER1 2011-12 1329 933 1010 SER1 2011-11 1329 973 1010 SER1 2011-10 1329 965 985 SER1 2011-09 1329 913 085 SER1 2011-08 1329 904 970 SER1 2011-07 1329 911 970 SER1 2011-06 1329 956 970 SER1 2011-05 1329 920 950 SER1 2011-04 1329 940 950 SER1 2011-03 1329 883 950 SER1 2011-02 1329 952 950 SER1 2011-01 1329 944 950 Figure 2 The monthly peak of the, used in the 4-hour rolling average, (USED), in January 2012, is 60 s more than the group capacity limit (BASELINE). The soft capping algorithms used by defined and group capacity can t be extremely precise, so it may happen that the s used are slightly more than the limits, (see also February 2011 in the above figure). This is an advantage for the customer, who doesn t have to pay for these extra s; they will be charged taking into account the minimum value of the limit set and the used.

However, 60 s seemed a bit high to be considered normal soft capping imprecision. So they decided to deepen their investigation. CEC: SER1 BY GROUP Z10ALL NOLIMIT DATE TYPE MODEL TOTAL LIMIT USED USED 2012-01 2097 717 1329 1070 1010 975 95 2011-12 2097 717 1329 975 1010 933 2011-11 2097 717 1329 913 1010 973 2011-10 2097 717 1329 873 985 965 2011-09 2097 717 1329 865 985 913 2011-08 2097 717 1329 913 970 904 2011-07 2097 717 1329 867 970 911 2011-06 2097 717 1329 861 970 956 2011-05 2097 717 1329 856 950 920 2011-04 2097 717 1329 879 950 940 2011-03 2097 717 1329 728 950 883 2011-02 2097 717 1329 823 950 952 2011-03 2097 717 1329 883 950 944 Figure 3 An additional NOLIMIT group, which used 95 s, is reported in the WLC by Group view, (see Figure 3), besides the Z10ALL group, but only in January 2012. Drilling down to the day level, the problem seems to be restricted to January 26th, which is also the peak of the month.

CEC: SER1 BY GROUP DATE DAY TYPE MODEL TOTAL Z10ALL NOLIMIT 02/01/2012 WED 2097 717 1329 704 704 01/31/2012 TUE 2097 717 1329 712 712 01/30/2012 MON 2097 717 1329 745 745 01/29/2012 SUN 2097 717 1329 419 419 01/28/2012 SAT 2097 717 1329 823 823 01/27/2012 FRI 2097 717 1329 929 929 01/26/2012 THU 2097 717 1329 1070 975 95 01/25/2012 WED 2097 717 1329 964 964 01/24/2012 TUE 2097 717 1329 816 816 01/23/2012 MON 2097 717 1329 767 767 01/22/2012 SUN 2097 717 1329 350 350 01/21/2012 SAT 2097 717 1329 784 784 01/20/2012 FRI 2097 717 1329 907 907 01/19/2012 THU 2097 717 1329 882 882 01/18/2012 WED 2097 717 1329 943 943 01/17/2012 TUE 2097 717 1329 867 867 01/16/2012 MON 2097 717 1329 786 786 01/15/2012 SUN 2097 717 1329 336 336 01/14/2012 SAT 2097 717 1329 630 630 01/13/2012 FRI 2097 717 1329 841 841 01/12/2012 THU 2097 717 1329 761 761 01/11/2012 WED 2097 717 1329 787 787 01/10/2012 TUE 2097 717 1329 851 851 01/09/2012 MON 2097 717 1329 761 761 01/08/2012 SUN 2097 717 1329 318 318 01/07/2012 SAT 2097 717 1329 661 661 01/06/2012 FRI 2097 717 1329 740 740 01/05/2012 THU 2097 717 1329 771 771 01/04/2012 WED 2097 717 1329 816 816 01/03/2012 TUE 2097 717 1329 785 785 01/02/2012 MON 2097 717 1329 792 792 Figure 4

Drilling down further still, and the mystery was solved... CEC : SER1 - WORKLOAD: z/os - 4 HOUR MOVING AVG BY HOUR - THU, 26 JAN 2012 GROUP SYSTEM TYPE MODEL 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 Z10ALL SYS1 2097 717 1329 69 74 71 51 34 31 37 48 59 69 71 67 63 57 57 55 56 69 58 58 49 41 43 55 Z10ALL SYS2 2097 717 1329 708 664 648 663 660 646 648 638 655 687 741 797 834 836 805 818 840 855 864 832 823 813 777 727 Z10ALL SYS3 2097 717 1329 6 1 2 4 5 6 6 6 6 6 6 7 7 7 7 6 6 6 6 6 6 6 6 6 Z10ALL SYS4 2097 717 1329 45 51 45 42 30 21 20 23 27 31 35 38 37 38 39 43 65 54 34 34 32 32 31 29 NOLIMIT SYS5 2097 717 1329 8 3 4 6 7 8 8 8 8 8 8 9 9 9 9 23 35 47 8 8 8 8 8 8 NOLIMIT SYS6 2097 717 1329 8 3 4 6 7 8 8 8 8 8 8 9 9 9 9 22 37 48 8 8 8 8 8 8 Z10ALL TOTAL 844 796 774 772 743 720 727 731 763 809 869 927 959 956 926 967 1039 1070 978 946 926 908 873 833 Figure 5 For some reason, the SYS5 and SYS6 LPARs were not included in the Z10ALL group and were therefore not controlled by the group capacity limit. So, in the peak hour, they used about 95 s, which, on top of the 975 used by the Z10ALL group, led to a total of 1070 s being used. 4 Elementary my dear Watson! The explanation was, as often happens, very simple. By looking at the EPV Exceptions, they found an alert pointing to a wrong Group Capacity definition. CEC GROUP SYSTEM LPAR- NAME GROUP CAPACITY CONFIGURATION - THU, 26 JAN 2012 CEC GROUP WEIGHT DEF MIN ENT MAX ENT CAP OLD Z/ OS SER1 Z10ALL SYS1 LPR1 1329 1010 136 0 137.4 1010 N N N N SER1 Z10ALL SYS2 LPR2 1329 1010 717 0 724.2 1010 N N N N SER1 Z10ALL SYS3 LPR3 1329 1010 5 9 5.1 9 N N N N SER1 Z10ALL SYS4 LPR4 1329 1010 70 126 70.7 126 N N N N SER1 Z10ALL SYS5 LPR5 1329 1010 36 0 36.4 1010 Y N N N SER1 Z10ALL SYS6 LPR6 1329 1010 36 0 36.4 1010 Y N N N DED WC=Y Figure 6

On January 26th, it was decided to hard cap SYS5 and SYS6 before running a new application performance test. Unfortunately, as explained in the WLM manual, when the limitations described in Section 2 above (Group Capacity Overview), are not fulfilled: All partitions which do not conform to these rules are not considered part of the group. WLM will dynamically remove such partitions from the group and manage the remaining partitions towards the group limit. In all fairness to the customer, we have to say that these hardware capping limitations were not documented in either the z/os 1.10 WLM manual, the above sentence, nor the z/os WLM manual, prior to 1.10. The description of the limitations was incomplete and is outlined below: WLM will only manage partitions with shared CPs and running on z/os V1R8. All partitions which do not conform to this rule will not be considered as part of the group. 5 Conclusions Group Capacity limit is a very powerful tool which is able to protect z/os customers from unexpected and undesired software cost increases. However, it is important to be aware that LPAR definitions have to comply to the group capacity rules and limitations. In this paper we described a real-life situation where the lack of knowledge unwittingly caused an increase in the monthly peak of the 4-hour rolling average of about 60 s. This oversight led to extra costs - in this case - of around $78,000. SEGUS Inc is the North American distributor for EPV products For more information regarding EPV for z/os, please visit www.segus.com or call (800) 327-9650