Diagnosing Intermittent MySQL Problems

Similar documents
How Percona Helps you to run MySQL Successfully. Peter Zaitsev, CEO, Percona Percona MySQL University Buenos Aires,AR February 7, 2013

Installing a Turntable and Operating it Under AI Control

Tech Tips with Gnull and Voyd

MITOCW max_min_second_der_512kb-mp4

MITOCW ocw f08-lec19_300k

Proxy servers for netflix

Configuring and Troubleshooting Set-Top Boxes

How Do I Manually Update My Nook Hd To Tv

DIFFERENTIATE SOMETHING AT THE VERY BEGINNING THE COURSE I'LL ADD YOU QUESTIONS USING THEM. BUT PARTICULAR QUESTIONS AS YOU'LL SEE

A QUALITY IMPROVEMENT PROCESS IN, HEMLOCK DRYING

Conversations with Logo (as overheard by Michael Tempel)

Att u verse troubleshooting red x

Video - low carb for doctors (part 8)

Description: PUP Math Brandon interview Location: Conover Road School Colts Neck, NJ Researcher: Professor Carolyn Maher

MITOCW MIT7_01SCF11_track01_300k.mp4

Circuit Playground Hot Potato

RELEASE NOTES. Introduction. Supported Devices. Mackie Master Fader App V4.5.1 October 2016

Digital TV Troubleshooting Tips

Medusa Script. Written By. Collin Cunningham Brendan McLaughlin Ethan Leisie Aiden Fry Erik Schulz. Based on INCEPTION

MITOCW mit-6-00-f08-lec17_300k

TOPIC 8 HOLIDAYS. 1. Warm Up. Do you like to go on holiday? Do you like to stay home? Where was your last holiday?

CONTENTS. Troubleshooting 1

7) What do you mean it supports 'Cloud'? It can communicate with Workforce Central over the open internet or the "cloud"

EG LFO (EFM 1900er series PCBs old forum topics) 1 of 6

Samsung Manual Lcd Tv Problems No Picture No Sound

ABBOTT AND COSTELLO TEN MINUTE PLAY. By Jonathan Mayer

Tvheadend - Bug #3134 configuration save/load - improve throughput - save space (was: big starting time)

Spectacle Motion Board Hookup Guide

User s Manual For LDU8000R

Samsung Manual Lcd Tv Problems No Picture But Sound Working

Little Brother The Story of the Prodigal Son by Mary Evelyn McCurdy. Scene 1. BIG BROTHER: Why are you talking about Dad dying? That's a long way off.

SPG700 Multiformat Reference Sync Generator Release Notes

Um... yes, I know that. (laugh) You don't need to introduce yourself!

NOUN CLAUSE SELF-TEST

Agilent Technologies. N5106A PXB MIMO Receiver Tester. Error Messages. Agilent Technologies

BBC Learning English Talk about English Webcast Thursday March 29 th, 2007

Connecting a Turntable to Your Computer

WiFi Time Provider v1 for Arduino Nixie Clock Operating Instructions & Construction Manual

SURVIVAL TIPS FOR FAMILY GATHERINGS

NATIONAL TRANSPORTATION SAFETY BOARD WASHINGTON, DC INTERVIEW TRANSCRIPT CLYDE ANTROBUS NOVEMBER 18, 1996

DOCKET NO. SA-516 APPENDIX 12 NATIONAL TRANSPORTATION SAFETY BOARD WASHINGTON, DC. INTERVIEW TRANSCRIPT RICHARD ORTIZ NOVEMBER 19, 1996 (25 pages)

Broadcast Engineering and IT: Bridging the Cultural Divide. Blake White Consulting Partner Cognizant Technology Solutions April 10, 2015

Audition the Actor, Not the Part

EngineDiag. The Reciprocating Machines Diagnostics Module. Introduction DATASHEET

MITOCW big_picture_integrals_512kb-mp4

EngineDiag. The Reciprocating Machines Diagnostics Module. Introduction DATASHEET

Sea Urchin Embryos on the Axiovert200M. Joyce Ma and Jackie Wong. April 2003

A very tidy nursery, I must say. Tidier than I was expecting. Who's responsible for that?

Jynxbox How2 for V1, V2, V3 Pg 1

Foundations Upgrade Institutional Readiness

Zoom Pro Classroom Facilitator and User Guide

Frequently Asked Questions (FAQs)

Introduction How to operate Cable Test Limit mode Log & EDID Troubleshooting. HDMI Diagnostics and Troubleshooting

Intellex Digital Video Management System. Troubleshooting Guide

INFORMATION TO USER CAUTION RISK OF ELECTRIC SHOCK, DO NOT OPEN

EAN-Performance and Latency

MR. MCGUIRE: There's a great future in plastics. Think about it. Will you think about it?

OPERATOR VIDEO MONITORING PRACTICES. April 17, 2013

#033: TOP BUSINESS ENGLISH IDIOMS PART #1

I M SO FRUSTRATED! CFE 3257V

The Ultimate Career Guide

CAST PERFORMER CAST PERFORMER

crazy escape film scripts realised seems strange turns into wake up

Avigilon View Software Release Notes

R2T12T&S12T45TP. CLIPSTER Release Note Software Version TF

ACT I SCENE 3. TOURIST 2 No wait. (She consults Baedeker) According to the guide, this is where the artists hang out. Let's get a table.

Circadian Rhythms: A Blueprint For the Future?

Parts of thesis writing chapter 1 >>>CLICK HERE<<<

ttr' :.!; ;i' " HIGH SAMPTE RATE 16 BIT DRUM MODUTE / STEREO SAMPTES External Trigger 0uick Set-Up Guide nt;

Philips server not found error message appears onscreen when you access Net TV.

SPG700, SPG8000A, SPG8000 Firmware Version 3.2 Release Notes

3jFPS-control Contents. A Plugin (lua-script) for X-Plane 10 by Jörn-Jören Jörensön

EXCERPT FROM WILLING OBJECTS BY SERAFINA DONAHUE

Excel 2010 Power Programming With VBA Ebooks Gratuit

PEOPLE WHO LIE. written by. Xavier Gonzalez

NetLogo User's Guide

Pre-Processing of ERP Data. Peter J. Molfese, Ph.D. Yale University

MCS PerfectMatch v6 Log Sample1.rtf Sep. 23, 2009

Cisco TelePresence Synch

10:00:32 Ia is stubborn. We fight about TV and cleaning up. 10:00:39 What annoys me most is that she's so stubborn.

Image Acquisition Technology

DM DiagMon Architecture

Release Notes for GT42 Universal descrambler Module

Questions and Notes for Solus Pro Certification Name

VERWER TRAINING AND CONSULTANCY LTD Supporting the PROFIBUS Group UK & PROFIBUS International

Contents. Introduction Troubleshooting Techniques... 4 Preparation... 4 Knowledge:... 4 Tools:... 5 Spare Parts:... 5 Backups:...

SAPLING WIRED SYSTEM

Watch Mushrooms Grow Lisa Sindorf East Gallery - Formative Evaluation February 2011

Free ebooks download for kindle >>>CLICK HERE<<<

AXIS M30 Series AXIS M3015 AXIS M3016. User Manual

The Calculative Calculator

Transfer your answers to the answer sheet

Victorian inventions - The telephone

Wireless Theatre Sample Radio Script

TECHNICAL BULLETIN CS 250ci/300ci400ci/500ci

BBC LEARNING ENGLISH Jamaica Inn 10: The truth is out

ABBOTT AND COSTELLO By Jonathan Mayer

MyFlyDream TeleFlyPro V1.04

SWITCH: Microcontroller Touch-switch Design & Test (Part 2)

802DN Series A DeviceNet Limit Switch Parameter List

Transcription:

Diagnosing Intermittent MySQL Problems

About Me You can contact me at baron@percona.com

Percona MySQL Consulting, Support, Training, & Engineering Percona Server enhanced version of MySQL Percona XtraBackup hot InnoDB backups Percona Toolkit tools for DBAs and sysadmins

Percona Events Webinars Once a month. Free! See percona.com/webinars Watch recordings of past webinars if you missed them Conferences See percona.com/live Percona Live London October 24-25 Percona Live Washington D.C. January 12th Percona Live MySQL Conference & Expo Santa Clara, April 10-12

Today's Agenda Diagnosing intermittent MySQL problems What kind of problems are we talking about? Why are they hard to solve? What approaches can solve them successfully? What tools can help you do it more quickly? How can you set up and use these tools? How do you interpret the results? Case Studies

Intermittent Problems Happen at random times Hard to observe in action No obvious reason

What Kinds Of Problems? In general, we see three kinds Randomly slow query Sudden error message Server-wide stalls Real customer examples: My server seems to freeze for ten seconds to a minute at random times. Suddenly, everything clears up again. It seems to happen for no reason. I get sporadic 'too many connections' errors. Increasing max_connections doesn't help. This is not related to my peak load.

How Hard Can It Be? It's hard to troubleshoot when you can't see it. Our graphs show this happens for 1 to 3 minutes once or twice a week. It's hard to get support when it's not reproducible. Our support staff thinks that we are imagining it. We filed a bug, but it was closed because we can't create a test case. It can go on forever. We've been working on this for nearly 5 months.

Why Does This Happen? More CPUs More memory More popularity Cloud computing

How Not To Do It DON'T try to use tuning scripts DON'T try to change server settings DON'T try rebooting everything DON'T do $random_stab_in_the_dark DON'T try upgrading or replacing components

The Fruits of Trial-And-Error I think this might be related to your networking. Can you try buying a new switch?

The Fruits of Trial-And-Error I think this might be related to your networking. Can you try buying a new switch? Oh, that didn't solve it? Hmmm... let me think.

The Fruits of Trial-And-Error I think this might be related to your networking. Can you try buying a new switch? Oh, that didn't solve it? Hmmm... let me think. I saw someone else on the Internet with a problem like this. They said that switching from Debian to Red Hat fixed it. Can you try that?

The Fruits of Trial-And-Error I think this might be related to your networking. Can you try buying a new switch? Oh, that didn't solve it? Hmmm... let me think. I saw someone else on the Internet with a problem like this. They said that switching from Debian to Red Hat fixed it. Can you try that? It still happens? Oh wow. What version of Java are you using? Can you [upgrade downgrade] that?

The Fruits of Trial-And-Error... Time passes... Sorry, I really don't know. Well, this is a free forum, so at least this didn't cost you anything.

Measure, Measure, Measure You cannot fix what you cannot measure.

How Do I Measure? You have to measure in three ways: Completely. Schwartz's Law: whatever you don't measure is the data you need. Correctly timed. If you measure in 5 minute increments and it happens for 10 seconds, you'll never see it. Correctly scoped. If you're looking at the whole server instead of measuring the specific piece that's having trouble, you'll mix data.

What Should I Measure? Everything. Yes, it's a lot of data. See Schwartz's Law.

I Never See It Happen You need automatic tools watching for it. We've developed good tools for this.

Using Percona Toolkit Percona Toolkit = Maatkit + Aspersa The primary tools for this are: pt-stalk: wait for something to happen, then execute... pt-collect: gather tons of diagnostic data for a short time pt-sift: look for needles in the pt-collect haystack

Finding a Trigger Find a reliable way to detect the problem Getting this right is the foundation! Use this as a trigger for pt-stalk.

Example $ mysqladmin ext -i1 awk '/Queries/{q=$4- qp;qp=$4}/threads_connected/{tc=$4}/threads_running/{printf "%5d %5d %5d\n", q, tc, $4}' 798 136 7 767 134 9 828 134 7 683 134 7 784 135 7 614 134 7 108 134 24 187 134 31 179 134 28 1179 134 7 1151 134 7 1240 135 7 1000 135 7

Example $ mysqladmin ext -i1 awk '/Queries/{q=$4- qp;qp=$4}/threads_connected/{tc=$4}/threads_running/{printf "%5d %5d %5d\n", q, tc, $4}' 798 136 7 767 134 9 828 134 7 683 134 7 784 135 7 614 134 7 108 134 24 187 134 31 179 134 28 1179 134 7 1151 134 7 1240 135 7 1000 135 7

Configuring pt-stalk Set THRESHOLD=15 Set VARIABLE=Threads_running Then start a screen session, and run pt-stalk as root You may need to install and enable: GDB for backtraces (wait analysis) Oprofile for server profiling

Looking At The Data

Using pt-sift

Case Study

Thanks! Contact me at baron@percona.com We can help with all your MySQL needs! Visit http://www.percona.com/mysql-support/ Contact sales at http://www.percona.com/