Using Boosted Decision Trees to Separate Signal and Background

Similar documents
PEP-II/BaBar Performance, Accumulated Luminosity. BaBar has submitted over 400 papers for publication (Last year this number was 350)

Using Geant4 in the BaBar Simulation. CHEP03 25 March 2003 Dennis Wright (SLAC) on behalf of the BaBar computing group

Proposal by the Ohio State University High Energy Physics to join the BABAR Collaboration

Measurements of low energy e e hadronic cross sections and implications for the muon g-2

LabView Exercises: Part II

PEP II Design Outline

BaBar-Belle Legacy Book White Paper

THE TIMING COUNTER OF THE MEG EXPERIMENT: DESIGN AND COMMISSIONING (OR HOW TO BUILD YOUR OWN HIGH TIMING RESOLUTION DETECTOR )

An extreme high resolution Timing Counter for the MEG Upgrade

CHAPTER 3 SEPARATION OF CONDUCTED EMI

Evaluation of ALICE electromagnetic calorimeter jet event trigger performance for LHC-Run2 by simulation

Glast beam test at CERN

Semi-inclusive + and 0 asymmetries using eg1-dvcs. Sergio Anefalos Pereira (INFN - Frascati)

The Scintillating Fibre Tracker for the LHCb Upgrade. DESY Joint Instrumentation Seminar

Transportation Process For BaBar

A Statistical Framework to Enlarge the Potential of Digital TV Broadcasting

Study of Timing and Efficiency Properties of Multi-Anode Photomultipliers

MAPS Beam Test: preliminary results and book keeping

Development of an Abort Gap Monitor for High-Energy Proton Rings *

UniMCO 4.0: A Unique CAD Tool for LED, OLED, RCLED, VCSEL, & Optical Coatings

Commissioning and Performance of the ATLAS Transition Radiation Tracker with High Energy Collisions at LHC

BaBar Grid. Tim Adye Particle Physics Department Rutherford Appleton Laboratory. PP Grid Team Coseners House 8 th November 2002

Performance and Radioactivity Measurements of the PMTs for the LUX and LZ Dark Matter Experiments

Reconstruction of Ca 2+ dynamics from low frame rate Ca 2+ imaging data CS229 final project. Submitted by: Limor Bursztyn

Beam test of the QMB6 calibration board and HBU0 prototype

6.UAP Project. FunPlayer: A Real-Time Speed-Adjusting Music Accompaniment System. Daryl Neubieser. May 12, 2016

Feedback Control of SPS E-Cloud/TMCI Instabilities

arxiv:hep-ex/ v1 27 Nov 2003

Preliminary Conclusions from Recent Q weak Target Density Fluctuation Studies Mark Pitt, Virginia Tech

StatPatternRecognition: Status and Plans. Ilya Narsky, Caltech

Spatial Response of Photon Detectors used in the Focusing DIRC prototype

Reconstruction and Identification of Boosted Tau Pair Topologies at ATLAS. David Kirchmeier TU Dresden

Instrument Recognition in Polyphonic Mixtures Using Spectral Envelopes

ABORT DIAGNOSTICS AND ANALYSIS DURING KEKB OPERATION

R&D of Scintillating Fibers for Intermediate Tracking and Bunch Id

Time Resolution Improvement of an Electromagnetic Calorimeter Based on Lead Tungstate Crystals

Pixelated Positron Timing Counter with SiPM-readout Scintillator for MEG II experiment

arxiv: v1 [physics.ins-det] 1 Nov 2015

Status and Plans for PEP-II

Design of a Gaussian Filter for the J-PARC E-14 Collaboration

LHC Physics GRS PY 898 B8. Trigger Menus, Detector Commissioning

Duobinary Transmission over ATCA Backplanes

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Operation and Performance of a Longitudinal Feedback System Using Digital Signal Processing*

Machine Vision System for Color Sorting Wood Edge-Glued Panel Parts

Removal of Decaying DC Component in Current Signal Using a ovel Estimation Algorithm

Volume Trigger Proposal for the 2011 Season for horizontal low Energy events

3-D position sensitive CdZnTe gamma-ray spectrometers

Production of quasi-monochromatic MeV photon in a synchrotron radiation facility

Detailed Design Report

PEP-I1 RF Feedback System Simulation

Calibrating attenuators using the 9640A RF Reference

Research on sampling of vibration signals based on compressed sensing

VISSIM Tutorial. Starting VISSIM and Opening a File CE 474 8/31/06

Table of Contents. Amplifiers Broadband Telecommunications Line Extender [BLE-75**] FEATURES

SPE analysis of high efficiency PMTs for the DEAP-3600 dark matter detector

First results of the PAMELA Space Experiment. F.S. Cafagna, INFN Bari on behalf of the PAMELA Collaboration

Status of the CUORE Electronics and the LHCb RICH Upgrade photodetector chain

PEP II Status and Plans

Sensors for precision timing HEP

Software Tools for the Analysis of the Photocathode Response of Photomultiplier Vacuum Tubes

Efficient Architecture for Flexible Prescaler Using Multimodulo Prescaler

DELTA MODULATION AND DPCM CODING OF COLOR SIGNALS

Neutron Irradiation Tests of an S-LINK-over-G-link System

Inter-sector Interference Mitigation Method in Triple-Sectored OFDMA Systems

INF4420 Project Spring Successive Approximation Register (SAR) Analog-to-Digital Converter (ADC)

ttr' :.!; ;i' " HIGH SAMPTE RATE 16 BIT DRUM MODUTE / STEREO SAMPTES External Trigger 0uick Set-Up Guide nt;

The Elettra Storage Ring and Top-Up Operation

Tutorial: Trak design of an electron injector for a coupled-cavity linear accelerator

Organic Electronic Devices

SWITCHED INFINITY: SUPPORTING AN INFINITE HD LINEUP WITH SDV

Test beam data analysis for the CMS CASTOR calorimeter at the LHC

Application note for Peerless XLS 12" subwoofer driver

BitWise (V2.1 and later) includes features for determining AP240 settings and measuring the Single Ion Area.

Progress Update FDC Prototype Test Stand Development Upcoming Work

li, o p a f th ed lv o v ti, N sca reb g s In tio, F, Z stitu e tests o e O v o d a eters sin u i P r th e d est sezio tefa ectro lity stem l su

Photo Multipliers Tubes characterization for WA105 experiment. Chiara Lastoria TAE Benasque 07/09/2016

inter.noise 2000 The 29th International Congress and Exhibition on Noise Control Engineering August 2000, Nice, FRANCE

PEP II STATUS AND PLANS *

ILC requirements Review on CMOS Performances: state of the art Progress on fast read-out sensors & ADC Roadmap for the coming years Summary

Characterization and improvement of unpatterned wafer defect review on SEMs

Investigation of time-of-flight PET detectors with depth encoding

CESR BPM System Calibration

Introduction To LabVIEW and the DSP Board

Detection and demodulation of non-cooperative burst signal Feng Yue 1, Wu Guangzhi 1, Tao Min 1

Tests of Timing Properties of Silicon Photomultipliers

Concept and R&D Plans for Project X

THE BaBar High Energy Physics (HEP) detector [1] is

An Effective Filtering Algorithm to Mitigate Transient Decaying DC Offset

College of Arts and Sciences

FX Basics. Time Effects STOMPBOX DESIGN WORKSHOP. Esteban Maestre. CCRMA Stanford University July 2011

Controlling adaptive resampling

INTRODUCTION. SLAC-PUB-8414 March 2000

KEKB Accelerator Physics Report

DOSE DELIVERY SYSTEM OF THE VARIAN PROBEAM SYSTEM WITH CONTINUOUS BEAM

Spectroscopy on Thick HgI 2 Detectors: A Comparison Between Planar and Pixelated Electrodes

E X P E R I M E N T 1

FRAME ERROR RATE EVALUATION OF A C-ARQ PROTOCOL WITH MAXIMUM-LIKELIHOOD FRAME COMBINING

J. Maillard, J. Silva. Laboratoire de Physique Corpusculaire, College de France. Paris, France

Standard Operating Procedure of nanoir2-s

Transcription:

Using Boosted Decision Trees to Separate Signal and Background in B X s γ Decays James Barber 16 August, 2006 University of Massachusetts, Amherst jbarber@student.umass.edu James Barber p.1/19

PEP II/BaBar at SLAC James Barber p.2/19

BaBar Detector James Barber p.3/19

e + e Υ(4S) B B 9 + 3.1 GeV = 12.1 GeV lab frame 10.58 GeV in center of mass frame Υ(4S) resonance: 10.58 GeV 2x m B = 10.558 GeV James Barber p.4/19

Decay of a B meson James Barber p.5/19

Decay of a B meson James Barber p.5/19

b sγ B 0 : bd B + : bu B0 : b d B : bū d d The b quark can decay to an s quark via this process Radiative Penguin Decay Predicted and Experimental Branching Fractions: B theo (B X s γ) = (3.61 ± 0.49) 10 4 B exp (B X s γ) = (3.55 ± 0.26) 10 4 James Barber p.6/19

b sγ? B 0 : bd B + : bu B0 : b d B : bū d d The b quark can decay to an s quark via this process Radiative Penguin Decay Predicted and Experimental Branching Fractions: B theo (B X s γ) = (3.61 ± 0.49) 10 4 B exp (B X s γ) = (3.55 ± 0.26) 10 4 James Barber p.6/19

How the detector can tell the difference The PEP-II/BaBar B-Factory Run: 2405635 Timestamp: 1d:ffffffff:000017/090915a6:H Date Taken: Tue Dec 31 16:27:42.787195000 1996 PST? HER: 8.990 GeV, LER: 3.112 GeV James Barber p.7/19

How the detector can tell the difference IT CAN T!! The PEP-II/BaBar B-Factory Separation must be done in software Run: 2405635 Timestamp: 1d:ffffffff:000017/090915a6:H Date Taken: Tue Dec 31 16:27:42.787195000 1996 PST? HER: 8.990 GeV, LER: 3.112 GeV James Barber p.7/19

Expected amounts of Signal and Background Photon Energy Spectrum Events / 20 MeV 10 6 contin expect bbar expect 10 5 10 4 10 3 10 2 10 signal expect For 10 2 to 10 3 signal events there are 10 5 and 10 6 background events! 3 or 4 orders of magnitude worth of background must be suppressed to get a signal 1 1.6 1.8 2 2.2 2.4 2.6 2.8 3 3.2 E* γ (GeV) - Simulation James Barber p.8/19

Practice on Fake Data! Event Variables: Monte Carlo (MC) Simulated Data Life-like variables leptonmomentumcm costhetagamma econe1 econe2. econe18 egammab James Barber p.9/19

Practice on Fake Data! Event Variables: Monte Carlo (MC) Simulated Data Life-like variables type and class variables leptonmomentumcm costhetagamma econe1 econe2. econe18 egammab type class James Barber p.9/19

Separating with Variables Pick a variable with high separation power Nice, fake data 22 Signal 20 Background 18 16 14 12 10 8 6 4 2 0 0 1 2 3 4 5 6 7 8 9 10 James Barber p.10/19

Separating with Variables Pick a variable with high separation power, Divide the data in a nice way. Nice, fake data 22 Signal 20 Background 18 16 14 12 10 8 6 4 2 rejected accepted 0 0 1 2 3 4 5 6 7 8 9 10 James Barber p.10/19

Separating with Variables Pick a variable with high separation power, Divide the data in a nice way. No Such Variable! leptonmomentumcm 4500 4000 3500 Legend Signal Background 3000 2500 2000 1500 1000 500 0 1 1.2 1.4 1.6 1.8 2 2.2 2.4 2.6 GeV James Barber p.10/19

Enter: Boosted Decision Trees (BDTs) BDTs are an advanced (complicated) method of separation, that learns to distinguish between signal and background events James Barber p.11/19

Enter: Boosted Decision Trees (BDTs) MC Data MC Events BDTs are an advanced (complicated) method of separation, that learns to distinguish between signal and background events BDTs need to be taught to separate signal and background trained on MC events James Barber p.11/19

Enter: Boosted Decision Trees (BDTs) MC Data MC Events BDTs are an advanced (complicated) method of separation, that learns to distinguish between signal and background events BDTs need to be taught to separate signal and background trained on MC events Train on an event sample representative of the data range James Barber p.11/19

Enter: Boosted Decision Trees (BDTs) MC Data Root Node MC Events BDTs are an advanced (complicated) method of separation, that learns to distinguish between signal and background events BDTs need to be taught to separate signal and background trained on MC events Train on an event sample representative of the data range All training events form the root node of a BDT James Barber p.11/19

How a Decision Tree works B 4/37 S 7/1 S/B 52/48 < 100 100 PMT Hits? S/B 9/10 S/B 48/11 < 0.2 GeV 0.2 GeV Energy? < 500 cm 500 cm Radius? B 2/9 Root Node S 39/1 Start with some number of Monte Carlo events in a root node James Barber p.12/19

How a Decision Tree works B 4/37 S 7/1 S/B 52/48 < 100 100 PMT Hits? S/B 9/10 S/B 48/11 < 0.2 GeV 0.2 GeV Energy? < 500 cm 500 cm Radius? B 2/9 S 39/1 Start with some number of Monte Carlo events in a root node Pick a variable/value combination to separate the events If node has > specified purity, or < specified number of events, stop separating James Barber p.12/19

How a Decision Tree works B 4/37 S 7/1 S/B 52/48 < 100 100 PMT Hits? S/B 9/10 S/B 48/11 < 0.2 GeV 0.2 GeV Energy? < 500 cm 500 cm Radius? B 2/9 S 39/1 Start with some number of Monte Carlo events in a root node Pick a variable/value combination to separate the events If node has > specified purity, or < specified number of events, stop separating Otherwise, continue separation Leaf nodes are classified as Signal or Background James Barber p.12/19

How a BDT works B 4/37 S 7/1 S/B 52/48 < 100 100 PMT Hits? S/B 9/10 S/B 48/11 < 0.2 GeV 0.2 GeV Energy? < 500 cm 500 cm Radius? B 2/9 S 39/1 Missclassified events are given boosted weights A new root node is made, and a new tree generated 500 or 1000 trees made in this way, with boosting after each classification A BDT learns from its mistakes! James Barber p.13/19

Testing a forest of BDTs MC Data Testing Data MC Events Testing Events All events not used for training are used for testing These events are run through every tree we created in training Every time an event is classified as signal, it gets Nsignal incremented Calculate a likelihood, l l = N signal Ntrees Background tends towards 0, signal tends towards 1 Plot likelihood to see separation James Barber p.14/19

Likelihood Values Signal and Background Separation 500 Signal BBbar Continuum 400 300 200 100 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 James Barber p.15/19

Determining separation quality Calculate a Figure of Merit, Q Q = S S+B +C S: sum of signal events selected B = B (1 + f B error ) C = 1 f C B: sum of B B background events selected C: sum of continuum events selected f: on peak percentage (.9) B error : accounts for uncertainty in exact amount of B B background in data James Barber p.16/19

Determining separation quality Efficiencies 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 signal BBbar background continuum 2 2.2 2.4 2.6 2.8 EGammaStar(GeV) Calculate efficiency separation take total events kept by selection algorithm divide by total starting events James Barber p.16/19

What I did Get this all working! Find parameter configuration which gives the maximum Figure of Merit, Q Some parameters tested + results (749,683 total events) Training events: 350,000 Trees: 1,000 MinEvents/node: 50 Cuts: 50 Q = 13.99 Training events: 350,000 Trees: 500 MinEvents/node: 50 Cuts: 50 Q = 13.81 Training events: 100,000 Trees: 1000 MinEvents/node: 50 Cuts: 50 Q = 18.21 Training events: 100,000 Trees: 1000 MinEvents/node: 100 Cuts: 50 Q = 18.18 James Barber p.17/19

And the winner is.. Training events : 100,000 Trees: 1,000 MinEvents/node: 50 Cuts: 100 Q = 18.37 Signal and Background Separation 500 Signal BBbar Continuum 400 300 200 100 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 James Barber p.18/19

Conclusion Better signal and background separation reduces uncertanties Important to make precision measurements Increase the sensitivity for new physics Possible to have a previously unseen heavy particle in penguin loop H? James Barber p.19/19