Simultaneous Experimentation With More Than 2 Projects

Similar documents
Unawareness and Strategic Announcements in Games with Uncertainty

Problem Weight Total 100

Story Tracking in Video News Broadcasts. Ph.D. Dissertation Jedrzej Miadowicz June 4, 2004

PIER Working Paper

CHARACTERIZATION OF END-TO-END DELAYS IN HEAD-MOUNTED DISPLAY SYSTEMS

Permutations of the Octagon: An Aesthetic-Mathematical Dialectic

A Note on Unawareness and Zero Probability

Emotional Decision-Makers and Anomalous Attitudes towards Information

The Research of Controlling Loudness in the Timbre Subjective Perception Experiment of Sheng

Optimal Foraging. Cole Zmurchok Math 102 Section 106. October 17, 2016

A combination of approaches to solve Task How Many Ratings? of the KDD CUP 2007

Lecture 16: Feedback channel and source-channel separation

Beliefs under Unawareness

MITOCW ocw f08-lec19_300k

ECE438 - Laboratory 4: Sampling and Reconstruction of Continuous-Time Signals

The Great Beauty: Public Subsidies in the Italian Movie Industry

Design of Fault Coverage Test Pattern Generator Using LFSR

MDPs with Unawareness

CRISTINA VEZZARO Being Creative in Literary Translation: A Practical Experience

Department of Electrical & Electronic Engineering Imperial College of Science, Technology and Medicine. Project: Real-Time Speech Enhancement

Prudence Demands Conservatism *

A repetition-based framework for lyric alignment in popular songs

Non-monotonic career concerns

22/9/2013. Acknowledgement. Outline of the Lecture. What is an Agent? EH2750 Computer Applications in Power Systems, Advanced Course. output.

2D ELEMENTARY CELLULAR AUTOMATA WITH FOUR NEIGHBORS

CHAPTER 6. Music Retrieval by Melody Style

Copyright is owned by the Author of the thesis. Permission is given for a copy to be downloaded by an individual for the purpose of research and

Optimization of Multi-Channel BCH Error Decoding for Common Cases. Russell Dill Master's Thesis Defense April 20, 2015

Part 1: Introduction to Computer Graphics

Deep Neural Networks Scanning for patterns (aka convolutional networks) Bhiksha Raj

Lab 6: Edge Detection in Image and Video

Programs. onevent("can", "mousedown", function(event) { var x = event.x; var y = event.y; circle( x, y, 10 ); });

II. SYSTEM MODEL In a single cell, an access point and multiple wireless terminals are located. We only consider the downlink

1.1 The Language of Mathematics Expressions versus Sentences

The Impact of Media Censorship: Evidence from a Field Experiment in China

Robert Alexandru Dobre, Cristian Negrescu

Introduction to Signal Processing D R. T A R E K T U T U N J I P H I L A D E L P H I A U N I V E R S I T Y

VISSIM TUTORIALS This document includes tutorials that provide help in using VISSIM to accomplish the six tasks listed in the table below.

CS229 Project Report Polyphonic Piano Transcription

Brain-Computer Interface (BCI)

Chapter 12. Synchronous Circuits. Contents

Music Genre Classification

MindMouse. This project is written in C++ and uses the following Libraries: LibSvm, kissfft, BOOST File System, and Emotiv Research Edition SDK.

A BEM STUDY ON THE EFFECT OF SOURCE-RECEIVER PATH ROUTE AND LENGTH ON ATTENUATION OF DIRECT SOUND AND FLOOR REFLECTION WITHIN A CHAMBER ORCHESTRA

MANAGING INFORMATION COLLECTION IN SIMULATION- BASED DESIGN

Analysis and Clustering of Musical Compositions using Melody-based Features

Error Concealment for Dual Frame Video Coding with Uneven Quality

ENGINEERING COMMITTEE

Extraction Methods of Watermarks from Linearly-Distorted Images to Maximize Signal-to-Noise Ratio. Brandon Migdal. Advisors: Carl Salvaggio

Course Web site:

ZONE PLATE SIGNALS 525 Lines Standard M/NTSC

Detecting Musical Key with Supervised Learning

Chapter 1. Introduction to Digital Signal Processing

Chapter 10. Lighting Lighting of Indoor Workplaces 180

Intra-frame JPEG-2000 vs. Inter-frame Compression Comparison: The benefits and trade-offs for very high quality, high resolution sequences

Lisa Randall, a professor of physics at Harvard, is the author of "Warped Passages: Unraveling the Mysteries of the Universe's Hidden Dimensions.

Digital Logic Design: An Overview & Number Systems

A Good Listener and a Bad Listener

Handout 1 - Introduction to plots in Matlab 7

Contests with Ambiguity

2. ctifile,s,h, CALDB,,, ACIS CTI ARD file (NONE none CALDB <filename>)

Opinions as Incentives

Game Theory 1. Introduction & The rational choice theory

Feature-Based Analysis of Haydn String Quartets

Melodic Pattern Segmentation of Polyphonic Music as a Set Partitioning Problem

Wipe Scene Change Detection in Video Sequences

Text from multiple sources, including In the Blink of an Eye by Walter Murch ISBN:

BIBLIOGRAPHIC DATA: A DIFFERENT ANALYSIS PERSPECTIVE. Francesca De Battisti *, Silvia Salini

System Quality Indicators

Joint Optimization of Source-Channel Video Coding Using the H.264/AVC encoder and FEC Codes. Digital Signal and Image Processing Lab

Lecture 10: Release the Kraken!

Technical Appendices to: Is Having More Channels Really Better? A Model of Competition Among Commercial Television Broadcasters

NEXT ION OPTICS SIMULATION VIA ffx

Linkage 3.6. User s Guide

7+($1$/<6,62)(6&2575(48,5(0(176)257$1.9(66(/6,135,1&(:,//,$06281'

Copyright Warning & Restrictions

DJ Darwin a genetic approach to creating beats

Peak Dynamic Power Estimation of FPGA-mapped Digital Designs

Music History: Genres, Record Labels and Artists (SCQF level 7)

Section 6.8 Synthesis of Sequential Logic Page 1 of 8

AskDrCallahan Calculus 1 Teacher s Guide

ECE 4220 Real Time Embedded Systems Final Project Spectrum Analyzer

Adaptive Key Frame Selection for Efficient Video Coding

Analysis of local and global timing and pitch change in ordinary

Chapter 27. Inferences for Regression. Remembering Regression. An Example: Body Fat and Waist Size. Remembering Regression (cont.)

Transducers and Sensors

Color Codes of Optical Fiber and Color Shade Measurement Standards in Optical Fiber Cables

A Functional Representation of Fuzzy Preferences

Video coding standards

Horizontal reputation and strategic audience management

Decision-Maker Preference Modeling in Interactive Multiobjective Optimization

How to Obtain a Good Stereo Sound Stage in Cars

Multirate Signal Processing: Graphical Representation & Comparison of Decimation & Interpolation Identities using MATLAB

Design and Implementation of Partial Reconfigurable Fir Filter Using Distributed Arithmetic Architecture

Musical Hit Detection

Measurement of overtone frequencies of a toy piano and perception of its pitch

(Skip to step 11 if you are already familiar with connecting to the Tribot)

Political Biases in Lobbying under Asymmetric Information 1

Discrete, Bounded Reasoning in Games

ECE438 - Laboratory 1: Discrete and Continuous-Time Signals

Transcription:

Simultaneous Experimentation With More Than 2 Projects Alejandro Francetich School of Business, University of Washington Bothell May 12, 2016 Abstract A researcher has n > 2 projects she can undertake; one and only one of these projects can succeed, but there is uncertainty about which one will work out. She may experiment on any subset of the n projects over any interval of time. Each additional project undertaken entails a cost, but simultaneous experimentation generates more data. Due to the complexity and intractability of the problem, we cannot hope for a closed-form, complete solution. Instead, we present the numerical solution for the case n = 3. Provided the cost is not too high, or the researcher is sufficiently patient, the optimal research strategy is as follows. If the researcher is sufficiently confident about a given project, she takes on this favored project alone. If she is sufficiently confident about which project will not work out, but not so much about which of the other two will, she takes on the latter two simultaneously. Finally, when she is sufficiently unsure about the projects, she takes on all three at once. Continued failure on a project pushes confidence towards the other projects. But the researcher does not give up on the failing project; rather, she takes on other projects as well despite the higher costs and knowing that all but one of them are doomed to fail. A conjecture regarding the structure of the optimal strategy for the general problem is provided. Keywords: Experimentation, two-armed bandits, multi-choice bandits, negatively correlated arms, Poisson process JEL Classification Numbers: D83, D90 Email address: aletich@uw.edu This work features research undertaken while I was at the Decision Sciences Department at Bocconi University as a postdoctoral fellow. I am deeply indebted to David Kreps, Alejandro Manelli, Pierpaolo Battigali, Massimo Marinacci, and Juan Camilo Gomez for their support, guidance, and encouragement. I gratefully acknowledge financial support from ERC advanced grant 324219. Any remaining errors and omissions are all mine. 1

1 Introduction Imagine there is an archipelago of islands, and a treasure ship is sunk within it. An explorer is after the treasure. The treasure is known to be buried in one of the islands, but not where exactly. The explorer can organize an expedition to one island at a time, or she can organize simultaneous expeditions to multiple islands. It is more costly to set up multiple expeditions, and all but one of said expeditions are doomed to fail; but simultaneous expeditions can cover more ground faster. In standard experimentation problems, decision makers are allowed to experiment on at most one project at a time. In the context of binary problems, Francetich (2016) shows that experimenting on more than one project simultaneously is beneficial for decision makers if they have this option even if it is known that only one of the projects is fruitful ex-post. For instance, an academic research question may be true or false, and the cause of a disease may be a virus or bacteria. But what if there are more than 2 islands in the archipelago? The problem becomes intractable or far too cumbersome. With n projects, the number of possible sets of projects to undertake is 2 n. Moreover, the state space of the problem, the simplex of posterior beliefs, is multidimensional. Thus, the Bellman equation and pasting conditions involve partial differential equations, and the regions over which the decision maker switches from one set of projects to the other are characterized by surfaces rather than by simple cutoffs. As a way around these technical and computational limitations, we present the numerical solution to the problem for the case n = 3. 1 Provided the cost is not too high, or the researcher is sufficiently patient, the optimal strategy dictates conducting research as follows. If the researcher is sufficiently confident about a single project, she takes on this favored project alone. If she is sufficiently confident about which project will not work out, but not so much about which one of the other two will, she takes on the latter two simultaneously. Finally, when she is sufficiently unsure about the projects, she takes on all three at once. Continued failure on a project pushes the researcher s confidence towards the neglected projects. But the researcher does not give up on the failing project; rather, she takes on other projects as well despite the higher cost of simultaneous research, and knowing that all but one of the projects are doomed to fail. For the general problem of n > 2 projects, a conjecture regarding the structure of the optimal strategy is provided. 2 The Model There is a finite set of n projects X = {x 0,..., x n 1 } on which a decision maker (henceforth, DM) can experiment. The DM allocates her time between the different subsets of X. The set 1 As another way around said issues, Francetich and Kreps (2014) explores heuristics in a similar problem. 2

of allocations of a divisible unit of time between the subsets of X is A := S 2n 1, the (2 n 1)- dimensional simplex. Given a labelling of the subsets of X, 2 X = {A j X : j = 0,..., 2 n 1}, the j-th component α j of vector α A denotes the fraction of time spent on A j. There is a (flow) research cost c > 0 to undertaking each project. Successes yield a gross reward of 1, and they arrive over time for project i = 0,..., n 1 according to a Poisson processes with arrival rate λi(ω = ω i ), where I( ) is the indicator function, λ > c is the known arrival rate, and ω Ω := {ω 0,..., ω n 1 } is the ex-ante unobserved state of nature. In words, it is known that one and only one of these projects is profitable to undertake, and exactly how profitable it is, but there is uncertainty as to which one is the profitable one. Payoffs are discounted at the subjective rate ρ > 0. The DM starts with a prior π 0 over the states of nature; this prior is a point in Π := S n 1, the (n 1)-dimensional simplex. If π Π represents the beliefs of the DM, her expected immediate payoff from experimenting on subset A X for a time interval of length Δ > 0 is: n 1 λδ i=0 π i I(x i A) cδ#a. In addition, she observes whether any successes arrive over Δ for each of the projects x A. In particular, by working on a single project, she cannot distinguish between the event of an arrival for one of the other projects and the event of failure of arrival altogether. Let π t = (π 0,t,..., π n 1,t ) denote the period-t posterior. At any moment, observing an arrival makes the posterior jump to 1 for the successful project and to 0 for the rest. By spending time on all projects, either nothing new is learned, or the model uncertainty is resolved immediately. This is due to the symmetry in arrival rates; the event of failure of arrival is equally likely for all of the projects. The more interesting dynamics take place when the DM spends time on non-empty proper subsets of X, namely, when she works on some but not all projects. Given α A, let α i denote the fraction of time spent on project i, be it exclusively or as part of a larger set of projects: α i = j:xi A j α j. If no arrival results over [t, t + Δt), the posterior for project x i is: π i,t+δt = π i,t e αi λδt ( j:xi Aj π i,t ) e αi λδt + j:xi / Aj π i,t. As Δt shrinks, we obtain: π i,t = α i λπ i,t 1 π i,t. j:x i A j While working unsuccessfully on some but not all of the projects, the DM becomes progressively pessimistic about them and optimistic about the neglected ones. See Figure 1. The environment is stationary, and the state variable of the problem is the belief of the DM, π Π. Let w : Π R denote the (optimal, average) value function; w satisfies the 3

(a) DM works on A = {x 0 } (b) DM works on A = {x 0, x 1 } Figure 1: Evolution of posteriors when the DM works on the projects in set A X. The curved lines pointing to the corners represent the jump in the posterior in the event of success. The straight lines represent the gradual updating of beliefs while no successes are observed. Bellman equation: w(π) = max α A { 2 n 1 α j (λδ j=0 n 1 π i I(x i A j ) cδ#a j + E )} A j,π[c(w, w, π)], ρ i=0 where C is the continuation value of the problem, which depends on the distribution of posteriors and on the value function and its gradient. The optimal strategy is a stationary strategy, recommending an allocation of time α A as a function of the state π Π. 3 Numerical Solution for n = 3 With n = 3 projects, the DM can spend her time on 8 different research agendas: (namely, doing no research at all), {x 0 }, {x 1 }, {x 2 }, {x 0, x 1 }, {x 0, x 2 }, {x 1, x 2 }, and X. To compute the optimal strategy numerically, we transform the control problem into a discretetime programming problem and adjust the length of the time period. We also discretize the state space. The length of the time period and the fineness of the state space, as well as the arrival rate, the discount rate, and the cost, are the parameters of the problem. Figure 2 presents the optimal strategy computed using MATLAB, specifying a state space of 200 200 points and a length of time of Δ = 0.01. 2 The arrival rate is always λ = 0.75; the different subfigures depict the optimal strategy for different values of ρ and c. The axes represent the probabilities π 0, π 1, respectively. 3 The different shaded areas of the (lower) 2 The MATLAB code is available from the author upon request. 3 The triangles depicted in figure 2 are projections of the 2-dimensional simplex onto the plane π 0, π 1. Such 4

triangle represent different recommended subsets of projects given the posteriors. The subsets are color coded as follows: grey = ; blue = {x 0 }; yellow = {x 1 }; red = {x 2 }; green = {x 0, x 1 }; purple = {x 0, x 2 }; orange = {x 1, x 2 }; and white = X. Along the boundaries, the decision maker splits her time evenly between the subsets recommended on each of the corresponding neighboring regions. She must split her time in this way for the path of posteriors to be well defined. On the interior of these regions, beliefs evolve in different directions; by shifting back and forth, beliefs are pushed in a single direction. 4 Figure 2a features the optimal strategy when ρ = 0.1 and c = 0.3. If the DM is sufficiently confident about a given project, namely if her prior is sufficiently close to one of the corners of the state space, she takes on the favored project alone. When the DM is sufficiently confident about which project will not work out, but not so much about the other two, she takes on the latter two simultaneously. Finally, when she is sufficiently unsure about the projects, she takes on all three at once: Information is valuable to her, and the cost of each project is not too high. Compare with figure 2b, where we have ρ = 0.1 but c = 0.65. Now, the DM works on at most two projects at once: She appreciates information, but it is too costly to take on all three projects simultaneously. Finally, figure 2c depicts the case ρ = 100 and c = 0.7. The cost is even higher, and the DM is far too impatient to appreciate the information that comes from experimenting. Thus, she takes on a single project if she is sufficiently confident about it, and otherwise gives up altogether. Figure 3 describes the path of posteriors and the research dynamics under the strategy in figure 2a. Figure 3a reproduces figure 2a in the 2-dimensional simplex. Assume the prior falls in the region where the DM starts working on x 0 alone. While working unsuccessfully on it, her posterior starts moving towards the orange region (figure 3b); eventually, she takes on project x 2 as well (figure 3c). Continued failure now pushes the posterior gradually in the direction of the (0, 1, 0) corner. When beliefs reach the frontier of the yellow and orange regions, the DM holds on to x 0 and splits her time evenly between x 1 and x 2 (figure 3d). Eventually, if no successes are observed, she becomes sufficiently unsure and takes on the third project as well. At this point, she works on all three projects at once until the winner is identified. 4 Conjecture for the General Case Based on the analysis of section 3, and given the results in Francetich (2016), we pose the following conjecture for the structure of the optimal strategy for a generic n N. projections are easier to compute, and the resulting figures are practically identical. 4 Such splitting ensures what Klein and Rady (2011) calls admissibility of the strategies. For more on admissibility in the binary version of the present problem, see Francetich (2016). 5

(a) Optimal strategy for ρ = 0.1 and c = 0.3 (b) Optimal strategy for ρ = 0.1 and c = 0.65 (c) Optimal strategy for ρ = 100 and c = 0.7 Figure 2: Optimal strategy for n = 3. Conjecture. Partition the parameter space into n + 1 different regions labeled k = 0, 1,..., n. For parameters in region k, the optimal strategy recommends undertaking up to only k out of the n projects. On region 0, the optimal strategy recommends never doing any research. For parameters in region 1, the state space Π is partitioned into up to 4 regions; the 3 outer regions represent the sets of beliefs where a single project is recommended, while the (possibly empty) inner region dictates when the DM should give up. For parameters in region k = 2,..., n, the state space is partitioned into k j=1 ( n j ) regions. On each of these, a j-subset of X is recommended, for j = 1,..., k. Singletons are recommended in the neighborhood of the corners of the simplex. In the neighborhood of points with j equal non-zero entries and n j zero entries, j-sets are recommended. For parameters in region n, the full set X is recommended in the neighborhood of the point π = (1/n,..., 1/n). 6

(a) Optimal strategy from figure 2a (b) The DM takes on project x 0 alone (c) The DM takes on both x 0 and x 2 (d) The DM keeps working on x 0 while alternating between x 2 and x 1 ; eventually, if no successes occur, she takes on all three projects Figure 3: Belief and research dynamics under the optimal strategy for n = 3. The arrows pointing to the corners represent the jump in the posterior in the event of success. The lines pointing inward represent the gradual updating of beliefs as the DM works unsuccessfully. On the boundaries of the different regions of the state space, the optimal strategy recommends splitting time equally between the subsets recommended on each of the neighboring regions. This strategy dictates conducting research as follows. Depending on the cost and discountrate ranges, the DM takes on at most k out of the total n projects at a time. If the posterior falls sufficiently near a corner of the simplex, the DM is sufficiently confident about the corresponding project and focuses on it. For posteriors along the faces of the simplex and nearby but sufficiently far from the corners the DM takes on multiple projects at once, the ones corresponding to the given face of the simplex. If the cost and discount rate fall in region n, namely if the cost is sufficiently low or the DM is sufficiently patient, this strategy dictates working on all projects at once when the DM is sufficiently unsure about the projects even though simultaneous research is more expensive, and ultimately only one project 7

can succeed. The dynamic of beliefs and research when parameters fall on the n region are as follows. In the neighborhood of the corners of the belief simplex, the optimal strategy recommends focusing on the favored project. Thus, if the DM is sufficiently confident in a project, she starts working on it exclusively. As long as she does not encounter a success, she becomes gradually pessimistic about this project and gradually optimistic about the neglected ones. As her posterior moves inward in the state space, she progressively takes on additional projects one at a time (possibly alternating between sets of projects along the boundaries), without abandoning the failing project. Eventually, she takes on all of the projects simultaneously until the fruitful one is identified. References Francetich, A. (2016). Managing multiple research projects. Working Paper. Francetich, A. and Kreps, D. (2014). Choosing a good toolkit: An essay in behavioral economics. Working Paper. Klein, N. and Rady, S. (2011). Negatively correlated bandits. Review of Economics Studies 78:693 732. 8