Multiproduct Firms, Information, and Loyalty

Multiproduct Firms, Information, and Loyalty Bharat N. Anand Harvard Business School Ron Shachar Tel-Aviv University Abstract The information set of consumers who are uncertain about product attributes is a central ingredient in the analysis of marketing researchers and the strategies of marketing practitioners. Here, we suggest that the profile of a multiproduct firm is an important element in the information set about each one of its products. As an example of such profiles, Honda is known to produce gasoline-efficient cars, whereas Volvo is perceived to focus on safety. The model includes product differentiation and heterogeneous consumers who are uncertain about product attributes. Like previous studies, a consumer s information set for a product includes noisy signals she obtains about the product s attributes, and the parameters of her prior distribution. Unlike previous studies, her prior distribution of a product s attributes depends on the firm s profile (which is defined as the distribution of attributes across the firm s products). We show that by revising the information set in this way, one can explain interesting empirical regularities concerning consumer behavior. For example, the consumer is loyal to a multiproduct firm even at times that it does not offer a product that matches her preferences better than competing firms. Our model also offers a parsimonious way of thinking about brand extension strategies and maps new channels of spillovers within a multiproduct firm. We estimate the model and test its implications using panel data on television viewing choices. The estimated model also allows for state dependence parameters and unobserved heterogeneity. Our structural estimates come from maximizing a simulated likelihood function and using importance sampling. The empirical results support the model and its implications. We find that the profile of a multiproduct firm is an important element in the information set of consumers. Finally, we illustrate the empirical bias in standard choice models that do not structure the information set as suggested here. We are grateful to Dmitri Byzalov for outstanding research assistance, and to Luis Cabral, Zvi Eckstein, Tulin Erdem, Barry Nalebuff, ArielPakes,PeterReiss,ManuelTrajtenberg,andseminarparticipantsatEconometrics in Tel-Aviv, Marketing Science 2001, Harvard, and Yale for helpful comments. Soldiers Field Road, Boston, MA 02163. Phone: (617) 495-5082; Fax: (617) 495-0355; email: banand@hbs.edu. Faculty of Management, Tel Aviv University, Tel Aviv, Israel, 69978. Phone: 972-3-640-6311; Fax: 972-3-640-5621; email: rroonn@post.tau.ac.il. 1

Key words: multiproduct firms, incomplete information, consumer loyalty, unobserved heterogeneity, brand extensions, television networks. 2

1 Introduction The information set of consumers who are uncertain about product attributes is a central ingredient in the analysis of marketing researchers and the strategies of marketing practitioners. Here, we suggest that the profile of a multiproduct firm is an important element in the information set for each one of its products. 1 For example, consumers are likely to expect that any Volvo automobile is safer than most cars. We show that by revising the information set to include firms profiles, one can explain interesting empirical regularities concerning consumer behavior. We test our model using a dataset on television viewing choices and find that the informational role of multiproduct firms is both statistically and behaviorally significant. Outline of the Model In today s economy, virtually no firm produces one product only. 2 In many cases, these firms have distinct profiles. Honda, for example, is known to produce gasolineefficient cars whereas Volvo is perceived to focus on safety. Including a firm s profile in the information set is ignored by previous studies of brand choice. Yet, these studies reveal key elements of the information set, such as word of mouth, media coverage, advertising, and previous experience with aproduct. 3 One avenue through which information flows within a multiproduct firm is revealed by Erdem (1998). She shows that consuming aproductaffects a consumer s information set about other products of the same firm. 4 Although our model differs from previous ones in the structure of the information set, our setting is somewhat similar to many of these studies. Specifically, we model product differentiation and heterogeneous consumers who are uncertain about product attributes. Their utility is a function of the match between their tastes and product attributes. For example, families like vans more than singles do. In previous studies of multiproduct firms, each firm offers a portfolio of products at each point in time. In contrast, in the model studied here, at each point in time t there are J multiproduct firms, andeachofthemoffers a new product. Thus, each product is only consumed once in the time frame of our study. 5 1 pro file (prfl) n. A set of data portraying the significant features of something; a concise biographical sketch. (Merriam-Webster s Collegiate Dictionary, 2001). 2 See Ogiba (1988) and Kesler (1989). 3 See, for example, Eckstein, Horsky, and Raban (1998), Erdem and Keane (1996), Crawford and Shum (2000), Shachar and Anand (1998), and Anand and Shachar (2001). 4 We highlight the differences between Erdem s approach and ours later on. 5 Multiple appearances of a product serve studies that focus on dynamic learning from experience, but not ours which is focusing on the prior distribution. These differences express themselves in the data sets as well. We use data on purchases of multiple products of the same firm, while previous studies have used data on multiple purchases of the same product. 3

Following previous studies, we assume that consumers receive unbiased noisy signals on the attributes of each product. These signals come from previous experience with the good, media coverage, and other sources. Under these assumptions, a consumer s expected utility from a product is a function of these product-specific signals and her prior beliefs about product attributes. We depart from previous studies by hypothesizing that these prior beliefs depend on the profile of the multiproduct firm. A firm s profile is characterized by (a) the mean attributes across all products that it offers and (b) by the variance of these attributes. Hereafter, we refer to the first characteristic as the firm s image and to the inverse of the variance as the firm s precision. For example, if automobiles were to have only one attribute miles per gallon then the profile of any automaker would be completely characterized by the average miles per gallon across all its models and by the variance of this variable. For any given product, the consumer s prior distribution of its attributes will then depend on this overall profile. In particular, the expectation of the prior is equal to the match between the consumer s taste and the mean attribute of the firm. We show that the purchase probability of a product is a function of (a) the match between the consumer s taste and the product attributes observed by the researcher, and (b) the match between the consumer s tastes and the firm s image. Notice that the purchase decision depends on the firm profile even if the individual previously did not consume any of this firm s products. The prediction that the purchase probability depends on the firm s profile differentiates the suggested model from previous ones. The model, and other testable implications, are presented in section 3 where we show that the effect of a firm s profile on the purchase probability is not fixed. This effect should be smaller for consumers who are relatively well-informed about a specific product. It is also expected to be smaller for better-known products. Finally, the effect should be smaller for firms with a diverse product profile. Implications The behavioral and managerial implications of the model are presented in detail in sections 3 and 6, and briefly here. First, the model introduces a new source of consumer loyalty. As mentioned above, the purchase probability depends on product attributes and on a firm s profile. While the first element is product specific, the second is common to all products of a given firm. This common element generates loyalty to multiproduct firms. This loyalty expresses itself in an individual s tendency to purchase a product from the firm whose image best fits her taste even when the specific product does not match her preferences better than the products of competing firms. (We call this phenomenon excess loyalty. ) For Although the setting of a firm offering a single product in any period might seem restrictive, it can be reinterpreted as a multiproduct firm offering a portfolio of products at a point in time, witht indexing the product categories. 4

example, since it is virtually impossible to know whether you like a book before you read it, individuals frequently make decisions based on a writer s style. Thus, a consumer who likes Ernest Hemingway s no-nonsense writing style on themes such as war, blood sports, and crime, and about heroes who confront tremendous odds with grace under pressure is likely to purchase his book Green Hills of Africa. This consumer would turn out to be surprised. Notice that although such behavior leads the individual to suboptimal choices in some cases, it is still optimal behavior given her uncertainty about product attributes. A consumer s excess loyalty is explained in previous studies by including an unobserved individual-firm match parameter. Thus, while the traditional approach leaves the explanation of loyalty in a black box, we present a behavioral foundation for such loyalty. Our explanation, based on the informational role of multiproduct firms, thus contrasts with the traditional statistical solution (unobserved heterogeneity) to explain excess loyalty. The behavioral and statistical sources of loyalty have different managerial implications. Note that our model includes both sources of loyalty (behavioral and statistical) as well as state dependence parameters. Each of these sources can generate consumer loyalty. In section 4, we demonstrate how these different sources can be distinguished using a panel dataset. The second implication of the model relates to the issues of brand extensions and brand alliances. In section 6, we illustrate the consequences of extension decisions on a firm s market share. When considering an extension, a brand manager is concerned typically about the spillover effects on the firm s other products. It turns out that these spillover effects are richer than one might expect. Consider the following case. A firm adds a new product that does not change its image. While it may seem that such a change should not affect the market share of the firm s other products, we show that it does. The reason is that such a change increases the firm s precision, and as a result, consumers place less weight on the signals they receive on each product. Thus, our model offers a parsimonious way of thinking about brand extensions and brand alliances in terms of their effectonthemeanandvarianceofafirm s profile and maps new channels of spillovers. Third, we also illustrate the empirical bias in standard choice models that results from not including the firms profiles in the information set. Our Monte Carlo experiments, using a specific and reasonable set of parameters, reveal that this bias is significant. The estimate of the consumer taste parameter is downward biased by about 40%. Empirical Application Even though our model is relevant to several markets, it describes the media and entertainment industries particularly well. In the last decade the television, movie, music, 5

publishing, and new media industries have been growing rapidly. 6 The relevant characteristics of these industries, as described in detail later, are: (1) consumers are bound to be uncertain about product attributes (because of constant changes in product attributes); (2) there is high product differentiation; (3) heterogeneity in consumer preferences is large; and (4) multiproduct firms with distinct profiles happen to be the norm in these markets. 7 Thus, the setting of our model, while relevant to many industries, is especially germane here, and might be useful to researchers and practitioners seeking to understand consumer choices and firms strategies. Since the model is especially relevant to the entertainment industry, we test the model based on data from the television industry. By using a panel dataset on television viewing choices from 1995 and data on show attributes, we estimate the model and test its implications. In our application, the products are television shows and the major multiproduct firms are the four national television networks. 8 We describe the datasets in section 2 and estimate the model and test its implications in section 4. The relevance of the model and of the various applications detailed above depends on its empirical validity. The results, presented in section 5, show that the data support the model. Specifically, our non-structural estimation finds that firms images affect product viewing choices. It turns out that this effect is as important as the effect of product attributes themselves. This evidence suggests that the new source of information introduced in this study is both statistically significant and behaviorally important. Our structural estimation, based on maximum simulated likelihood methods, reveals that these results hold even when all the restrictions of the model are imposed. For example, one restriction concerns the distribution of the unobserved signals on show attributes. This and other restrictions require multivariate integration of the likelihood function. To deal with the resulting complexity, we employ importance sampling. The structural estimation reveals substantial heterogeneity across viewers and networks in 6 See the survey of technology and entertainment in the November 21, 1998 issue of The Economist. 7 These include, among others, Disney, AOL, NBC, Penguin Books, Julia Roberts, and Harrison Ford. Disney, for example, is family-oriented and thus avoids controversial shows even on ABC, the television network that it acquired. 8 Mankiw (1998) presents the network television industry as a good example of an industry with established multi-product firms. He, like others, refers to such firms as brands, and writes: Establishing a brand name and ensuring that it conveys the right information is an important strategy for many businesses, including TV networks. Furthermore, he reproduces a New York Times article (September 20, 1996) that reads: In television, an intrinsic part of branding is selecting shows that seem related and might appeal to a particular certain audience segment. It means developing an overall packaging of the network to build a relationship with viewers, so they will come to expect certain things from us, said Alan Cohen, executive vice-president for the ABC-TV unit of the Walt Disney Company in New York. These quotes suggest that the television industry seems to be appropriate to examine the empirical questions at hand. 6

the precision and/or number of signals they obtain about product attributes. For example, we find that people are better informed about shows on ABC and NBC. Our estimation also reveals that the informative role of firms profiles is substantial. Specifically, when forming their expected utilities, individuals place the same weight on this information as on all the signals they receive about the products. Notice that this is consistent with the non-structural results. Using our structural estimates and two different measures of loyalty, we find that the informational source of loyalty is more important than that due to unobserved heterogeneity. Related Literature One of the closest studies to ours is Erdem (1998). She examines the extent to which a consumer s perception of product quality is affected by his experience with other products of the same firm. Her paper follows theoretical studies that have focused on the informational role of multiproduct firms, referred to there as umbrella brands (Wernerfelt [1988], Montgomery and Wernerfelt [1992], and Cabral [2001]). While these studies focus on experience goods, our model is relevant mostly to products that have at least one non-experience attribute. 9 Aconsumerwho knows the attributes of such a product is assumed to be certain about her utility from the good. For example, many consumers buy a computer without testing it first. 10 The difference between the focus of previous studies and ours expresses itself in the modeling approaches. Erdem shows that a consumer s perception of product quality depends on his experience with other products of the same firm. Our model predicts that the purchase decision depends on the firm simageevenifthe individual did not consume any of this firm s products previously. Indeed, in the relevant markets for our model, consumers often have an informed perception of a firm s offering even without direct experience. They may have gained this information through word of mouth or media coverage. For example, most consumers are aware of the differences between Japanese and Swedish auto makers even without having driven any such cars. The second, and related, difference between Erdem s study and ours is that her study focuses ontheroleoffirmsassignalsofproductquality. In our setting, the signals convey information both on the vertical attributes of a product as well as on its horizontal characteristics. 11 Over 9 These types of products are sometimes called search goods. 10 The products in our data set, television shows, cannot be categorized as pure search goods, because even a full description of the products cannot resolve all of the uncertainty that an individual has about her utility. However, it is clearly the case that information about the products attributes decreases consumers uncertainty. For example, the uncertainty that an individual faces about a show s attributes decreases when she knows its cast demographics. These attributes allow us to apply our model to the television data. 11 The terms vertical and horizontal differentiation are often used in the industrial organization literature. While all consumers have either positive or negative tastes for the vertical attributes of a product, some have positive, and others negative, tastes for horizontal attributes. 7

time, product diversity has been dramatically increasing in many markets. This factor is likely to intensify consumer uncertainty about the horizontal attributes of products. As a result, the informational role of firms as signals of these attributes has increased as well. Sullivan (1990) is the first to present non-experimental evidence for spillovers in umbrellabranded products. She provides a methodology to analyze spillovers and uses it to measure two instances of spillover effects. Her study shows that negative spillovers resulted from the Audi 5000 s problems with sudden acceleration and that positive spillovers resulted from Jaguar s first major model change in 17 years. Sullivan s approach elegantly exploits cases in which one can easily identify a unique event. Our methodology does not require such natural experiments. Furthermore, while she uses aggregate data, we use individual level panel data that allows us to account for individual-specific effects. The flow of information between a firm and its products has been examined using experimental data. Morrin (1999) shows that brand extensions can modify the perceived profile of a multiproduct firm. Simonin and Ruth (1998) and Park and Srinivasan (1994) demonstrate a similar phenomenon with respect to brand alliances. In another study somewhat similar to ours, Moshkin and Shachar (2001) show that state dependence can be due to asymmetric information and search cost. The main difference between their study and the one presented here is in the aim of the study: they focus on state dependence, which is a specific aspect of loyalty. State dependence is definedasrepeatpurchaseinsequential periods. We, on the other hand, focus on loyalty across any two periods (not necessarily sequential ones). This difference then expresses itself in the modeling approaches. 12 A brief description of the datasets in section 2 is used to motivate the description of the model in section 3. There, we describe the structure of the consumer s information set and the implications of the model. Section 4 discusses estimation issues, and the results are presented in section 5. In section 6, we examine several applications of the model. Section 7 concludes. 2 Data Although the setting of the model is not industry-specific, we start by presenting the data. This approach is intended to make the presentation of the model intuitive. Our data include television viewing choices, viewers demographic characteristics, and show 12 For example, while the information set in Moshkin and Shachar (2001) depends on the previous purchase, ours does not. Also, their model focus on asymmetric information, while ours on the informational role of multi-product firms. 8

(product) attributes. The first two, from Nielsen Media Research, are described first. Show attribute data, created specially for this study, are described subsequently. 2.1 The Nielsen Data We obtained data on individuals viewing choices and characteristics from Nielsen Media Research, which maintains a sample of over 5,000 households nationwide. 13 Nielsen installs a People Meter (NPM) for each television set in the household. The NPM uses a special remote-control to record arrivals and departures of individual viewers, as well as the channel being watched on each television set. Although criticized frequently by the networks, Nielsen data still provide the standard measure of ratings for both network executives and advertising agencies. Although the NPM is calibrated for measurements each minute, the data available to us provide quarter-hour viewing decisions, measured as the channel being watched at the midpoint of each quarter-hour block. The Nielsen data set records specific viewing choices for the four major networks only: ABC, CBS, NBC, and FOX. We focus on viewing choices for network television during prime time, 8:00 to 11:00 PM, using Nielsen data from the week starting Monday, November 6, 1995. Thus, we observe viewers choicesin60timeslots. Figure1providestheprime-timescheduleforthefournetworksover this week. This study confines itself to East coast viewers, to avoid problems arising from ABC s Monday night programming. 14 Finally, we eliminate from the sample viewers who never watched television during weeknight prime time and those younger than six years of age. From this group, we randomly selected individuals with a probability of 0.5, giving us 1675 viewers. The other 1556 are kept for predictive validation. Nielsen also reports the age and the gender of each individual, and the income, education, cable subscription, and county size for each household. The definitions and summary statistics of the variables that we create appear in table 1. 2.1.1 Show Attributes We have coded the show attributes for the relevant week based on prior knowledge, publications about the shows, and viewing each one of them. Following previous studies, we categorize shows 13 Using 1990 Census data, the sample is designed to reflect the demographic composition of viewers nationwide. The sample is revised regularly, ensuring, in particular, that no single household remains in the sample for more than two years. 14 ABC features Monday Night Football, broadcast live across the country; depending on local starting and ending times of the football game, ABC affiliatesacrossthecountryfill their Monday night schedule with a variety of other shows. Adjusting for these programming differences by region would unnecessarily complicate this study. 9

based on their genre and their cast demographics. Rust and Alpert (1984) present five show categories action drama, psychological drama, comedies, movies, and sport and show that viewers differ in their preferences over these categories. We use the following categories: situation comedies, also called sitcoms (31 shows fall into this category), action dramas (10 shows), and romantic dramas (7 shows). The base group includes news magazines and sports events, which Goettler and Shachar (2002) found to be similar. Shows also were characterized by their cast demographics. Shachar and Emerson (2000) demonstrate that the demographic match between an individual and a show s cast plays an important role in determining viewing choices. For example, younger viewers tend to watch shows with a young cast; older viewers prefer an older cast. We use the following categories: Generation-X, if the main characters in a show are older than 18 and younger than 34 (21 shows fall into this category); Baby Boomer, if the main show characters are older than 35 and younger than 50 (12 shows); Family, if the show is centered around a family (11 shows); African-American (7 shows); Female (15 shows); and Male (22 shows). Table 2 and figure 2 illustrate the differences in show attributes across each of the four networks. One can see, for example, that FOX is more likely to air romantic dramas centered around Generation-X characters, whereas ABC is more likely to offer male-starring or family sitcoms catering to baby boomers. These statistics highlight the differences in the network profiles. 3 The Model We start by describing the setting of the model. Then, we introduce the utility function and the information set of the individual. Last, we present several implications of the model setup. 3.1 The setup There are J multiproduct firms. In each period t, each firm offers a single product. Each product of firm j is offered only once within the studied time frame. In the empirical example are 4 television networks, with each network broadcasting only one show within each time slot t. Thus, inthe empirical example J =4;aperiod,t, is also called a time-slot and lasts 15 minutes; and a product is a television show. There are I individuals who are indexed by i. TheyfaceJ +1 mutually exclusive and exhaustive alternatives. The (J +1) th alternative is the no-purchase option. In each period t, individual i makes a choice, C i,t, from among these J +1 options indexed by j. Thus, C i,t = j 10

when individual i chooses alternative j at time t. Since each product is offered only once, individuals cannot learn from experience about product attributes within the studied time frame. 15 Although the setting of a firm offering a single product in any period might seem restrictive, it can be reinterpreted as a multiproduct firm offering a portfolio of products at a point in time, witht indexing the product categories. 3.2 The utility The utility from the first J alternatives is: U i,j,t = X j,t β i +(η j,t + ε i,j,t )+α i,j + δ i,j,t I{C i,t 1 = j} (1) The first element of the utility represents the match between the product s attributes, X j,t, and the individual s preferences, β i. 16 The parameter vector β i is a function of observable and unobservable individual characteristics. The specifics of X j,t and β i for the television example are presented in detail in section 4.1. For example, the interaction X j,t β i includes the element ActionDrama j,t (β Male AD Male i ), where ActionDrama j,t is a binary variable equal to one if the show on network j in period t is an action drama, and zero otherwise; and the binary variable Male i is equal to one if individual i is male, and zero otherwise. If men like action dramas, the parameter β Male AD would be positive. The utility is also a function of the products attributes not observed by the researcher. These are represented by the second element of the utility (η j,t + ε i,j,t ). Common unobserved effects are captured by the parameter η j,t, while transitory and personal effects are represented by the random variable ε i,j,t. Specifically, since some of the attributes are unobserved by the researcher, some of the match elements will be unobserved as well. The parameter η j,t can be thought of as the mean (across individuals) of these unobserved matches and ε i,j,t can be thought of as the deviations from that mean. Another interpretation of the parameter η j,t is the quality of the product. Since the term quality might be misleading when describing television shows, we henceforth use the term unexplained popularity to describe this parameter. 17 15 Consequently, our setting is different from previous studies which examined the role of uncertainty on product attributes (Eckstein, Horsky, and Raban 1988, Erdem and Keane 1996, Erdem 1998, and Crawford and Shum 2000). In these studies, each product was offered multiple times within the studied time frame, therefore the researchers focused on the dynamic learning of consumers through their experience with the product. Here, instead, consumers can rely on the profile of the multiproduct firm to resolve their uncertainty on product attributes. 16 The variable X j,t is a K-dimensional row-vector; and the parameter β i is a K-dimensional vector. 17 In the industrial organization literature the element X j,t β i is called the horizontal dimension of utility, and η j,t the vertical dimension. 11

The other elements in the utility concern its dynamic features. Specifically, the third element of the utility, α i,j, represents the unobserved match between individual i and firm j. Thisparameter is one of the sources of consumer loyalty to a multiproduct firm. It is the only element in the utility that does not change over time (and thus does not have an index t). It appears in individual i s utility with each of the products offered by firm j. A positive α i,j increases individual i s propensity to purchase each one of firm j s products. Earlier, we referred to this unobserved heterogeneity parameter as a black box explanation for loyalty because it represents a statistical (not behavioral) solution to account for consumer loyalty to a firm. The fourth and last component of utility, δ i,j,t I{C i,t 1 = j}, represents the state dependence in choices. The indicator function I{ } is equal to one if the individual purchased the product offered by firm j in the previous period. The parameter δ i,j,t is a function of observable and unobservable individual characteristics, product attributes, and time. There are various reasons for state dependence. For example, after getting used to a firm s product, an individual might find it easier or cheaper to use the subsequent product offered by the same firm. Previous studies of television viewing choices find strong evidence of state dependence even when unobserved heterogeneity is accounted for. 18 Thus, we include this element in order to avoid misspecification of the utility. The specifics of δ i,j,t for the television example are presented in detail in section 4.1. For example, δ i,j,t includes the element δ Female (1 Male i ). If the tendency of women to flip channels were less than for men, the parameter δ Female would be positive. Both α i,j and δ i,j,t leads to consumer loyalty. However, while the effect of state dependence is limited to two sequential periods, the unobserved individual-firm match leads to loyalty in any two periods. The utility from the outside (non-purchase) alternative is: where Y γ i,t U i,j+1,t = Y γ i,t γ +(η J+1,t + α i,j+1 + ε i,j+1,t )+δ out,i I{C i,t 1 =(J +1)} (2) is a vector of individual i s observable characteristics, and the remaining elements are 18 In the television industry this well known phenomenon is called the lead-in effect. Darmon (1976) introduces the concept of channel loyalty and Horen (1980) estimates a lead-in effect, both using aggregate ratings models. Rust and Alpert (1984) use individual-level data to estimate an audience flow model, in which viewers are described as being in one of five states according to: whether the television was previously on or off; if it was on, whether it was tuned to the same channel as the current viewing option; and whether this option is the start or continuation of a show. Shachar and Emerson (2000) allow state dependence to vary across shows and across demographically defined viewer segments. Goettler and Shachar (2002) demonstrate that the cost of switching remains when unobserved heterogeneity is accounted for. 12

analogous to those definedearlierforthej alternatives. 19 Thus, the outside utility is a function of the individual s observed and unobserved characteristics and state dependence. 3.3 Information set The individual is uncertain about X j,t and η j,t. The dynamic nature of the modern economy makes it difficult for consumers to stay informed about the attributes of all the alternatives. Today, product diversity is constantly increasing, and new products are introduced often. 20 The television industry experiences these changes as well. Each fall, the networks introduce new shows and change thetimeatwhichmanyveteran showsareaired. This,obviously,makes itdifficult for viewers to stay informed about the shows that appear on a specific time slot for all the networks. While some information on the attributes of television shows is available in daily newspapers, many other show attributes remain unclear. Uncertainty about X j,t and η j,t leads to uncertainty about (η j,t +X j,t β i ). Since this expression represents the contribution of product attributes to utility, we term it attribute utility. We denote this element as ξ i,j,t η j,t + X j,t β i. A consumer s information set includes (a) a prior distribution of products attributes, and (b) product-specific signals such as media coverage, word-of-mouth, and previous experience with the good. We depart from previous studies by hypothesizing that the prior beliefs depend on the profile of the multiproduct firm. Specifically, the prior distribution of individual i on ξ i,j,t is: 21 ξ i,j,t N(µ i,j, where, by definition, 22 1 ς µ i,j ) (3) µ i,j E t [ξ i,j,t ]=E t [η j,t ]+E t [X j,t ]β i (4) and 1 ς µ = V t [ξ i,j,t ] i,j The multiproduct firm s profile for individual i is characterized by two parameters: (1) µ i,j 19 These characteristics might change over time for reasons that would become clear in Section 4. 20 For example, even in a small country like Israel, there are 3000 new products in the supermarkets every year (Ha aretz, November 1998). 21 We actually assume that both η j,t and X j,t follow a normal distribution, and present the resulting distribution for the attributes utility. 22 E t [ ] istheexpectedvalueacrosstimeslots. 13

and (2) ς µ i,j. We assume that E t [η j,t ], E t [X j,t ] and V t [ξ i,j,t ] are known to all consumers. In other words, while the individual is uncertain about product attributes, she knows the profile of each firm. For example, while most consumers do not know exactly the attributes of each automobile offered by Honda, they know that Honda tends to produce gasoline-efficient cars. 23 Consumers also get product-specific signals through word-of-mouth, media coverage, previous experience with the good, and advertising. Since we do not observe these signals, we can model them as a single signal instead of multiple signals without loss of generality. Specifically, the signal that individual i receives on the product offered by firm j at period t is: where S i,j,t = ξ i,j,t + ω i,j,t (5) ω i,j,t N(0, 1 ς ω i,j ) (6) The unbiased signal is assumed to be noisy, because none of these sources of information is precise. For example, even viewing previous episodes of a show does not provide exact information on the current one, because the focus of the show varies from week to week. While in one episode the plot is centered on romantic issues, the next one might be focused on career matters. We allow the precision of the signal, ς ω i,j,todiffer across individuals, i, andfirms, j. Differences in ς ω i,j across the firms may result from differences in the number of signals individuals receive, since the products of some firms simply may be more well-known than of others. Intuitively, if 1 ς =0(i.e., the signal is not noisy), then exposure to a signal resolves all uncertainty ω i,j about product attributes. If ς ω i,j > 0, by contrast, an individual who is exposed to such a signal would still not be sure about what the product attributes are. 24 23 Similarly, consider the attributes of newspapers (these may include, for example, the amount of space devoted to sensational news items or trash, and the amount of investigative reporting). While the level of each attribute may differ with each daily edition of the newspaper, it is still the case that The New York Times has an image that is quite different from The New York Post. In this example, a newspaper is the multiproduct firm and each issue of the newspaper is a product. Similarly, consider movie stars (where the movie star is the analog of the multiproduct firm, and each movie is a product). Although most movie stars have acted in movies of different genres, it is still the case that most movies in which Bill Murray participates are quite different from Sylvester Stallone s, resulting in distinct images of actors. 24 For example, an individual may know that E.R. is a popular drama set in the emergency room of a hospital, but may know nothing else about the show (e.g., the cast demographics, the show s time slot, the degree of action or romantic content, etc.). 14

3.4 Expected utility Having described the setup of the model, the utility function, and the consumer s information set, we are ready to solve the expected utility. Since the only element in the utility that the individual is uncertain about is her attribute utility, ξ i,j,t, we start by solving the expected attribute utility. Individual i updates her prior using the signal to form a posterior distribution of the attributeutility, ξ p i,j,t N(µp i,j,t, 1 ς p ). The mean, µ p i,j,t, and precision, ςp i,j, of this posterior distribution are i,j given by: 25 h i µ p i,j,t = 1 ς p ς µ i,j µ i,j + ς ω i,j S0 i,j,t i,j (7) ς p i,j = ς µ i,j + ςω i,j (8) where Si,j,t 0 is the realization of the signal. The precision of the posterior, ςp i,j, is the sum of the precision of each source of information. More importantly, the expected attribute-utility, µ p i,j,t,is a weighted combination of the product-specific signal realization, Si,j,t 0,andthemeanofthefirm s profile for individual i, µ i,j. The weight on each element is a positive function of its precision. For example, the weight on µ i,j, which we denote by θ i,j,is θ i,j ς µ i,j ς µ i,j +ςω i,j (9) Themorepreciseistheproduct-specific signal, the less important is a firm s profile in determining the expected attribute-utility from the product. Since Si,j,t 0 = ξ i,j,t + ω 0 i,j,t, where ω0 i,j,t is the realization of ω i,j,t, we can rewrite equation (7) as: where µ p i,j,t = θ i,j µ i,j +(1 θ i,j ) ξ i,j,t + ω 0 i,j,t (10) ω i,j,t ς ω i,j ς µ i,j +ςω i,j ω i,j,t The researcher does not observe ω 0 i,j,t. It is distributed normally with mean 0 and variance σ2 ω 0,i,j ς ω i,j (ς µ i,j +ςω i,j )2. 25 See DeGroot [1989]. 15

Both θ i,j and σ 2 ω 0,i,j can be considered as measures of how ill-informed the individual is. Each of these parameters is a negative function of the precision of the signal, ς ω i,j. For example, whenever the signal is noisy ( ς 1 > 0), we get that θ ω i,j > 0, and thus, the individual relies on the firm s i,j profile when forming her expected attribute-utility. In contrast, whenever the signal is not noisy ( 1 ς ω i,j =0), it follows that θ i,j = σ 2 ω 0,i,j =0and thus, µp i,j,t = ξ i,j,t. In other words, when the signal is not noisy, the individual is fully informed, and thus, her expected attribute utility is equal to her actual attribute utility. Thus, the full information model is nested within our model. Using (4), we can rewrite the individual s expected attribute-utility as: µ p i,j,t = θ i,j E t [η j,t ]+(1 θ i,j ) η j,t + hθi,j E t [X j,t ]+(1 θ i,j ) X j,t i β i + ω 0 i,j,t (11) and her expected utility as: E[U i,j,t ]= θ i,j E t [η j,t ]+(1 θ i,j ) η i,j + hθi,j E t [X j,t ]+(1 θ i,j ) X j,t i β i (12) + ω 0 i,j,t + ε i,j,t + α i,j + δ i,j,t I{C i,t 1 = j} In the next subsection, we derive the implications of this model. In order to assess their novelty, we compare them to the implications of a model that differs from the suggested model in one aspect a consumer s information set is not a function of multiproduct firms profiles. In other words, the prior distribution is ξ i,j,t N(µ 0 i,j, 1 ), where µ 0 ς µ,0 i,j and ς µ,0 i,j are not a function of i,j the firm s profile. It is easy to show that in such a case, the expected utility is: E[U i,j,t ]=η j,t + X j,t β i + α 1 i,j + ε i,j,t + ω 1 i,j,t + δ i,j,t I{C i,t 1 = j} (13) where " # α 1 i,j = α i,j + µ 0 i,j and ω 1 ς ω i,j i,j,t = ς µ,0 i,j + ςω i,j ω i,j,t 3.5 Implications We describe below the implications for an individual s purchase probability, which is directly related to her expected utility. The first implication is that the purchase probability of a product is a function of the multiproduct firm s profile. Since the profile is a function of the set of products offered by the firm, 16

this means that the attributes of any product offered by a firm will affect the demand for any other product produced by this firm. 26 This first implication highlights the spillover effects within a multiproduct firm. The magnitude of the spillover effects is determined by θ i,j (see equation 10). Notice that the spillover effect varies across individuals and firms. Specifically, the effect is a negative function of the precision of the product-specific signals, ς ω i,j, and a positive function of the precision of the firm, ς µ i,j. In other words, the spillover effects are large for firms that offer a homogenous line of products and small for (a) individuals who received many signals and (b) products that are well known. The second implication is that the inclusion of the multiproduct firm profile in the information set leads to consumer loyalty. The expected utility (equation 12) includes the element θ i,j E t [X j,t ]β i that does not vary over time (since E t [X j,t ] is the same in every period). It appears in individual i s utility for each product offered by firm j. Thus, a positive match between the firms image and the consumers taste (E t [X j,t ]β i > 0) increases her propensity to purchase each one of this firm s products. We term this source of loyalty informational attachment. This loyalty expresses itself in an individual s tendency to purchase a product from the firm whose image best fits her tasteevenwhenthespecific product does not match her preferences better than the products of the competing firms. (One might also term this phenomenon excess loyalty 27 ). Notice that the benchmark model seems to include a similar attachment element, α i,j + µ 0 i,j. However, while in the benchmark model the α and µ cannot be distinguished empirically, one can separate these two effects using the approach presented here (as illustrated in section 4.4). Thus, while the benchmark model leaves the explanation of loyalty in a black box, we present a behavioral foundation for such loyalty. Our explanation, based on the informational role of multiproduct firms, contrasts with the traditional statistical solution (unobserved heterogeneity) to explain excess loyalty. 4 Estimation and identification issues We start by specifying the exact functional forms used in the estimation. Then we construct the likelihood function and, finally, we discuss the sources of identification of the model s parameters. 26 Thus, when a firm replaces some of its products with new ones that differ in their attributes from the previous ones, it is altering its image and, therefore, affecting the demand to each one of its existing products. As an illustration, Julia Roberts s image as an actress in light movies may have changed after Mary Reilly. 27 Recall that the model includes additional sources of loyalty: (a) the unobserved individual-firm match, α i,j,and (b) the state dependence parameters, δ i,j,t. 17

4.1 Functional form and structure 4.1.1 Utilities The following equations present each of the elements in the utility for the television example and specify its full structure. Attribute utility We formulate the attribute utility as: η j,t + X j,t β i = η j,t + β Gender I{the gender of viewer iand show j, t is the same} +β Age0 I{the age group of viewer i and show j, t is the same} +β Age1 I{the distance between the age group of viewer i and the cast of show j, t is one} +β Age2 I{the distance between the age group of viewer i and the cast of show j, t is two} +β Family I{ilives with her family and show j, t is about family matters} +β Race Income i I{one of the main characters in show j, t is African American} +Sitcom j,t (β Sitcom Y β i +ActionDrama j,t (β AD Y β i + v Sitcom i ) +RomanticDrama j,t (β RD Y β i + v AD i ) + v RD i ) where Y β i is a vector of individuals observable characteristics that include T eens i, GenerationX i, BabyBoomer i, Older i, Female i,income i,education i,family i,andurban i. These variables are defined in the appendix. The parameters vi Sitcom, vi AD,andvi RD capture unobserved individual tastes for sitcoms, action dramas, and romantic dramas, respectively. The first six β parameters capture the effect of cast demographics on choices. As mentioned above, previous studies demonstrated that viewers have a higher utility from shows whose cast demographics are similar to their own. Thus, we expect to find that: (1) β Age0 > β Age1 > β Age2, (2) β Gender > 0, (3)β Family > 0 and (4) β Race < 0. We use individuals income as a proxy for her race, since information on race is not included in our data set. 28 The taste for different show categories is a function of observed (Y β i ) and unobserved (vsitcom i, v AD i, vi RD ) individual 28 The proportion of African-Americans in the highest income category is disproportionately low, while it is disproportionately high in the lowest income category. This relationship persists for all income categories in between as well (U.S. Census Bureau 1995). Nielsen designed the sample to reflect the demographic composition of viewers nationwide and used 1990 Census data to achieve the desired result. We found that the income categories and the proportion of African-Americans in the Nielsen data closely match those in the U.S. population (National Reference Supplement 1995). Although our data set does not include information about race, Nielsen has it and reports its aggregate levels. 18

characteristics. Each of these interactions between show category and individual characteristics is captured through a unique parameter. For example, the interaction between an Action Drama show and a female viewer is captured via β Female AD. All other parameters are denoted accordingly. While we can, theoretically, estimate an η j,t for each time slot-alternative combination (subject to one normalization), we choose to fix the unexplained popularity parameter (η) for the duration of each show. Consequently, a half-hour show and a two hour movie both have one η parameter. Given our intent in uncovering fundamental attributes of the shows, this is a natural restriction. State dependence parameters The utility presented in equation (1) included a state dependence element, δ i,j,t I{C i,t 1 = j}. Here we specify the structure of δ i,j,t andextendthestate dependence to include another element. Specifically, we formulate the state dependence in the network utility as: δ i,j,t I{C i,t 1 = j} + δ InProgress I{C i,t 1 6= j}i{the show on j started at least 15 minutes ago} where Yi δ δy + Xj,t δ δx + vi δ +δ First15 I{The show on j started within the past 15 minutes} δ i,j,t = +δ Last15 I{The show on j is at least one hour long and will end within 15 minutes} +δ Continuation I{The show on j started at least 15 minutes ago} where the observed variables included in Yi δ are T eens i, GenerationX i, BabyBoomer i, Older i, Female i,family i, and cable subscription status (Basic i and Premium i ), and the vector Xj,t δ includes the following show categories: Sitcom j,t, ActionDrama j,t, RomanticDrama j,t, NewsMagazine j,t and Sport j,t. The binary variable Basic i is equal to one for the one third of the population that has access to basic cable offerings only, and the binary variable Premium i is equal to one for the one third of the population that has both basic and premium cable offerings. The binary variable Sport j,t is equal to one for the sport shows (Monday Night Football on ABC and Ice Wars on CBS), and the binary variable NewsMagazine j,t is equal to one for news magazines (e.g., 48 Hours on CBS). We allow the state dependence parameters to vary across age groups, gender, and family status because previous studies (Bellamy and Walker 1996) find preliminary evidence suggesting differences in the use of the remote control across these groups. We also allow these parameters to 19

differ across individuals for unobserved reasons through vi δ. These parameters can differ across show types as well. For example, we may expect δ to be smaller for sports shows, since there is no clear plot in these shows when compared with dramas. Finally, we allow the state dependence parameters to depend on time. Specifically, we expect δ to be low in the first 15 minutes of a show, since the viewers have not had enough time to get hooked by the show. For the same reason, we expect the state dependence to be high during the last 15 minutes of a show. Furthermore, we expect that the state dependence is higher during a show than between shows (δ Continuation > 0). Last, δ InProgress applies to individuals who were not watching network j in the previous time slot. Since the tendency to tune into a network to watch a show that has already been running for at least 15 minutes should be lower than for a show which hasbeenontheairforlessthan15minutes,δ InProgress is expected to be negative. We formulate the state dependence parameters in the outside utility as: δ out,i I{C i,t 1 =(J +1)} = Yi δ δy + vi δ +δ out + δ Hour I{The time is either 9:00 PM or 10:00 PM} I{C i,t 1 =(J +1)} +δ FOX10:00 I{C i,t 1 = FOX}I{Thetimeis10:00PM} Individual characteristics (Yi δδy + vi δ ) are included in exactly the same way for the outside alternative because they are meant to represent behavioral attributes intrinsic to individuals. Since the outside alternative includes the option to watch non-network shows, we allow the state dependence parameters to change on the hour. Notice that many non-network shows end on the hour, and thus we expect the δ tobeloweratthattime(δ Hour < 0). Furthermore, since FOX ends its national broadcasting at 10:00 PM our data cannot distinguish between viewers who stayed with FOX thereafter and those who chose the outside alternative. Thus, we expect δ FOX10:00 to be positive. Outside alternative Other than the standard observable individual characteristics (age, gender, income, education, family status, and area of residence), the vector Y γ i includes the variables Basic i, Premium i, All i,andsame i,t. The cable subscription status is included here since the outside alternative includes the option of watching non-network shows viewers with basic or premium cable have a larger variety of choices, which can lead to a higher utility. The variable All i is equal to the average time the individual watched television during the previous days of the 20