e ect of location and assortment on category consideration

47
Effect of Location and Assortment on Category Consideration, Learning, and Choice Bhoomija Ranjan * , Paul Ellickson and Mitchell Lovett June 30, 2016 Job Market Paper Preliminary, do not cite without consulting authors first Abstract Retailers aim to maximize profits given the constraints of space and existing in- frastructure. They frequently face the problem of department management, to over- haul the assortments and locations of multiple product categories sharing a common area in the store. Although category and departmental resets are frequently per- formed by retailers, an empirical analysis of the effects of these department layout changes has not been done. To analyze this question, we exploit a “natural” exper- iment of a large-scale departmental reset of dairy in a supermarket store location, with two other stores acting as controls. With a rare dataset of floor plans and category planograms, we characterize 11 reset treatments related to location and assortment changes. Analyzing both aggregate and household-level purchase data, we present descriptive evidence that the reset made a significant (2.6%) improve- ment in sales. We find that the changes affect purchase probabilities through the channel of attention/consideration, and induce learning among customers. We then specify a structural model of demand that incorporates multi-category considera- tion, learning, and choice at the individual-level. The model enables us to leverage the exogenous variation in location and assortments to identify the effects on at- tention/consideration and choice. Preliminary results indicate that the location of the category within the store layout has a significant effect on consideration. We find that being adjacent to popular categories has a negative effect on consideration among customers who have tried the category earlier, but has a much larger positive effect among customers who have never bought the category, thus inducing trial. Our learning estimates also indicate that consumers’ perceptions of category match values are positively biased on average, which leads them to try the product but that far fewer individuals make it a regular feature of the on-going shopping basket. * Doctoral Student in Marketing, Simon School of Business, University of Rochester, Email: [email protected]. A version of this paper will also serve as the first essay in my dissertation. We would like to thank Bowen Luo and Austin Stone for excellent research assistance. We are grateful to Ron Goettler, Garett Johnson, Avery Haviv & Yufeng Huang for their support and in- sightful comments through the project. We also thank Marketing seminar participants at the University of Rochester for their meaningful discussions. All remaining errors are our own. 1

Upload: others

Post on 27-Mar-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Effect of Location and Assortment on Category Consideration, Learning, and Choice
Bhoomija Ranjan∗, Paul Ellickson and Mitchell Lovett
June 30, 2016
Job Market Paper Preliminary, do not cite without consulting authors first
Abstract
Retailers aim to maximize profits given the constraints of space and existing in- frastructure. They frequently face the problem of department management, to over- haul the assortments and locations of multiple product categories sharing a common area in the store. Although category and departmental resets are frequently per- formed by retailers, an empirical analysis of the effects of these department layout changes has not been done. To analyze this question, we exploit a “natural” exper- iment of a large-scale departmental reset of dairy in a supermarket store location, with two other stores acting as controls. With a rare dataset of floor plans and category planograms, we characterize 11 reset treatments related to location and assortment changes. Analyzing both aggregate and household-level purchase data, we present descriptive evidence that the reset made a significant (2.6%) improve- ment in sales. We find that the changes affect purchase probabilities through the channel of attention/consideration, and induce learning among customers. We then specify a structural model of demand that incorporates multi-category considera- tion, learning, and choice at the individual-level. The model enables us to leverage the exogenous variation in location and assortments to identify the effects on at- tention/consideration and choice. Preliminary results indicate that the location of the category within the store layout has a significant effect on consideration. We find that being adjacent to popular categories has a negative effect on consideration among customers who have tried the category earlier, but has a much larger positive effect among customers who have never bought the category, thus inducing trial. Our learning estimates also indicate that consumers’ perceptions of category match values are positively biased on average, which leads them to try the product but that far fewer individuals make it a regular feature of the on-going shopping basket.
∗Doctoral Student in Marketing, Simon School of Business, University of Rochester, Email: [email protected]. A version of this paper will also serve as the first essay in my dissertation. We would like to thank Bowen Luo and Austin Stone for excellent research assistance. We are grateful to Ron Goettler, Garett Johnson, Avery Haviv & Yufeng Huang for their support and in- sightful comments through the project. We also thank Marketing seminar participants at the University of Rochester for their meaningful discussions. All remaining errors are our own.
1
1 Introduction
A central challenge facing modern ‘brick and mortar’ retailers is how to maximize sales given the physical constraints of their space and display infrastructure. To improve space allocation, they “reset” the layout within the store, including whole categories or depart- ments. According to the Food and Marketing Institute, the average supermarket makes ∼ $12 per square foot of selling area per week. Even a small 0.5% increment in sales due to better space optimization is significant compared to the average industry profit margin of 1.5%. The retailer has to decide the space allocation for a set of categories within a common location in the store, such as the produce section or dairy section. This task of department management not only involves deciding what assortment each category should carry (category management), but also where they should be displayed within the physical context of the store itself (layout planning). Not only are the profit implications of department management important, but ease-of-navigating the store is a major factor for 92% of consumers in choosing their primary store (FMI, 2012[10]). Hence, department management is a vital problem for the retailer.
The narrower problem of category assortment management has been the subject of a large number of marketing studies. The seminal work by Dreze, Hoch and Purk (1994)[13] analyzes an experiment involving 60 stores in the Dominick’s Finer Foods supermarket chain that investigates the efficient allocation of space within a category and what strategies to use for optimal assortment. They find that categories are over-allocated in terms of shelf space, and prescribe removing lower-performing SKUs from the shelves. This result is also corroborated by a number of later studies (Broniarczyk, et al. (1998)[3], Boatwright & Nunes(2001)[2], Gourville & Soman (2005)[12], Chernev (2005)[5]), and a Nielsen report indicates that 40% of retailers implemented this basic suggestion (Nielsen, 2010[22]). Later work (Kahn & Wansink 2004)[16] clarified that it is perceived variety that has a positive effect on consumer perceptions and sales, and that not all SKUs contribute to the perception of variety. Attention also shapes consumers’ perceptions of assortment and choice in a category. Broniarczyk, et al. (1998)[3] point out that the availability of a consumer’s favorite product(s) attracts their attention and moderates the negative effect that a reduction in SKUs has on assortment perceptions. Chandon, et al. (2009)[4] use eye-tracking experiments to show that in-store marketing factors, such as the number and position of facings, and the horizontal and vertical positions of items, have a significant effect on attracting consumer attention within a category display. The existing papers have done analysis on single categories alone, or on an independent category-by-category basis.
However, no category is an island. Category locations, assortments, and merchandiz- ing can have effects on neighboring categories as well. Dreze, Hoch & Purk (1999)[13] and Bezawada, et al. (2009)[1] both find that bringing known complements closer together increases the sales of each, and Hong, Misra & Vilcassim (2015)[14] find that categories have positive effects of assortment on own sales, and negative effects on neighbors’ sales when they are neither complements nor substitutes. The authors argue that these results can be explained through categories competing for the consumer’s attention, and corrob- orate this theory with online eye-tracking experiments. However, barring Dreze, Hoch & Purk (1994)[13], the field studies do not have exogenous sources of location variation variation.
2
In this paper, we investigate how category location, assortment, and merchandizing shape consumers’ choices of the category and its neighbors. We are aided in this enterprise by a unique ”natural” experiment of a departmental reset performed by a major retailer in one of her stores. The layout change involved a major reorganization of 15 dairy categories. The reset was instigated by an unrelated technology upgrade that was not specific to dairy, meaning that the timing and store selection for this reset is plausibly exogenous. The basic idea for identifying the desired location and assortment effects is to take advantage of this (conditionally) exogenous variation in locations and assortments induced by the reset, and compare the outcomes with control stores that did not undergo any changes. The essential identification strategy utilizes a “diff-in-diff” and “diff-in-diff- in-diff” approach.
Our analysis exploits a unique dataset that includes the department’s physical layout, as well as planograms that provide information on the exact location of every item in the department. These documents allows us to construct data, for example, on what SKUs moved/changed arrangement, the adjacency of categories, and what type of display each category was contained in. We characterize the reset changes faced by the dairy categories through a set of 11 ‘reset treatments’ that affect location and assortment. We use this information, along with individual and aggregate sales data, to analyze the natural experiment.
Our experimental analyses strongly indicate that location influences category choice, which we theorize as operating through attention and consideration. For instance, our re- sults show that moving to high(low)-traffic locations improves (reduces) category purchase probabilities. Similarly, increasing facings and the visual space the category occupies also increases category purchase. These results tie in well with previously established results in the literature (e.g., Chandon, et al. (2009)[4]). We also find differences between the short and long-term effects of the reset, that are directionally consistent with consumer learning.
We then develop a structural model of consumer category purchase decisions that involves consideration, choice, and learning. This model is developed to explain the pat- terns observed in the data and to capture the underlying consumer processes that drive them. Such a model can help shed light on the cross-category and inter-temporal dynam- ics observed in the data. In our model, we build on the existing literature and hypothesize that consumers evaluate categories once they ‘pay attention’ to them. We assume that consumers have a passive attention process that shapes consideration. The level of at- tention is influenced by the category’s location, assortment, and the characteristics of its neighbors, among other factors. Hence, we integrate into a structural model the adjacency effects noted by Hong, Misra & Vilcassim (2015)[14].
Our model follows a “consider-then-choose” framework, shared by many papers in the consideration set literature. This general framework incorporates the idea that con- sumers’ effective choice sets do not include all the brands/SKUs in the category. Rather, they restrict their attention to a smaller set of items, the consideration set, from which they ultimately make their choice. The literature contains many different assumptions re- garding what makes consumers consider certain items, but not others. Attention focused explanations include ones focusing on advertising (Goeree 2008[11], Draganska & Klapper 2011[8], Terui, Ban & Allenby 2011[27]), in-store display and feature ads (Nierop, et al. 2010)[28], and the shelf space and position (Dehmamy & Otter 2015)[7]. By excluding
3
this variables that influence attention from the choice process, the model can separately identify the consideration and choice processes.1 Our identification focuses on location as being excluded from consumption utility.
We further differentiate from the existing literature by modeling consideration and choice among multiple categories, rather than a single one. In that sense, our model also relates to the literature on multi-category purchases. This literature mostly focuses on correlations between categories, without accounting for their in-store locations. Man- chanda, Ansari & Gupta (1999)[18] study the cross-price effects of two sets of known complementary categories (laundry detergent & fabric softener, cake mix & cake frosting) and point out that correlations might exist between two non-complementary categories (they call it co-incidence), because these categories might be bought in the same shopping trip. Others in this literature focus on cross-category price effects with aggregate data (Song & Chintagunta 2006)[25]) and consider the role of budget constraints (e.g., Mehta 2007[19] and Mehta & Ma 2012[20]). Thus, we integrate the multi-category literature with the consideration set literature.
Finally, because we find evidence of learning and dynamics in the preferences over time, our structural model also incorporates consumer learning in the form of a Bayesian learning model (Erdem and Keane, 1996[9]; Narayanan & Manchanda, 2009[21]; Shin, Misra & Horsky, 2012[24]). However, in our model, each consumer learns about each cat- egory’s match quality through their own consumption experiences. Integrating learning into the complete model allows a rich inter-temporal response to location changes. Such changes can increase attention to and consideration of a category, which for newcomers can induce trial, a short-term positive effect. However, trial leads to consumer learning, which can increase or decrease purchases in the long-term, depending on the distribution of the“true” quality in the consumer population. Having both consideration and learn- ing influence the choice process enables the structural model to predict flexible dynamic patterns of response to the reset, including persistent decreases in category purchase in- cidence due to lower attention or worse assortment; or an initial increase in purchases due to increased attention, followed by a decrease due to learning that the category is worse than expected; or a small initial increase in purchases due to rising consideration, followed by further increases, suggesting the category is better than expected.
We apply our structural model to a subsample of 2,100 individuals. Our preliminary results support our theory that attention influences category consideration. We find that there is an “attention-stealing” effect if a category is placed next to a high-frequency pop- ular category. This result corroborates the findings of Hong, Misra & Vilcassim (2015)[14]. However, if a consumer has never bought a category, then its chances of being tried by the consumer will be significantly improved by placing it next to a category that the consumer already purchases frequently. Therefore, we find a dual role played by categories, some- times bringing attention to their neighbors, and at other times stealing attention from them, depending on the consumer’s prior knowledge of the category. Apart from this, we also find significant effects of assortment on consumption utility. Finally, our results indicate an important role for learning, but only among a subset of consumers with rela- tively little experience in a category. We find that consumers are generally biased upward from their true match qualities, which encourages them to try new categories upon con-
1Dehmamy & Otter (2015)[7] also make an argument that purchase incidence vs. quantity purchases can be used to identify the consideration process.
4
sideration. However, post-trial, many of these consumers realize that the (match) quality is not as high as they expected, leading to lower retention. This is consistent with the motivation for a reset–that there is untapped potential that can be exploited by shifting attention to existing categories. Understanding which categories have this potential and which do not can assist managers in planning future resets. Hence, these structural model results form the basis of our planned counterfactual analyses, which can guide the retailer in choosing how to improve the department layout.
The rest of the document is organized as follows. Section 2 provides a conceptual framework for our analysis and structural model. We present our data in Section 3, and our experimental analysis and preliminary evidence in Section 4. Section 5 lays out our structural model. We discuss the details of the structural model and estimation in Sections 6 and 7. Finally, Section 8 concludes.
2 Conceptual Framework
Our work highlights the role of category location and assortment in influencing consumer attention, category choice and learning about category quality. Figure 1 presents the framework we use for our analysis and structural model.
Figure 1: Model Conceptual Framework
We conceptualize consumers’ shopping trips to involve considering and choosing among multiple categories. Our focus is entirely on the level of the category (i.e., we do not model brand choices within the category). The key variables we study are assort- ment, location, habit (state dependence), quality expectations, and price. We explain the role of each variable in our conceptual framework based on a three stage view of category choice process.
The first stage is consideration. Following the literature on category assortment effects (Chandon, et al.,2009[4]), attention is a central influence on consideration.2 An
2An alternate way of modeling this would be a more “rational expectations” type approach where
5
individual considers a category only if the level of attention the category obtains is suffi- ciently high.
We conceptualize attention as a passive process that is influenced by assortment, location, in-store promotions, and habit (state dependence) (see arrows pointing to con- sideration/attention). Assortment can influence attention through the size of the category. For instance, the number of facings assigned to a category proxies for the total visual space that the category occupies in the store. If a category occupies a large area visually, it may be more likely to attract consumers attention (Chandon et al.,2009[4]). Similarly, a greater number of brands and items can produce more or less visual attention due to numerosity perception biases related to subgrouping (Krishna, 2008[17]). Location influ- ences attention through the type of display case, the orientation of the category relative to the general store traffic patterns, and the adjacent categories. For instance, consumers may notice categories put in high-traffic areas or in a certain style of the display case. Ad- jacent categories can generate attention spillovers, whereby the high frequency category that obtains a large amount of attention steals attention from neighboring categories. Al- ternatively, such a category might lead to positive spillovers to neighbors, especially ones that are not known as well. In this case, while visiting the shelf for one category, the indi- vidual notices (perhaps for the first time) an adjacent display. Relatedly, shopping habits shape the attention through greater awareness and memory for the category as well as repeated patterns in how the individual generally moves through the retail environment. Finally, in store promotions can shape attention. In our context, the prominent type is shopper club discounts which are labelled in the aisle.
The second stage of our process is choice given consideration. We follow the literature on consideration (Draganska & Klapper 2011[8]) and take consideration to mean that the individual will make a conscious choice about the option (category) that is considered, which could be to not buy from the category. During this process, the consumer examines prices, recalls category quality expectations, and evaluates any observable attributes (e.g., the assortment of brands and items). We represent the inputs to this process via the arrows pointing into choice (i.e., assortment, price, and quality expectations). We note that an important distinction in our framework is that location does not directly influence the utility of a choice, but rather influences only attention. We argue that this is on the surface a good assumption for products like those we study in the dairy department.
The third stage in the framework is the post-purchase process that involves con- sumption and the resultant changes in consumer expectations and habits. Purchase and consumption generates experience that provide signals of the true quality and causes the consumer to update their expectations about the category quality (e.g., Shin, Misra, & Horsky 2012[24]). In addition, purchase and consumption shapes memory and habit that can lead the category to receive greater attention in future shopping visits. For example, having visited a category in recent days could trigger memory and resultant attention to the category as the individual walks through the department containing that category.
consumers decide to consider only if the expected payoff from choice is high enough (e.g. Seiler, 2013[23]). We choose to focus on the passive attention approach both because of the broad support for visual attention being central in the existing literature and because in our context we have limited price variation needed to identify such a search model.
6
3 Data
The data for this project comes from a major U.S. supermarket chain. Due to confiden- tiality requirements, the name of the chain and some details of the data are hidden. The data and analysis focus on a reset of the dairy department in a particular store of the chain that occurred in the last week of August, 2015. In addition, we obtain data on two other stores, which were selected by the chain as the best “control” stores.
The dairy department in this chain consists of 22 product categories. These account for ∼ 10% of daily store sales, and have 1550 unique SKUs across stores. The categories and their daily dairy share are presented in Table 1. Yogurt, milk and shredded cheese are the main revenue generators in dairy. The product categories are defined by the chain for operational reasons, but they largely follow definitions based on substitution. For ex- ample, the categories “Milk,” “Yogurt,” “Hummus,” “Butter & Margarine” contain close substitutes. However, others do not with “Pasta & Sauce” being the most problematic because the prices differ markedly between pasta and sauce, and these two sets of prod- ucts are complements in many usage situations. Despite these concerns, throughout we use the product categories defined by the chain for consistency with the data provider.
Category Avg. Daily Category Avg. Daily Category Avg. Daily Dairy Dairy Dairy
Share (%) Share (%) Share (%)
Margarine Hummus 2.34 Plant Based 2.27
Chunk Cheese 6.54 Juice 6.95 Beverages
Cold Cuts 7.17 Lunchables 1.08 Refrigerated 3.14
Cottage Cheese 2.70 Milk 14.92 Baked Goods
& Ricotta Pasta & Sauce 0.76 Shredded Cheese 10.68
Cream Cheese 2.67 Pepperoni Rack 0.99 Sliced Cheese 2.22
Dairy Creamer 6.11 Pepperoni 0.42 Sour Cream & Dips 2.33
Desserts 1.60 Stick Rack Yogurt 16.51
Table 1: Descriptive statistics of dairy categories
The data we obtained for this research consist of three distinct datasets. The first contains floor plans and planograms for the entire dairy department of all three stores. This rare dataset provides a detailed accounting of every inch of shelf-space in the dairy department, allowing us to track where items (i.e., stock-keeping units or SKUs) were located, how many facings (the visual space of an item on the shelf) items have, and how much the items moved during the reset. The second dataset contains the census of primary shoppers from the individual-level shopper-club data for the three stores. This shopper club information contains visit-level sales data for every purchase each individual made over the sample period. The third dataset contains daily aggregate point-of-sale (POS) data on unit and dollar sales for each store. In the remainder of this section, we discuss each of these datasets in turn.
7
3.1 Location Data
The location data contains detailed floor plans and planograms for the dairy department for the treatment and control locations. These floor plans and planograms provide a snapshot of how dairy was arranged within the respective stores between May 1 - Dec 31, 2015. The reset occurred during the last week of August, which provides us approximately 120 days before and after the reset when we know the exact floor layout.
We are limited in exactly what we are able to describe regarding the reset, but we can say that one part of the floor layout change involved replacing a central 40 ft. refrigerated coffin-style cheese bar with a 56 ft. regular cheese bar. At the same time as this change, many other category shelf-space and location changes occurred. Figure 2 shows an example store floor plan without any category indicators.
Figure 2: Changes in treatment store dairy layout before and after reset
The floor layout plans are very useful, but don’t provide the detailed data on SKUs. We also obtained planograms for each of the categories. A planogram is a map of how items are arranged within the category. Figure 3 shows a sample planogram. Each (colored) box represents a facing where a SKU is displayed. The planogram also records Item IDs (hidden in the figure) that can be linked back to the respective SKUs.
Together, the planograms and floor plans provide two important dimensions that are not possible in standard scanner datasets. First, it provides rich location information about every item of every category in dairy. We are able to attach to each location what locations are neighbors (i.e., adjacency), and what type of display case that location uses (e.g., side case, wall case, cheese bar). Further, we can characterize where on the shelf each item is. Second, it provides information about facings. With this information, we can characterize not only the number of items and brands in a category, but also the number of facings.
8
3.2 Individual-Level Data
The individual-level data is a shopper-club panel of dairy purchases for 61,509 consumers who constitute the primary consumers of the three store locations - Treatment, Control A and Control B3. The data includes the date and store visited, categories purchased, and units and dollar amounts spent in the process. The panel data customers constitute ∼ 30% of the stores’ daily traffic and ∼ 80% of daily dairy sales during May-Dec 2015.
We further augment this panel data with historical dairy purchases for the same individuals going back one year to April 2014. Using this panel data, we construct stan- dard state-dependence measures of recency (such as weeks since last category purchase or weeks since last shopping trip) and frequency (such as no. of purchases within the cate- gory in the past 180 days). We also record if the individual has purchased at all within a category since April 2014. Table 2 shows descriptive statistics for these measures for the dairy categories. Yogurt is the most purchased as well as the most recently purchased category on average, whereas Pickles & Salads is the least recently purchased category on average.
3.3 Aggregate Data
The aggregate data consists of daily SKU-level sales for the dairy categories of the treat- ment and control locations from May 1 - Dec 31, 2015. The data has information on SKU attributes such as brand and size, and records daily unit and dollar sales for each SKU. We describe our operationalization of price and promotion variables below.
3A consumer’s primary store is defined as the store in the supermarket chain where she spends the maximum dollar amount in the past 52 weeks.
9
Mean weeks since Mean no. of purchases % of customers last purchase (sd) in past 6 months (sd) never bought*
Butter & Margarine 9.03 (12.76) 5.29 (5.00) 32.61 Chunk Cheese 13.25 (15.89) 3.96 (4.58) 40.68 Cold Cuts 15.16 (18.00) 3.38 (4.04) 44.96 Cottage Cheese & Ricotta 17.12 (17.87) 2.77 (3.80) 53.10 Cream Cheese 16.29 (17.12) 2.52 (2.98) 47.14 Desserts 21.72 (20.10) 1.94 (3.20) 61.56 Hummus 17.76 (18.83) 2.68 (3.47) 62.49 Lunchables 23.78 (22.26) 2.39 (4.12) 84.03 Pasta & Sauce 25.86 (20.68) 1.16 (1.89) 81.76 Pickles & Salads 27.58 (22.86) 1.24 (1.99) 79.78 Refrigerated Baked Goods 16.90 (16.93) 2.58 (3.30) 51.53 Shredded Cheese 9.10 (13.21) 6.54 (6.46) 33.30 Sliced Cheese 18.08 (19.08) 2.43 (3.21) 60.78 Sour Cream & Dips 13.49 (15.78) 3.34 (3.55) 42.58 Yogurt 7.82 (13.05) 9.71 (9.20) 31.66
* Proportion of customers not purchased in category from April 2014 to the start of data
Table 2: Descriptive statistics of recency and frequency measures
3.3.1 Prices
Constructing SKU-specific prices is straightforward using the above information4. The primary challenge at the SKU-level is that we must normalize prices to $/oz. or $/fl.oz. for comparison across heterogeneous unit sizes within category. However, constructing category prices introduces significant challenges.
The retailer follows an every-day low price (EDLP) strategy, resulting in little vari- ation in SKU-level gross prices over time. The retailer does provide some shopper club discounts, and consumers sometimes use coupons, resulting in small variation in net prices (gross price minus the discount). Figure 4 shows the weighted average gross and net prices in the yogurt category in the Treatment and Control A stores during the sample period5. The average gross (net) price of yogurt changes post-reset by less than 0.4 (0.5) cents. We also note that much of the change in the treatment store, where the variation is larger, reflects assortment changes, which is not a meaningful source of price variation for estimating price sensitivity.
Despite the limited variation in price, and the fact that it is not central to our study, we conservatively include price to control for potential omitted variable bias.6 To
4For days that a SKU sold no units, its price was imputed with the price from the next day with positive sales. On two days, Oct. 10-11, 2015, the treatment store had a refrigerator malfunction, which was both highly unusual for the store. Hence, we dropped these two days from our sample.
5We first calculate average prices for each subcategory within the category, and the weights are chosen to be the sub-category’s share of category sales. The prices remain very similar if we do this two-step process at the brand or SKU-level.
6However, we strongly caution against a causal interpretation of our price sensitivity estimates for three reasons. First, unlike the typical variation sought to estimate prices, the primary variation we obtain with the above measure is across individuals, rather over time within individual. Second, the pricing measure subsumes into the prices the individual differences in product preferences into the prices. For instance, individuals who consistently buy items that are priced above the average price of the category due to item-level preferences, could appear to have positive preferences for price. Third, as mentioned earlier,
10
produce price series that aggregate over SKU prices to the category level, we combine two suggestions offered by Manchanda, et al. (1999)[18].7 Specifically, for the visits with purchase, we use the price paid as the category price, otherwise we use the individual- specific weighted price where the weights are the share of brands ever bought by the individual (since April 2014). If the individual has never purchased the category, we use the weighted net price calculated from the aggregate data. Although we include price as a control in order to be conservative, we do not interpret the estimated coefficients as price sensitivity per se.
(a) Treatment Store (b) Control A Store
Figure 4: Comparison of gross and net price in yogurt
3.3.2 Promotions
The retailer does not employ display advertising for dairy products and follows an EDLP strategy. However, there are periods when the shopper club card or coupons provide discounts, which we can identify based on differences between the reported gross and net prices. The data do not report what portion of the monetary discounts are in the form of promotions (shopper club discounts) or coupons. For the purposes of the present analysis, we interpret these monetary discounts as shopper club discounts.8 Such shopper club discounts on items are marked with stickers, they may draw consumers’ attention towards the category, and, of course, the discount itself can increase the net utility of purchase through price.
some category definitions (e.g., Pasta & Sauce) are not consistent with a product substitution definition for the category. As a result, price sensitivity may be confounded with complementarity leading to more price-seeking consumers.
7They offer two alternative approaches: an individual-weighted average price of brands, where the weights are the share of brands bought by each individual, and an approach that uses as the category price the price paid when an item is purchased, and otherwise it uses the weighted average price.
8Even if these monetary discounts were all through coupon redemptions, it does not pose much of a problem. Instead of our current assumption that the promotional stickers catch the consumer’s eye and attract their attention, we would need to assume that all consumers are aware of the coupons for each category. The presence of a coupon then attracts their attention towards the category. This is identical to the assumptions made for display and feature in standard marketing models.
11
We define two variables related to these promotions: (a) whether the category has any promotion, and (b) average promotion depth for promoted items. Table 3 shows the descriptive statistics of category promotions. A category is likely to have at least one SKU on promotion ∼ 42% of the days during our sample period. While on promotion, ∼ 4% of the SKUs have a price reduction of ∼ 12.5% from their regular price.
Min Median Mean Max
Propensity of Promotion across categories 0.02 0.42 0.49 0.99 Conditional on promotion, mean promotion depth 0.12 12.53 13.89 94.34 (as % of SKU price) Conditional on promotion, percent of SKUs on promotion 0.33 4.23 8.09 68.18
Table 3: Descriptive statistics of Category Promotions
3.3.3 Other controls
Apart from the variables mentioned above, we also need to control for seasonality and daily store traffic measures. We control for the latter with the number of daily store-level transactions and weekend dummies. For capturing seasonality, we allow for category- specific month trends. Also, we construct a measure of how much monthly category sales deviate from the annual average.9 This helps us to control for non-linear time-trends in category sales.
4 Experimental Analysis and Preliminary Evidence
In this section we focus on “non-structural” evidence related to the influence of the reset. In particular, we argue that the reset serves as a kind of ‘natural experiment’ that we can use to measure the casual effect of the reset and the related category-level treatments. In this section, we first describe an initial analysis employing a simple diff-in-diff for the total reset effect and the corresponding results. We then describe the category-level reset treatments, the appropriate diff-in-diff-in-diff, and the results from this detailed analysis. Finally, we present preliminary evidence consistent with the idea that consumers are learning and changing their preferences after the reset initiates consideration for categories that were previously not considered as often.
4.1 Diff-In-Diff Analysis
In this analysis, we aim to measure the average effect of the reset on store sales. The diff-in-diff analysis compares the change in dairy sales in the treatment store with that of
9This measure is constructed from aggregate monthly category sales data between April 2014 and May 2015.
12
Salesst = αs + αt>T + αs,t>T +Xstβ + εst, (1)
where the subscript ‘s’ denotes store, ‘t’ denotes time, and ‘T ’ the time of the reset. Salesst stands for dollar sales of dairy in store s at time t.10 The store fixed effect αs controls for store-level differences between treatment and control locations. The time fixed effect αt>T captures the (post-pre) difference in sales common across stores. Xst contains other time and store-varying covariates including week effects, day-of-week effects, and measures of daily traffic. Finally, αs,t>T measures the difference-in-difference estimate of the effect of the layout reset on dairy sales.
The two major threats to validity for this analysis are selection on timing of the treatment by the store or selection of the actual store that was treated. Neither the timing nor choice of treatment store for the reset was driven by predictions about future demand (i.e., selection). The timing was driven by an external calendar related to implementing a new software that managed operational aspects not specific to the dairy category. Many different stores were in line for this IT implementation, and this store was chosen for this timing essentially “at random.” The store was selected to reset the dairy department (other stores reset other departments), but this, too, from conversations with managers, although not randomly assigned, was a function of experimentation as much as forecasted opportunity.
The retailer identified the two control stores based on their own assessment of best match. The two control locations, in fact, closely resemble the treatment store in key observables including the time since the store opened/was remodeled (within the last 15 years), competing supermarkets within a 5-mile radius, whether the store has an in- house pharmacy, no. of aisles within the store, and the categories adjoining the dairy department. Since the stores are within a 10-mile radius of one another, they also share similar market conditions and customer demographics. The primary consumers going to these stores also share similar characteristics both in terms of dairy expenditure per shopping trip and time between visits (see appendix A). Hence, the evidence suggests that the stores serve as effective control stores to treat the reset in the treatment store as a natural experiment. In other words, in the absence of this treatment, we would expect the stores to follow similar demand trends.
Given these arguments for the validity of the “natural” experiment, the data also appear consistent with these arguments. Figure 5 shows the trends in average daily11
dairy sales across the treatment and control locations. The black line in between July and October denotes the time of the reset. The treatment and control locations show very similar time series patterns in dairy sales before reset. This gives credence to the common trends assumption for performing the diff-in-diff analysis. The treatment store is bigger in size, accounting for the higher level of sales. However, these time-invariant differences between stores will be absorbed via store fixed effects, and do not pose a problem to the diff-in-diff analysis. The small increase in the treatment store following the reset is
10We also conduct this analysis using individual-level data 11Daily average constructed as total weekly sales of dairy divided by the number of days. This smoothes
out the day-to-day variation in sales. If the week included national holidays when the treatment/control locations were closed, the number of days gets appropriately adjusted.
13
difficult to see from the figure, but apparent upon closer inspection.
Figure 5: Daily sales in dairy across locations
Table 5 shows the results from the diff-in-diff analysis12. The average increase in sales is $325 per day, which accounts for ∼ 2.6% of daily dairy sales at the treatment store pre- reset. Thus, the reset appears to have led to a practically meaningful increase in sales. Note also that an unreported analogous diff-in-diff on individual-level data provides a similar 2.6% significant percentage increase in spending per individual (aggregating across visits).
Estimate Std. Error
Post-Reset 1173.68 479.53 * Treatment Store -347.61 135.71 ** No. of Transactions 3.93 0.08 *** Post-Reset*Treatment Store 325.03 152.15 *
Regressors include store, week and day fixed effects, and no. of daily transactions. Significance: * - 5% , ** - 1%, *** - 0.1% Adjusted R-squared: 0.9361
Table 4: Difference-in-Difference Analysis on Aggre- gate Store Data
4.2 Diff-in-Diff-in-Diff Analysis
The diff-in-diff analysis can reveal the average treatment effect from the reset, but it does not provide information about the various kinds of location and assortment changes that took place as part of the reset. For this second analysis design, we use a diff-in-diff-in- diff (DDD) approach. Before we present the details of the model, we first introduce the category-level reset treatments related to location and assortment.
12Note that this analysis is performed with aggregate store-level dairy sales data alone.
14
4.2.1 The Category-level Reset Treatments
The reset affected each of the 22 dairy categories in the treatment store differently.13 For instance, Yogurt increased its space allocation from 24 ft. to 36 ft., but remained in the same wall-fed display case. Desserts & Toppings, on the other hand, moved to the opposite side of the aisle, changed display cases and added 15 SKUs to its assortment. Three other categories (Milk, Egg Beaters & Juice) had no changes in assortment or location. In fact, the majority of changes occurred to the categories in the Wall Case, Cheese Bar, and Side Case, which include 15 categories. The other seven categories were in locations that are physically more distant or quite different in appearance (e.g., a pepperoni rack the at end of cheese bar) and none of these others moved. Hence, we focus on the 15 categories for the DDD analysis.14.
We categorize 11 broad types of reset actions/treatments implemented by the retailer at the category-level. We use the following specifications for defining each reset action -
(i) Moved - If the center of the category moved from its original position
(ii) Rearranged - Whether an item (SKU) within the category was displaced horizontally and/or vertically. No. of rearrangements counts the no. of SKUs rearranged within the category.
(iii) Changed Display - Whether the category’s display case was changed to the cheese bar, wall case or a side case. Note that one can move a category without changing its display case, but not vice versa.
(iv) Split Category - If two categories shared the same location earlier, and were sepa- rated post-reset
(v) Changed Orientation - If the category was moved from left to right (L → R)to the outer section of the dairy department, or right to left (R → L) towards the inner section of dairy, with respect to the direction of traffic
(vi) ∂Items, ∂Brands - No. of items (SKUs)/brands changed in the category post- reset. The former is equal to #(Items Added) + #(Items Deleted) in the category. Similar definition is used for ∂Brands. No brands were removed post-reset, hence the number of brands changed equals the number of brands added to the category.
(vii) ∂Facings - Difference between the no. of facings allocated to the category post-reset and pre-reset. This proxies for changes in space allocation to the category.
Table 5 shows the descriptive statistics of these actions. The reset actions from Table 5 can be characterized as changes in location (columns 3-10) and assortment (columns 11-13).
Changes in location include moving, changing the display case and orientation, and splitting the category into two or more sub-categories. As mentioned in section 2, we argue that location does not generate utility from purchase directly and instead affects only consideration. We argue that in our setting there are four ways location is likely to
13Since dairy requires use of cooling infrastructure like wall-fed freezers, the reset was contained within the same 120ft X 40ft area of the store, and did not disrupt the arrangement of other categories.
14We do use the eliminated 7 dairy categories to control for whether the individual’s shopping trip took them near the dairy department.
15
influence consumers’ consideration. First, the new location can have different contextual cues to remind the shopper to consider buying, such as type of display case and orientation to traffic. Second, moving categories to high-traffic areas such as the wall case or the central cheese bar may attract attention from more consumers than lower traffic areas (and vice versa). Third, a location change can confuse regular customers of a category who then have to find the category in its new location. Such confusion could reduce consideration. Fourth, moving a category to a new location can change its neighbors. If there are spillovers in attention between neighboring categories, this again changes consumers’ consideration of the category in question. This last change is not explicitly in our treatments, but rather captured in covariates, which we discuss shortly.
The last three treatments in Table 5 pertain to changes in category assortment, which as we mentioned previously are conceptualized to affect both consumption utility and consideration (attention). Adding or deleting SKUs (or brands) from the category changes the choice set faced by consumers, and hence, the utility they derive from category consumption. Assortment changes can also mechanically change the total visual space of the category (number of facings) and change the perception of it (number of brands and items).
Looking at columns 3-13 of Table 5, their linear independence indicates all 11 treat- ments can be separately identified from one another. For instance, four categories (Butter & Margarine, Shredded Cheese, Sour Cream & Dips, and Yogurt) moved to a different location on the same wall-fed case, without changes in display case or orientation. Chunk Cheese moved to a different style of display case without a change in orientation, and nine categories moved to a different display case and changed orientation (see Table 5).
4.2.2 DDD Analysis Approach
We use a modified DDD analysis approach to measure the effect of each reset action (treatment). We generalize Equation 1 to
yijst = αi + αj + αs + αt>T + αj,s + αj,t>T + αs,t>T + 11∑ k=1
τk,j,s,t>T γk +Xijstβ + εjst. (2)
where i indicates individual, j stands for category, and k = 1, . . . , 11 stands for treatment type. Therefore, τk,j,s,t>T is a dummy that is 1 if category j in the treatment store un- derwent treatment k during the reset. The dependent yijst includes measures of purchase incidence, purchase quantity, and churn (defined later).
In the above DDD specification, we control for store-, category- and post-reset fixed effects, and combinations of these. These fixed effects absorb time-invariant unobservables that affect categories in each store, store-invariant unobservables affecting categories post- reset, and so on. The effect αs,t>T picks up unobservables that affect the treatment store in the post-reset period, i.e., are common across all categories. In addition, we include individual fixed effects αi. Xijst includes category-, time- and store-varying covariates in- cluding controls for category-month-of-year effects, day-of-week effects, measures of daily traffic, year-ago seasonality controls, seasonal time trends, state dependence variables for recency and frequency of purchase within category, category-weekend/holiday effects, and category-specific controls for price, promotions, and depth of promotions.
16
17
Finally, we estimate the treatment effects γk by comparing the categories with τk = 1 and τk = 0 in the treatment store post-reset. We modify the standard DDD design, because rather than allowing category-specific treatment effects, αj,s,t>T , we assume that the category-specific treatment effects are constrained to be the sum of the treatment effects arising from the reset. Formally, we assume that category-specific treatment effects are
αj,s,t>T = 11∑ k=1
τk,j,s,t>T γk, j = 1, . . . , J. (3)
Because our study focuses on location we also include in Xijst variables that are related to adjacency and not part of the DDD treatments, per se, but do have variation induced by the reset. We conceptualize the adjacency effects on purchase likelihood as arising from attention and consideration. We term categories the individual visits regularly as high frequency categories for that individual15. Consumers may be more likely to try a category if it is placed next to a category they visit often. Because the 11 treatments that we specify above do not account for adjacency effects, we add four variables to capture adjacency at the individual level: (1) is category j next to a high- frequency category j′ for consumer i, (2) has individual i never purchased from category j, which is next to high-frequency category j′ for consumer i, (3) how many categories surrounding j are on promotion, and (4) how many categories not surrounding j are on promotion. Together, we use these measures to assess the effect of adjacency. To measure these adjacency effects we need to be able to separate category complementarities, which are interactions between categories, from adjacency. Note that this separation is possible because we have variation in the location of the categories so that the categories have constant benefit from complementarities, but within individual the location of those categories vary across stores and over time.16
We also include controls for initial location and assortment including variables for the type of case and for the number of items, brands, and facings in the category before the reset (in the control stores, these have the same value in the post-reset period). Including these variables implies that our treatment effects are measured as the effect of changes in location and assortment.
Now that we have described the model formulation, we discuss the threats to validity for this DDD approach. The exogenous timing assumption and use of (valid) control stores addresses many of the potential concerns. However, the additional difference produces both additional controls in the form of fixed effects as described above, but also involves additional demands because the treatment effects are assigned at the category-store level rather than at the store level. As long as the treatment was not assigned to categories in
15Category c is a high-frequency category for individual i if her propensity to buy c exceeds her median propensity to buy in the dairy categories. If i has no purchases in dairy in the previous year or has less than 5 shopping trips, we define six categories - Butter & Margarine, Chunk Cheese, Cold Cuts, Shredded Cheese, Sour Cream & Dips and Yogurt - as high-frequency categories for i. Since these are high-revenue categories, the average individual is more likely to consider among these categories for purchase
16Complementarities are sometimes evaluated by including cross-price or cross-promotional variables. If the estimated cross-price coefficients were negative, we would infer the presence of complementarities. In our context, since the variation in prices is minimal, we do not estimate a full set of cross-price coefficients, but include the cross-category promotions for non-adjacent categories.
18
anticipation of a specific store-time-category level demand shock, the treatment effects, τk, are conditionally exogenous. The main remaining “issue” related to such anticipated shocks are in the form of predicting unmet demand. Such unmet demand could be tapped by the category-level treatments. However, philosophically, such untapped demand is exactly the cause for our treatment effects. Our theoretical view is that the unmet demand is due to consumers not considering products that they otherwise might obtain sufficient utility from such that they would choose to buy it. If there are no such opportunities, then there is no room for the reset to generate benefit. Hence, we view our estimates as conditional on the nature of the untapped potential.17
Another potential concern for the assortment-related treatments could be if the re- tailer has pre-existing contracts with major brands to allocate space to them. If such contracts exist, then one can question whether the retailer has autonomy to experiment with the category assortments. However, the retailer has a policy of accepting no slotting allowances, and claims complete control over space allocation and assortment decisions within the stores.
4.2.3 DDD Results
Table 6 shows the results of the DDD analysis.18 We estimate Equation 2 with three dependent variable specifications - category purchase incidence (Column 1), log purchase quantity (Column 2) and churn (Column 3). Category purchase incidence records a binary outcome - whether individual i bought category j during shopping visit (s, t). The second outcome records the quantity purchased. We normalized all the units to ounces or fluid ounces. Using the log enables us to express the treatment effect as an % increase in quantity purchased. Finally, we define individual i as a churned customer for category j if i has not purchased j in the past 13 weeks (91 days).19
The results in Table 6 show patterns that are consistent with our primary thesis. Because in most cases the three dependent variables present a common picture (i.e., direction of effect), we focus our interpretation on the probability of purchase and discuss the other measures primarily when they differ meaningfully. First, across all specifications, the reset treatment effects support the theory that location influences consumer attention and consideration. Moving an item reduces its probability of purchase. The number of items rearranged within the category reduces the propensity to purchase. Splitting categories into sub-categories also has a negative effect. These effects suggest that the reset can break existing shopping habits and increase the difficulty of finding the category or preferred items within the category (reducing the probability of purchase). Further, these effects reflect the negative side of a reset that constrains how frequently a retailer wants to adjust a department.
Second, switching display cases or orientation changes the contextual cues of the category. Moving the category to high-traffic areas (the wall case or the cheese bar) increases the purchase propensity and quantity. Moving to a low-traffic area (the side
17We consider this conditioning as similar to the conditioning that every study on advertising faces in terms of conditioning on the kind of advertisement creatives that were used. Of course, if better ads were used this would lead to higher advertising elasticities, but this fact doesn’t invalidate the measurement of advertising elasticities, and neither should it invalidate the measurement of store resets.
18The focal parameter estimates reported here. Refer to Appendix B for remaining estimates. 19We try a similar specification with a cap of 26 weeks (182 days), and results remained similar.
19
(0.001) (0.003) (0.002) Log # of Items Rearranged -0.007*** -0.019*** 0.001
(0.001) (0.002) (0.001) Changed Display to Cheese Bar 0.026*** 0.060*** -0.033***
(0.002) (0.007) (0.004) Changed Display to Wall Case 0.026*** 0.068*** -0.112***
(0.002) (0.005) (0.003) Changed Display to Side Case -0.026*** -0.078** 0.001
(0.002) (0.006) (0.004) Split Category -0.035*** -0.094*** 0.161***
(0.003) (0.008) (0.003) Changed Orientation L→ R -0.042*** -0.106*** 0.052***
(0.003) (0.010) (0.005) Changed Orientation R→ L -0.001 0.003 -0.040***
(0.002) (0.005) (0.003) ∂Log Brands -0.083*** -0.210* 0.042***
(0.006) (0.019) (0.012) ∂Log Items 0.094*** 0.275*** 0.065***
(0.007) (0.022) (0.012) ∂Log Facings 0.051*** 0.133*** -0.217**
(0.003) (0.010) (0.005)
Adjacency Adjacent to high frequency category -0.005*** -0.014*** 0.075***
(0.000) (0.000) (0.000) Never Bought* Adj. to high freq. cat. 0.004*** 0.007*** -0.280***
(0.000) (0.000) (0.000) No. of adj. cat. on promo. 0.001*** 0.003*** 0.000
(0.000) (0.000) (0.000) No. of non-adj. cat. on promo. 0.000 0.000 0.000
(0.000) (0.000) (0.000)
Location & Assortment Size before reset X X X State Dependence X X X Category-price controls X X X Category-promotion controls X X X Category-promotion depth controls X X X Category-control store fixed effects X X X Category-treatment store fixed effects X X X Category-weekend fixed effects X X X Category-month fixed effects X X X Individual fixed effects X X X
No. of Individuals 65,019 65,019 65,019 No. of Observations 15,297,720 15,297,720 15,297,720 Adjusted R-squared 0.4219 0.4469 0.2367
∂log Brands, ∂log Items, ∂log Facings have all been defined as log(Items/Brands/Facings after reset) - log(Items/Brands/Facings before re- set) Robust standard errors in parantheses Significance: * - 5% , ** - 1%, *** - 0.1%
Table 6: Difference-in-Difference-in-Difference Analysis
20
case) reduces purchase propensities and quantity. Similarly, changing the orientation of categories from right to left and moving them to the interior of the dairy department (again, a high traffic area) decreases churn, whereas changing orientation from left to right and moving to the outer part of the dairy area decreases the purchase propensity and quality, and increases churn.
Third, considering the treatments that dealt with changes in assortment, a 1% in- crease in category size (∂Log Facings) increases the probability of category purchase in- cidence by 5 percentage points. Similarly, a 1% change in assortment (∂Log Items) raises the probability of purchase by 9 percentage points, but also increases the probability of churn. Combining this result with that of purchase incidence and quantity indicates that, although the assortment changes are beneficial for frequent buyers, it increases the un- certainty for infrequent buyers and pushes them to non-purchase. Changing the brands within the assortment has a detrimental effect across specifications.
Fourth, the adjacency effects also reveal a nuanced influence of neighboring cate- gories. Recall that moving a category’s location also automatically changes its neighbors. Being adjacent to a high-frequency category reduces the probability of purchase by 0.5 percentage points (correspondingly a 1.4% decrease in quantity and 7.5% increase in churn). This points towards an attention-stealing story, where more popular neighbors steal customers’ attention away from the category (see Hong et.al., 2016[14] for a related result). However, if the individual has never bought in the category, then having an adjacent high frequency category increases the chances of trial vis-a-vis being next to a low-frequency category. This implies that consumers are more willing to try, but note that once the consumer picks this category, she is less likely to pick it the next time (due to the dominant negative effect), pointing to lower retention and purchase frequency post-trial. Similarly, a category’s probability of purchase increases if an adjacent category has a discount/promotion, but is unaffected by the discounts in non-adjacent categories. This result suggests that adjacency is important to capturing attention, and that the adjacency effects are not simply proxying for category complementarities.
The above results suggest that location influences choice, which we theorize as op- erating through attention and consideration. Further, these influences are practically important in magnitude. However, the last set of results about adjacency point to a distinction between short and long-term effects. In the next set of analyses, we explore whether the patterns in the data are consistent with learning being initiated by the reset.
4.3 Evidence of Learning
To shed initial light on the post-purchase stage of our conceptual framework, we wish to investigate whether the reset induced learning among consumers. Traditionally, shifts in market share and changes in purchase probabilities are taken as evidence of learning. We therefore plot the evolution of purchase propensities over time across categories.20 Figure 6 shows this evolution for two categories, Sour Cream & Dips and Yogurt, across the treatment vs. average of the control stores. The black line in the center denotes the day of the layout reset. Sour Cream & Dips is a much lower share category than Yogurt, and hence many individuals would not have tried it in the past. However, because it
20Purchase propensity is calculated from the data by dividing the number of purchases in the category by the total number of store visits for each month in the sample period.
21
was moved and split from another category, it now occupies a different location that will be adjacent to new consumers’ high-frequency categories. Thus, the reset brought it new attention from many consumers, inducing them to try it, significantly changing its purchase probabilities compared to the control locations. In particular, the purchase probabilities narrow the gap over the months after the reset. On the other hand, yogurt is a high-share category and many more consumers have ample experience in the category. Unsurprisingly, the reset appears to induce less change (though there is still a slight rise in purchase right after the reset) in Yogurt’s purchase probabilities versus the control stores. Similar plots for each category are available in Appendix C. These patterns suggest there may be different short-term and long-term effects of the reset on categories.
(a) Pickles & Salads (b) Yogurt
Figure 6: Pickles & Salads and Yogurt purchase probabilities across stores
We implement the above idea in the DDD context by looking at the reset effects on purchase probabilities over time. If learning is occurring in our context, we should find long-term effects of the reset actions on purchase probabilities that are different from the short-term effects. To test this hypothesis, we repeat DDD specification (1) but add a separate long-term effect for each reset action. In this case, we define short-term as the first month after the reset (Sept.2015), and long-term as the three months after that (Oct-Dec 2015). Table 7 shows the results of this analysis. All the reset actions have significant effects on purchase probabilities in the long term, and some have significant differences from the short-term effects (we report the differences in the table). The effect of being moved worsens with time, perhaps suggesting that consumers don’t return to prior consideration levels for the moved categories and instead find new substitutes. Having the category split improves with time, suggesting the newer prominence and assortment of a split category eventually builds up habit (state dependence) or expectations around the higher quality. Interestingly, the difference in log brands decreases its effect (i.e., more negative), suggesting that the long-term effect of changes in brands is negative. Finally, the difference in log items increases with time, suggesting after some time consumers learn that they like the new variety. The reset effects become significantly stronger in the long term for the assortment-related variables (i.e., for changes in Items and Brands). Since the assortment resets affect consumers’ utility of consumption, this evidence is also consistent with learning. Overall, we take these preliminary investigations as indicative
22
of the need to model the dynamic effects of the reset through learning.
Purchase Incidence Short-Term Long-Term
Effects Differences from short-term
(0.001) (0.002) Log # of Items Rearranged -0.006*** 0.000
(0.001) (0.001) Changed Display to Cheese Bar 0.026*** -0.001
(0.004) (0.004) Changed Display to Wall Case 0.028*** -0.003
(0.003) (0.003) Changed Display to Side Case -0.025*** 0.000
(0.003) (0.003) Split Category -0.041*** 0.009*
(0.004) (0.0064) Changed Orientation L→ R -0.044*** 0.005
(0.005) (0.004) Changed Orientation R→ L 0.001 -0.002
(0.003) (0.003) ∂Log Brands -0.068*** -0.017*
(0.009) (0.009) ∂Log Items 0.076*** 0.021*
(0.010) (0.009) ∂Log Facings 0.055*** -0.006
(0.005) (0.005)
Adjacency controls X Location & Assortment Size before reset X State Dependence X Category-price controls X Category-promotion controls X Category-promotion depth controls X Category-control store fixed effects X Category-treatment store fixed effects X Category-weekend fixed effects X Category-month fixed effects X Individual fixed effects X No. of Individuals 65019 No. of Observations 15,297,720 Adjusted R-squared 0.4219
∂log Brands, ∂log Items, ∂log Facings have all been defined as log(Items/Brands/Facings after reset) - log(Items/Brands/Facings before reset) Robust standard errors in parentheses Significance: * - 5% , ** - 1%, *** - 0.1%
Table 7: Difference-in-Difference-in-Difference Analysis with short-term and long-term changes
23
4.4 Summary
The descriptive analyses above demonstrate that the layout reset has an overall positive effect on total sales and on individual-level category purchase incidence. The evidence indicates that the influence of the reset operates through a combination of location and assortment effects. As discussed earlier, we theorize that these location effects do not affect choice utility directly, but instead shape attention, and, as a result, the likelihood of considering a product. Thus, our evidence on location effects, including the influence of the reset actions and the adjacency results, support an attention effect. We also theo- rize that assortment effects directly influence both attention/consideration and category preferences (i.e., changing the individual’s perceived quality of the category). We find evidence supporting assortment effects as well. Further, we provided preliminary evi- dence that some of these location and assortment effects change over time and that the aggregate purchase probabilities change in ways consistent with learning. Such patterns could arise from consumers learning their preferences due to trial induced by the location and assortment changes from the reset. As a collection, these results are consistent with our three-stage conceptual framework. In the next section, we will develop a model of consumer demand that can capture the consideration, choice, and learning processes that this DDD evidence supports.
5 Model
We propose that consumers consider and purchase multiple categories in the same shop- ping trip. In our model, consumers have limited attention, and location (among other factors, including habit) shapes that attention. More attention on a category implies a higher likelihood of consideration. There are no upper limits or lower limits to the num- ber of categories consumers can consider. We envision the following scenario: consumers walk down the dairy aisles and if a category attracts their attention, they stop and think whether to buy in the category or not. We call this process of attracting attention ‘con- sideration.’ In this scenario, consumers are not aware of prices while walking down the aisle. Prices are revealed once they consider the category. Hence, this is a passive model of attention, where consumers do not plan attention allocation.21
Once a consumer considers a category, they exert effort to make optimal choices based on the information they have at the time, including prices and expectations about their ‘match-value’ with the category. After purchase, consumers gain new information about the quality of the category through their consumption experience, and, as a result, update their expectations about the quality of the category, thereby shaping future purchases.
21In other words, it is not a “rational” model of search where consumers decide optimally whether the anticipated benefits of effortful search/consideration are worth the costs (ala, Seiler, 2013[23]). The important modeling distinction is whether consumers make consideration “choices” based on expectations about the net utility for the category. In Seiler’s paper, consumers have expectations on prices and evaluate their expected utility from consumption to decide whether to search in the category. This is equivalent to saying that consumers are aware of prices (and perhaps other features of the category) initially. In our context, it is improbable that consumers are aware of prices in all dairy categories, particularly ones that they don’t regularly visit. Further, given the EDLP strategy, prices don’t vary much so the incentive to shop for prices is limited. As a result, we formulate a model of passive attention rather than active search.
24
Thus, in our model, increasing attention can increase the chance of purchase, which can shift expectations either up or down so that the net effect of attention shifters can be short-lived or develop a long-term benefit that is larger than the initial trial bump.
Formally, an individual i ∈ 1, . . . , N goes to a store s ∈ 1, . . . , S at “time” t ∈ 1, . . . , Ti, where “time” is an actual store visit. The individual can consider any of the j ∈ 1, . . . , J categories in the dairy department. If they consider the category, the Cijts = 1 (otherwise 0) and the consumer then makes an optimal (myopic) choice to purchase from the category, yijts = 1 or not (0). On purchase, she receives an experience signal upon consumption, which she uses to update her beliefs about category quality. These beliefs form the basis for her future choices. We now develop each aspect of the model in the following sections.
5.1 Consideration
We posit that that the first-stage decision of whether to consider category j depends on whether j attracts individual i’s attention22. At the category level, following Nierop,et al. (2010)[28], we write a multivariate probit (MVP) model of consideration:
C∗ijts = Xijtsα + εijts j = 1, . . . , J
Cijts = I[C∗ijts > 0] (4)
~εits = (εi1ts, . . . , εiJts) ′ ∼ N(0,Σ).
The variables Xijts measure the effect of category j on individual i’s attention. These variables include a range of individual-specific and category-specific controls for how re- cently, and/or frequently, i purchases j. These variables follow roughly those used in the DDD analysis including category-specific controls for promotion, size, assortment, and seasonality, and controls for store traffic and weekend effects. In addition, Xijts includes our focal shifters of consideration, which we discuss shortly. The unobserved stochastic shocks ~εits are distributed multivariate normal with covariance matrix Σ, measuring the unobserved correlations across categories. If consideration utility C∗ijts is positive, we say that i considers j23.
The setup in (4) above allows the consumer to consider multiple categories simulta- neously. The Xijts include layout-specific dummies for type of display case, orientation of category j, whether j is adjacent to categories that i purchases frequently, and whether j’s adjacent categories are on promotion, individual state dependence related variables for time since last store visit and last category purchase, and controls for seasonality (linear category-month trends and monthly sales deviation from the annual average for the category) and store traffic (no. of daily store transactions and a weekend dummy). This creates relationships between categories based on spatial proximity and individual purchasing patterns.
22Honka, Hortacsu, Vitorino (2014)[15] use survey data to distinguish awareness/attention from con- sideration. In our setting, we cannot identify between these two empirically and use these terms inter- changeably.
23Note that C∗ ijts can be multiplied by a different positive constant, and the outcome Cijts remains
unchanged. Hence, the coefficients at the category consideration level are identified up to scale, and we need to normalize the diagonal elements of Σ to 1, making it a correlation instead of a covariance matrix.
25
Economically, two categories are complements when their cross-price effects on de- mand are positive. Classical examples of complements include laundry detergent and fabric softener, cake mix and cake frosting, and so forth. Complementarity between cate- gories depends directly on the categories’ characteristics, and is independent of the spatial distance between them in stores. However, spatial correlations can enhance/detract from the baseline complementarity between categories through the channel of consumer atten- tion. For instance, consumers might be more/less likely to buy from categories adjacent to their most frequently purchased categories. Or, having a promotion in a category may pull away consumers’ attention from adjacent categories. Our model allows for these re- lationships to exist across the J categories through the X matrix. All other unobserved cross-category correlations are captured in the off-diagonal elements of the correlation matrix Σ, including complementarity.24
5.2 Consumption Utility & Learning
Moving from the consideration stage, individual i then chooses whether to buy within the considered categories, given her current belief of category quality Qijt. In a learning framework similar to Shin, Misra & Horsky (2012)[24], consumers update their quality beliefs for each category j over successive category purchases. After every category pur- chase, consumers receive a quality signal due arising their consumption experiences. These quality signals are centered around true match quality between individual i and category j, Qij,
QE ijt ∼ N(Qij, σ
where σ2 E,j is the variance of the experience signals.
Initially, consumer i starts from a initial quality belief Qij0 about the true mean quality Qij,
Qij0 ∼ N(µij0, σ 2 ij0), (6)
where µij0 and σ2 ij0 are the mean and variance of the initial beliefs about category j’s
quality for individual i at time 0. In the Bayesian learning setup, the prior beliefs at time t are the same as the posterior beliefs at time t− 1, which are distributed as
Qij,t−1 ∼ N(µij,t−1, σ 2 ij,t−1). (7)
We assume that quality beliefs at time t − 1 are normally distributed, so that posterior beliefs at time t also remain normal after Bayesian updating. At time t, individual i goes to store s and, conditional on consideration, her utility from category j is given as
U∗ijts = Qij,t−1 +Wijtsβ + ωijts, j ∈ {k : Cikts = 1} ~ωits = (ωi1ts, . . . , ωiJt)
′ ∼ N(0,). (8)
24Technically, complementarity operates at the level of consumption utility, not consideration. In practice, the correlation among unobservables in the consideration and choice stages cannot be identified separately without other restrictive assumptions. As a result, we allow it only at the consideration stage, which also proxies for the choice stage.
26
Wijts includes variables that affect the consumption utility of the category, such as price, and the number of unique SKUs and brands in the category. Since i can buy multi- ple categories in the same shopping trip, the utility structure across categories U∗ijts again follows a multivariate probit structure. The stochastic error terms for all J categories (~ωits) are distributed multivariate normal with covariance matrix . Similar identifica- tion restrictions need to be placed on as Σ above. However, it may be difficult (or impossible) to identify cross-category correlations in the consideration stage and the pur- chase incidence stage (Nierop et al., 2010[28]). Hence, we restrict to be a diagonal matrix, which reduces (8) above to a series of independent binary probits.
Given the information set at the end of period t− 1 from (7), Qij,t−1 is a stochastic variable. We assume that individuals are myopic and risk-neutral and hence, base their decision to buy/not buy on the current expected utility with respect to quality beliefs:
Eij,t−1(U∗ijts) = Ei,t−1(Qij,t−1) +Wijtsβ + ωijts, j ∈ {k : Cikts = 1} = µij,t−1 +Wijtsβ + ωijts
yijts = I[Eij,t−1(U∗ijts) > 0]. (9)
The consumer buys in the category if expected consumption utility Eij,t−1(U∗ijts) is posi- tive. Following DeGroot (1971)[6], we can write out the posterior mean and variance at time t in terms of the initial mean and variance of the quality beliefs:
µij,t = σ2 ij,t
. (10)
Shin, Misra & Horsky (2012)[24] rewrite the above in terms of perception bias νij,t = µij,t− Qij, the difference between the mean quality belief at time t and the match quality, and signal noise ηEijt = QE
ijt − Qij. Thus,
µij,t = Qij + (σ2
E,j/σ 2 ij,0)νij,0 +
As the number of purchases of category j, ∑t
τ=1 yij,τ , increase, µijt tends to the match quality Qij as the resultant bias term tends to zero. After the Bayesian updating from Equation 10, µij,t acts as the mean belief for i at the t+ 1th shopping trip.
For defining the initial conditions, we let the means and variances of the initial perception bias νij,0 and uncertainty σ2
ij,0 be functions of past purchases NPij and whether they have never bought the category NBij. These variables are defined at the time of individual i’s first visit to the store. The mean and variance of the initial perception bias are
νij,0 = δQ0,j + δQ1 NBij
27
Note that µij,0 = Qij + νij,0, and hence, defining (12) for the initial belief mean µij,0 or the initial perception bias νij,0 is equivalent.
Finally, to close the model, we assume the distribution of true match qualities Qij to be normal for each category. Hence,
Qij ∼ N(Qj, σ 2 Qj
) j = 1, . . . , J. (13)
5.3 Identification
In this section, we informally discuss the identification issues of the above model. Broadly, this discussion on identification covers three aspects. First, how do we identify the effects of the location and assortment-related variables. Second, how do we separate consid- eration from learning. And finally, given our learning model, how do we identify the learning-related parameters.
Our main research questions relate to measuring the effect of location and assortment variables on consumer choices. As discussed previously, our design leverages the reset timing, which was assigned randomly due to IT software changes. We do not repeat these arguments here, but note that the same essential arguments provide exogenous variation in assortment and location across individuals in this structural analysis. The main difference relates to the non-linear model components. We discuss the critical aspects of these model components next.
In our data, consideration and choice are not separately observed by the researcher (i.e., consideration is unobserved to the researcher). To properly identify these functions, we require a valid excluded variable that enters consideration and not utility. Location is an ideal exclusion restriction. In-store location trivially satisfies that it does not affect the utility of consumption and it also does not affect the trade-off between outside good con- sumption and inside good consumption, where the outside good is disposable income to be spent on other products and services. Hence, location influences choices only through consideration, which, in our model, is determined by attention.25 With location as a nat- ural exclusion, we can identify the two processes. While this is our primary exclusion restriction, we also assume that consumers do not observe category prices without consid- ering the category and that category quality expectations do not influence consideration. We argue that these assumptions are consistent with a passive attention process. We do include variables related to state dependence (reflecting habits and shopping patterns) and promotions (reflecting attention grabbing in-store display) in the consideration pro- cess (and not in the utility/choice process). However, we make these modeling choices to ensure consistency between the theoretical basis of our passive attention model and the variables that should be included in such a process, rather than to obtain identification of our model.
Regarding the learning model, we need to identify the population-level true qualities, the variance across individuals of the true qualities, and the functions for the initial mean bias and initial uncertainty, respectively, {Qj, σ
2 Qj , νij,0, σ
2 E,j}. First, to obtain this
(limited) set of learning parameters, following Shin, Horsky, & Misra (2012)[24], we have
25As noted previously, although we prefer to model consideration as a function of attention, we cannot rule out that some or all of the location effects could also be explained by a rational model of search. However, a comparison of these two approaches is beyond the scope of the current paper.
28
already fixed the variance of the experience signals, σ2 E,j, to 1. For the population mean of
the true qualities, estimates are based on the “long-run” average propensity to purchase in the category. The variance of the true quality distribution across individuals is estimated from the variation across individuals in this long-run average propensity to purchase in the category. For the initial conditions parameters, we use the long series of past purchases in our panel that are prior to our focal estimation period. For the initial mean bias, we use variation across categories in the average initial purchase propensity, as well as the average difference in the initial propensity to purchase for those that had never bought (vs. bought) the category in the pre-estimation period. For the initial uncertainty, technically two data features are used in estimation. First, the pace of change to shift from the initial propensity to the long-run propensity is related to the number of purchases and whether the individual never bought in the category. Second, the uncertainty induces extra noise in the choice process (see Equation 12), so that as more purchases are made, choices become more predictable. Qualitatively both of these features help with estimating uncertainty. We note that, because of the need for information about “long-run” vs. initial propensities to purchase, we restricted our data for the structural analysis to the subset of individuals with 5 or more visits during the data period. This removes around 30% of the individuals, but these individuals that visit less often are less important to overall sales.
6 Estimation
The full set of parameters for the Bayesian consideration and learning model are given as Θ = {α,Σ, β, {Qj, σ
2 Qj }Jj=1, δ
Q, δσ}. Hence, the category consideration and choice stages are now given as
C∗its = Xitsα + εits, εits ∼ N(0,Σ)
Eij,t−1[U∗ijts] = Qij + (σ2
E,j/σ 2 ij,0)νij,0 +
E ij,τ
(σ2 E,j/σ
2 ij,0) +
Given the above model, the data likelihood is written as
L(y|X,W,Θ) = N∏ i=1
∫ Qi1
Ti τ=1)
) dηEijtdQi1 . . . dQiJ .
In the maximum likelihood paradigm, calculating the above likelihood function would require summing over each of the 2J − 1 possible consideration sets for each individual, making it unwieldy. Moreover, it would require simulating and integrating over the ex- perience shock signal noises, ηEijt, a high-dimensional integral. However, in the Bayesian paradigm, if we draw the latent consideration (C∗ijts) and purchase (U∗ijts) utilities (Tanner & Wong, 1987[26]), we need to make only J comparisons. Further, the ηEijt can be drawn from a prior distribution of N(0, 1). These simplifications reduce the demands at each
29
computational iteration and also allow the data to inform the distributions and set of consideration sets that we integrate over, which can dramatically improve the computa- tional performance. For these reasons, we chose to estimate the model using a Markov Chain Monte Carlo (MCMC) method. For the purposes of the current study, we also fix Σ to be a diagonal matrix, which significantly reduces computation costs.26
To complete the Bayesian model, we need to specify our prior. We assume standard conjugate distributions for the regression coefficients α and β, and for the initial condition parameter vectors, δQ and δσ, and for each of the J true quality population means in the vector, Qj. We use diffuse priors for these normally distributed parameters. For the precisions of the true quality population means, we draw each precision σ2
Qj using
a gamma distribution parameterized as a scaled chi-squared distribution with a scale of 1 and J + 3 degrees of freedom. Thus, the joint posterior likelihood of the parameters conditional on the data is
L(Θ|X,W, y) ∝ L(y|X,W,Θ)π(Θ)

Ti−1 t=1 α, β))
(
)
Ti−1 t=1 β)L(Cijt|X,α))
(
)
] π(Θ). (14)
We estimate the model with a Metropolis-in-Gibbs MCMC sampler. In Appendix D, we provide the details of the full conditional posterior distributions and the sampling algorithm. For the estimation of the parameters, we discard the first 50,000 iterations of the sampling chain as burn-in, and keep the next 100,000 iterations for analysis from which we retain every 10th draw. We inspect the iteration plots to determine that the sampler converged to the stationary posterior distribution.
We note that we subsample our data to accomplish two goals. First, our learning model requires a reasonable number of observations per individual due to the individual level heterogeneity. As a result, we subsample to include only individuals with at least 5 shopping trips. This criterion reduced the sample of individuals to 42,108. Second, due to the computation time, estimating the model on the full sample using the Bayesian estimation technique is prohibitive. Instead we use approximately a 5% sample of 2,100 individuals. This sample was drawn randomly from the set of those with at least 5 shopping trips.
26This choice of fixing Σ to diagonal was determined after estimating a model without learning with the full correlation matrix. In that model we found that all of the marginal distributions for the correlations covered zero.
30
Parameter Posterior Posterior Mean Std. Dev.
Adjacency: Adjacent to high-frequency category -0.185 0.09 Never Bought*Adj. to high-freq. cat. 0.414 0.05 Adjacent category on promotion -0.010 0.03
Category Size & Promotion: Log no. of facings 0.085 0.08 Promotion dummy 0.160 0.04 Promotion depth 2.543 0.72
State Dependence: Purchased last month 0.115 0.05 Log weeks since last visit 0.386 0.02 Log weeks since last purchase -0.693 0.03
Location: Coffin Case: Outside 0.369 0.13 Coffin Case: Inside 0.355 0.13 Cheese Bar: Outside 0.192 0.11 Cheese Bar: Inside 0.183 0.09 Wall-fed case 0.172 0.08 Side case 0.000 -
Other controls: Weekend 0.215 0.03 Log no. of daily transactions -0.185 0.10 Monthly sales dev. from annual avg./1000 0.068 0.01 School Start dummy -0.016 0.05 Category fixed effects X Category-month trends X
Table 8: Consideration stage estimates
We now present results from a preliminary formulation of our model that does not yet incorporate the effects of the full set of reset treatments (discussed in Section 4.2) on the attention variables. We therefore view the current results as a proof in concept, rather than final results. The estimates are reported in Tables 8 and 9, and correspond to the parameters related to consideration and utility, respectively. We begin our discussion with consideration and then turn to the parameters related to utility and learning.
7.1 Consideration Model Parameters
We begin with our focal variables related to adjacency. We find that adjacency plays an important role in consideration. Moreover, our results are consistent with the analyses of section 4.2.3. First, being adjacent to a category tha