How is big data changing marketing?


(piano music) – Technology allows companies to track their customer’s behavior
like never before. It’s made it easier for companies to run their own experiments,
and when it comes to marketing, they can use
insights from data analysis to customize advertising and sales down to the individual consumer. So what are companies learning, and how are they using
the data they collect? Welcome to The Big Question, the monthly video series
from Chicago Booth Review. I’m Hal Weitzman, and with
me to discuss the issue is an expert panel. Sanjog Misra is the Charles
H. Kellstadt Professor of Marketing and Neubauer
Family Faculty Fellow at Chicago Booth. His research uses data-driven models to examine how consumers make choices and firms make decisions
on pricing, distribution, and sales force management. Ted Buell is Head of
Insights and Analytics for Google’s sales and marketing
organization in the US, in which capacity he advises
Fortune 500 companies in retail, tech, and telecom
on using digital marketing. Andrew Appel is President and CEO of IRI, a provider of big data
and predictive analytics for companies in the
consumer packaged goods, healthcare, retail, and media industries. And Mike Boush is Senior
Vice President, E-Business, and Chief Digital Officer at
Discover Financial Services, the credit card issuer, direct bank, and electronic payment services company. And all three of our industry
panelists are Booth MBA, so welcome back to the
University of Chicago. Sanjog Misra, let me start with you. When we say big data, what do we mean by big data in this context? – So let me just start by, kind
of, thinking about big data that typically how firm’s
think about big data, which is to one, think
about the volume of data, the amount of data that we have and that’s grown exponentially in the last, you know, decade or two. Then there’s the aspect of
the different types of data that we’ve been seeing,
and that’s kind of, there’s been an, you know, an explosion in the variety of data that we see, the structured and unstructured data, decks, textual data,
nontextual data, and so on. Finally, we have the velocity
at which the data comes in, and that’s the speed, so
we have real time data that we didn’t have. The cadence of data has changed a lot, and that’s typically
how firms have kind of defined big data. My own perspective is likely different. I think big data is kind
of a term that captures different things to different people, in particular the way I
like to think about it is, data is big every time it
makes you feel small, right? So the idea being, this
concept is a function not only of the amount
and of the different types of data you have, but
also the infrastructure and the people that you have
that deal with this data, so that’s the way I think about data. – Okay, what about at Google, Ted Buell? Is it, you are dealing
with vast amounts of data, so how do you, I mean,
that’s obviously different. How do you parse that for
companies you work with? – Yeah, I think overall big data is really about helping businesses
and marketers specifically make better decisions. So there’s different forms of it. I think one form is the data
that companies and marketers collect based on the
relationship with the customer, whether it’s purchase
history or their preferences. I think another form of
data is the interactions that consumers have with that company’s or that marketer’s assets,
whether it’s their website or their stores or different aspects. And then the third I would say
is how brands and marketers use existing products,
whether it’s to understand consumer behavior at large, so how, what are people searching for
on Google and how does that, those trends inform
their marketing strategy, or what kind of videos are
people watching on YouTube, and ultimately how does
that behavior help them understand how to navigate the space. – Okay. Andrew Appel, how has
this changed marketing? What can we do now that we couldn’t do before this revolution started? – Oh, I think in a lot of ways. I think you, you know,
I guess I think of data as the breadth of information
that you can get around a business decision that
needs to be made or activated. And so there is, you
know, if you think about the typical day in the
life of any one of our kids or consumers, right, they
are leaving a footprint of information that is
exponentially bigger than it was ten years ago. And that footprint, right, of
all the different behaviors and all the different sites they’re on or where they physically are
and what device they are on and how they interact with
others in the ecosystem has an extraordinary impact on marketers, because if the ultimate
role of marketing is to get consumers to ultimately
buy an additional product or have a propensity to
buy an additional product in the future, if you take
count of the economic view, then you have all this
explosion of, of information about what consumers are
doing at any given moment, and how do you then analyze that in speed with sophisticated models
to ultimately shape how you stimulate those consumers to take a different decision. – Okay. Mike Boush, what about in your
industry, in your company, how are you, what are
you doing differently now that you couldn’t do,
you know, say ten years ago? – I think that there is such
a large set of information that has been available to use
for financial services firms and insurers and so forth
for a long period of time that we’re comfortable in
the industry using big data, and it’s not as recent a phenomenon for some industries as others. I think what’s changed
in the last few years, first, is the observational data points that are attached to mobile devices, and each time somebody
clicks on the internet, they’re teaching it what
they’re interested in, and a lot of that data
was just not computable in practical sense, a
few years ago for us. And now it’s rounding out the picture of the consumer for us, and
I think it’s allowing us to determine the difference
between what somebody states and what somebody does, and
in a lot of cases, you know, in past research, we
would have to ask somebody what they thought, and now we can infer it through their actions. – Kind of revealing consumer preferences in an interesting way. Sanjog Misra, this hasn’t
just affected marketing, it’s affected how
companies make decisions, so tell us about that. How has that changed? – Big data, or what firms have figured out is that the data doesn’t
quite exist only on consumers but it also exists about your employees, it exists about your business practices, about your suppliers, about competition, about the environment,
and when you start to look at all of that, kind of,
in some cohesive way, there’s a lot of things
that you could do now that you couldn’t do before, right? So there’s been projects
that I’ve worked on where we’ve optimized sales
force compensation plans because you can think
about the amount of data that a particular
salesperson generates through his or her actions over
time, that in itself is a huge amount of data
at the individual level. So what can we learn about how they react to their compensation environment,
and then re-optimize it? So that’s something that’s
entirely new and different that we couldn’t have done
five, ten, twenty years ago. We’ve re-optimized things
like the standard marketing aspects, like promotions
for a firm like MGM, where you’ve got millions
upon millions of customers and each one of them reacts
completely differently to a set of offers, and now
you start thinking about not who do you target and
how do you segment markets, but also how do I individualize the offers that I actually make to these customers? How can I personalize them? And this has been an
idea that’s been around for a really long time, the
idea of targeting individuals with an individualized offer. Now we are at a point where
we have enough information as well as computing power
to actually make that happen. – Okay, and you’ve done
some research on that. – I have. So I was giving an example of like, the sales force
compensation project was for a large medical devices company. We re-optimized, kind
of, promotions for MGM, there’s a project that
I’ve been working on currently on trying to figure out how do be dynamically priced. When I say dynamic
pricing, for each customer we try and figure out what
they’re willingness to pay would be for a particular
variant of the product, and that’s done in about
18 milliseconds, right? So that’s the other aspect of something that we have to talk about,
which is with large amounts of data, there’s also,
data is also perishable. It’s only relevant for a
really short amount of time, and you have to make the correct action in that short amount of time. So there’s a page loading,
so for example, you know, you’re going through Google
and you want to figure out what particular creative to
show someone as a page loads up. You have about anywhere in
between 25 and 200 milliseconds to make that decision, and
there’s an entire industry that’s competing on that
front, so there’s a lot of interesting things going on there. – And I would say if you’re a marketer the fact that certain data only is good for a certain period of time, I think that’s actually a good thing that consumers are generating
more and more data. Obviously potential challenges with that, but that actually gives you an opportunity to understand those moments
and those different ways consumers are making decisions
or evaluating decisions for your product, for your
category, for your brand, and I think that can inform what you do. – Andrew Appel, you’re working with lots of different companies. What kinds of data are
most useful for them? – Oh, look, you can boil it
down to an incremental data set that allows them to
improve the effectiveness of a business process,
whether it be supply-chain, store, or marketing, right? And so, you know, we’re amalgamating 50,
60 different data sets, whether it’s what’s physically on a shelf, what’s in a supply-chain,
what’s been sold last week, what’s somebody’s viewed
on a television show, what’s where they spend their money outside of a grocery store,
inside of a grocery store. Each incremental piece of
data has, for it to be useful, obviously has a marginal
value on improving the efficiency or
effectiveness of the process. And so that’s, that’s how we think about each incremental piece of this big puzzle that we’re putting together is how does it drive efficiency
and effectiveness in, kind of, the business to business
collaboration between the companies and then how does it drive
more accurate insight into the individual
consumer decision-making. And some of it is decay and
some of it’s not, right? Some of these behaviors,
there’s a lot of data that’s really valuable for a minute, and then there’s an equally
large set of information that’s valuable for two years. ‘Cause people’s buying behaviors
don’t shift that quickly. And so what drove someone
to kind of bias themselves for a certain purchase
on a certain retailer or a certain store or
a certain product today is not that different than a year ago. You know, it will be five years from now, and now a little bit of
context about where they are, for example, like mobile
context can help you target an advertisement or a
promotion at the point of the decision, that’s different. – Okay, Ted Buell, do you
think companies are collecting enough data or the right kinds of data? – I think companies are certainly
collecting a lot of data because of, first of all, there
is a lot of data collected and second of all, there
are a lot of resources to be able to do that. I think the key is really
the second point, which is, are they collecting the right
data or, more specifically, are they evaluating the right data? So I think that’s one of the
biggest challenges we see today is not just are they, are marketers, collecting the right amount of data, but the right data to make decisions. And so we look a lot at,
what are the data points that really show intent for
shopping for a retail category or shopping for financial
services, whatever it may be, and then ultimately what
are those data points that show intent and can
inform a business decision. – So also, let me just add one thing. Like data, so for your answer,
is incredibly messy, right? It comes at lots of different levels, and so very few companies,
actually, are having a lot of savvy at taking
divergent, different datasets and getting them into a usable form to then make decisions with. And so there’s, you know, they
come in different timeframes, they come in different intervals, they come at different levels. Sometimes you get the full set,
sometimes you get the panel, sometimes they come at a brand
level versus at a SKU-level, and so, you know, a lot of
the players that, you know, are good at, are moderately good at using their own data sets, but
once they have to incorporate any other data that’s
second or third party, it becomes very challenging. – Well, and it’s almost too convenient to say it’s universally true, but the more data, the
better the model will become, and in some ways, there isn’t too much data, there is only as the
model taking into account the right things. Specifically, there are a lot of devices that are throwing off data all the time and so IOT has become– – IOT? – The internet of things,
devices that record statuses and machines and positions and so forth, they can be used to, as you mentioned, solve business problems that
were really just matters of forecasting in prior
years, but there wasn’t enough information to make a better forecast. So, with the question, do business have enough data, more to make better predictions seems like a generally
acceptable, good thing. – And to build on that,
you hear more and more about machine learning, or
artificial intelligence, and there are a number of
companies that do it really well. But Mike, to your point, it’s
really the breadth of data and the amount of data that
those companies can actually process to inform and make
that learn and smarter, so in some cases, companies
that think about the breadth of data as almost a risk or a debt just because it’s so cumbersome yet, to Mike, to your point,
the large amount of data can actually start making your processes and your learning better. – Well, and there’s another
half to this which is the ability for the
system to actually make, take the prescripted action on it’s own. So you have data for insights
and data for modeling and data for analytics,
but then the next evolution is the system just is
gonna use the analytics that come out of the data and effectively execute the decision to improve the effectiveness
of the process. So, we see that in, for example, marketing optimization, right? It’s a very easy example to say, if I can look at the ROI
or ROAS or sales lift of a digital campaign
across 14 different factors, and then I can just rebalance
the purchasing automatically, I don’t need a person in the middle of it who’s too slow to respond, right? I can just reshape, you
know, the next four weeks or the next week’s purchases
based on the return of the last five weeks or
based on what the sells that are statistically
driving higher value if that’s the factors
that I wanna optimize. – Management by machines. It sounds fantastic, but
I mean, what proportion of companies are actually doing this? This all sounds very, sort of, futuristic. Are companies really using
data correctly, Sanjog Misra? Or optimally? – So I’ll answer the first
part of your question which is, are companies actually doing this? The answer’s yes and no. So in certain contexts, yes. So I teach a course at Booth
called Algorithmic Marketing which focuses on essentially
what I see is the shift that’s coming in the next 3 to 5 years, where a number of areas where humans have been making decisions,
that’s gonna be replaced by some combination of
data and algorithms, and we wanna be ahead of that curve. So in certain industries,
so take for example, kind of online advertising,
display advertising, you know, the timeframe is such that
you literally have somewhere, like I said, about 100 to 200 milliseconds to make a decision, it’s
impossible for humans to be involved. There’s entire auctions being run within that small time
interval, and there’s, you know, there’s not gonna
be a human intervention that kind of allows you
to participate there. Where I see things changing are places where we haven’t actually
yet tapped into the use of algorithms and data in big ways. So, like HR analytics, for example. Human Resources is still kind
of untouched by this big data, revolution’s only starting
to kind of visit a few firms that are emerging that wish to
automate this entire process. Things like performance evaluation using machine learning and
the data that’s collected within the firm to evaluate employees. New product design issues, you can think about not only using data to think about these
traditional marketing decisions like pricing and distribution, but also how do we design new products on the fly? Can we customize products on the fly? And that’s something that’s
kind of going to emerge. And then more broadly, it’s
a question of taking all of this data and thinking
about making, you know, decisions more broadly,
not just within business, but outside business, right? Can we improve the efficiency
of supplemental nutritional programs, or like SNAP programs, or food stamps, for example. Or charitable giving, right? These are also businesses or
government, kind of, decisions that can be improved with data,
but we’re not quite there. There’s been efforts in that direction, but we’re not quite there yet. – I would say that the concept
of management by machine needs to, obviously,
needs to be approached with caution, however, because I do think that
we’re in the early days of using machine learning algorithms to make behavioral
predictions and in some cases, we can hook them up to operational systems which execute those
predictions without some type of intervention, but in a
lot of cases that I’ve seen and the folks that I’ve
worked with and talked to, the data that is
represented in the dataset, if it indicates human interactions, then contains biases,
which the model can pick up and model into itself. And so one can accidentally
recreate what would happen in human behavior
through a computer model, and it can be rife with the biases. I’m hoping that prudent
management of this technology over time will allow us
to make better decisions than humans would, rather
than just take the totality of the human decision pool
and model what would happen if humans were to do it. – Andrew Appel, are
companies making optimal use of the data they collect? – You know, I still think
we’re in the early innings called the second inning of this journey for companies to use data, and frankly, they may never get there because the expansion of the
datasets are happening faster than they can get their arms
around the data they have. But, you know, we work with a
lot of the leading retailers in the country and a lot
of leading manufacturers and they do a lot of
analytics around datasets they’ve been using for awhile, and they’ll look at them, you know, and maybe add one, you know, they’ll crank out a PowerPoint
that has the insights that come with it, but
the idea that you have a, kind of an on-demand
analytics capability that, you know, crosses both, whether
it’s the store dimension or the consumer data
dimension that has, you know, a 360 view of consumer behavior
that then you’re able to, like, pull out, you know,
micro-insights on behaviors that happened yesterday, we’re awhile from that. I think companies are, the other thing is, I think companies are
reasonably good at using their own data and just learning how to integrate it with third party data to kind of enrich it with the behaviors that happen outside of their core dataset, at Discover, as a credit card company, or a large retailer that
has a loyalty program. – Okay, Sanjog Misra? – So I wanted to go back to
something that Mike mentioned in terms of, you know, the
idea of machine learning using all of this data and
essentially replicating the biases that human beings have. So one of the things that’s happened is, firms and researchers
have also figured out that you can create your own datasets, and experimentation
has taken off as a tool for decision making unlike, you know, something that we’ve never seen before. What I’m seeing, like,
firms like Netflix, right, where every single decision is tested out, there’s AB tests that are
done, data’s generated, and decisions are made
on the basis of that. We see this with Google,
experimentations being built in into many of their tools,
like the 360 tool at Google, or on Facebook, you know,
anything you want to do, you have the ability to
test, create your own data, and I think on the research front, what people have also figured out is that the traditional approach
to machine learning and data science is not
what we want for business decision-making or for
economic decision-making. What you want is, you want
to marry causal inference and machine learning in clever ways so as to get at the truth, right? So machine learning is
extremely good at prediction, which is great, but if you think about policy interventions
or business decisions, it’s not about predicting
what’s gonna happen, it’s about telling you how,
how things might change because of a marketing intervention or some policy intervention
that I might take, and that in itself is an
interesting concept because we’ll never have data about the
counterfactual, right? I can tell you what’s gonna
happen based on what I’ve seen, but I can’t show the same person an ad and not show them an ad, and that’s really what I’d like to do. I’d like to generate data
from both those worlds and then look at the difference. So I think a lot of kind of
the cutting edge research that’s going on is at this confluence of machine learning economics
and causal inference, so as to be able to
answer these questions, hopefully with some degree of accuracy. – Are companies typically
comfortable running experiments with their own customers? Mike, are you running
experiments at Discover? – I think within terms
of marketing experiments with data that customers give
us permission to access, yes. The goal of all of these algorithms is to make a better customer experience, to make a better experience
and have more positive outcomes for customers, which hopefully will mean more positive outcomes for businesses. So, I don’t think that
it is under-aspirational to try to predict relevance
or predict, you know, what somebody might like. I do think that the
sophistication of the models that are in play right now, you know, they’re in the early innings, and it takes large sets
of behavior in order to, to get down to an individual choice. – Looking at marketing in
regards to experiments, you see the more progressive
and most progressive marketers using experimentation
to understand things, ’cause there are a number of
different tools and platforms in place to measure the
effectiveness of marketing today, and they’ve been around
for a number of years, but to the point that was earlier made, in order to be able to see a
new trend or an emerging trend, you do have to have this
culture of test and learn, and use experimentation to
test certain hypotheses, otherwise you’re in a bit
of a lean back and observe, which is okay in some cases,
but I think the rapid rate at which consumers are
experiencing your brand and experiencing your products today, I think we see the best
brands and marketers leaning forward, and
really using experiments to figure out, you know, to either test or prove or disprove their hypotheses. – It may be reasonable to just think that there are false
positives to these models, and so to the extent that you’re choosing which advertisement to
put in front of a person, if you miss it’s okay. It’s when we talk about
automatically executing against the outcomes
of some of these models that it’s worth being
measured, and understanding that there are false
positives to these models. And this is really pattern recognition, but there can be points that lie outside of the pattern that are in the dataset. So, banks and insurance
companies use large datasets to predict losses of fraud and so forth, and those things are
gathered over patterns, but there are individual points in there that don’t fit the curve. And we wanna be careful when
we execute against those. – Andrew Appel? – Oh I was just gonna say,
you also have to, you know, recognize that each and
every individual interaction is unique in and of itself, and so the models don’t
scale down to the level of personalization that
we think about in theory. Because you just can’t
get the datasets together, I mean, the buy-through rate
for particular advertisements, by the time it goes
through, did someone see it? Did they take an action? Did they buy an item, right? And then what were the other,
you know, numerous factors that led them into buying an item? Was it on sale? Was it not on sale? Did they know they were
gonna buy it anyway, right? So the idea of like, on-demand,
real-time test and learn runs into the reality that
people aren’t spending, you know, a billion dollars to decide which of the 400 different
experiences, and even then, even if I as an individual
company, let’s say Discover, create 400 permutations of test and learn to look for the best returns, right, you’re not sure that there aren’t hundreds of other contextual
things around that person that are biasing the results of the sample because people are so overexposed and so constantly being, you know, there’s so many factors
into the decision model. So, it is actually quite hard to get to these little micro-segments because it’s hard to get
the flow of information accurate enough from what ad to target to decision to result at scale. I mean, we see this all the time, just to get purchase data on
consumers from, let’s say, a loyalty dataset, you know,
you need hundreds of millions to be able to look at,
you know, 50 permutations of an ad of a hundred million items or something like that,
let alone four thousand. And it just, you know,
the models just die out. – And I think as you look
across different industries, I think some are harder to
do that than others, right? Based on the velocity of
the data that’s available and how many data points
you are collecting, ’cause you’re right, there are so many different permutations of that consumer journey,
and what they experienced and how they ended up making the decision, and so, you know, you do see a little bit of variance among industries. – So what is nice about
all of that, though, is that now that we have so much more data about exposures and
indications and expressions of interesting clicks
and likes and so forth, we do get better attribution overall, but we have a larger set of data for which to describe a particular
consumer decision journey, and so we can model a
path out more accurately, but there might be a dozen
sub-paths inside of that, and so as we get more specific, while we’re more confident
in the overall outcome, we’re less confident in
the individual journey as you dial it down more. – Yeah, it’s effectively
like, the consumer experience is increasing in complexity at a rate and the science and the
attribution is increasing, and the question is which
one’s moving faster. – Yep. – The fragmentation of the behavior, or the science to figure out
what predicts the behavior? – So if this micro-segmentation down to the individual consumer
is not yet happening, how are companies using data
to segment their audiences? Ted Buell? – Sure, so, I think if you, so I work in digital marketing,
so I work with big retailers to help them understand
how they can use Google to grow profits for their business, I think it’s the data that they have, that they own and they use, and how they compare it
with digital marketing, for example, Google campaign
to reach the right consumers at the right time that are
in market for their product. And so I think the goal, obviously, is to get down to the
one-to-one marketing level, which is hard to scale, particularly if you have a consumer, a set of consumers who’ve
shopped your stores or purchased your product in the past, but rather starting to use, because of the consumer
behavior on the web and all of these data points, you can actually start to learn based on these set of consumers
who else looks like that. – I mean, today, for some of the most sophisticated
marketers in the world, you know, doing their
optimization twice a year in terms of how they
do resource allocation is a step forward versus once a year, and to go from, you know,
12 segments of brand to which they develop a
basically a kind of an activation or marketing program once a year, to go to two is considered an advancement. So I think the most
sophisticated retailers are re-scoring their
files and their score, you know, weekly, and they’re creating thousands
of sub-characteristics of their tens of millions of consumers and they’re developing
personalized interactions based on these thousands
segments on a, you know, adjusted and rolled
forward on a weekly basis. That’s about, you know, in the pure digital world, I suspect a Capital
One or something can go a little bit farther because
they’re basically not living in the physical world, but
in the physical world of CBGs and retails and other companies
that actually sell products, that would be highly sophisticated, and everyone wants to get to,
you know, kind of a daily, you know, a daily re-scored by person, by channel of demand, with an adjustment for
context or something, so that’s, you know, the
vision is to basically have an individual score
for individual products by individual demand lever that you can then kind of
update based on context, right? Am I sitting in front of a hair shelf, what am I likely to buy? If I’m in a ??? or something. – Sanjog Misra? – Yeah so, I think the way
I see the points being made is that it’s about matching
the cadence of the data with the cadence of the decisions, right? And some, and this heterogeneity
that Andrew brings up is a, is an important
one, so there’s gonna be certain industries,
retail being one of them, where the cadence of
decision-making is just not feasible to go in and change the prices
of a hundred thousand SKUs at your supermarket on a daily basis, that’s just not going to happen. Whereas if you’ve got an
online firm, you know, Amazon can go in and tweak
their prices in, you know, a fraction of a second and
that’s perfectly plausible. So I think the cadence
of the decision-making will kind of lead to
the degree to which data and these algorithms will
contribute to the bottom line. That’s not to say that in retail this isn’t important, right? There’s just a different
perspective on this, there’s other aspects of things
which complicate matters, and there it might be that, look, our decisions are going
to be made once a month, or once a quarter, but we have
just a larger volume of data and we can take the richness of data to better inform our
decision-making process even at that level of cadence. The point I was making
about these micro-segments and about, kind of this
real-time decision-making was in the context where
the decision-making cadence is really fast, where you can do things and you want to do things, right? Where that it’s possible
to do those things, and I think at some point
what’s gonna happen is that you’re gonna get to a place where, just looping back to my
idea about experimentation, we’re going to go to a world where you’re not going to be able to rely on just historical data
or just experimentation, for the reasons that were brought up. There are biases in
historical, observed data, which, human beings generated it, and human beings have biases. There’s no bias in experimental data, but it’s not generally
applicable and not scalable for long-term
decision-making, we know that. So essentially what we want is we want some amount of
experimental variation to be injected into our
datasets moving forward, and I think the way I see kind
of the next phase of big data is, like, firms realizing the type of data that they would like to have in the future and making sure that that’s, you know, if we go down a path which
will allow us to get that data so that we can make better
decisions moving forward. It’s a grand vision, but
I think sooner or later we’ll get there. – Well on that big vision,
unfortunately our time is up. My thanks to our panel,
Sanjog Misra, Ted Buell, Andrew Appel, and Mike Boush. For more research,
analysis, and commentary, visit us online at, and join us again next time
for another Big Question. Goodbye. (piano music)


  1. Big data is being used to manipulate elections, marketing directly to our emotions. It is being collected, traded, sold, over and over again, generating huge revenue streams for corporations, and in the case of elections, results. Not transparently, it has invaded our most personal and private corners of our lives.


Please enter your comment!
Please enter your name here