#084: Bayesian Statistics and the Digital Analyst with Dr. Elea Feit

Do you model professionally? Would you like to? Or, are you uncertain. These are the topics of this episode: Bayesian statistician (among other official roles that are way less fun to say) Dr. Elea Feit joined the gang to discuss how we, as analysts, think about data put it to use. Things got pretty deep, included the exploration of questions such as, “If you run a test that includes a holdout group, is that an A/B test?” This episode ran a little long, but our confidence level is quite high that you will be totally fine with that.

Miscellany Mentioned on the Show

Episode Transcript

[music]

0:00:05 Speaker 1: Welcome to The Digital Analytics Power Hour. Tim, Michael, Moe, and the occasional guest, discussing analytics issues of the day. Find them on Facebook at facebook.com/analayticshour. And their website analyticshour.io. And now, The Digital Analytics Power Hour.

[music]

0:00:28 Michael Helbling: Hi everyone. Welcome to The Digital Analytics Power Hour. This is Episode 84. On a scale of zero to one, would you consider yourself a Bayesian statistician? So much of what we do as analysts doesn’t really resolve itself into clear-cut answers to questions. And frankly, I’m unsure of all we’re gonna even get into today in the podcast. But my clear recommendation is to stay tuned because something in here is gonna be useful. Let’s start with what we know for sure. That is my two co-hosts, Moe Kiss, Analytics Manager at The Iconic. Hi Moe.

0:01:08 Moe Kiss: Hey, how’s it going?

0:01:09 MH: It’s going good. Thanks for Americanizing your greeting on my behalf.

[laughter]

0:01:14 MK: You’re welcome.

0:01:15 MH: And Tim Wilson, senior director of analytics at Search Discovery. Hello Tim.

0:01:20 Tim Wilson: How are you feeling, Michael?

0:01:22 MH: Oh, that is such a great question.

[laughter]

0:01:25 MH: I’m gonna hold off on answering that, ’cause this would go at a whole different direction. And I am Michael Helbling, I lead the analytics practice here at Search Discovery. But enough with the known, we needed a guest who could help us unify all the statistical thinking that we’re getting into these days into useful applications. Dr. Elea McDonnell Feit is a Bayesian statistician. She is the assistant professor of Marketing at Drexel University, a senior fellow at the Wharton Customer Analytics Initiative, and recently wrote the book “R for Marketing Research and Analytics,” and now I am reasonably certain she is a guest on our show. Welcome to the show, Elea.

0:02:08 Elea McDonnell Feit: I am so excited to be here. Thanks for having me.

0:02:11 MH: We’re excited. I think this has been a long time coming so we’re pretty stoked to have you on the show.

0:02:17 TW: The first choice was Thomas Base but we found out that he was not available. [laughter]

0:02:21 MK: He turned out dead.

0:02:22 TW: He passed. [0:02:23] ____ personally. Yeah.

[laughter]

0:02:25 MH: #TooSoon, Tim.

[laughter]

0:02:30 MH: To get us kinda started, there’s so many directions we’re gonna go, but I wanna maybe start with just understanding what you do today. You’re a professor, so you’re teaching students, you’re teaching classes. Let’s start with that, just some background information on the kinds of things you’re doing in your day-to-day.

0:02:47 EF: Yeah. So I’ve been at Drexel for four years now, and when I got to Drexel, my big goal was to kind of infuse digital into the marketing curriculum. So I teach an undergrad class called Data-Driven Digital Marketing, and I like to say it’s a spoonful of digital marketing sugar to help the analytics medicine go down.

[laughter]

0:03:06 EF: The students think they’re coming in to class to learn how to do marketing on like Snapchat or whatever the latest social media platform is that they’re really excited about and I’m like, “But we’re gonna look at the data behind it,” and that [laughter] gives me a chance to teach them a little bit about data analysis in a context that they really like.

0:03:27 MH: That’s awesome.

0:03:30 EF: So we go… There’s all kinds of stuff. They actually… As kind of a centerpiece of the class, they have to do some sort of social… Sorry, not social, some sort of digital marketing campaign on behalf of a small client in the Philadelphia area where we are. So that could be improving a website through an A/B Test. That could be running an email A/B Test. That could be buying ads on Facebook and trying to figure out if those ads have positive ROI. Any number of those things.

0:03:58 MH: Do they ever?

0:04:00 EF: Do they ever?

[laughter]

0:04:01 MH: Sorry, trigger. There’s no trigger warning here. [laughter]

0:04:05 EF: Sometimes, not all the time. Actually, it’s fun to watch them actually run through the numbers and get to the negative ROI and think, “Oh crap. All my marketing dreams are not true.”

[laughter]

0:04:25 TW: That might be the best thing you could be doing for the next generation of marketers and analytics people, is letting them see that right away and early.

0:04:34 EF: Well, and with money that they spent themselves, on an ad campaign that they were dreaming about would be perfect. When we teach intro marketing, it’s just a field of dreams, it’s only the sort of Don Draper Mad Men pitch meeting part, and we never get into the, “I actually ran the campaign, I had a whole bunch of problems when I was getting the whole thing set up, I couldn’t figure out how to set up the targeting in Facebook.” I have students who’ve been in two or three marketing classes that discussed targeting and didn’t really get it until we’re in the back end of Facebook picking the targeting for their ad. I’m like, “These are the people who are gonna see your ad, that’s targeting.” So…

0:05:17 MH: Maybe. That’s who Facebook’s gonna claim is gonna see your ad. And if it doesn’t work, do you teach them to actually double down on what they’re spending?

0:05:22 EF: Shh, don’t tell the students.

[laughter]

0:05:29 EF: They’re beginners, right? We have to give them some training wheels. I really enjoy that group. I love the sort of math-phobic students that you often get. So I get Marketing majors, and I’m trying to give them a little bit of this analytics. And sometimes I’m really nervous about it, but when you can pre-build people’s confidence, maybe they had a bad math teacher in high school or something that really knocked them down a few pegs, and if you can build them back up they can actually be really good at that stuff.

0:06:00 EF: So that’s my undergrad class, the other classes is a full semester class in marketing experiments. So AB testing, multivariate testing. It mostly attracts students who’re attending to be data scientists who maybe don’t have a lot of marketing exposure and know a lot of machine learning methods. And I like to say I beat them with a causality stick for 10 weeks.

[laughter]

0:06:23 EF: To help them really understand the value of randomization in figuring out how business decisions are gonna affect outcomes. So that’s the other class, covers all of that stuff from both a Bayesian and a Classical’s perspective. So one of the things I was really excited to talk to you guys about, since I was coming here today, is, what should be in my courses for students? What should be future, what do the current digital analytics professionals think that future digital analytics professionals should be learning in college?

0:06:54 MK: I actually was about to ask you a question about whether we should all be going back to university to do your course. Because as you’re talking through some of this stuff, I can’t help but think, I mean, so many of us in the industry never had an education like that. We never learned the stats behind an A/B test. We just started doing it and trying to figure it out along the way, which is super dangerous. So, yeah, from my perspective, all I hear when I’m listening to you talk about this course is like, “How do I sign up?”

0:07:25 EF: Yeah. I’m sure the admissions office at Drexel would be happy to talk to you. We’ve a large number of international students. [laughter] who pay a lot of tuition to come to Drexel.

0:07:38 MH: I almost feel like academia is doing a better job of trying to ask the question of what’s needed. And they’re turning to, in many ways, a group of people who’re ill-equipped to answer, kinda to Moe’s point that there’s a level of… I don’t know whether it’s sticking our heads in the sand or whether it’s because we haven’t really figured it out. Like if you take a… You’re teaching an accounting course and you’re asking people who’ve been in the accounting profession for 5 years or 10 years or 20 years what do they really need to know. And there’re… Things will change and they will evolve, but you’ve got the people who’ve really figured it out who are tenured at companies and this is how we’re doing accounting. I feel like there’s still a ton of companies that are still plodding along without that appropriate background, in some ways education by vendor which is increasingly terrifying, I’m realizing.

0:08:33 MH: I mean, ask one of our favorite guests his opinion of one of the more predominant A/B testing platforms and you’ll get a hell of an earful on that. And so it’s kind of this ugly cycle where we’ve sort of been taught bad habits or shortcuts that are really, really scary. Like I still… And I’ve had it explained to me multiple times, I’ve gone through courses, I’ve read The Signal and the Noise, is that the name of Nate Silver’s book? And I’m like, I don’t intuitively get Bayesian versus frequentism. I’ve read a definition, like are you, undergrad or graduate, able to really articulate what Bayesian statistics is and why that mindset or why that approach or way of thinking actually brings value from an embracing of uncertainty or however it brings value? That was, I don’t know if that’s 17 questions plus a diatribe.

0:09:32 EF: So am I able to articulate to undergraduate or graduate students, why I’m Bayesian. That’s like asking someone why they’re Catholic, right? It’s a deeply-held belief about how we should think about uncertainty. [chuckle]

0:09:50 MH: Well the Catholicism, right, you could define what that doctrine is, so I guess it’s more like what… I sort of feel like if you can articulate what a Bayesian mindset is or what… Then it should become evident.

0:10:04 EF: Yeah. I can try. I don’t know that I’m… So one of the things you should be aware of…

0:10:08 MH: We’ve got six hours.

0:10:10 EF: I know. I know.

0:10:11 MH: So you wanted to pull out the white board, we’re good to go.

0:10:13 EF: So this is always a tough question, but the main thing that Bayesians do is that… To some extent, all properly trained statisticians are obsessed with uncertainty. We want to not just know that we have uncertainty but we want to measure as precisely as we can how much uncertainty we actually have. And the Bayesian approach is proven to be really useful in settings like… Well, one of the first settings was in figuring out where to aim a cannon or other ballistic thing to get the maximum effect against your enemy, when you don’t know a lot of things like the wind and other things that might affect exactly where this ballistic thing is going to land. And so this Bayesian approach to looking at that as a decision problem. So Bayesians are focused on how do we make decisions under uncertainty like this. Whereas I don’t think classical statisticians are as focused on decisions, and they also have a different approach. So I think every quantity that is out in the world that exists, I think of as having a probability distribution. Like, everything has some uncertainty, and I’m just obsessed with knowing how much uncertainty that is. So I would never say something like, “Next week on Tuesday we’ll have 50,239 visitors to our website.” I would always give you a range for that ’cause I’m… Am I making any sense?

0:11:47 MH: Well, yeah. But you would say, “Next week, I would expect 95% of the time… ” Well even trying to explain it, trying to articulate a confidence level, you would…

0:11:58 EF: Yeah, I might say exactly that. I might say… So, and that’s the one thing. So, classical statisticians, because of the way they think about uncertainty, they characterize uncertainty as, “What would happen if I replicated this process over and over.” And so they say these really backward-sounding sentences. But Bayesians are just straightforward about it, they say “I believe that, with 95% probability, that website traffic on Tuesday will be between 30,000 and 50,000 users.” And, on top that, you should make a decision. Like, say, if you’re trying to size, you should size for 50,000 people. Or maybe should size for 60,000 people because if this website goes down and crashes, that would be really, really bad.

0:12:42 MK: I actually, I was about to ask, ’cause as you’re talking though your obsession with uncertainty, which I find super fascinating, how that plays out in a business context, because often decision-makers, I don’t know, I guess in a footnote somewhere they wanna hear about uncertainty but when it comes to your recommendation they just wanna, “Hey, you should size your website for 60,000 visitors,” or wherever the decision is. And so, I really like how you just framed that of, “I’m still gonna give you the range but then I’m gonna tell you, this is the action that you should take based on the upper limits of that range.”

0:13:20 EF: Right, that’s…

0:13:21 MK: I mean, is that how you would position it to the business?

0:13:22 EF: Yes, that’s exactly how I try to position things to when I’m talking to decision-makers. So, just because I have uncertainty doesn’t mean I’m not sure about what to do.

[laughter]

0:13:33 EF: Does that make sense? Just because I’m not sure about a particular quantity in the world, I’m not sure how many people are gonna show up on Tuesday next week, that doesn’t mean I can’t decide what to do, specially if I can quantify that uncertainty and kind of give it a range. Then I actually am more powerful because I can say, “This is likely to take this much web capacity.” Another great example I heard from a guy who was an analytics professional at Best Buy, and he worked with Geek Squad and Geek Squad was always reporting the mean to their customers. A customer walks in, says, “I wanna get my computer fixed.” The part’s on backorder, so they would quote the mean time that they expect to take to repair the computer. And, of course, that means that more than… Like half of the customers, ’cause half of the time it’s over the mean…

0:14:27 MH: Now, wait a minute, are we confusing mean and median? ‘Cause that’s actually something I do understand. I’m not certain…

[laughter]

0:14:34 EF: Okay, so yeah, I was being sloppy about mean vs median there. You caught me. For a lots of distributions, they’re close to each other. But you get the point. There were plenty of people who were not getting the average amount of time. And so, they just changed to reporting the 95% longest time. They rank the times in the past by how long it’s taken and tried to hit the 95% on that, and that’s what they quote the customer. Which means it’s probably gonna be less than that for most people, which is what people want. They wanna hear that their computer came back faster than what is expected.

0:15:09 MH: Back to your traffic forecasting, you’re actually… If you are, as opposed to saying, “You should be planning for 60,000,” if you’re thinking in uncertainty and you’re saying, “I think it’s gonna be this range,” then the discussion becomes, “How comfortable are you taking a risk that we crash the site?” ‘Cause it may be pretty damn cheap to be, “80% of the time, you’re gonna be fine,” and it may get really expensive to say, “No, I wanna be 99% of the time,” and therefore being able to think about that spread.

0:15:40 EF: Yeah, and Bayesians actually have a name for that, they call it “the loss.” So the loss… If I make a decision and it turns out the truth is something different than what I was expecting or what I just planned for, what’s the loss? So, in that case, it’s a case of what we call “asymmetric loss,” so if I oversize, I waste s little bit of money because I had more web service available than I really needed. But, on the other hand, if I crash the website I lose a lot more, and that is what we call “asymmetric loss.” So, yeah, we’re really thinking about not only our uncertainty about quantities, but also what should we do in the face of that uncertainty, which of course is a function of how much we’ll lose if we make the wrong decision.

0:16:31 TW: There’s a point where I fell like we wind up in forecasting land which as an analyst, I’ve never been asked to predict traffic or revenue. That’s probably me pointing at people, especially in retail, especially of holiday or like, “Look, we need to predict demand.” I found myself sometimes using forecasting as a way to look for anomalies in saying, “How did my actual results compared to a forecast if those didn’t exist?” What are the other, I’m trying to think through what are the ways that that thinking of uncertainty, when we’re doing historical data analysis, and it may be that I’m completely framing it wrong, looking at past behaviour on the site. I wanna find the most common paths. I wanna find fallout points. I wanna optimize my site in the absence of an A/B test. I’m trying to wrap my head around where the uncertainty mindset… I’m really liking thinking of Bayesianism as bracing uncertainty, ’cause I’m getting way, way better about understanding uncertainty, so that’s actually closing that gap. I don’t know, is that a fair question?

0:17:40 MK: Yeah. The way I think of it is that the numbers from the past are what they are, and they have no uncertainty. We know… Well, there might be a little bit of measurement uncertainty like, maybe the tracking system went down or something happened. But, for the most part, when we’re looking in, say, Google Analytics and were saying, “How many people came to the site?” and, “Which sources of traffic converted at the highest rate?” I don’t feel like we have to say a whole lot about the uncertainly on those. And so in this, what I call the historical slicing and dicing framework that most web analytics professionals spend most of their time doing, the uncertainty doesn’t really matter. If you’re just saying, “This is how many people came last Tuesday,” it’s just a fact and it doesn’t have uncertainty.

0:18:32 TW: ‘Cause I used to think that’s the challenge when statisticians work with samples and we’re saying, “We have all the historical data. We have the population,” but the little light bulb that’s gone on for me is that if I’m looking at comparisons between… And the simple one is between my traffic sources. I wanna look at that and say, “Yes, I can say that paid search drove more traffic than organic search,” but there’s value in trying to look at that and saying, “What is the variability and how much traffic came from paid search versus organic search?” Are they really different? Or yes, the absolute population is different but are those channels really different?

0:19:08 EF: What I was about to say is that, when we’re looking at historical data, we’re implicitly asking questions about the future. When I look at my conversion rate by different source of traffic, I’m implicitly looking for a difference that I believe will be true in the future so that I can act on it. I find out that paid search is delivering high conversion rate on particular keywords, “Let’s buy more of those keywords,” right? There’s an implicit prediction about the future in there and if we adopt more statistical techniques we can be more explicit about that prediction and more clear about what our uncertainty is about that future prediction. And what we’re doing is we’re saying, “Well, I expect the future to be like the past,” and actually then quantifying uncertainty with data often means looking at, “How much data do I actually have here? And is it enough to say that there’s a real difference? Or is there a possibility that I would have seen this big of a difference between two sources just by random chance even if they were truly the same?”

0:20:13 EF: That’s the core thing that we’re… That’s why we do confidence intervals, is to figure out, “Do we have enough data to come to a conclusion? That if these things have a true long run mean is it different or not?”

0:20:26 MH: Can I just ask? And this is my own personal mullings of the last recent little while which I haven’t figured out remotely.

0:20:36 TW: You mean like for the last five minutes or the last like three months?

0:20:39 MK: No, like last three months. God, Tim. I’m a deep, thoughtful person.

[chuckle]

0:20:43 TW: I have got some uncertainty about what you mean by that phrase. It raises [0:20:47] ____ confusion.

0:20:47 MK: This is getting way too deep for me.

[chuckle]

0:20:50 MH: Don’t worry. I have a question. It’ll bring us right back to the surface after now.

0:20:54 EF: Yeah, I wanna come up. I wanna come up for air, please. [chuckle]

0:20:58 MK: Okay, so bringing you back. You had a statement that I’m interested in hearing your perspective on a little bit further ’cause like I said, I haven’t figured out my thoughts on this yet, which is that past behavior and past data is acceptable to help us predict what’s going to happen in the future. And I’m just gonna do a dot, dot, dot and leave it open for you to comment on.

0:21:21 EF: I think it’s important… Really good statisticians are very explicit about what they expect is going to be stable in the future. That’s part of, when I make a statement as a Bayesian statisticians that says, “I believe that the traffic on next Tuesday is going to be between 30,000 and 50,000 users with 95% probability.” Embedded in that analysis are some clear assumptions about whether next Tuesday is gonna be like last Tuesday, usually in some very specific way. Bayesians usually build the model for that which is just our way of saying that we have an equation that describes what we think Tuesday is going to be like as a function of what day of the week it was, and any special holidays. That model is part of my assumptions baked into it. I actually like the process of building that model because it makes me lay bare and explicit how I think the future is related to the past. But yeah, everything I say when I made that probability statement, kind of has that my assumptions baked into it.

0:22:29 EF: And you could examine my assumptions and say, “I don’t agree. I don’t think all Tuesdays are alike,” or whatever it is about my equation by writing it out as an equation. Then you can see it and say, “I don’t agree,” or, “I do agree,” or, “That seems pretty reasonable.” That’s part of the modeling process. I really like to say that I model professionally.

[laughter]

0:22:58 MH: Nice. So good.

0:23:00 EF: It’s a great line in a bar.

[laughter]

0:23:03 EF: And our listeners can’t see us but I’m like a middle-aged kinda ordinary-looking woman, so I definitely model clothing professionally. I model data professionally.

[chuckle]

0:23:17 S1: That’s awesome.

0:23:18 EF: So I don’t know. Yeah, I think it’s better to be explicit than implicit about why would we look at the data on what happened to our website in the past if we didn’t think it had something to do with the future? We don’t even need to look at it because we’re making decisions about now and in the future. We’re not making decisions about the past so we don’t really care about the past at all, except that it can drive our future perspective. And maybe one way we could kind of come back up to the surface on this is to talk a little bit about how digital analysts have been trained to look at the past. That’s the main thing that the tools teach you how to do, they don’t teach you to think about the future very much. Look at the user interface for Google Analytics. What’s the first thing you see? You see a time series plot of the traffic to the website of over the last, what is it, 30 days or something?

0:24:13 MH: I’m sure, yeah.

0:24:14 EF: There’s not even like a little dotted line, making you think about the future.

[chuckle]

0:24:18 MK: It’s just… Here it is. This is what happened, last month. And so I think we might be training people to think about, just cataloging, and reporting on the history. I like to call that a data retriever. Analysts, who all they can do, is find out answers about what happened in the past.

0:24:39 TW: What breed of analysts do you have? Oh, we have retrievers.

[laughter]

0:24:42 TW: That’s one of the most popular types of analysts.

0:24:46 MH: One thing that’s inherent in being able to have a more forward looking view is, bringing together all of the assumptions and things that will impact your model or your prediction, right? One of the big challenges I’ve found for analyst, is just building their library of possible things that could make a difference. How do you teach students or talk to people about how to develop sort of that… I don’t know if it’s a set of experiences, skill set, some of it’s in the data. But some of it’s not. Some of it’s experience. How would you approach that?

0:25:24 EF: Yeah, I actually think the Digital Analytics Community might be better at building that library than say the Data Science Community. Let me just complain about my data science students, my future data scientists. They’re really well trained to take any dataset and build a predictive model off of it. They get some dataset on whether or not people converted with all they’re behaviors at the website. And then they know all these fancy machine learning methods that basically sort through all those variables to try to find the ones that are most closely related to the conversion. But I think, that’s where domain knowledge and just being a good marketer can come in. If you can just sit back before you look at the data and start thinking about like, “Okay. What do I think is affecting conversion well?” Maybe there’s certain pages that I think that if you looked at, you’d be more likely to convert.

0:26:20 EF: Because those pages describe product benefits that I think would convince people. Or maybe, if you came in through a certain channel, like a blog post, you were already so deeply engaged in the product and the blog post price said something good about the product. Your more likely to convert. In a lot of ways the data scientists are just like, “It’s all data. We’ll just let the computer sort it out.” But I think that digital analyst have a lot to bring to the table in terms of intuition about just good marketing know how, about what is likely to affect outcomes. And just having looked at a lot of data, what is likely to affect outcomes.

0:26:55 TW: I think that’s another way that the tools have really taken a big chunk out of critical thinking from analysts, is actually not treating everything as equal as oppose to, putting the hat on of… My bias is towards finding things that I can impact, that actually matter. There’s a bunch of stuff I can’t impact. I can’t impact what my competitors’ doing. When we put direct traffic next to paid search, next to organic search, and again that’s the easy one to fall back on. Sure, just run it through a model and the solution is to drive more direct traffic on desktop. Well, good luck.

0:27:31 EF: Or send more people through blog post. How are you possibly going to do that? If it’s send more people through search, through paid search, I can maybe affect that a little bit. The other… Let’s loopback to input. But if we go back, the first thing is, are these things likely to be associated with conversions? Let’s take conversion. Have you ever opened the GA interface for the first time and shown it to a completely green person?

0:28:00 MH: I’m doing that today, so give me some tips.

[laughter]

0:28:02 EF: It’s not gonna be fun.

[laughter]

0:28:05 MK: It’s gonna be overwhelming.

0:28:05 EF: Have you looked at that menu on the left hand side. I don’t mean to pick on GA. I use GA because they actually have a very nice… They have a website where they make the GA, a website that they run available for students. So I use GA, and don’t be just…

0:28:22 MH: Not the API though?

0:28:23 EF: Not the API, just…

0:28:26 MH: But if this podcast has any sway, we now can say that we’ve put in the plug for them to open up the reporting API for the store.

0:28:32 EF: Oh my God, that would be awesome! Oh my God!

0:28:35 MH: Look at that, Google, you’re punishing academia. You’re punishing the analyst of tomorrow because you are not giving API access to the store data.

0:28:44 EF: Yeah. My data science students would love that. My marketing students, the undergrads would probably not like that. They’re gonna be still looking at GA, but getting back where I was going with this, that left hand side menu is overwhelming, right?

0:29:00 MH: Yeah.

0:29:02 EF: I’m looking at… Say I’m looking at conversion.

[laughter]

0:29:04 TW: Have you ever opened up the Adobe Analytics?

0:29:06 MK: Yeah. I was gonna say.

[laughter]

0:29:08 MK: You ain’t seen nothing yet.

[laughter]

0:29:12 EF: And like you said, nothing is elevated to be more important than anything else. It’s just… We can slice your conversions, by these 20 different ways. And a beginner student is like, “Well which way should… There’s 20 different ways. Am I suppose to look at all 20?” Don’t you think we could… I think you Micheal, you, could sort those items. What’s your number one thing that is related to conversion? Not whether we can have impact, that’s another issue. What’s the number one thing that’s related to conversion in this kinda data?

0:29:45 S?: What in GA? Oh geez.

0:29:47 EF: I mean what kind of behavior that’s available in GA is most closely relate… If you were looking for something that was related to conversion, where would you look first?

0:29:55 S?: Or maybe you’re fake.

0:29:55 EF: You are being graded. You are being graded.

0:29:57 S?: Yeah, I know. I’m feeling intense amount of pressure right now.

[laughter]

0:30:03 S?: I mean, obviously, the stupidest answer is, do they add something to the cart? Or start the checkout process?

0:30:09 EF: Okay.

0:30:10 S?: That would…

0:30:10 EF: Fair enough.

0:30:11 MH: Or have they purchased previously?

0:30:14 S?: Well, but is that readily available by Google Data?

0:30:16 EF: Might be.

0:30:17 S?: Not really.

0:30:18 EF: Maybe.

0:30:19 S?: Not in a useful way. Not without a lot of work. But like…

0:30:23 MH: I don’t like tests. Maybe I shouldn’t go back to school [chuckle]

0:30:26 EF: But I guess my point is, that you have some ideas…

0:30:31 S?: Yeah. In my life, as an analyst…

0:30:32 EF: For beginners…

0:30:33 S?: I have a list of things, yep.

0:30:35 EF: Right, which is the beginning of building a predictive model. You have an intuition.

0:30:39 S?: Yeah.

0:30:39 EF: You kinda know what things are important. The next big thing that Tim brought up before is, whether or not we can actually influence those things. If you wanna predict something, there’s actually two kinds of predictive models. One predictive model is called an umbrella model. That’s like, say you wanna predict the rain. The reason you wanna predict the rain is you wanna know whether or not to take an umbrella out. You don’t really care why it’s raining, you just wanna know if it’s gonna rain. And in that case, knowing that carting or past purchases are related to future purchases can help you make a really good prediction about revenue next month.

0:31:19 EF: Which could be useful, right? Boss probably wants to know what revenue next month is gonna look like. The other type of model that Tim is probably more interested in is what I kind of call a rain-dance model. A rain-dance model is, I wanna know, if I dance, will it rain? If I put these extra video features on my website, will I get higher conversions? If I send out an email blast, will I get higher conversions? Those are rain-dance problems. Those are ideas I have for how I’m gonna make it rain. Those kinds of models take more work. Someone with 18 months of data science training can’t build those kind of models for it to be more thoughtful.

0:31:56 TW: Is there a point, if you don’t really… Actually, just thinking through it that way and saying, if I’m asking this question and I can give the answer… If I can build this model, before you even try to build the model, figure out if you’re building an umbrella or a rain-dance model. I feel like that’s a little bit of a challenge, saying, “Hey, we have perfectly explained everything that happens and there’s nothing we can do with it,” we might have invested an enormous amount in it, as opposed to the flipside of saying, “I can impact where my paid advertising is sending people to land on the site. If I can build a model that shows there’s some predictive capability of where I send them and where they land and what the impact is,” then I’ve built a rain-dance model or am I…

0:32:45 EF: Yeah, that is a rain-dance model. Figuring out where the right landing page would be, would be a rain-dance kinda model.

0:32:52 MH: I think that’s your new nickname, Tim. You’re Tim “rain-dance” Wilson.

[laughter]

0:33:00 MK: This show is actually trucking along in a beautiful direction, which doesn’t happen to us all the time. I’m just gonna veer us straight off…

[laughter]

0:33:08 MK: I’m gonna veer us straight off the road and go in a totally different direction. [chuckle] As you’ve been talking through some of these concepts, one thing that’s kinda sticking in my mind, and it’s probably ’cause I have a personal stake in this topic, is what are you teaching your students or how are you teaching your students about their biases when they approach problems? Because I’m thinking, even if that rain-dance model you just suggested, if you go in there being like, “If I dance harder, will it rain more?” You could be starting off from that assumption that dancing leads to rain. And so, I’m just curious to hear, is there any… ‘Cause you did the early mention feedback about what we should be teaching students. Is there anything about teaching students about biases, and particularly, something like confirmation bias in analysis?

0:34:01 EF: I don’t do this with the undergrads, but with the grad students, by the end of class, I say, “Say it with me: Randomization will set you free.” [chuckle] So what do I mean by that? They know that; they leave the class and they know that randomization is really important. Some of them don’t actually know what randomization is, but they know it’s really important. And what I mean by randomization is if you have a true rain-dance problem; you wanna know, “If I dance, will it make it rain?” Then what you really should do is take half of the places in the world and do the rain-dance, and take another half of the places in the world and don’t do the rain-dance, or maybe it’s days, you can do days, you could do places, you could do people; do a rain-dance for some people, but not for other people, and compare those two groups, that is the sort of gold standard for proving that the rain-dance works. And that’s what we call an A/B test, right?

0:35:00 MH: Well, or even longitudinally. I, years ago, had a case where I was asked, “What’s the best time for this brand to post on Facebook?” And so I said, “Oh, I can do that historical analysis.” And they posted at 10:00 AM, every Monday, Wednesday, Friday. That was my data. And I’m like, “Well, you have not given me… You’re gonna have to mix it up. I’m gonna have to do some sort of longitudinal randomization.” I think those words can go together.

0:35:26 EF: Yes. That goes [0:35:28] ____… I like that.

0:35:28 MH: In order to figure out. And, “Oh, by the way, I’m changing the content, I’m changing the calls to action, I’m changing the type of posts.” So, “Oh, it turns out, I can’t answer… ” When you’ve been doing one thing… You’re dancing on Tuesdays in Sour Lake, Texas and seeing if it rains, that’s not gonna tell you whether dancing causes rain, right?

0:35:52 EF: Yes. That is absolutely right. So, yes, if you look at historical data, there’s two problems that can happen. One is the one you mentioned where, “I always do the same damn thing, so how do I know what would happen if I did something different?” [chuckle] I work with a company that sends catalogs every month on the first Monday or Tuesday, except June; and they’ve done this for years. So I can’t tell you what would happen if you doubled up on catalogs in December because they’ve never done it. [chuckle] So that’s one problem, and statisticians would call that lack of variation in the data. So there’s just not enough variation in the data for me to figure out what’s going on. The other problem is, and it’s more evil. The evil decision-makers of the past could have always been doing the dance when they know it’s going to rain.

0:36:39 MH: Oh!

[laughter]

0:36:44 EF: You’ve never looked at data like that. Do you wanna know the technical word for that?

0:36:47 MH: Ooh, yes!

0:36:48 EF: It’s called endogeneity; I hate that word. But the idea is if I’m looking at historical data, where decision-makers in the past have been acting in a smart way… And the classic is advertising, right? If I’m trying to figure out if my ads work or have positive ROI, and I look at historical data, there’s probably a heavy-up of advertising in holiday. Almost everybody does it. So it’s gonna look like advertising makes sales because every time we advertise, the sales go up, but it’s really the going up of the sales that is causing the advertising, and not the other way around. So that’s a really important concept, and randomization breaks it. Randomization breaks that problem. I say, “No, a decision-maker didn’t decide to do the dance. I actually decided. I had the computer randomly decide who was gonna see it.” And that’s the core thing that gets me excited about A/B testing, is I can know for sure from the A/B test that A is better than B.

0:37:49 MH: Is it fair to say the design of experiments is kind of a superset of A/B testing? That you can design experiments that give you what you need that aren’t necessarily an A/B test?

0:38:01 EF: Oh, yeah. For sure. In fact, I’m not even sure what A/B test means. What does A/B test mean? This is actually important ’cause I’m writing a paper about A/B testing. [chuckle]

0:38:10 MH: Well, so the A is usually your control… No. [laughter]

0:38:17 EF: No, I’m curious. What does an A/B test mean to you? Moe… Oh, I’m cold calling again.

0:38:22 MH: That’s okay. As long as it’s not me.

0:38:23 EF: Do you know about cold calling?

0:38:24 MK: Oh, man. [laughter]

0:38:27 TW: Yeah, no.

0:38:27 MH: Good, put him on the hot seat. No, put her on the hot seat.

0:38:30 MK: Fine. Fine. Fine.

0:38:31 MH: [0:38:31] ____ answer her question.

0:38:34 MK: Yeah. If I totally faff this up, I want no judgment. Please be kind listeners. So… Fine.

0:38:41 MH: Please just answer the question. [laughter]

0:38:44 MK: So to me, yeah, you have group A, which is your control, and they’re exposed to one variant of whatever it is that you’re exposing them to. And then you have group B, who are your variant who sees something different, or exposed to a different feature, page, whatever it is that you’re testing. So that’s in really high-level layman’s terms, so that I don’t get judged if I faff it up.

0:39:06 EF: So if I do an ad where the control is I don’t send an email, and the treatment is I do send an email, is that an A/B test?

0:39:13 MH: Yeah…

0:39:14 MK: I don’t think that’s a fair comparison.

0:39:17 MH: As long as it’s random, and as long as you have a metric you can effectively test or measure between them.

0:39:25 EF: Yeah. So I was just curious what you guys thought because sometimes when I say A/B test, people do think like Moe did, and say A, it has to be something versus something else, and some people specifically call what I described, a holdout test.

0:39:40 MH: Oh, yeah.

0:39:41 TW: I was thinking holdout.

0:39:42 MK: Now, I have a term to explain that. Nice.

0:39:45 EF: Yeah. Well, I don’t know. I think we should define this, and we’re The Digital Analytics Power Hour… Well, I hope I can include myself in The Digital Analytics Power Hour family, now, but can The Digital Analytics Power Hour say, “We are gonna universally call A/B tests, things that compare two active marketing treatments? And randomized holdouts are things that compare an active marketing treatment to something that’s not, to nothing?” I don’t know.

0:40:11 MH: Yeah. And it’s not the function of The Digital Analytics Power Hour to create industry standards or… [laughter] We would leave that in the hands of the Digital Analytics Association, or other…

0:40:26 EF: So the kind of beginner version of this issue, and it’s just a personal problem I have when I’m writing up this paper is… So I’m working on a paper, where we actually look at A/B testing from a decision-theoretic point-of-view rather than a statistical point-of-view. And I have to make a decision, and if I’m doing a test between two versions of a website, let’s say, two versions of a homepage, and I have to make a decision, and I have some data, and my data is not clear… So this is one of the questions I wanted to ask you. So say, I do an A/B test, and I do A versus B, and it turns out that A and B perform about the same and it’s a homepage, what should I do?

0:41:11 MH: Yeah, Tim. What should she do? [laughter]

0:41:14 TW: First off, I’m glad we’re back on the test ’cause I feel like the direct marketing and the email marketing winds up being… In a sense, there’s an easier component to that. And that’s where holdouts come in ’cause you don’t have a holdout scenario when you’re doing an A/B test on a site. And the vendors have conditioned us to talk about web experiences for A/B, and I think that’s why that’s a piece, there. I think that it would be helpful for us to accept that many tests, ’cause in practice, many tests don’t turn out a winner. Winner is a binary term. I think that’s problematic. That’s where all this uncertainty kind of comes in as well. People love to talk about how you see a big winner and then the test effect deteriorates over time or whatever that is.

0:42:05 EF: Oh, I don’t know about that. Explain that to me.

0:42:07 TW: Well, it’s basically, you do an A/B test, and that can outpost at least one article ’cause they’re all over the place and say, “Oh, this had a 5% lift, and it was statistically significant.” And then you run the same test again… I’m gonna butcher what this is… It’s kind of like doing an A/A test that shows significance, when it’s an A/A test. That being another…

0:42:29 EF: Well, that happens. You know significance happens.

0:42:32 TW: Yes. Yeah, yeah.

0:42:34 EF: 5% of the time, at random, [chuckle] when it truly wasn’t significant.

0:42:38 TW: But the other is that when… So an A/B test, where you say, “Oh, this one shows a certain lift. We were getting this result, this changed. So has this other result. We shift to using our B variation, and then over time, it winds up back exactly where it started.” And trying to explain why that happened, and people seem to write about that fairly often, but not the people who are selling that as a service or a solution. I mean ’cause if you take the data scientist and you throw a bandit into it and say, “You know what, we’re not gonna turn this stuff off. We’re gonna let it keep running because even if it’s down to 5%, we kinda wanna be able to re-normalize to whatever seems to be working better.” Keep the fuel running, as opposed to saying, “I’m gonna run this test. I’m gonna give it absolute truth, which there is no truth. I’m gonna switch to that truth and then move forward.” That’s what we have freaking taught all marketers and analysts to do. And that was six tirades, and maybe, I just dodged the question. I get partial credit?

0:43:47 MH: Yeah. I was like, “I’m not sure you answered the question there, Tim.”

0:43:50 TW: Did I get partial credit, though, for just…

0:43:52 EF: I don’t even remember the question, anymore.

0:43:55 TW: Yeah. Look at that.

0:43:55 MH: All I know is, Tim was like, “This is how I got through college.”

0:44:00 TW: There’s no such thing as truth and it’s sort of like, “Well, then why answer the question at all?”

0:44:03 MH: No, there is no truth, right? I mean, that’s…

0:44:06 S?: Tell that to the Catholics, Tim.

0:44:07 MK: Tim for politician.

0:44:09 MH: The fact was, you almost… I’ve gotten to where I rattle off Matt [0:44:12] ____ that were operating under conditions of uncertainty, there’s a cost to reducing uncertainty, and we can’t eliminate uncertainty, and Elea, you almost quoted… You said that first part, and Matt wouldn’t take credit for being the one who coined that, it’s just it finally sunk in with me that…

0:44:27 EF: It’s just that we’re both Bayesians. [chuckle] So we see the same thing. [chuckle] We learned the same catechisms from the Reverend Thomas Bayes, right?

0:44:34 S?: That’s right.

[laughter]

0:44:37 TW: Yeah. I have no idea what the question was.

0:44:38 MH: I think she asked you what you would do if you didn’t find statistical significance in an A/B test on the homepage, if there was…

0:44:46 MK: Oh, yeah! No, if they were the same.

0:44:49 EF: Yeah. If it’s not statistically different.

0:44:50 TW: I would dig in a little bit, but then I’d be okay… If I was testing a fucking button color, and there was no difference, I’d probably going back and saying, “Yeah. You know what, deep orange versus deep blue, it doesn’t fucking matter.” So that’d be applying some level of I’m doing this test because it’s not a significant in marketing strategic terms test, but I was conditioned that I could change the button color or the button text. If it was, “Wow, these are fundamentally different. Wait, maybe if I dig in a little bit more, peel the onion back, and this is where we’re having the Simpson’s paradox discussion before we started… Maybe it’s because I’ve lumped too much stuff together.” So there’s some level of digging in, there’s some level of applying some subject matter expertise and moving on.

0:45:38 EF: Yeah and it depends on what levers you’re able to pull back, to your point about what can I and can I actually do in business practice. So say it’s the homepage, the button on the homepage has to be the same for everybody. So it really doesn’t matter if there’s subgroups that are more affected by blue than orange, than other people. I just need to pick one. And actually, at that point, a Bayesian would say, “I don’t care. Just pick one. Whatever you want.” [chuckle] That is the perfect time to let the most important person in the room decide, “Who cares? It doesn’t matter. It doesn’t affect the business outcome, but we should just pick one and move on.” A lot of statisticians would say, “Collect more data.” That’s ridiculous.

0:46:20 MH: Oh, yeah, just run the test until you get significance. Is that not a good idea? You don’t just run the test until it’s almost a significance.

[laughter]

0:46:28 MH: ‘Cause all of this is just going right over my head.

0:46:31 MK: I’m learning gazillions in this episode. So can we just focus on me for a moment.

[laughter]

0:46:38 EF: Sure. I know. I have so much lighter content to share with you guys. How did we get so deep?

0:46:48 MK: So, okay, I have a question, and I’d like to hear how you’d approach this problem. So we’re definitely not experts in A/B testing. It’s kind of in its infancy, where I work. And one of the problems we’re having in my particular team is that we always come back to measuring the success of an A/B test by a reduction in a particular category of customer service contacts. Now the problem is, even if we have 10,000 people in the control and 10,000 people in the variant, the percentage of those who contact customer service for a very specific reason is so small that I’m hesitant to make any decisions on it. What would you do? ‘Cause I’ve got no idea at the moment.

0:47:37 EF: So that’s actually a really important and practical problem in marketing that I don’t think statistics is tackling as well as we… The statisticians are just not providing solutions ’cause statisticians are used to measuring things like the length of your foot, which is normally distributed across the population. Whereas, you’re talking about a low-frequency event, a particular type of customer service contact. And as those events become very low-frequency, what happens is the required sample sizes just explode. It could be in the millions. It depends on how low the probability is. If it’s one in 1,000, or… If it’s one in 1,000, in 10,000, you might see ten, right? [chuckle]

0:48:22 MK: That’s about what I’m looking at.

0:48:26 MH: I’ll cut this out if I fail the course at this point, but does the t-test not take into account the size of both the successful versus…

0:48:36 EF: So for conversions, you should be using a prop.test instead of a t-test. But it’s very closely related, but what happens is the variance gets very large when that happens. So when the variance is large, it pushes the sample sizes up. So when there’s a lot of noise in the data. So the t-tests… There’s several ways to think of the t-tests, but I’m gonna describe it in sort of a Bayesian interpretation of it. And then, I’m just apologizing if there’s any real statisticians listening and they don’t like the way I’m describing it. I’m describing it in a Bayesian way, but…

[laughter]

0:49:07 MH: Oh, we lost them a long time ago. We lost them, like episode three.

0:49:11 EF: But the t-test is telling you, “Is there a high likelihood that the true average of these things is different?” Or in a prop.test, it’s the true proportion of people who call customer services, is it different between the A and the B? And that is a function of the variation in the data. So these low-probability events drive a lot of variation in the data, and the difference between the two things. So the bigger the differences, the more certain we can be that there’s a real difference between the things. If one causes half the people to call customer service, and the other one is running one in 1,000, we’re not gonna need a lot of sample to figure that out. So when you’re looking for kind of minute changes in the website that probably aren’t gonna change that customer service contact much, and it doesn’t happen very often, you’re gonna need massive, massive sample sizes to detect it. And that’s been a big issue in the academic literature lately. So some guys published a paper a few years ago, where they were looking at display ads on Yahoo. These were display ads shown on Yahoo homepage. Do you remember Yahoo had a homepage?

0:50:20 EF: And they had these low-probability of clicking or converting in response to that ad. And the whole point of the paper is saying, “Oh, my God, the sample sizes here are in the millions.” There’s probably not enough data in the world for advertisers to figure out whether or not their advertisements are working if what they’re looking for are these kind of low-probability events. And there’s kind of no way around it. I guess, what I would suggest to you, Moe, is think about are there any proxies that you could use that are further up in the funnel that would be related to those customer service contacts that might give you a read because they’re not so low-probability. Maybe, looking at the page where you get the contact information for that service event, and more people are gonna do that than actually call and… I mean, I get you that that’s probably what the business cares about ’cause that’s expensive. So trying to get those down, but is there some kind of upstream measure you can use where you can get statistical significance, even if you can’t get significance on those actual customer service contacts? Does that make sense?

0:51:24 MK: That does. And, I mean, that’s kind of how I’ve been thinking. But to be honest, the way you framed it is super helpful. I really, really appreciate it.

0:51:32 MH: I think, more importantly, I’m on the Yahoo homepage, and I’m not finding any display advertising. [laughter]

0:51:37 EF: Apparently, there was at one point.

0:51:39 MH: No. Oh, I’m sure there was. I think that [0:51:42] ____ may have come out of that study that they were like, “Yeah, nobody’s gonna pay for this.”

0:51:46 EF: Well, the actual punchline of the paper was, “No one knows whether they should pay for this or not because in order to figure out… ” So they were comparing ads for the product, just public service announcements, that was their A/B test. But they figured out that the sample sizes that they would need are so enormous that they would never be able to figure out whether ads are better than public service announcements. And that’s a pretty big punchline, right? Nobody has enough of an advertising budget [chuckle] to buy enough advertising to figure out if it works. That’s not true for all channels, by the way. Direct mail is actually extremely effective. So you can find significant effects for direct mail or email pretty easily.

0:52:28 TW: Yeah. Absolutely.

0:52:29 MH: Man, this… Well, there’s so many things. And the bad news is we have to start to wrap up. But I’ve written down about four amazing quotes that I’m gonna steal from you. Like, “Randomization will set you free.” I loved that.

0:52:47 EF: It will! It totally will!

0:52:49 MH: This has been really awesome.

0:52:51 MK: It’s been utterly terrific. Yep, couldn’t agree more.

0:52:54 MH: What we really need to do is just find ways to hang out on the side, just to talk more [laughter] ’cause I’ve got lots more questions. Anyway, this is, I think, excellent and very helpful. There’s so many more angles, but I think we’ve gotta move on. One of the things we love to do on the show is called “the last call.” That’s where we go around the horn and we just talk about something we’ve seen recently that we think is cool, or of use to our listeners. So you’re our guests, Elea, so would you like to go first?

0:53:28 EF: Sure. I saw this tweet recently. This is related to what we’ve been talking about. I saw this tweet recently that I can’t attribute, because I can’t find it again and I can’t remember who gave it to me. So maybe it’s the tweet that I dreamed. But when you find an interesting effect in your data or an interesting difference between two groups, the first thing you should ask yourself is, do I have enough data to say these are truly different? And second thing I should ask myself is, is it a fluke? And yeah, that was those two questions. I thought that was brilliant. It’s quite easy, no math just, do I have enough data to say that this is a real difference? And is it a fluke?

0:54:09 MK: Nice.

0:54:10 MH: See, there are cases for why more than a 140 characters actually works out. ‘Cause there’s no way that’d fit all in the…

0:54:16 EF: This is the tweet I dreamed, so maybe my tweets are… The tweets I dream are verbose.

0:54:22 MH: You should just tweet it. Get it out there, we’ll retweet it.

0:54:26 EF: If you’re listening and you’re the person who gave that tweet, tweet at me @eleafeit, so that I know who you are.

0:54:35 MH: Alright. Moe, what about you? Do you have a last call?

0:54:38 MK: I do. I don’t know if anyone’s ever mentioned it before, ’cause it’s not super-recent. But a while back we were super weak, and Charles Farina from Analytics Pro did a presentation, and he talked through some integrations and I ended up tweeting him. Long story short, he talked to me how to set up a tool called SessionCam. But the truth is… And that’s a fantastic article. It’s on his blog and we’ll share that, but it’s also just… For him as a person, he’s kind of inspired me to have a little bit of a tinker around with some other GA integrations. So I set up another tool on Friday on pet project A, so I shout out to Charles for hitting me off, kicking me off my back side and…

0:55:23 MH: Funky Charles Farina, we’re gonna make it a thing.

[laughter]

0:55:28 MH: Everybody funky Charles Farina. Alright. Tim do you wanna go or do you want me to go?

0:55:33 TW: Why don’t you go?

0:55:35 MH: So I wanna make a comment though ’cause I think one of the things I felt like I heard or observed in you describing the difference between sort of classical statistics and Bayesian, is this almost a sense of application that comes from Bayesian because you’re considering this concept of asymmetrical laws and other things like that. And that’s one of the things that I’ve been really trying as my team is developing, I’m trying to give them tools for application. And because like your students in the data science they’re super smart, have all these abilities to build predictive models of the data but don’t know the marketing application or context sometimes. My first is probably a repeat of something I’ve said on the show before but I’m starting with the beginning which is “Scientific Advertising” by Claude Hopkins as what I’m having people read to get started on that, so that’s my first last call.

0:56:32 MH: And then my second one is personal which is, there was a really great article recently by I believe someone who was one of the founders of Ion Interactive, Anna Talerico. And she wrote this article about how she lost her ability to do deep work and what she was doing to regain it ’cause she was the CEO and so she was spread so thin in doing the things that I really resonated with me, ’cause I feel like I’d go through that exact same thing and then struggle to actually engage deeply to do things like analysis or any kind of meaningful, real, deep thinking. Really good article, so that’s my other one. Okay Tim, what have you got?

0:57:12 TW: So I’m gonna give a listener comment and then I’m gonna tie into Last Call. So the guy I know we were going back for a little LinkedIn and his comment was, he was like, “So, whenever I listen to the Podcasts, I always feel like my scope of work is super small. Machine learning, AI, text mining, I’m trying to explain why customers hate my promo or why Adobe should never be allowed to touch her implementation.” So he’s having some challenges with his Adobe implementation. He also had earlier told me that I sounded like I was drunk on the Super Week Episode.

0:57:42 TW: So it’s so we were going back and forth a little bit about, “Hey what if we are still in the world of giving my reports, be a data retriever, make recommendations, tell me about the campaign, how do we actually break out of it,” and recently ’cause I recently took a new job, it happens to be a company that pushes for a lot of ongoing learning. The very, very tactical free Codecademy subscription for learning python is, I’ve been down to R Path and I’ve had various reasons that I need to ramp up on python and I just… Codecademy, which I’ve dabbled with in the past, but I was just struck by how well they make these little bite sized things. You could literally carve off 20 minutes every morning for two weeks and go from zero to some basic level of understanding with something like python or git or whatever you want. Not directly applicable to digital analytics, it doesn’t cover some of the challenges we have of how do we actually work with our data, but it’s a form of a course to go through.

0:58:48 EF: Alright, I didn’t know we were allowed to have two last calls.

0:58:50 MH: Oh yeah, totally.

0:58:52 EF: So I have another last call?

0:58:53 MH: Get back in there, absolutely.

0:58:54 EF: So this is an old trick I learned in market research at General Motors. Before we were allowed to undertake any market research project or any kind of analysis project, and I think this applies broadly to all analysis projects that we might take on. Try to get the decision maker to give you a decision criteria which is sort of an ‘if this then that’ statement about how your analysis relates to what they’re going to do. So get them to say something like, “If you find out that,” well in our case with cars it was, “If I find out that this car has a styling appeal level below 50%, I will redesign the car.” Or some other major investment that they’re gonna take on the basis of this, so that it immediately gets rid of the nice to know stuff. And right down to the nitty-gritty of it’s like ‘if this then that’.

0:59:48 TW: I am a huge fan of the, “I believe this is going on,” don’t just state your hypothesis if your hypothesis holds up then what action are you gonna take ’cause it’s a way to validate that there’s some actionability there. But I like the if this then that.

1:00:02 EF: At GM we actually put it… And we’d record the statement and put it as the first slide in every report out-deck we made for that decision maker.

1:00:10 TW: I like you.

1:00:12 EF: It wasn’t a prom… We didn’t hold them to it, but we reminded them that they told us that’s when we started.

1:00:18 TW: That’s fantastic.

1:00:19 MH: Okay, you’ve been listening to the show and you have a lot of questions and ideas, and things that you want to ask Dr. Elea McDonnell Feit so you should. And we would love to hear from you. And the way that you should reach out to us is via Twitter or our Facebook page or The Measure Slack, are you on The Measure Slack?

1:00:42 EF: Not yet. I’m not into general analytics person, really.

1:00:46 MK: Oh, we’re gonna make you one. [chuckle]

1:00:49 MH: No, no. This is good because actually I think, A, your students will find this a useful group, and I think you can also give assignments to the people in Measure Slack when you want opinions. So this may be a community that you find useful.

1:01:04 TW: Yes.

1:01:05 EF: Okay.

1:01:06 MH: Certainly we do. But anyways, those are the ways. But on Twitter you can reach out to Dr. Feit at @EleaFeit. But we would love to hear from you. One thing for everyone out there, thank you so much we’ve been asking over the past few episodes for more ratings and reviews on iTunes. We’re not sure why we need those, but thank you for your response in giving those. [chuckle] I happened to cross our iTunes page and I saw those and it’s… Frankly, it’s really heart warming. Thank you so much for those ratings and reviews, keep ’em coming.

1:01:43 MH: Okay, we would love to hear from you. Reach out to us in those ways. Dr. Feit thank you so much for being on the show. This is amazing. We always say this and we really mean it this time which is, “My goodness, [chuckle] we’ve barely scratched the surface of what we wanted to talk about.” Also I really hope we were recording at the very beginning when I described how the show would go, because it almost kind of did, didn’t it? [laughter] And on that note, and for my two co-hosts Moe and Tim, all of you out there start being uncertain but keep analyzing.

[music]

1:02:24 S1: Thanks for listening and don’t forget to join the conversation on Facebook, Twitter or Measure Slack Group. We welcome your comments and questions. Visit us on the web at analyticshour.io, facebook.com/analyticshour or @AnalyticsHour on Twitter.

[music]

1:02:44 Speaker 6: So smart guys wanted to fit in so they made up a term called analytic. Analytics don’t work.

[music]

1:02:53 MH: The show will go like this, I’ll get us kicked off, ask you that first question and we’ll just get into a conversation at that point then you, and Tim will just talk about 45 minutes until I interrupt.

[laughter]

1:03:06 EF: What about Moe?

1:03:07 MH: That’s not what you thing, that’s what Tim thing.

1:03:11 MK: Tim, are you freaking out that I’m talking about stuff before the show?

1:03:14 TW: That’s what you do, Moe. [laughter] It’s what you do.

1:03:18 MH: Bayesian statistician.

1:03:22 MK: Statistician nailed it.

1:03:24 MH: Statistician gettin there, it’s been an afternoon of heavy drinking.

1:03:29 S?: Yay!

1:03:31 EF: Why are you so excited?

1:03:34 MK: I’m like, “How do we get you somewhere into a bar, where we can just sit around and have a few drinks and keep chatting ’cause… “

1:03:43 EF: Yeah. I live in Philadelphia, so if you’re ever in a bar in Philadelphia just call me and I’ll come there.

1:03:50 S?: Where are we?

[laughter]

1:03:55 MK: Okay, wait. I’ve got one more question now that this show pumped me.

[laughter]

1:04:01 MK: Sorry.

1:04:02 S?: Are you serious?

1:04:03 MK: Yes. I’m deadly serious.

1:04:06 MH: Rock flag and uncertainty.

[music]

4 Responses

  1. Ben says:

    Recording this should have made you rethink the entire “hour” in the name of the podcast. Maybe add an “s” to it and WHEN you have Dr. Feit back on you can let the recording last for a few more hours.
    Great Stuff!

  2. […] #084: Bayesian Statistics and the Digital Analyst with Dr. Elea Feit […]

Leave a Reply



This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_243_-_Being_Data-Driven__a_Statistical_Process_Control_Perspective_with_Cedric_Chin.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares