For our special International Women’s Day episode, we committed a type one error and peeked at our results, so we are releasing this winner three days early. As good analysts, we set out to optimise the podcast by swapping out Tim and Michael for two guests (it’s rare for Tim to be in the control group, but he’s an outlier either way). Unfortunately, it turns out we confused testing with personalisation, so we invited along a family member, Michele Kiss, as well as CRO expert Valerie Kroll, to talk about the evolution of the space from conversion rate optimisation (CRO) to experimentation. In Val’s words, good experimentation programs are all about optimising to de-risk product feature roll-outs and marketing tactics, all the while learning about our users and prospects. Stay tuned for the three tips from our guests on how to set up the best version of an experimentation framework, as well as the stats on the show’s gender breakdown since our start in 2015!
Ideas, Concepts and Frameworks Mentioned on the Show
- Valerie Kroll
- Michele Kiss (different, but related to, Moe Kiss!)
- International Women’s Day
- Conversion Rate Optimization
- Ton Wesseling
- (Conference) CXL Live
- Summary of Bangaly Kaba’s CXL Live 2018 talk – “The Path to 1 Billion: Lessons Learned from Growing Instagram”
- Lukas Vermeer
- Testing Prioritization Frameworks: PIE, ICE, and PXL
- 10 Conversion Optimization Myths that Just Won’t Go Away
- Matt Gershoff
- Multi-Armed Bandits
- (Book) Why We Sleep: Unlocking the Power of Sleep and Dreams
- (Book) White Fragility: Why It’s So Hard for White People to Talk About Racism
- (FB Group) Moms in Tech
- Lily Cerna
- (Book) Curious: Life hacks through maths
- The Measure Slack
00:04 Announcer: Welcome to the Digital Analytics Power Hour. Tim, Michael, Moe, and the occasional guest, discussing digital analytics issues of the day. Find them on Facebook at facebook.com/analyticshour and their website, analyticshour.io. And now, the Digital Analytics Power Hour.
00:27 Moe Kiss: Hello everyone, welcome to the Digital Analytics Power Hour. This is episode 136. And today in the host chair, you have Moe Kiss. Once upon a time ago, well, about a year ago, the Digital Analytics Power Hour started a new tradition. It’s officially a tradition because we’ve done it two years in a row. And that tradition is to honor International Women’s Day with a kickass show brought to you by all-female voices from our industry. Basically, today is a great chance for me to hang out with two industry people I adore, and well, talk data and hopefully, a bit of shit. But before we get into that, I wanted to share a little data with you because it is International Women’s Day. Tim and I, big shock, we disagree on how to count this particular metric, but one of the things that we’ve started looking at is the gender breakdown of people that we have on the show. I argue we shouldn’t include the host, Tim argues that we should include the hosts, but yeah, you can see where we get to with this discussion.
01:29 Moe: But I have a couple of stats for you. In the year 2015, the show had 97% of its contributors were male. That’s including our hosts, 97% of the voices on the show were male. In 2016, it was 99%, in 2017, it was 83%, 2018 was 64%, and last year in 2019, it was 61%. That means almost 40% of the voices that came to you last year were female. Now, it’s still pretty early in the year but so far, including today’s show, we’re at 56% male voices and 44% female voices, which I don’t know, I’m pretty freaking excited about. I wanted to share those numbers with you. But on to the juicy stuff, oh god, I’m so excited about today’s episode. We’re gonna be talking about conversion rate optimization or CRO. You’re probably gonna recognise our two guests from a conversion conference that came to you last year, where there was an X-Files themed movie poster that went viral because everyone thought that they’d somehow made a mistake. Apparently it’s a real thing, but anyway. First I’d like to welcome to the show, Valerie Kroll. She is the Optimization Director at Search Discovery and President of the Digital Analytics Association. She’s also held senior data roles at UBS and the American Medical Association, and is probably more organised than Tim himself, which is terrifying. Welcome to the show, Val.
03:02 Valerie Kroll: Thank you so much for having me. Big claims on Tim. Woof. I’m probably gonna get feedback on that one later. [chuckle]
03:07 Moe: Oh, I’m sure you will. Now, next off, we have someone who is a blood relative of one of the hosts. She’s a former co-worker of one of the hosts and a great friend of all three of us. But somehow, it’s taken 136 episodes to get her on and I’ve never heard the end of it. Today, we have with us Michele Kiss. Yes folks, you heard that right, my sister. Nope, we are not the same person. She is a senior partner at Analytics Demystified, where it feels like she’s been forever but in reality, just a short seven years. And prior to that, she held senior roles at Red Door Interactive and Kelley Blue Book. Welcome to the show, Michele.
03:49 Michele Kiss: I exist. I’m a person, a separate entirely different person.
03:54 Moe: And sometimes, we like to call her Shum, so feel free to let that spread as well.
04:00 Michele: No, we do not.
04:01 Moe: We do.
04:03 Michele: Who can play this game?
04:05 Moe: Okay, I’m gonna stop there, sibling rivalry and whatnot. But yeah, conversion rate optimization, I’m really pumped to talk about it because I have a gazillion questions and thoughts on this topic. But Val, maybe we could just start with a good old definition and a little bit of how this trade has evolved.
05:29 Moe: But it sounds like people often describe CRO as kind of A/B testing. And if I’m doing A/B testing, then I’m doing CRO. Is that an immature mindset or does that summarise it well? It seems like the two really go hand in hand.
05:46 VK: Yeah, I think that there’s a lot where they do go together, but you’re right that they often do get conflated. I think that the difference between running A/B tests and having a formal CRO program is the underlying process and the way that you’re plugged into the broader organisation. You could be a digital analyst, like I was when I first got into the experimentation space, that ran an occasional A/B test to help prove out some ideas that my stakeholders were coming to the table with. But it wasn’t until we really built out things like an intake process and a prioritization model and making sure that we were always testing and taking advantage of every time we could learn from a pair of eyeballs. And that’s really the difference between the two.
06:31 Moe: Michele, I’d like to bring you in a little bit because you’ve spoken at quite a few different conversion conferences. And I think one of the things that’s always a challenge is, how much is this the analyst’s job and how much is this its own separate function in the business? And I know, in some of your work in consulting, you’ve been involved in some of these programs. I guess, where do you think it should sit?
06:54 Michele: Yeah, I think it’s gonna depend a little bit on where your organisation is in terms of maturity because a lot of times, a single person is expected to fulfil multiple roles. And I think, as Val said when she started, being the analyst who also does some form of testing and optimization is an incredibly common place to start. But I think that as a company matures, as they’re trying to do more, you can’t possibly expect people to fulfil multiple roles, but I do see them as incredibly connected to each other. And sometimes, the places where I’m surprised certain conversations aren’t happening, are the ones where it’s like, “Wait, analytics and optimization should be so much more hand-in-hand than what I’m seeing examples of,” because I think that really is key to how it’s going to be successful.
07:50 Moe: One of the things that I have seen kind of play out client-side is, yeah, the two functions really should be hand-in-hand, but this view that somehow analytics needs to be QA-ing everything that a CRO… How do I refer to a CRO person in a short term? Is that CRO-er?
08:12 Moe: What’s the slang? I don’t know, Val, do you know?
08:16 VK: [chuckle] Well, actually, I think, CRO is way more of a popular term in Europe than I think it is in the US, and I hear experimenter or optimizer much more commonly over in our parts in the US, I would say.
08:29 Michele: I hear optimizer.
08:30 Moe: Oh, optimizer. Okay.
08:33 Michele: Yeah.
08:34 Moe: One of the things, getting back to my point, that I struggle with internally is, yeah, the two functions, analytics and optimization should go hand-in-hand but I feel like sometimes one might be blocking the other, if that makes sense. And in that vein, it’s normally analytics, who are trying to QA the work of the optimizer and make sure they’re doing everything “right”, and in the process, slowing them down. And I had the good fortune of meeting Ton Wesseling at Superweek and I… Yeah, I mean, he’s a CRO genius. He’s fantastic. And he was like, “No, you should be able to set up your program and then walk away and leave it.” And then you’re just like, “Wait, how much damage could… ” Could things go really bad if you set things up, and then just leave and optimize it or run with it if they really don’t understand A/B testing in a whole lot of detail? This is the conundrum that I mean. How hand-in-hand should they be to ensure that one can work efficiently without getting blocked, essentially, but still has the right checks and balances?
09:38 VK: I’ll take my first half of this and I’d love to hear your thoughts too Michele, is if I had to pick a home for optimization experimentation to live and I had the choices of potentially an IT team where there’s just mostly engineers, or it’s in marketing or product or analytics, I’m gonna pick analytics. Because the most dangerous thing you can do is invest in the cost of testing, and then end up with outcomes where you can’t make smarter, better decisions that are less risky or are really gonna lead to true business impact where those results hold. And I also think that there’s a lot of roles for analysis when it comes to CRO, and not just on the back end once you’ve reached your stopping point but there’s a whole exercise of evidence gathering, ideally some quantitative data being a part of that to help identify what are the right opportunities to execute on. And how can I make sure that this test is going to put me in a better spot than before I invested the time to test. I’ve never seen analytics be a blocker because it’s only enriched every part of the CRO process from my perspective.
10:49 Michele: It probably depends on who’s defining blocker ’cause I could certainly see blocker being like, “But it slowed us down. We were trying to do these many tests per quarter and we had to spend all this time checking with all these checks and balances.” I view that connected teams and functions are always a good thing. And I don’t feel like there are a lot of departments where we would ever say that everyone should get out of the way and they should all be doing their own thing and we should just let them run with it ’cause I think that ultimately everybody has to be working together. It’s like, “Oh, well, marketing could get so many more campaigns run if legal would just butt out.” And it’s like, “Oh no, no, no, no. They need to work together.” And I feel like that’s gonna be the same thing with analytics and optimization where there’s going to be a certain extent of analytics helping to set up good practices, and then a CRO team being able to move a little bit nimbly. But at the same time, your analysts are looking at your data all the time, they’re seeing what’s changing, they’re seeing what kind of behaviour and patterns there are. And you always want that to be feeding into an optimization program. You don’t wanna go through this one analysis exercise to come up with 10 ideas at the start, and then you never look at data again. I think they just have to stay connected.
12:15 Moe: Yeah, that makes sense. I guess I’m trying to understand the interplay of speed and the most accurate results that are possible. And I don’t know, maybe this problem is super specific to my team where CRO does sit in another space but it’s this thing where it’s always like, “Okay, well if we really want to make sure the results are valid, then we need to wait for an analyst to analyze those results,” versus someone interpreting them in a tool which then can make things a little tricky, I guess.
12:48 Michele: I think you have to look at the potential for impact but also the potential for harm.
12:53 Moe: Exactly.
12:54 Michele: So, if you are running too fast and too furious, is there a risk that you will ultimately harm your business results? In which case, slowing down, its gambling in some ways. It’s like you’re gonna have to make these trade-offs. Do you do the risky thing or do you play it a little bit more safe? And some of that may be within the culture of your organisation, like are you a riskier company? Are you a “Move fast, make mistakes” kinda company. Or are you a like, “No, we methodically wanna do all the right things.”
13:24 Moe: But it’s funny, actually, and this was another thing I was chatting to Ton about. We obviously had a very short, rapid, but intense conversation. It seems like for lots of companies, the ideal case is that you have one metric that each test or that you’re trying to optimize towards. But in reality, that’s not my experience. And that’s just my own personal experience is that there is always multiple metrics, ideally a primary and then there’s secondary, we call them nanny metrics, so if you make a change here, you wanna make sure that there isn’t a huge increase in customer service tickets, for example. And so you’re increasing the cognitive load on that optimization person, because they’re constantly trying to make this trade-off of, “Is the increase here, worth a decrease here? And is that fair and reasonable?” And Ton’s advice was like, “Well you should have one metric.” And I’m like, “Yeah, but that just doesn’t happen.” For both… What’s your experience with that? Are most companies really struggling with this trade-off of one metric going up, another going down or is this just my own little land?
14:24 VK: That happens all the time. So we call that the cannibalization analysis because a lot of times you’re running tests, especially if you’re talking about website testing, where the user has already landed on the page. If you’re not testing to get more people to that page, then you’re asking people to make a different choice once they land. Which means typically, something is gonna be sacrificed for this more valuable action, right? And I think that the optimization analyst should never feel stuck at the end of an analysis, ideally, because you feel about a decision plan grid on the front end to say, “What are we gonna do for each of these outcomes, what will that mean and what are we gonna do?” And so you understand what the risks in those trade-offs are. So, perhaps you’re trying to get an audience to authenticate and so then that means they’re not gonna interact with the page content.
15:14 VK: So I do get it that the primary goal here is to get more people to log in, so I see a lot of tests aligning to that, but sometimes you need to look at something a little bit shorter term to switch the KPI to make sure that that KPI’s sensitive enough for you to see how someone has behaved differently because of the challenge you’ve introduced. So, I really think that a lot of the rules that people put out there or some of the hyperbolic statements that people make is trying to drive towards a good discussion. And there’s good reason that they put that in place but there’s so much art to experimentation. So, you really do need to understand how to manipulate those risks and pull those levers.
15:55 Michele: Yeah, I would agree because some of the best tests that I’ve seen run are the ones where we said at the start, “We’re gonna do this thing, we know it’s gonna hurt this other thing.” That’s just rational based on behaviour and we are willing to accept a 30% drop or whatever that number is. And having that defined to knowing that that was like the, “uh-oh, no go.” And some of the worst ones that I’ve seen are the like, “We’re just gonna launch it. Oh, that went down, is that bad?” They don’t know how this works…
16:26 VK: And how bad? [chuckle]
16:28 Michele: Yeah, and if you can say that at the start, even if it’s a swag, even if it’s a guess, I think it does set you up better to have those conversations after, when you’re getting the results.
16:40 Moe: Do you think it’s a cop-out? I’m gonna preface this up front and say I feel like it’s a cop out. So I’m gonna talk about my particular situation, big shock, using the show for insider knowledge.
16:51 Michele: Moe’s therapy hour, right?
16:52 Moe: It really is Moe’s therapy hour. So, you have sign up for free users and then you have people trialling a paid product, so that’s… I feel like that’s always gonna be a trade-off. When you’re trying to get more people to trial a paid product, I feel like the fall back is always like, “As long as there isn’t a reduction in sign-up for free users.” As long as that is steady, and we see an uplift in trials, that’s good and I’m like, “I’ve never seen a case where that happens, but that’s constantly the bar that we set because if you’re optimizing a page towards getting someone to try a paid product, you’re normally gonna see a reduction in free sign-ups. So it always feels like you’re just setting up the CRO team to fail in that case, if that’s the metrics that you’re asking them to perform to. Is that reasonable or completely batshit crazy?
17:42 VK: I love that question.
17:43 Michele: I feel like there needs to be some math on that to say, “What percentage of those free users are going to eventually convert to a paying user and so therefore, what amount of free can slip?” Also, what’s the cost to our business to run those free accounts and so how does that trade off against the money that we’re being paid from paid users ’cause it costs money to provide a service essentially for free. And even if it’s napkin white board math, something that tries to do a little bit of that, that says we can handle a 2% or a 10% or a 20% drop. But I agree, to think that you’re not gonna influence anything else is probably pretty unrealistic in most businesses.
18:28 Moe: Val’s shaking her head a lot.
18:32 VK: [chuckle] Yeah, I don’t even have anything to add. I think that that’s… Is it riskier to not make a new decision and try new messaging or try a new layout? Is the other choice to just stay stagnant? ‘Cause that seems risky, too. So really understanding what you can learn from these experiments, is just as important as making sure you’re able to measure and monitor percentage increases or dips. So, if running those experiments is gonna get you closer to understanding something new about your user, then that might be worth the cost to test even if you have to stick with your control variation, if that makes sense?
19:08 Moe: Yeah. Okay, so I wanna turn to an example that I heard at CXL Live, and if you’re in the CRO world and you haven’t been to CXL Live, it’s a really terrific conference. All three of us have actually been, so a little plug for them ’cause I think it’s a really fantastic event. But I heard… And I think his name is Bangaly? That is how it’s pronounced, he’s from the Instagram team, and he talked through basically how they changed their user sign-up flow and then basically all of their metrics completely tanked. And the learning was that you can’t make a bunch of big changes to a user sign-up flow in one hit. But I also think from a UX perspective, that can be incredibly limiting because sometimes you understand fundamentally that a user flow does need to change and there is gonna be the time it takes for users to get used to the new flow, etcetera. And I feel like sometimes in CRO land, it can make people really apprehensive to do a big overhaul of any kind of UX in one hit. Yeah, I guess I wanna understand, is that what you’re seeing that people are getting more reluctant to make big changes? ‘Cause sometimes, I don’t know, I’ve… Especially, I work at a company where there’s a lot of designers and sometimes they’re like, “This just isn’t working, we wanna completely change this.” But the CRO person, I feel like that would put them in an odd position ’cause they wanna measure every small change. How are most businesses handling this at the moment?
20:35 VK: I think that, in my opinion, there’s been an over-correction to some of the attitudes towards making sure we can measure each incremental change ’cause we wanna make sure we can tie causality to the things we’re publishing. But I think the common denominator of how much change is too much change lies in a hypothesis statement. If the user flow, if the reason you’re changing sign up is because your hypothesis is around, “I believe if we were to introduce progressive pages or we wanted to break this up into multiple pages, then we’ll see more people increase.” Well, it’s really hard to step into that, right? And so making that change across multiple pages, even though that’s “multiple changes”, it’s one test idea. And so, sometimes you have to put that big piece out there that ties back to your hypothesis statement, and then if it’s really important to your go-forward strategy to understand if there was any small piece of that, that was the major contributor, you can always backtest. But making sure that it’s rolling out with an experiment in the first place is obviously keeping you from making any decisions that’s gonna damage your business and that’s the goal, right?
21:45 Moe: And it’s funny, Matt Gershoff… I was reading one of the articles you shared before the show, Val, which we’ll make sure to link. She has amazing resources, so we’ll make sure to link those. Where Matt Gershoff’s point is basically like, “Well, if you’re changing a headline, you’re actually changing multiple words at once and everyone seems to be fine with that so it’s reasonable that yes, if you did wanna overhaul some UX stuff, you could change multiple things on a page at once and that wouldn’t be the end of the world.” But yeah, that’s really interesting about the over-correcting ’cause that’s what I’m seeing as well. It’s like you need to be able to be certain of causality of every single minor change you’re making, which in some cases, I think, can be really limiting to designers.
22:25 Michele: I would be really curious as well to know how that example would have gone if it was a huge success. I feel like then it would be the story talking about, “You have to make these… You have to make dramatic changes. If you wanna affect your business, you need to really go out there. We overhauled our entire sign-up flow and we quadrupled sign-ups and blah, blah, blah.” And then it would be this huge success story talking about how you have to take risks. Sometimes, the learnings also are influenced by what the results of it was.
22:51 Moe: Yeah, that… Ooh, ooh…
22:55 Moe: Ooh. The politics. And Val, you also shared a really interesting point from Lucas from CXL last year. Do you mind just explaining that a little bit? ‘Cause that really helped me solidify this point.
23:07 VK: Everyone was so excited for Lucas to talk at CXL because obviously, Booking.com has some really unique practices as it relates to CRO. The whole idea of testing is democratized throughout their entire organisation and they’ve done some things that a lot of organisations really strive towards. And plus, he’s just really fun and engaging as a speaker but one of the things he brought up was, “How do you manage swim lanes?” So, “Okay, Lucas, we’ve heard you talk about how everyone in your company has the power to run a test. How do you deal with that? All these tests coming live and how are you able to keep those separate?” And his whole mentality is, “I don’t worry about that because I have a way to monitor at an overall level, and if I ever wanted to drill into interaction effects or to make sure that things were not muddying my data, I could always drill into it. But I would much rather the failures happen during an experimental phase than rolling out sequential things that in combination actually do hurt my business.” And that really flipped it for me in thinking about what I was really protecting against, with building out these really discrete swim lanes from beginning to end. It’s a really interesting way to approach it, I think.
24:18 Moe: Yeah, it’s funny, I was having a similar conversation again with Ton, I told you we got through a lot of content, where one of the things that I was really struggling with is that our CRO team is… We’ll have a test, it runs for two weeks, we’ll have an outcome, the next test is already ready to go and I’m like, “Shouldn’t the results from the test we’ve just run be feeding into what we do next?” And his view was like, “Well actually, if you do that, then you’re going too slowly because by the time you then build that and it gets through design, yadda, yadda, yadda, your next test won’t be ready to go for probably three weeks, and then you wait a week for another spring and it’ll be a month between every test that you do.” But his view was that, “Well, in that interim stage, you should test something different.” So, you should be… The next test should not be on the same page. Yeah, you should have your swim lanes. So you then go run this other thing and then you iterate from your learnings from the test that you just ran and then you run that again in the next two weeks when it’s ready to go. Which gets us onto a topic that he shared with me, which I get the impression Val is really passionate about, and that I have not delved into too much, which is about testing prioritization. So can you talk us through what is testing prioritization, and how have you seen it work well?
25:35 VK: Yeah. So the idea is that if you’re doing a really good job of getting a healthy pipeline of test ideas coming from all different walks of life in your business, that you’ll have more test ideas than you can actually execute on. And that’s good, that’s really healthy. But that also puts a lot of pressure on you to figure out which one should I pick to go next? Which one am I gonna invest in testing? And so the idea of coming up with this objective prioritization framework is what helps you sift through those ideas and lets the best ones rise to the top. And when prioritization frameworks were first hitting the scene, really popular ones were PIE where it’s talking about the Potential, the Importance and the Ease or the ICE model, which looks at the impact primarily.
26:19 VK: But basically you score your test ideas, and then you sum it across and it’s out of a square of 10 or whatever, and then you can rank them by their score and that’s how you choose which one goes first. And so, I’m not sure if it was PIE or ICE that I chose first when I was at the American Medical Association, but one thing that never happened with someone submitting a test idea is someone saying, “I have this incredible idea, here it is, but it’s probably not gonna win and it’s probably gonna have really low impact and it’s gonna be a nightmare to get out the door.”
26:53 VK: And so I was like, “Cool. So I’ve “objectively” scored all these tests and they’re all a 10.” So where did that get me? [chuckle]
27:04 Moe: Oh, wow.
27:05 VK: I wasn’t any better off than that list of ideas and just using it LIFO, last in first out. So the way that prioritization models have matured a little bit, I think, is to get a little bit more objective in figuring out how to score things potentially giving it binary points, asking more questions, like the PXL framework asks about where your evidence came from and that can give a test more credit. But honestly, my favourite prioritization models are the ones that everyone can understand to the point where you never have to talk about it. Because if you’re spending airtime debating the points of a certain test, then you’re really not focused on the high value conversation. So [chuckle] I love just figuring out what works best and just set it, and that you can walk away from, I think.
27:51 Moe: Yeah. So Ton’s approach was to basically split tests into what bit of the site that they’re working on, and I really hope I’m not paraphrasing this incorrectly. Which bit of the site it’s addressing and then, using prior tests’ uplift as basically any prior information you have from other tests, to help re-rank. Basically, what was the actual observed uplift that you saw? And then you can use that to try and help you determine what the expected uplift is from a new test. And I really liked that but then it brought me back to the like, “Maybe you’re gonna… You’re probably gonna need someone in analytics to help you with this bit.” So then every time you wanna re-prioritize your tests, you’re gonna be dependent on analytics again and that sent me into a tailspin. And Michele, do you do much work with testing prioritization?
28:42 Michele: The majority of the work that I’ve done with it has been looking at things like… One of the organisations that I worked at we would run these revenue calculations of what we thought that if we made certain changes that they were gonna impact. So it’s like, we wanna add this ad unit to the page. And if we do that, we think that we can sell this much, we think we can charge this much but we think these other two ads are gonna take this hit in their sell-through and we think this performance, etcetera. And having a whole bunch of assumptions that we could toggle and play around with and see how sometimes you would make wild changes and assumptions, and it would make almost no difference because what you were looking at testing was pretty small. But one of the things that I definitely learned in that was that it is key to keep that track record because I literally had the same product managers who came to me every month like, “This is a 30% lift!” And I’m like “Dude, you’ve literally never brought me an idea that had that lift ever.” So, “Cool, let’s talk about something more realistic.” And just keeping everybody really reasonable and level set and being able to do that in terms of… Also just the bottom line, what the revenue impact would be and just keeping track of that over time. So it’s you have your estimate and then were you right? You’re probably super wrong but over time you’ll get better at estimating it.
30:11 Moe: Is that part of the program though? Whoever is managing your CRM program really needs to understand human biases and heuristics and understand how to challenge people’s perceptions of how good a test is going to be?
30:24 VK: I think so, but I also think that one of the soft skill characteristics that I think is really common in people in CRO is their humility because of how often they can be wrong. But also helping people understand why their gut isn’t always the truth. So you have to have some of those change management conversations for sure.
30:44 Michele: Yeah. I’ve seen it done multiple ways. I’ve seen the person who has really hard conversations about prioritization, but maybe doesn’t make a lot of friends. And then I’ve seen the complete opposite end of the spectrum, the person that’s really good at working with all of these people to get everybody to agree that your test isn’t as important as his test. And I think it’s somewhere in the middle there that there’s the success ’cause you’re gonna make some people unhappy and make other people happy. But it’s a really hard line to weave and I think it takes a certain set of soft skills to do that.
31:20 Moe: So one of the pieces of homework that Val gave me before the show, was the top 10 myths from an article that CXL put out, which I freaking loved. And there’s one myth in particular that I was like, “Damn, I’ve been thinking this for ages,” which is, if only three in 10 tests is a winner, that’s okay. Can you chat to us about why that thinking might be problematic? ‘Cause I think as you start to get into CRO, you’re really trying to be like, “Well, tests are gonna fail. That’s okay. That’s the whole point of testing. Only some should win,” I suppose. And this is harking back to our prioritization discussion. But yeah, what’s problematic about this thinking?
32:01 VK: I think it sets like an artificial barrier. You think like, “Oh, if I’m at a 25% win rate, then I can just keep doing what I’m doing.” So it doesn’t challenge you. And I think just to go back a little bit to what you’re saying in your conversation with Ton, I agree with some of that thinking about making sure you’re keeping track of where those wins were. But another key piece of information that I found to help be predictive of tests that could potentially be winners is where that idea came from and what evidence do we have to support it? So is it just because it was in an email from my CEO on a Sunday night because they saw the competitors doing it? Or is it something that we discovered in the data and we validated via usability and we also have some heat mapping to say that this is a great opportunity? Okay. Well, those three sources, that’s probably going to be a much more likely to win test. So I think that the issue is, there are things you can do with your prioritization and being smart about the choices you make that can make sure that you are winning more. But on the other hand, if you’re… Especially if your program’s just starting off, and you have that 80% win rate or 100% win rate, you might be picking up some low hanging fruit or you also might not be really challenging yourself enough to get into business level testing. So we just can’t live and die by that metric, basically, in my opinion.
34:13 Michele: I think of it as… And maybe this tells you a little bit about my personality type. Apparently, I’m going back to gambling examples. But I think about it as a way to continue to grow your business but to know that you’re doing it in a way that’s actually successful. So you can sometimes try the crazy things. You can sometimes do the things that just need to be done and you’re just making sure that they’re not harming some of your key metrics. But being able to grow and progress and maybe in some cases have a little bit of kicking your organisation ’cause we call it something cool like CRO and getting people to take action. But I do… I think that there’s certainly a part of that that’s true, that some of the things that you are going to test or things you would have done anyway, you just wouldn’t have had any measurement of them. Over time, you probably would have made those changes.
35:11 Moe: But then should you test it, if you’re gonna do them anyway?
35:14 Michele: But what if you’re wrong to do ’em? You think you’re supposed to, but what if it’s a huge failure.
35:18 VK: Or what if you deployed it the wrong way. There’s a client that we’re working with that wants to keep up with cutting edge features to their product, which is the site, and they wanna add auto replenish. Well, you’re not gonna be able to get to that insight in one test ’cause you need to figure out where to introduce it. Should you call it auto replenish to your users? Should it be named something else? Where do you need to explain it? Where does that appear in that happy path? So that’s what we… I like to think of as the learning agenda because there’s lots of different little learnings that you need to chip away until you get to that POV at the end that says, “This is the best way for us to approach adding auto replenish as a feature.” And that any different question that you could have thought around that about the different choices you could have made, that you’ve tested that and that you know that this is the optimal way. And then you can refine from there. But especially if you’re making a big departure, you really do wanna protect to make sure that you’re not doing any damage like we’ve touched upon a couple times here, but also that you’re making the best choices possible. That you’re not just slapping that on and say, “Well, it didn’t hurt. So let’s roll with it.”
36:30 Moe: So, and in that scenario, you would be testing the messaging and where it’s placed on the product and that sort of thing, basically over a fairly significant period of time, I’m guessing, with lots of different variations.
36:42 VK: Yeah. And I mean, you learn as you go with this. So even if you say on the onset, “I have these 12 questions that my product team is asking themselves on how to approach adding this feature.” After you’ve run those first two or three tests, then you might realise, “Well, I don’t need to run this one anymore. I actually have this new question.” And so it’s this living, breathing cycle of developing that POV. And when you’re able to truly inform the best way to address your users, that type of learning is richer than any win along the way because you can take that to other projects.
37:18 Moe: Speaking of learning along the way, Matt Gershoff presented about multi-armed bandits which he often does at Superweek, which was really interesting. But one of the things he mentioned, it just sounds to me like multi-armed bandit is the perfect approach for lots of CRO things. But that there is this barrier where people, I guess they’re really stuck, and they’re thinking about there needs to be a start and a finish and then when the test finishes, you roll it out to everyone. And that’s not the approach that you have with a multi-armed bandit. This is exactly where I think that that hand-in-hand analytics relationship needs to come into play because I can’t see this being something that’s really driven by a CRO team. You’re really gonna need your analytics team to be super involved if that’s something that you wanna start discussing is like, “What is right for an A/B test and what is right for a multi-armed bandit approach?” Are people in the industry… Are there experimentation programs mature enough that they’re starting to have those conversations, or is it still very much based on A/B testing and it makes sense because there’s a start and a stop?
38:24 VK: I think that we’ve been trained to be very fixated on the frequentist approach, and to have that fixed horizon before you launch your test ’cause we know the dangers of stopping early, and things like that. But with some of the new features that the tools have and some of the new tools like Conductrics and what their offering provides, allows you to think about things a little bit differently. Your choices are different. And to even just pause on that for a second, I think that there are some programs that aren’t even mature enough to pick their accurate stopping time with the full choices that they truly have at their disposal.
38:58 VK: If you’re using a calculator where you’re only inputting your raw conversions and your total visitors and their percentage lift, that means that that calculator is making multiple choices for you on your behalf, that’s not letting you choose and customize the amount of risk that you wanna take for this testing opportunity. And I actually think… I was using the Search Discovery calculator that we have before I joined. So I don’t feel bad [chuckle] plugging that. But I think that it does a really nice job taking those pieces and making it conversational so that you can really customize how much sample you truly need to make the decision that you’re gonna make. So fast forward to the multi-armed bandit solution, very similar feelings on that. That depending on the business problem and the appetite you have for risk, it’s the perfect choice in a lot of cases, especially if you know how to wield that tool. [chuckle]
39:49 Michele: I think sometimes as well, there’s a lot of people that… A lot of what’s in us that wants to be more in control and wants to be able to say, “We did this, we ran this, we ran this test, we did this till here, blah, blah, blah.” And the idea of relinquishing some of that control and maybe feeling like then you can’t take as much credit if it’s successful, that I think is a little bit of a struggle that people have to come to terms with as they’re leveraging all kinds of different solutions. I think that’s a bit of a hard struggle is to be able to give up some of the control for the success.
40:30 Moe: And that’s one thing that actually really scares me, which I’ve seen play out through different optimization efforts. And this is where I’ll hark back to the, “Should you test it if you know you’re gonna roll it out anyway?” And this is more of like, if there’s a bug or something and then people wanna test it so that they can then communicate to the business, “Fixing this bug resulted in this much more revenue,” or whatever the case is. And then it becomes about being able to demonstrate your value to the business, and not actually fixing a really shitty customer pain point. I think it can become political really quickly where people can use it to justify their opinions, almost.
41:12 VK: Yeah. That irks me so bad. I was just shaking my head the entire time you were telling that story.
41:17 VK: Because think about that, think of it… You’re not in a laboratory. Sure, it’s an experiment, but you’re not in a laboratory. You’re talking about an actual business. You’re talking about actual users who are trying to give you money. So, what? Are you gonna send people to what you know is a suboptimal experience for the duration of this test? That’s not a CRO program to me. That’s rookie league stuff because that means you’re not connected into the business value that you’re providing your organisation. Add that to your list. If you’re so concerned about… Let’s say you discovered it. Great, add it to your list of contributions, but don’t do damage. First rule, do no harm, right?
41:57 Moe: Yeah. Val, you did have some advice that I wanted to touch on which was around internal marketing and gamification. What did you mean by that?
42:07 VK: So when I was first at the AMA, we were the first to join analytics team, and we were the first to be bringing optimization to the organisation. And so we tried lots of things to get people excited and on board. And I was above no tactic to get someone excited [chuckle] about what this new capability could provide. And even back to our earlier conversation about giving some of those bad test ideas, we would name test in Optimizely. Our CMO had an idea that we all really did not wanna roll with. So we named the test Rod CMO, and it was versus controller. And so it was like Rod versus the controller. One time there was a developer and a UX designer were at odds about the way something should go forward. And so we named it Ben and Tim. Those were the names of those variations.
42:55 VK: So those are things actually in the tool to get people excited about the outcomes. And so we did things like whiteboard takeovers. We would do guerilla tactics, like joining people’s stand-ups to talk about ways we could insert testing and getting people to vote for which variation they thought won. All those individual tactics. Sure, it took our time and that was an investment, but it really got people excited about it and made it not seem like… No one’s gonna come to a meeting called, “Learn How To Write A Hypothesis Statement.” But…
43:26 VK: If you’re like, “Hey, who wants to prove the CMO wrong?” They can get jazzed about that. [chuckle]
43:33 Moe: Oh my God. I’m completely stealing this idea. This is brilliant. Having said that, I feel like for me, it’s not actually getting people excited about experimentation. Everyone’s excited and everyone wants to do it. It’s more about, how do we make sure that we’re running the tests that are the most valuable.
43:50 Michele: Almost getting them to slow down.
43:51 Moe: Yeah, well Michele, what techniques have you found valuable in getting people behind experimentation?
43:56 Michele: I definitely agree with internal promotion. And depending on the organisation that you’re in, it’s probably… It’s a little different if it’s all one company that all do the same thing but sometimes there might be different brands within one organisation. And you can show all the cool stuff you’re doing with one brand and then suddenly everybody who wasn’t that interested is like, “Oh, I wanna do that. I want what she has.” And I think that having a little bit of that brand versus brand, can help to get people excited about it.
44:28 VK: The FOMO effect.
44:31 Michele: Yes.
44:31 Moe: Yeah.
44:33 Moe: Okay, before we start moving towards a wrap-up, Michele as an analyst, I’d like you to have your analyst hat on and Val, as an optimization person, I’d like you have your hat on. Let’s make it competitive. Not at all, but…
44:48 Moe: Kinda like how your gamification works. Can you give me what you think your top three must-haves are, for an optimization program? What are the three core things that if you’re not doing, you’re gonna be in deep shit? Maybe I should have given you forewarning about this.
45:03 Michele: Do we have to give all three right at once? ‘Cause one thing that I think is critical is that you are doing analysis and exploration of your data, because I think that you’re gonna get really good opportunities and ideas from that and if you’re barely even looking at the data that you have, it’s hard to draw that value. And I’ve gone to some conversion conferences, thinking that I’m preaching to the choir and occasionally I’ll talk to people and I’m like, “Oh, how do you come up with test ideas?” And they’re like, “Well, we just make ’em up.” And it’s this huge miss. It was really surprising to me, but I feel like it is definitely a missed opportunity.
45:38 Moe: So using analysis and data to inform which tests to run. Nice.
45:44 Michele: Yep. And come up with ideas, find the gaps.
45:47 Moe: Val, number one?
45:48 VK: My first one is, understanding the tools and the statistics that are powering your recommendations and analysis, because if you don’t understand that, then you’re no better off than buying a dart board. So understanding the statistics and the way that you’re helping support de-risk decision-making is numero uno in my book.
46:10 Moe: Okay. Michele, number two?
46:12 Michele: I think that combining your optimization program with generally just what you’re doing as a business, because I do feel like I see a lot of examples of companies who are testing us on this one… It’s on this one path, but then everything else they’re like, “We’re gonna change this on the pricing page.” But then, “Oh, we’re gonna do a whole website redesign.” It’s like, they’re just totally divorced from each other and I feel like you have optimization people spinning like hamsters in wheels, doing this ticky-tacky stuff instead of it all being a part of your overall strategy and I just think it’s critical that those things are together.
46:48 Moe: Yeah, nice. Val, number two?
46:50 VK: It’s actually very closely related to what you were just saying, Michele and it’s partially making sure that you have that hand shake with the analyst team if you’re not already ingrained in each other’s work, because I guarantee you that they’re gonna have a measurement framework that you’re gonna wanna be plugged into, to understand how you can optimize towards the most important goals for your organisation. And how to get in the mindset of your stakeholders to know what motivates them, and how their success is tracked because to Michele’s point, if you’re optimizing something over here and you’re like, “Cool, cool, cool, not a priority, not even gonna prioritize rolling out that win, ’cause it’s not worth it.” Then again, there’s a cost to testing and you don’t wanna to waste any of those time.
47:32 Moe: Yeah, nice. Number three, Michele.
47:33 Michele: I see a lot of organisations that have relatively high turnover and recently I bumped into one where the person who’s in-charge of optimization basically had this long deck that just, it literally listed everything. Every test had its own one slide and it was one place you could always go to. And I remember having this like, “Hey, have we ever tested this thing?” And he’s like, “Yep, 18 months ago, we did this. This is what it looks like. This was the winner. Did we… Did we roll it out or not? What date?” I’m like, it’s easy to say, but I find that it doesn’t happen very often, and I think it can… Let you look back at where you’ve come from, see what’s happened and also learn from older tests, carry through that learning and that education and then maybe not repeat some of the same mistakes potentially, if people are gonna just do the same thing over and over again. And it’s not probably the most fun thing to do, to be the person that documents all the history, but I don’t know, I guess there’s historians, so they must like it.
48:41 Michele: But it’s so valuable to be able to look back and do that.
48:44 Moe: I think storing what you’ve done somewhere is… It seems to be something that so many organisations don’t master. And so then, yeah, you have turnover or people don’t know what was tested two years ago. And should you revisit that test, or what were the results then? Is it gonna be the same, different? Yeah, I’m not gonna lie, I haven’t cracked that.
49:03 Michele: The thing that made it to me really good looking at it, was that it was so easy to digest. It wasn’t, “Here’s a folder with every analysis of every test we’ve ever done, and there’s zero chance anybody’s ever gonna read it.” It was very punchy and like, “Here’s the visuals. Here’s exactly what it looked like. Yes, we shipped this. We didn’t… ” And it was just really well-summarized. I think people have to be able to use it and digest it as well.
49:29 Moe: Okay, lucky last. Val?
49:31 VK: [chuckle] I’ve loved all yours, Michele, by the way too. And my third suggestion is also not very sexy, but it’s investment in process because you could have the best test ideas in the world, you could have the best technology deployed super efficiently, but if you can’t get unblocked from getting these things out the door or you don’t have a regular communications plan, so that other people know what you’re doing, you’re gonna get stopped. And so, I bet you would be surprised to know the proportion of people and clients that we work with that aren’t actively running test cycles because there is so much in building that process in the upfront phases and thinking about how to make this all come together, but then it’s beautiful once you’re over that hump, because then you’re running seamlessly and you can have that high velocity program, right? But the process isn’t always the sexiest thing, the most fun but…
50:23 Moe: See, I told you, you and Tim are in cahoots because Tim would talk about process too, and the perfect documentation which is done well, well in advance. So, I’m gonna be in so much trouble because we have gone slightly over but it’s because this is clearly gonna be one of my absolute favourite episodes of the year. I admire both of you so incredibly much, so it’s a real privilege to get to speak with you both, especially on such an amazing day as International Women’s Day. But before we wrap up, we do have last calls. So Michele, do you want to take it away?
50:58 Michele: Yeah. So I listen to the podcast and many other places, and I hear an awful lot of, “I read this great book and I read this book. It changed my life,” quoting Moe about the sleep book.
51:14 Moe: Yes, Why We Sleep, love it.
51:16 Michele: But I always listen to them and I’m like, “Dude, I have a big job, I have a second job. I have a three-year… ” There’s not enough hours in a day. And I found something at one point a couple of months ago called Blinkist, and it’s like literally CliffsNotes but for books. And they go through some of the popular titles that people have heard of where it’s like, “I wanna read it but it feels like the book is… “
51:44 Moe: I don’t have 10 hours.
51:46 Michele: Yeah, I just don’t have time. And sometimes the book it feels very stretched out like you could have told this in a 1000 words. And so, I found that that’s a really good way to be able to feel like I’m catching up on things when time is of the essence.
52:00 Moe: Nice. And Val?
52:02 VK: So my recommendation is a book, so hopefully Michele, this is…
52:08 Michele: We didn’t plan that.
52:09 VK: ‘Cause apparently I’m the lame-o who recommends books, which I swear I usually don’t do. But ever, I don’t know, maybe six months ago, I’ve been on this journey exploring how women’s groups are exclusive and they’re not more inclusive of other underrepresented groups. And I think, Michele, this is actually something you’ve been exploring, ’cause I saw some of the changes you made on Measure Slack and some of those, which is awesome. But there’s this book that I was recommended by one of the Diversity and Inclusion experts at my women-only coworking space and it’s called White Fragility: Why It’s So Hard for White People to Talk About Racism.
52:46 VK: And I’ve been listening to it and I don’t know how fast I have the setting, but it’s four hours or five hours all in. But it’s an incredible book, and she takes… The first 20% was just in defining things like prejudice and racism and giving some context to why this book needs to exist. She’s a sociologist, and it’s just so well done and it just really opened my eyes to things like racism doesn’t mean you’re a bad person. It’s about the systems. And it’s just really been really interesting and it’s a topic… A book that I wouldn’t necessarily have picked up but I’m really glad it was recommended to me because I’ve actually learned a lot.
53:23 Moe: Oh, I’m really excited to read that.
53:25 Michele: Am I allowed two? [chuckle]
53:26 Moe: It’s called a twofer, Michele. A twofer.
53:29 Michele: Oh, I didn’t know to introduce it as that. Val, you just reminded me of something that I actually think would be very good, especially given International Women’s Day, but also like what you were just talking about with women and groups and things like that. This is a discussion that’s come up a lot in a group that I’m in on Facebook, and normally Facebook mom groups are like the seventh circle of hell. And everybody is horrible and judgmental and it’s just a terrible place to be. But for anybody who is struggling with balancing this analytics career, or optimization career, whatever you’re in and on parenthood there’s a group on Facebook called Moms in Tech. And it’s very much like it feels like people who are having similar struggles of how do I balance all of these things and that’s actually a topic that’s come up a lot is racial and religious diversity and things within even our group.
54:22 Moe: Nice. So, believe it or not, I also have a book. Yeah, so apparently everyone reads except Michele. She just goes straight for the short approach, which I like. There is a woman here in Sydney called Lily Serna. She is a product analyst at Atlassian, but she also hosts a show called Letters and Numbers, which is basically… She’s a mathematician and she’s really involved in trying to help children learn math. But she wrote a book called Curious: Life Hacks with Maths. And I said to someone the other day, I was reading it, and they’re like, “Isn’t that a kids’ book?” And I’m like, “No.” It’s written for adults but I would argue that probably, the target audience is high school kids, maybe 20s. But it teaches you all these little ways like when you’re throwing a dinner party, how to hang a picture, how to figure out whether you should order a medium or a large pizza, how to cut a cake so that it is the most effective number of cuts so that you don’t have wastage.
55:25 Moe: And it’s like, I love it because everyone tells me in this industry, you have to learn math. And she’s finally found a way to explain it to me in a really relatable way, plus she’s also just a really lovely human and is doing a lot for women in the math space in Australia. So Lily Serna, she’s completely incredible, her book Curious: Life Hacks with Maths. I haven’t finished it yet, but I’m excited. I’ll let you know as I hear more about it. So I can’t yet… I’ve gotta wrap, it’s over. But we would love to hear from you on the show so please reach out. Both Michele and Val are super active in the Measure Slack community and on Twitter. And so feel free to reach out to them with any questions, we’ll be sharing every single link we can find in our show notes. And thank you for listening.
56:16 Announcer: Thanks for listening, and don’t forget to join the conversation on Facebook, Twitter or Measure Slack Group. We welcome your comments and questions. Visit us on the web at analyticshour.io, Facebook.com/analyticshour or at analyticshour on Twitter.
56:36 Speaker 5: So smart guys want to fit in. So they’ve made up a term called analytic. Analytics don’t work.
56:44 Speaker 6: Analytics. Oh my god, what the fuck does that even mean?
56:53 Moe: Val, can you just do a little bit of talking, I’m gonna check your WAV files.
56:56 VK: Yes, I would love to do a little bit talking so you can check my WAV files. [chuckle] I feel like my little bumps on this line are smaller than your bumps.
57:08 Moe: Oh, you do have baby bumps.
57:11 Michele: That totally sounded wrong.
57:15 Michele: I mean, this isn’t super surprising. I feel like this is just data and evidence that says that the Kiss girls are really loud and talk a lot.
57:22 Moe: Well, that’s also true and I’m really glad I started recording so we captured that.
57:28 Michele: Hey, my bumps got smaller.
57:30 Moe: Your bumps got smaller.
57:32 Michele: My volume bumps.
57:33 Moe: Fuck.
57:37 Moe: Okay, Tim is online. I know he’s in a meeting but just let me see what happens.
57:44 VK: I dare you to text him saying, “We need help, Val’s bumps are too small.”
57:53 Ashton Kutcher: So I just started making the investments on my own.
57:55 Marc Maron: With your own money?
57:56 AK: With my own capital.
57:57 MM: Yeah.
57:57 AK: And I invested in a company called Optimizely, which is an A/B testing tool.
58:03 Michele: Rocket flag and conversion rate optimization. Did I do it right?
58:08 Moe: Michele, it’s rock flag. Rock flag.
58:11 Michele: Okay. Well, it only took 136 shows for me to figure that out. I’m gonna go Google it now.
58:16 Moe: Okay.