#111: Automation in Analytics with Erik Driessen

We thought we deserved a break from the podcast, so we went looking for some AI to take over the episode. Amazon Polly wasn’t quite up to the task, unfortunately, so we wound up sitting down as humans with another human — Erik Driessen from Greenhouse — to chat about the different ways that automation can be put to use in the service of analytics: from pixel deployment to automated alerts to daily reports, there are both opportunities and pitfalls!

Episode Links

Episode Transcript

00:01 Amazon Polly: Welcome to the Digital Analytics Power Hour. Tim, Michael, Moe and the occasional guest discussing digital analytics issues of the day, find them on Facebook, at Facebook.com/AnalyticsHour, and their website analyticshour.io And now the Digital Analytics Power Hour.

00:27 Michael Helbling: Hi everyone, welcome to the Digital Analytics Power Hour. This is Episode 111. You know what is great? Automation. This whole show is now fully automated, automated, automated, automated. Okay, but, seriously, there are so many great ways to bring automation to the role of the digital analyst. A wise man, for a human, named J. Schelling said it this way, “You always want to look for smart, lazy analysts, smart enough to get the job done, and lazy enough to not want to do it over and over again.” Like writing and delivering the introduction to a podcast. Okay, well let’s get into it. Tim are you a fan of automation?

01:09 Tim: Absolutely.

01:10 MH: Moe, have you delved into automation yet in your new world?

01:13 Moe: Absolutely, actually I’m gonna probably touch on that today ’cause I need some advice.

01:18 MH: And I am Michael Helbling. And my automation capabilities are mostly limited to keyboard shortcuts in Excel.

[laughter]

01:25 MH: So obviously we needed a guest who could give us and our listeners a leg up on thinking about automation, analytics, and its possibilities. Erik Driessen is an analytics evangelist at Greenhouse. He’s spent the last seven and a half years and held numerous different roles in that company. But luckily today, he is our guest. Welcome to the show, Erik.

01:49 Erik Driessen: Hey everyone!

01:50 Moe: Howdie.

01:51 ED: I feel like How you going…

01:53 MH: How you going, exactly. So this will be a test, you know, because we’ve heard, Erik, that you’re very talkative. And so… No, I’m just kidding.

[laughter]

02:05 MH: Let’s get into automation, and start breaking it down a little bit, so maybe we could start with just some of the categories that you see, Erik, in your work and then we’ll go into more depth on each of those.

02:20 ED: Well, basically what we do it with my team is actually we found out in our day-to-day activities, there’s a lot of stuff we just do every day, on a repetitive, basically in a repetitive way. And some of those things are not really fun to do. For example, placing tags, that’s not the most fun thing to do, or running a data check every day for the same client. So we try to automate a lot of that stuff that’s automatable.

02:52 Tim: Well, how does the data check piece work? ’cause I’ve known people who say, “Well I’ve got automation, and that I’ve set up my own little report that I can look at every day to see if anything has gone awry” and then there’s the side of… “We’re gonna set up alerts to trigger if something has gone awry.” Both of those have some downsides. The latter being can be pretty like in Adobe Analytics are pretty cumbersome to set up and they’re not necessarily easy to tune. So what… What kind of… How do you do the automation of the data checks?

03:28 ED: Basically, what we do with the team is whenever we see an opportunity like that we just try to plan a full day with the whole team. We call it “The Hackathon” and we just invest heavily in a solution that we think it’s valuable and one of those for us was the data checks and what we did is we basically thought of the same way of Google Analytics data checks because I don’t know if you know the alerting function in that system, but it’s also pretty limited.

[laughter]

03:57 ED: Yeah, that’s a nice way to put it even, pretty limited. But basically with the API of, the reporting API you can do a lot of really advanced checks because you have access to all the data that’s in the reporting API of Google Analytics. So what we did is, first set up basically a prototype where we get data from Google Analytics through the API and send a message to Slack based on what we see. So the check could be a flatline check, or just a report on numbers of the previous day, and then we try to set it up so that a lot of non-technical people can use the system as well. So right now, we basically have a Google sheet where people can fill out a check for a client and they can even set the name they wanna use for a metric in a notification because sometimes, like quantity resembles the amount of insured people, for example. And you wanna show that as a readable message. And basically, all those things are linked together, so right now we have a system that allows non-technical people to set up an alert based on anything that’s available in the API and it sends a human readable message to Slack whenever something is wrong or just a number that needs to be reported on for a campaign.

05:12 Moe: So my question is though, in a previous role, we had so many notifications set up, it just got to the point where people kind of stopped giving a shit and particularly with email notifications, they would just filter all the messages out ’cause they were getting 120 notifications a day. What’s the culture like at your company? Is there one owner for a particular notification, is it going to a group distro? How are you managing that?

05:40 ED: Well, basically we mainly use Slack for the analytics notifications right now and what we do is basically, whenever something is wrong so whenever there’s a flat-liner of an important campaign metric for example, we make sure to assign users through the message as well. So the people who are responsible for a campaign and that is mostly someone who’s a campaign optimizer at our agency and an analyst who is responsible for the Google analytics implementation for example, they get a direct notification as well, so everyone who is in the channel knows that something’s up, but there are two people assigned to the notification as well, so you get a direct message, and also an alert on your computer that something is off, that you need to look into.

06:22 Tim: Which is… That’s kind of… Alerts maybe that’s… That is one of the… It seems like there’re two challenges with alerts, one is getting them tuned so that you don’t have false positives. That’s definitely the… Oh, we set the threshold too conservatively and you get inundated with notifications. It’s funny, with Slack Alarmduck, I mean a couple of… That’s kind of an Adobe Analytics solution built for Adobe Analytics, where I’ve actually seen that work when the notification pops up it’s typically a legit issue. So I guess, how much tuning has to happen and then I think separately, you still need people to recognize there’s value in spending the time even if you streamlined it in the Google Sheet. Do you have a struggle getting people to say, “this matters”. Like, to me alerts that people are most interested in them when they’re least helpful after something is broken and it was missed. It’s the… Closing the doors of the barn after the horses have already gotten out.

07:25 ED: We look at basically all the, in the case of Google Analytics, all the goals that are configured that are used in a campaign, so we have an automized campaign reporting system as well and everything that’s configured as a goal in that system, we currently configure it for the campaign people in the sheet so they don’t set up the notifications themselves, but we make sure that every notification is used for a metric that’s used in the campaign optimization process, so we do not track a metric because we think it’s interesting, we only track them when we know they are used in campaigns, and because of that, we always also think it’s really important that whenever that thing flatlines that should be a big alarm to anyone, so we make sure that people get instant notifications, and that both someone from the analytics team as someone from the campaign team are updated about the issue.

08:18 Moe: And do you have notifications for spikes as well? So the other day one of the engineers is like, “Oh this is so weird. We’ve got this huge influx of traffic and it was from some random IP address, which turned out to be a bot”. So do you… Yeah, I mean, ’cause you keep talking about flatlining. Do you have the inverse of that, of where something is hitting you, and suddenly your metric is totally stuffed, as I experienced?

08:43 ED: We don’t have that right now. We are looking into ways to look like at the central differences or what I think is a really good example as well is what Tim showed previous year at Super Week with the anomaly detection. So basically make a prediction for today and see if the turn data point is in or outside of that prediction. But we do try to create custom notifications based on what the business needs, so we didn’t really get this requesting yet, but we do have is, for example, check that monitors whether are running a B test data’s coming into Google Analytics. So when you connect for example, VWO or Optimizely to Google Analytics, you wanna make sure that as soon as your tests are running, so from the start date of your test, the data comes into Analytics. So we have a notification for that, so we see whether that’s… We got a notification when it’s not happening and we have something for PhenoLOGIC as well. So for our campaigns, we often set up a goal for the Thank You page, Personal Detail page, baskets and stuff like that. It would be really weird if like the Personal Detail page has fewer hits than the Thank You page. So we also do a logic check like that, but that’s not really a spike check or anything, but just, it should not be possible that someone enters the Thank You page without having first visited the Personal Detail page, so that’s… We should get an instant notification about that when it happens.

10:05 Moe: Do you ever feel like there are circumstances though, where the amount of effort to set up these notifications, kind of… ’cause from a reporting perspective, I know that I’ve had this happen plenty of times, where something, I don’t know, it takes you five minutes every day and you figure out to automate it, it’s probably gonna take you two and a half weeks. Is that actually worth it, in terms of investment? Is that a scenario that’s come up for you as of yet?

10:34 ED: Oh yeah we do, I had a guy in my team, we basically made a priority of all the repetitive tasks that we do, we made basically sort of priority where you both look at the amount of effort that’s required to automate a task and the impact it would make. But what we try to do now is just, we sit together with the team and we define a topic and for a lot of automation things we don’t really have the time to invest in it during our daily jobs. So what we do is we plan a Hackathon where we just work on it with the whole team, and at the end of the Hackathon, we have a minimal viable product, something that works, that proves the point, and then we assign a product owner to the product to make sure that it keeps on developing and gets new features added to it, and we also try to set a target. So, within two or three months, we wanna make sure that, for example, 20 of our clients are connected to the reporting notifications, so we know that they’re actually used as well.

11:35 Tim: But that’s an interesting… ‘Cause… And we’ve most we’ve been talking about alerting but then Moe you’re bringing up reporting, and it’s interesting when you’re thinking about automation, you’re making me… There’s the time savings aspect of it, which I feel like reporting or kind of, regardless, I’m gonna have to do this, I’m gonna have to spend five minutes a day, or I’m gonna spend 15 minutes on Monday mornings and doing a time savings analysis is one way to go at it. With the alerts, where we started, there’s this separate part where automation, it’s like nobody’s going to look at the data every day or every hour to monitor it. So it seems like that’s a different calculus. If you don’t have automation it’s not gonna happen and there could be a business loss, because something could break and it could not be caught. I feel like there are probably other factors that may go into when to automate. I mean, I feel like I’ve probably automated stuff, just because maybe the time savings didn’t fully justify it, but it was just kind of the psychological…

12:43 Moe: Annoyingness.

12:45 Tim: Annoyingness of it and that, oh, if I actually get to go build something, then I’ll get the little charge out of having automated it. I’ll probably learn something along the way. And the side benefit is I’ll never have to do that again. And I might be able to repurpose it, maybe that’s another factor. Maybe I can’t justify it for automating this task, but now I’ve got something that’s three quarters built that I can automate the next task and just convince myself that it all works out as being the right thing to do.

13:15 Moe: Yeah, I’d agree with that.

13:17 ED: I actually have a good example of how the notification system basically saves us from having a lot of bad data, because besides the daily check, we also have a sort of semi-real-time check, because the real-time reporting in GA is also pretty limited. So we have another system that basically checks the data of the past four hours every hour. So it’s way more flexible, because we can use all the possible fields in the reporting ABI of Google Analytics. The only downside is that we have data of four hours ago instead of right now, but that does give us an indication that when an important campaign measurement seems to be dropping, we get a notification within four hours. And we do have examples where normally when that would happen, on a, say that on 3:00 in the afternoon, a campaign measurement breaks and it flatlines at that moment, the day itself might just be a bad day. If you don’t have any notifications turned on, the second day is totally gone. And then on the third day, you finally see that, you miss a full day of data. When you have the data checked upon every four hours then you know at least within a day that something is probably wrong, and you can look into it. So that’s a big data quality saver as well with automation.

14:35 Moe: Do the clients necessarily understand the value of this?

14:38 ED: No. [laughter]

14:39 Moe: ‘Cause I’m just thinking like other things that if you had missing, I’m gonna say data, if you had missing data, by the time it got to day three, the client would be completely pissed. But in this circumstance, they don’t know that. So, yeah, do they realize how awesome it is or do you kind of have to prove the value to them?

15:01 ED: They start realizing it. We actually have, with one client, we are currently in talks with setting up like a different version that sends a notification to their developers whenever a metric drops, but where the message for our campaign team is really the name of the metric that they report on. For the developers, we would include all the data layer variables that are required to have successful measurements. So when a measurement drops, they know, “Oh, we have SKU XYZ missing in a data layer, let’s look into that so we have the right numbers again.” And if you really wanna do it in a proper way, I think you should even monitor it as soon as the measurement fires when a customer lands on the thank you page, so that you just validate: Does the whole ecommerce data layer contain all the values that are required to have a proper ecommerce measurement, for example. And when a field is missing, you send a notification right away instead of waiting for it to end up in your reports, then checking it four hours later, and then sending a message out.

16:06 Tim: So we’re talking about alerting is kind of one way to check for data integrity based on look at the data flowing in. In the market, and I’ll say like Alarmduck, and then kind of native things within Adobe Analytics and Google Analytics are what I feel like are kind of market solutions for that, which don’t have nearly the marketing going behind them as the other way of checking data, which is more of the crawl-based auditing, the observe points, the hub scans. Do you do, I guess do any of you guys have much experience with the kind of crawling and tag checking on a recurring schedule approach? And do you have any thoughts about that?

16:52 Moe: Not really. I probably should look at it, I guess. To be honest, this whole conversation’s making me feel really guilty that at the moment the way I stumble on a problem is someone being like, “Oh, this looks weird.” And me being like, “Yeah, that does look weird, let’s do some digging.”

17:08 Tim: I feel like if you listen to what’s… If you sat and watched Erik’s presentation at super week, you’d be like, “Oh my God, I’m doing nothing, I’m terrible.” If you listen to auditing crawling tools, and I always struggle with those, because we have clients who use them but that’s another one of those, the investment to set it up can be really, really high. Especially what was making me think of it, Erik, was when you were talking about the data layer on the “Order Complete” page. Well, that’s a lot easier to check in the data that’s coming in, ’cause you’ve got data being passed in and you can say, “Did the data get passed in? Was it complete?”

17:50 Tim: Now it’s one step removed from saying, “Is the data layer right there?” Some other little things could have broken along the way. If you wanna do an audit and you wanna do it thoroughly, then you’ve gotta have something that can crawl all the way through, and make the purchase, and you have to configure all the rules to say, “Here’s how you have to check that everything looks valid in all of those data layer things,” and I’m sure I’ll upset some of those companies. Like at a level of saying, “Did a tag go away?” Totally get it, ’cause you can just point it at the site say, “Crawl it, make sure that the Adobe tag, or the Google tag, or whatever tags I wanna have on every page are there.” When it comes to saying, “Are these custom dimensions being… Are these eVars being populated under the correct scenario?”

18:36 Tim: To me, it winds up being back to that, “What’s the cost to setup that automation?” And it’s inherently gonna be kind of this blanket thing. I like the way that you talked about the alerting system you guys have setup that said, “Look, we’re trying to focus on what are the metrics being used for this campaign?” It’s hard to imagine a tag auditing tool having its configuration updated with each campaign and even what that would look like. It just feels like a very high cost thing to try to maintain, whereas alerting seems like it can be a lower cost that gets lower and lower as you build out your solution more and more.

19:19 ED: Yeah, well we did do some tests with proper crawling, which you are referring to. So we mainly use Python within the team to run our… Create our projects, do our automation stuff, but also our more advance analysis. But there are these things called headless browsers. I don’t know if you’re… Any of you have heard of them. But you can basically simulate browser behavior with them. The hard part is that you need to program every behavior that a user needs to do to get from the home page to the thank you page. You have to program it into your code, and it may work for the current website, but as we all know as analysts, that websites tend to change a lot and often the measurements break, so probably often also your crawling system will break because for somewhat, some reason they decided to change the pattern of the click outs to the check out and then your whole measurement might be broken but also your whole auditing system might be broken.

20:19 Tim: Plus the number of ways you can go down. How many different options, I bought one product, I bought three products, I bought… I used a coupon code, I checked that my shipping address was the same as my billing address. Like all these things that if there is a bug in the system could be what causes a tag misfire, but chances are you scripted or recorded one path through which is good if there’s a catastrophic break that would catch it, but if there’s a corner case where, “Oh, it was this scenario where somebody was checking out on a Samsung Galaxy running IE, or something weird, chances are, you’re not gonna script all of those and therefore you’d probably find it, you’d still say, even if you’ve done that, you’d still wanna have the alerting system as a secondary check and in many cases it feels like it would catch it first.

21:17 MH: Yeah, there’s sort of these multiple points of entry into these types of tools which is sort of like as you start and then along the way, and then sort of riding along the top. And in the US, observe point is one of those tools and they’ve created this thing where you can record the journey, sort of, like, “Oh here’s the path the customer has taken. If tagging gets broken along this path, let me know.” But the problem is, is if that path changes or to your point, the website’s gonna change, so you’re always having to go back and re-architect those paths, so you need some sort of alerting function in the mix as well.

21:56 Moe: And you need to automate your automation.

22:00 MH: Yeah, exactly.

22:00 Tim: But even if you think about it, you can record the path, but if you think of an… Say an Adobe implementation where you’d say, “Well at each step, these are the 17 patterns or values that I expect for each one of these variables.” The same thing for GA. If you’re populating custom dimensions and custom metrics like you have to say, “Here’s the step I want you to follow, check that the tag’s there, checked that the tag has these values.” or even more robust but harder to do, “Here’s the patterns to look for.”

22:30 MH: Right.

22:31 Tim: ‘Cause you wanna find that it’s not passing a no-value into some… Do you say it’s passing a value, it’s passing the exact value, it’s passing a value that fits this pattern.

22:41 MH: Is it a properly formatted value and all those kinds of things.

22:45 Moe: I wanna just throw this up in the air. As we’re talking about this, part of me is just like, “I don’t know if it’s worth it.” The use case that Erik’s talking about is notification if you’ve got a campaign running and some metric that is dependent on the success of the campaign it’s dependent on yeah, that sounds completely reasonable. But checking that every single tag fires every single time. I’m like, “I don’t know.” [chuckle]

23:12 Tim: Well, that’s the thing, I don’t think anybody… Nobody actually does that, they never get to that. It has the capability to do that. I’m pretty convinced it gets sold as though you’re gonna do that because nobody’s thinking through what would actually take to do it, and then to maintain it to what you guys have brought up that, “Oh wait, now we redesign the site, or we swap something out and now all of those crawlers are busted and we have to go back and tweak them.”

23:40 Moe: Yeah.

23:40 MH: Yeah, and I think… To your point, Moe, I think it is a matter of scale in a certain sense, because if you can kind of know your domain space is enough so that you can kinda get in there and do checks on a very efficient basis, then you don’t necessarily need to employ that, but what if you’re managing hundreds of websites in some capacity, then you have no ability to cross-check all of them. And so in that scenario, then you’ve gotta use tools to basically be a force multiplier in terms of checking for certain things. I do actually think this is a space that really could incorporate machine learning effectively in different ways. And I’m sure probably some of the tools out there do at least in some capacity.

24:29 Tim: I would say that Adobe when they talked about their use of machine learning and they talk about… Or AI even with Adobe Sensei and that was the cool demos were in kind of the visual image stuff they at least as of a year or two ago, were looking at their anomaly detection and calling that machine learning, which it’s a little bit of a stretch but my experience with that in theory sounds good. It’s the come up with a forecast and see if you fall outside the 95% or 99% expectation, but boy, it is really hard to get it to not trigger false positives and you pretty quickly run into that, “Oh crap. I’m getting alerts,” the times that I’ve set that up where I’m trying to use things that are in the machine learning realm and then I wind up going back and saying, “Screw that, I’m just gonna set a threshold for this metric if it drops below 10 in any given hour, for this metric if it drops below a 1000 in any hour.” And those wind up being more reliable, but that may be coming.

25:38 Moe: So can we talk about reporting now?

25:40 MH: Yes.

25:40 Moe: The thing I get really excited about. [laughter]

25:42 MH: Right, after I copy and paste our Twitter metrics and our YouTube metrics into my dashboard every day. [laughter]

25:52 Moe: Okay, the reason I really wanna talk about it is because at the moment, yes, I am automating some reporting. And I was, big shock, chatting to my sister about it, and I had this really… I was basically going through and building out 16 metrics that I didn’t need right now, but I know that over the next six months, nine months someone’s gonna ask me for. So I was like, “I’ll build them out now in all of my code,” and this is where I definitely want some advice and Erik you might have some thoughts on this. What I was doing was creating all the calculations in my code, ’cause I don’t know if I’m gonna stick with the reporting tool that I’ve got. And then I’m like, “Oh, I can pick up my code and just move it to another visualization tool if I want.” My sister’s like, “You’re such an idiot. Whatever. Just use the calculated metrics, it’s fine. Build something that you need for the next three months. And then in six months if you need all of those 60 other metrics, then build it then.” I was like, “But I’m trying to future-proof it.” And she’s like, “You’re a team of one. Build what you need for now and then deal with the hurdle when you need it.” So, who’s right?

27:00 ED: Well, to be honest, I think your sister is. [laughter] So I’m a big fan of creating the bare minimum you need right now to prove a point, so in my opinion, doesn’t really make sense to make a report bigger than it needs to be or more advanced than it needs to be, because you think it will be required in the future. And on the other end, I think it’s also good that when someone has an extra request for you that you, yeah, you don’t have it ready right away. You really involve them in the updates that you’re implementing for the report. So you don’t add a metric, because that’s probably something that you’ve always done, and you know they’re gonna need it, but you’re gonna add the metric because the business directly asked you to add it. So that makes it a direct relation to the business request instead of assuming that it will probably be a business request in the future.

27:54 Moe: There’s lots of eye-rolling going on, I know he’s right. And so is she. [chuckle]

27:58 Tim: Well… [chuckle] Well I have watched the automation of reports. It is so predictable when, and this is kind of consulting or agency side with the client, where you try to say, “Hey, we get it, you need to see something on a recurring basis and we’ll give you this basic report. This is not answering all your… ” No matter how many times you say, “This is just so you’re getting your KPIs and you can look at them,” guaranteed 100% of the time, it will come back with, “Can we add this? We have this one question, can we add this to the report?” Where it’s like you have… It’s like, “No. It’s one question. We can answer the question and move on.” And then the reports start to bloat. And I feel like, as an analyst, it’s easy to get sucked into that, like, “Oh cool, I’m gonna make this report bigger.”

28:48 Tim: And I can go back 15 years. You wind up reports that are 35 pages long. And then you say, “Oh I’m gonna buy a BI tool, so I can automate the production of this 35 page long report that no one is actually accessing, but hey once it’s automated who cares? What’s the harm? If it’s automated, it costs us nothing to produce it.” And I think there actually is harm in having reports go out. It is like the alerts that you… It’s going out every day and literally no one’s looking at it. And the analytics team is saying we’re doing a good job, ’cause we’re producing through automation this daily report, and no one’s actually looking at it. That was like three rants kind of blended into one, so I don’t know how coherent…

29:33 Moe: Yeah, that was lots of rants. I agree, but I don’t know. There’s also, and this is where the whole effort, cost, time balance thing comes into it. If you’re writing the code and it’s gonna take you a couple of minutes now versus in three weeks when they look at you’re MVP and they’re like, “Oh, can we add like a year-on-year calculation just so that we know this?” And to go back then and make that change is gonna take you like four hours. I don’t know. I totally see both perspectives, but there’s this little bit of me that’s like, “You know that they’re gonna ask to see year-on-year, because that’s what X executive always asks for.” Isn’t it worth like the three or four minutes now to write one extra calc, and then it’s all… I don’t know. I see both. Yeah.

30:24 Tim: Put my rant aside, I more often do what you just described. [chuckle] And it’s like they may not have asked for it, but I know they’re going to as soon as they see it. They’re gonna say, “Oh, can we get… “

30:32 Moe: It’s also like the quick win of someone being like, “Oh, I need this thing now.” And you’re like, “Oh yeah, I can add that to the report. Give me a couple of days.” And then you go back the next morning and you’re like, “Boom, I’ve added it.” And they’re like, “You’re amazing!”

30:47 Tim: I’ve got cases where, recent cases, where they’re like, “We need a report that does X,” and I’m like, “Literally, the report you asked for has a drop-down that literally gives you that. I didn’t really wanna pro… But I did and I thought ahead.” And they’re like, “Oh.” And then two weeks later, they ask for something that’s also in this, I’m like, “It’s a one-page report with some drop-downs.” So, sorry, now we’re getting to…

31:14 MH: So is that a form of automation though?

31:17 Tim: What?

31:18 MH: Predicting the future?

31:19 Moe: No, but that’s about when you’re doing automation how… Should you go beyond what your stakeholder has asked for, if you know that it’s something that…

31:30 Tim: Well but there’s also the way if you go beyond, I’ve done this too where I’m like, “They’re gonna ask for this,” I went ahead and pulled 10 metrics in the automation, but they’ve only asked for these three. So I didn’t expose the other ones. It cost me literally nothing other than a little bit of thought to prep for that. So that if they ask for it, then it’s not already there, ’cause they hadn’t asked for it yet, but it’s 80% of the way there and I could go enable it. I still feel like reporting in general is the… Even I think a lot of BI tools are like, “Look, you can make this dashboard, and the executive can look at it every day, and it’s automated.” And you’re like, “Yeah!” And 95 times out of a 100 they’re not looking at it, or if they do look at it, it’s not answering the questions they expect it to answer and you’re potentially on the slippery slope of trying to continue to automate things that really shouldn’t be automated.

32:30 Moe: So here’s the other thing is the danger zone, not the drinking game, like the place as the analyst.

32:37 Tim: There’s a drinking game called Danger Zone?

32:39 Moe: Oh, Tim, that’s a long story we’re gonna have to take offline. When, and this is what always scares me, when reports especially start to get automated, whose job is it? Is this where you should be using alerts? The reports are just getting automated, you never have to look at them, and then suddenly like a metric falls off a cliff, is this… That’s the exact use case you should have all your alerts setup? Because ultimately, once it’s automated, and this was happening a bit at my last company, who owns the report? Is it the person that built it? Is it the person that uses it? Is it the person, the relevant analyst for the team? Whose job is it to make sure that that report continues to be accurate? And the ownership of that can be really tough, ’cause sometimes the people doing the automating are not the people that are actually the analysts in the team or the marketer that’s using it.

33:36 ED: Well, to me, you basically summed up the very reason why we started with the whole notification project. [laughter] So that’s really, really nice. Basically, my future view on the whole thing that we do is basically we do a lot of implementation work, which is a hard part of getting good data in. Then you can do analysis based on the data, which is something we as analysts can do. We can create reports based on the data that are easier to use by the non-analysts in the business basically. But a lot of the decisions that we make based on a report are basically if X then Y decisions. So you see something happening in the data. And the easiest check we could automate was a flatline, because no one is looking at a report every day, but it’s really important when there’s a flatline in a report for an important metric. So that’s the very reason why we setup notifications is that people don’t tend to look at reports every day, but some checks that we do, should do in those reports are really easy to automate. So why don’t we just automate them to make life better for all of us, basically.

34:43 Tim: I will make the case for the whole not looking at every report. I’ve had cases where if I’m sending a daily report, I know that nobody’s looking at every day, but at times it’s the most expedient way to say whenever they do need it, they may not look at it for a month but if they do have a question, they want the most recent data, and they at least know that it’s arriving and they can pull it. That still feels a little… If those are reports going into your inbox, that still feels kind of old school, but it’s also kind of somewhat the reality. I do like the… Years ago I drew up sort of what’s my conceptual platonic ideal of a good dashboard, and it was really nothing but random just notifications on a dashboard, like everything else was grayed out. You have 20 metrics that are on the dashboard, but the actual dashboard only showed the ones that were out of spec based on your target or your needs, which is I think a precursor to push notifications that are, “Look, if you don’t get notified, assume that everything is basically what you would expect it to be.” And now we’re only going… The report is only gonna be something that’s out of whack. Although people sometimes still wanna know how much traffic came to the site yesterday.

36:07 Moe: This was gonna say like, as an analyst, and this is what scares me sometimes, like you automate something so you stop looking at it every day and so you stop being so close to the numbers. And then someone says to you, “What was traffic yesterday?” And you’re like, “It’s on that dashboard.” Whereas when you kind of have to do it every day, there is this subject matter expertise that you kind of just get by osmosis of just having to do whatever that two-minute task is that forces you to stay really close to the numbers.

36:38 Tim: Michael’s gonna tell his story about the report he setup for himself that he looked at every day.

36:43 MH: I’m not, ’cause this has nothing to do with automation. It was an automated report.

36:47 Tim: It wasn’t automated. [chuckle]

36:48 MH: It was.

36:49 Tim: And you were looking at it every day.

36:51 MH: I did look at it every day. But it was for that reason is just basically these were context metrics, so it’s just setting the context for everything. And so that way as other information flies in, you can quickly in your own mind, sort it. That’s all.

37:08 Tim: I have a feeling like we’ve pissed off half our audience in one way or another with this whole discussion. But I don’t know how or where.

37:15 MH: That’s fine, let’s get started with the second half.

37:17 Moe: And when have you ever been worried about pissing people off?

[laughter]

37:20 MH: Yeah, Tim. Seriously. If you’re pissed off by this, fuck you.

37:26 Tim: That’s great.

[laughter]

37:31 Tim: We hadn’t sworn yet on the show, so that was the one reason.

37:33 Moe: I said shit, that counts.

37:35 Tim: Oh, you did? I’m sorry, I must have missed it. That’s probably the other aspect of automation of reporting. There’s automation where a tool has actually got the data available for automated… It’s automated going somewhere that somebody can access it whenever they want it, and it’s current data. And then there’s automation of reports that are pushed through email or through something else, but those both count as automation.

38:02 Moe: So okay, this is an actual related to the topic discussion. Erik, who also makes sure that all of your scripts that do the automation are running? Do you have notifications to let you know that all of your Python codes run successfully? And how do you guys manage that?

38:21 Tim: And what monitors the scripts that are running those notifications? And is it a whole inception a noble thing?

38:26 ED: Well, two things that are, I think, pretty important. Automation, first of all, we don’t have notifications for our own notification systems, but I do really like the concept.

38:36 Tim: Do you have an intern who just sits there and watches the whole time? ‘Cause that would be awesome.

38:40 ED: What we do with all our automation projects is we measure them. So we have a measurement protocol head go out for every notification that gets sent. So we know how much basically value we could potentially have added to the business. We have another project where we place techs automatically, so we measure every tech we’ve placed automatically with a measurement protocol, so we can show how much automation we do. So that’s one thing we do. And now I have to think about the other part of the question.

39:09 Tim: Well, how do you actually monitor… How do you know if the script breaks?

39:13 ED: Yeah, so we don’t monitor if the script breaks but we did set it up in a really scalable way. So, instead of having one big project running, in our case, AWS, so Amazon Web Services, we set it up so one script gets all the notification lines from the Google Sheets and that fires another script per notification that we wanna check. So, in that case, if one of the notification is configured incorrectly or has a corrupted setup or whatever, that’s the only notification that breaks. So we do not monitor that right now, so we could potentially set that up as well. It’s pretty easy to do, actually. So, probably when I’m done with this show, I’m gonna write some tickets in our system and have someone build that. So, that’s a really good one.

39:58 Tim: The next hackathon, the first hour of the next hackathon is gonna be…

40:02 ED: But it’s actually… Yeah, we did think of scalability, so making sure that every function that we wanna do, so a data check, is handled independently from the others. So if you have a… I think we have tens of checks configured right now and normally, if you had one big project and one of them was badly configured, the whole project would break and no notification would be sent or no data check would have been run by the system, and now that only happens when it’s badly configured for one single check, but we don’t notify people, to answer your question.

40:37 Tim: Slacker. [laughter] Like our code, we don’t have to worry about other people, the developers screwing up other stuff. Our code is rock solid.

40:46 ED: We never make mistakes.

40:47 MH: That’s right.

[chuckle]

40:49 ED: Actually, when I uploaded the project for the first time, I did a big screw up, so that was pretty funny. I thought we were uploading like the real-time checks, and I was really excited about the project, so I did it on a Saturday morning. So I look for the first real-time function that we had in AWS, and I actually overruled a real-time Snowplow pipeline function of a client with our notification system, and I didn’t know how to roll it back, but luckily, it was a test pipeline, so it wasn’t used by the client yet, but I got a pretty bad feeling when that happened.

[laughter]

41:26 ED: But you learn from your mistakes, right?

41:28 MH: Automation.

41:29 Moe: Absolutely.

41:30 MH: Double-edged sword.

41:30 Moe: I’ve got lots of learning to prove it.

41:34 Tim: This is awesome. Alright, well, we do need to head towards wrap up, but as we get started on that, this has been a great conversation, and I don’t think automation could have helped it be more better. So.

41:46 MH: That is just the sort of delusional subjective assessment that human beings tend to make.

41:51 Tim: Great. Unique. And I would say Well, Erik did a presentation at Super Week and we will throw the link to those…

41:57 ED: Oh yeah. Excellent.

42:00 Tim: Slides and which doesn’t have all the context ’cause it… But you’ll learn about the Haddon matrix?

42:05 ED: Yeah.

42:05 Tim: Haddon matrix and other exciting things and then you’ll be able to pepper Erik with questions through the Measure Slack, on the details.

42:14 MH: I’ve edited out the un-funny joke that Michael made about my cousin the Large Hadron Collider at this point because I found it mildly offensive. Anyway, one thing we don’t edit out on the show is our last calls, so we just go around the horn we talk about something we found interesting. Moe, do you have a last call?

42:32 Moe: Okay, mine is just for the lols today. I feel like everyone needs a laugh. So, the Football Federation in Australia has had this big article published about how they put out a survey, and basically, the survey was a failure because it talks about how everyone really doesn’t care about football AKA soccer, but the actual failure when you read the article, is the visualizations of the survey results and if you ever need an example of bad data visualization mistakes, this is 100% the article to read. It is so entertaining. Everyone in Australia was reading it, and we’ve had the most robust discussion about how bad it is. So, yeah if you need a laugh today or just to know that you’re doing your job properly, we’ll show the link to that article.

43:25 Tim: Very nice. Can I go next is my… Is a companion.

43:29 MH: Yeah.

43:29 Tim: Last call.

43:30 MH: Hey Tim, what’s your last call?

[laughter]

43:34 Tim: Well, one, I don’t think I’ve ever… I don’t know that I’ve ever done a last call that came through Pawel Kapuscinski. And both of you have.

43:43 Moe: What?

43:43 Tim: So, I’m gonna rectify that. He had shared a link with me, that is from the Financial Times, called the science behind good charts. And so it talks about pie charts versus bar charts and why, but what’s kind of interesting is that it actually has sort of a quiz so they’re a little bit contrived as to… You’re trying to pick what’s the third largest segment on this pie chart and it is a pie chart that even a person who’s terrible with pie charts would never actually make, but it’s actually useful to go through and just pay attention to how much harder you have to think to try to answer these questions, and you’ll get some of them wrong. So it’s also a data viz oriented thing, but it’s a little more interactive than just reading about the… It’s got reading about the concepts combined with actually trying to apply them by answering quiz questions.

44:39 MH: Very nice.

44:40 Tim: Courtesy of Pawel.

44:42 MH: Alright. Erik, what’s your last call?

44:45 ED: Well, I’m actually a big reader. So my last call is a book and it’s pretty interesting because it’s a book from ‘1976, about… It’s from Joseph Weizenbaum, it’s called Computer Power and Human Reason and basically he created it…

45:01 Tim: From ‘1976 you said?

45:02 ED: Yeah. So basically, he created a chatbot in ’76 at MIT. Yeah. That’s pretty awesome all by itself and basically, what he saw that people were projecting human emotions on this machine, even though people knew it was a machine, and that made him think of the… Like even if we’re able of putting a lot of computer power to making human decisions, we never really thought about what decision should we hand over to computers. So, basically, his whole book is an argument that goes from explaining the concept of computers to whether or not we should hand over certain decisions to computers at all, which is a really interesting thought, and especially since it’s a book from ’76.

45:47 Tim: Is chapter seven about having computers do alerting notifications on of your web traffic?

45:55 ED: Yes. No, no, it’s not. [laughter]

45:57 Tim: He was very pre-set.

45:58 MH: If it was, that’d be amazing. I love that.

46:01 Moe: I love that we’re talking about this amazing old book and I’m like, “Haha look how stupid someone is.”

46:06 Tim: Which I’m looking at that article right now and oh my God, these are pretty atrocious visualizations.

46:11 MH: They’re good, they’re really good.

46:13 Tim: They’re, yeah.

46:14 MH: And let’s not take away from the fact that there is a guy from MIT who’d created a chatbot in 1976. So he’s…

46:20 Tim: Especially since we threw MIT end of the bus on the last episode of the podcast.

[chuckle]

46:23 MH: Exactly. Alright.

46:26 Tim: What’s your last call, Michael?

46:28 MH: Well, I’m so glad you asked. Well, we can’t do a show about alerts and that kind of stuff, without me bringing up my two close friends and colleagues who created Alarmduck. So if you have Adobe Analytics and you wanna integrate anomaly detection into Slack, you should definitely check out Alarmduck. Sorry, that’s just a little plug there. I love… Those guys are awesome and Alarmduck is pretty cool. But really, what I wanna talk about was… Well, I know I have two more things I wanna talk about, ’cause there’s one that’s fun. I’ll do the fun one first, and then I’ll just sort of get a vibe for whether we’ve got more time.

47:04 Tim: You check with the moderator.

47:06 MH: Yeah, I’ll check with the moderators, see if he’d let me eke out a three-fer. There is this guy I just randomly ran across him on Twitter, but he’s got this new comic series, that just blew up and he’s publishing it on Instagram called Strange Planet, and his comics are hilarious and a little bit quirky and I love them. And you should check them out because I think you will love them too. Okay, now you’ve been listening and you’ve been thinking. Well, it’s obvious to me that if you automated this aspect of this thing as we’ve been discussing, that you could do this, that, or the other thing, I don’t really know what you’ve been thinking, I’m not a mind reader. But you might have been thinking about things pertaining to the show and we’d love to hear about it. And the best way to talk to us is on the Measure Slack where you can reach all of us including Erik, and we’d love to hear from you. Also, we have our new LinkedIn group, so go on LinkedIn and search for that and join, we’ll let you in. We don’t really do anything on it yet, but that’s definitely something. And then, obviously Twitter, you can reach out to us there as well. Erik, thank you so much for coming on the show, it’s been delightful.

48:15 ED: Thank you for having me.

[laughter]

48:15 Moe: Wow.

48:16 ED: There you go.

48:20 MH: So you know what’s funny? So, here’s the funny thing, Erik. One time I didn’t stop after saying that, and everybody gave me a bunch of grief. And so now, I’m like, “Well, I’ll stop” and then you’re just sort of like…

48:32 Tim: Erik listens enough. He knows like… Yeah, he’s not really looking for a response, he’s gonna talk [48:36] __.

48:37 MH: Yeah. Just gonna go right through.

[laughter]

48:42 MH: I can’t win for losing. Alright, no, it’s fine. Anyways, it has been awesome having you on the show and certainly I think I can join both of my co-hosts Tim and Moe, telling every analyst out there, get smarter, get lazier but gosh darnit, keep on analyzing.

[music]

49:04 AP: Thanks for listening, and don’t forget to join the conversation on Facebook, Twitter, or Measure Slack group. We welcome your comments and questions. Visit us on the web at analyticshour.io, facebook.com/analyticshour or at Analytics Hour on Twitter.

49:24 S?: So smart guys want to fit in. So they made up a term called analytic. Analytics don’t work.

49:32 S?: Analytics. Oh my God. What the fuck does that even mean?

49:41 Moe: I seriously, the last few episodes I’m like, “Something’s wrong. Tim looks really concerned. And then I realized it’s just his face…

49:49 MH: One of the 17 reasons that I’ve now decided that Australia’s on the short list for our next move.

49:56 Moe: You just take photos, go kayaking.

[chuckle]

50:03 MH: You just summed up. Yeah, here’s what Tim does, he takes photos and goes kayaking.

50:08 ED: Basically, two years ago, I heard about Super Week through your podcast.

50:13 Moe: Really?

50:13 ED: And then a year later, I could present at Super Week, and that got me into the podcast again. So wooh it’s kind of a nice circle.

50:21 Moe: That is a great circle.

50:23 MH: After today, we will never speak again, but just…

[laughter]

50:28 Moe: You know joke’s not funny when you explain it, right?

50:31 MH: Well, I know, but then Tim’s gonna come and edit out the parts and it’s just gonna sound really…

50:36 Tim: I’m not sure, I’m not sure, I still don’t get it.

50:40 Moe: Sorry just sorting that out.

50:41 Tim: It’s alright, I was pointing off. So I was pointing at you, but I realized it was not, it wasn’t in the camera there.

50:47 Moe: I know you were doing this on… It’s fine, I got it, don’t worry, with you.

50:52 Tim: That’s as funny as your pregnant pause at the beginning of the episode.

50:55 MH: We might have already edited that whole thing out, who knows?

[laughter]

51:02 MH: I’m gonna go back to bed now.

51:06 MH: Rock, lag and automation.

[music]

Leave a Reply



This site uses Akismet to reduce spam. Learn how your comment data is processed.

Have an Idea for an Upcoming Episode?

Recent Episodes

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

#243: Being Data-Driven: a Statistical Process Control Perspective with Cedric Chin

https://media.blubrry.com/the_digital_analytics_power/traffic.libsyn.com/analyticshour/APH_-_Episode_243_-_Being_Data-Driven__a_Statistical_Process_Control_Perspective_with_Cedric_Chin.mp3Podcast: Download | EmbedSubscribe: RSSTweetShareShareEmail0 Shares